Jiahao Wang

20

Papers

128

Total Citations

Papers (20)

Structure-Aware Sparse-View X-ray 3D Reconstruction

Universal Segmentation at Arbitrary Granularity with Language Instruction

CoopTrack: Exploring End-to-End Learning for Efficient Cooperative Sequential Perception

SpotActor: Training-Free Layout-Controlled Consistent Image Generation

SAUI: Scale-Aware Unseen Imagineer for Zero-Shot Object Detection

SceneCrafter: Controllable Multi-View Driving Scene Editing

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Enhancing the Outcome Reward-based RL Training of MLLMs with Self-Consistency Sampling

IWRN:A Robust Blind Watermarking Method for Artwork Image Copyright Protection Against Noise Attack

ViLT-CLIP: Video and Language Tuning CLIP with Multimodal Prompt Learning and Scenario-guided Optimization

CRA-PCN: Point Cloud Completion with Intra- and Inter-level Cross-Resolution Transformers

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

RobustLight: Improving Robustness via Diffusion Reinforcement Learning for Traffic Signal Control

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Mamba-Reg: Vision Mamba Also Needs Registers

Towards Precise Scaling Laws for Video Diffusion Transformers

Imbalance in Balance: Online Concept Balancing in Generation Models

DynamicID: Zero-Shot Multi-ID Image Personalization with Flexible Facial Editability

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation