Dahua Lin

41
Papers
2,696
Total Citations

Papers (41)

VBench: Comprehensive Benchmark Suite for Video Generative Models

CVPR 2024
996
citations

Scaffold-GS: Structured 3D Gaussians for View-Adaptive Rendering

CVPR 2024
589
citations

OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation

CVPR 2024
365
citations

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

ICLR 2024
209
citations

VideoBooth: Diffusion-based Video Generation with Image Prompts

CVPR 2024
118
citations

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

ICLR 2024
100
citations

GPT4Point: A Unified Framework for Point-Language Understanding and Generation

CVPR 2024
62
citations

Long Context Tuning for Video Generation

ICCV 2025
56
citations

LEGION: Learning to Ground and Explain for Synthetic Image Detection

ICCV 2025
32
citations

Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction

CVPR 2025arXiv
31
citations

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

AAAI 2025
26
citations

SongGen: A Single Stage Auto-regressive Transformer for Text-to-Song Generation

ICML 2025
21
citations

MIA-DPO: Multi-Image Augmented Direct Preference Optimization For Large Vision-Language Models

ICLR 2025arXiv
19
citations

IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations

ICLR 2025
15
citations

Creation-MMBench: Assessing Context-Aware Creative Intelligence in MLLMs

ICCV 2025arXiv
12
citations

Horizon-GS: Unified 3D Gaussian Splatting for Large-Scale Aerial-to-Ground Scenes

CVPR 2025
11
citations

GenDoP: Auto-regressive Camera Trajectory Generation as a Director of Photography

ICCV 2025
7
citations

Utilize the Flow Before Stepping into the Same River Twice: Certainty Represented Knowledge Flow for Refusal-Aware Instruction Tuning

AAAI 2025
6
citations

Keyframe-Guided Creative Video Inpainting

CVPR 2025
6
citations

HiFlow: Training-free High-Resolution Image Generation with Flow-Aligned Guidance

NeurIPS 2025
6
citations

Multi-identity Human Image Animation with Structural Video Diffusion

ICCV 2025
5
citations

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

ICCV 2025
2
citations

Bootstrap3D: Improving Multi-view Diffusion Model with Synthetic Data

ICCV 2025
2
citations

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

CVPR 2024
0
citations

Mixing Expert Knowledge: Bring Human Thoughts Back To the Game of Go

NeurIPS 2025
0
citations

OneLLM: One Framework to Align All Modalities with Language

CVPR 2024
0
citations

GPT-4V(ision) is a Human-Aligned Evaluator for Text-to-3D Generation

CVPR 2024
0
citations

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

CVPR 2024
0
citations

3DTopia-XL: Scaling High-quality 3D Asset Generation via Primitive Diffusion

CVPR 2025
0
citations

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

CVPR 2024
0
citations

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

ICCV 2025
0
citations

Towards Text-guided 3D Scene Composition

CVPR 2024
0
citations

Cinematic Behavior Transfer via NeRF-based Differentiable Filming

CVPR 2024
0
citations

SAM2Long: Enhancing SAM 2 for Long Video Segmentation with a Training-Free Memory Tree

ICCV 2025
0
citations

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting

CVPR 2024
0
citations

Visual-RFT: Visual Reinforcement Fine-Tuning

ICCV 2025
0
citations

MM-IFEngine: Towards Multimodal Instruction Following

ICCV 2025
0
citations

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

CVPR 2025
0
citations

Conical Visual Concentration for Efficient Large Vision-Language Models

CVPR 2025
0
citations

MuxServe: Flexible Spatial-Temporal Multiplexing for Multiple LLM Serving

ICML 2024
0
citations

Linear Alignment: A Closed-form Solution for Aligning Human Preferences without Tuning and Feedback

ICML 2024
0
citations