Hao Tan
28
Papers
913
Total Citations
Papers (28)
GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting
ECCV 2024
245
citations
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model
ICLR 2024
227
citations
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
ICLR 2024
154
citations
LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias
ICLR 2025
86
citations
RandAR: Decoder-only Autoregressive Visual Generation in Random Orders
CVPR 2025
61
citations
Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats
ICCV 2025
56
citations
Numerical Pruning for Efficient Autoregressive Models
AAAI 2025
22
citations
Carve3D: Improving Multi-view Reconstruction Consistency for Diffusion Models with RL Finetuning
CVPR 2024
21
citations
Compound Text-Guided Prompt Tuning via Image-Adaptive Cues
AAAI 2024arXiv
13
citations
Gaussian Mixture Flow Matching Models
ICML 2025
8
citations
Turbo3D: Ultra-fast Text-to-3D Generation
CVPR 2025
6
citations
Adaptive Few-shot Prompting for Machine Translation with Pre-trained Language Models
AAAI 2025
6
citations
Buffer Anytime: Zero-Shot Video Depth and Normal from Image Priors
CVPR 2025
4
citations
Generating 3D-Consistent Videos from Unposed Internet Photos
CVPR 2025
4
citations
Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport
CVPR 2025
0
citations
Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels
CVPR 2025
0
citations
MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
CVPR 2025
0
citations
RayZer: A Self-supervised Large View Synthesis Model
ICCV 2025
0
citations
VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation
ICCV 2025
0
citations
DiffTell: A High-Quality Dataset for Describing Image Manipulation Changes
ICCV 2025
0
citations
LazyDiT: Lazy Learning for the Acceleration of Diffusion Transformers
AAAI 2025
0
citations
Building Vision-Language Models on Solid Foundations with Masked Distillation
CVPR 2024
0
citations
Efficient Federated Incomplete Multi-View Clustering
ICML 2025
0
citations
A Joint Speaker-Listener-Reinforcer Model for Referring Expressions
CVPR 2017arXiv
0
citations
EnvEdit: Environment Editing for Vision-and-Language Navigation
CVPR 2022arXiv
0
citations
Learning Navigational Visual Representations with Semantic Map Supervision
ICCV 2023arXiv
0
citations
Scaling Data Generation in Vision-and-Language Navigation
ICCV 2023arXiv
0
citations
VidLanKD: Improving Language Understanding via Video-Distilled Knowledge Transfer
NeurIPS 2021
0
citations