Xiaojuan Qi
23
Papers
600
Total Citations
Papers (23)
SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
CVPR 2024
302
citations
RegionPLC: Regional Point-Language Contrastive Learning for Open-World 3D Scene Understanding
CVPR 2024
103
citations
VideoEspresso: A Large-Scale Chain-of-Thought Dataset for Fine-Grained Video Reasoning via Core Frame Selection
CVPR 2025
54
citations
V-IRL: Grounding Virtual Intelligence in Real Life
ECCV 2024arXiv
35
citations
Mixture Compressor for Mixture-of-Experts LLMs Gains More
ICLR 2025
23
citations
DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation
ICCV 2025arXiv
21
citations
Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
CVPR 2024
12
citations
Can OOD Object Detectors Learn from Foundation Models?
ECCV 2024
12
citations
ObjectMover: Generative Object Movement with Video Prior
CVPR 2025
10
citations
SaCo Loss: Sample-wise Affinity Consistency for Vision-Language Pre-training
CVPR 2024
10
citations
Deformable Radial Kernel Splatting
CVPR 2025
8
citations
``Principal Components" Enable A New Language of Images
ICCV 2025
6
citations
A Data-Centric Revisit of Pre-Trained Vision Models for Robot Learning
CVPR 2025
3
citations
Equipping Vision Foundation Model with Mixture of Experts for Out-of-Distribution Detection
ICCV 2025
1
citations
How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?
CVPR 2024
0
citations
Learning from Neighbors: Category Extrapolation for Long-Tail Learning
CVPR 2025
0
citations
UniScene: Unified Occupancy-centric Driving Scene Generation
CVPR 2025
0
citations
Holistic Tokenizer for Autoregressive Image Generation
ICCV 2025
0
citations
Aligning Effective Tokens with Video Anomaly in Large Language Models
ICCV 2025
0
citations
Mixture-of-Scores: Robust Image-Text Data Valuation via Three Lines of Code
ICCV 2025
0
citations
How Far are AI-generated Videos from Simulating the 3D Visual World: A Learned 3D Evaluation Approach
ICCV 2025
0
citations
EscherNet: A Generative Model for Scalable View Synthesis
CVPR 2024
0
citations
Classes Are Not Equal: An Empirical Study on Image Recognition Fairness
CVPR 2024
0
citations