Ran Xu
20
Papers
462
Total Citations
Papers (20)
ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding
CVPR 2024
192
citations
HIVE: Harnessing Human Feedback for Instructional Visual Editing
CVPR 2024
164
citations
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
ICLR 2024
104
citations
Trust but Verify: Programmatic VLM Evaluation in the Wild
ICCV 2025
2
citations
Continual-MAE: Adaptive Distribution Masked Autoencoders for Continual Test-Time Adaptation
CVPR 2024
0
citations
Position: TrustLLM: Trustworthiness in Large Language Models
ICML 2024
0
citations
WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos
CVPR 2021arXiv
0
citations
Use All the Labels: A Hierarchical Multi-Label Contrastive Learning Framework
CVPR 2022arXiv
0
citations
ULIP: Learning a Unified Representation of Language, Images, and Point Clouds for 3D Understanding
CVPR 2023
0
citations
Mask-Free OVIS: Open-Vocabulary Instance Segmentation Without Manual Mask Annotations
CVPR 2023arXiv
0
citations
Deformer: Dynamic Fusion Transformer for Robust Hand Pose Estimation
ICCV 2023arXiv
0
citations
GlueGen: Plug and Play Multi-modal Encoders for X-to-image Generation
ICCV 2023
0
citations
Open Vocabulary Object Detection with Pseudo Bounding-Box Labels
ECCV 2022
0
citations
Burn after Reading: Online Adaptation for Cross-Domain Streaming Data
ECCV 2022
0
citations
SmartAdapt: Multi-Branch Object Detection Framework for Videos on Mobiles
CVPR 2022
0
citations
Structured Policy Optimization: Enhance Large Vision-Language Model via Self-referenced Dialogue
ICCV 2025
0
citations
Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting
ICCV 2025
0
citations
Text2Data: Low-Resource Data Generation with Textual Control
AAAI 2025
0
citations
Open Visual Knowledge Extraction via Relation-Oriented Multimodality Model Prompting
NeurIPS 2023
0
citations
UniControl: A Unified Diffusion Model for Controllable Visual Generation In the Wild
NeurIPS 2023
0
citations