Yongming Rao
43
Papers
1,127
Total Citations
Papers (43)
Runtime Neural Pruning
NeurIPS 2017
509
citations
Generative Multimodal Models are In-Context Learners
CVPR 2024
422
citations
Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?
ECCV 2020
81
citations
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
ECCV 2020
41
citations
Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior
CVPR 2024
37
citations
Efficient Inference of Vision Instruction-Following Models with Elastic Cache
ECCV 2024
25
citations
Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model
CVPR 2025arXiv
9
citations
SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs
ICCV 2025
3
citations
PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
CVPR 2021
0
citations
Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
CVPR 2022arXiv
0
citations
FineDiving: A Fine-Grained Dataset for Procedure-Aware Action Quality Assessment
CVPR 2022arXiv
0
citations
Back to Reality: Weakly-Supervised 3D Object Detection With Shape-Guided Label Enhancement
CVPR 2022arXiv
0
citations
Point-BERT: Pre-Training 3D Point Cloud Transformers With Masked Point Modeling
CVPR 2022
0
citations
DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting
CVPR 2022arXiv
0
citations
SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation
CVPR 2022arXiv
0
citations
FLAG3D: A 3D Fitness Activity Dataset With Language Instruction
CVPR 2023arXiv
0
citations
DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion
CVPR 2023
0
citations
Learning Discriminative Aggregation Network for Video-Based Face Recognition
ICCV 2017
0
citations
Attention-Aware Deep Reinforcement Learning for Video Face Recognition
ICCV 2017
0
citations
Group-Aware Contrastive Regression for Action Quality Assessment
ICCV 2021arXiv
0
citations
PoinTr: Diverse Point Cloud Completion With Geometry-Aware Transformers
ICCV 2021arXiv
0
citations
RandomRooms: Unsupervised Pre-Training From Synthetic Shapes and Randomized Layouts for 3D Object Detection
ICCV 2021arXiv
0
citations
NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo
ICCV 2021arXiv
0
citations
Towards Interpretable Deep Metric Learning With Structural Matching
ICCV 2021arXiv
0
citations
Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification
ICCV 2021arXiv
0
citations
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
ICCV 2023
0
citations
TCOVIS: Temporally Consistent Online Video Instance Segmentation
ICCV 2023arXiv
0
citations
Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models
CVPR 2025
0
citations
AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers
ECCV 2022
0
citations
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
ECCV 2022
0
citations
Unleashing Text-to-Image Diffusion Models for Visual Perception
ICCV 2023arXiv
0
citations
X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition
CVPR 2024
0
citations
Learning Globally Optimized Object Detector via Policy Gradient
CVPR 2018
0
citations
Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition
CVPR 2019
0
citations
COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis
CVPR 2019
0
citations
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
CVPR 2020arXiv
0
citations
Deep Face Super-Resolution With Iterative Collaboration Between Attentive Recovery and Landmark Estimation
CVPR 2020arXiv
0
citations
Structure-Preserving Super Resolution With Gradient Guidance
CVPR 2020arXiv
0
citations
Global Filter Networks for Image Classification
NeurIPS 2021
0
citations
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
NeurIPS 2021arXiv
0
citations
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
NeurIPS 2022
0
citations
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
NeurIPS 2022
0
citations
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models
NeurIPS 2023
0
citations