Yongming Rao

43
Papers
1,127
Total Citations

Papers (43)

Runtime Neural Pruning

NeurIPS 2017
509
citations

Generative Multimodal Models are In-Context Learners

CVPR 2024
422
citations

Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?

ECCV 2020
81
citations

MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation

ECCV 2020
41
citations

Sherpa3D: Boosting High-Fidelity Text-to-3D Generation via Coarse 3D Prior

CVPR 2024
37
citations

Efficient Inference of Vision Instruction-Following Models with Elastic Cache

ECCV 2024
25
citations

Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model

CVPR 2025arXiv
9
citations

SparseMM: Head Sparsity Emerges from Visual Concept Responses in MLLMs

ICCV 2025
3
citations

PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds

CVPR 2021
0
citations

Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion

CVPR 2022arXiv
0
citations

FineDiving: A Fine-Grained Dataset for Procedure-Aware Action Quality Assessment

CVPR 2022arXiv
0
citations

Back to Reality: Weakly-Supervised 3D Object Detection With Shape-Guided Label Enhancement

CVPR 2022arXiv
0
citations

Point-BERT: Pre-Training 3D Point Cloud Transformers With Masked Point Modeling

CVPR 2022
0
citations

DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting

CVPR 2022arXiv
0
citations

SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation

CVPR 2022arXiv
0
citations

FLAG3D: A 3D Fitness Activity Dataset With Language Instruction

CVPR 2023arXiv
0
citations

DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion

CVPR 2023
0
citations

Learning Discriminative Aggregation Network for Video-Based Face Recognition

ICCV 2017
0
citations

Attention-Aware Deep Reinforcement Learning for Video Face Recognition

ICCV 2017
0
citations

Group-Aware Contrastive Regression for Action Quality Assessment

ICCV 2021arXiv
0
citations

PoinTr: Diverse Point Cloud Completion With Geometry-Aware Transformers

ICCV 2021arXiv
0
citations

RandomRooms: Unsupervised Pre-Training From Synthetic Shapes and Randomized Layouts for 3D Object Detection

ICCV 2021arXiv
0
citations

NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo

ICCV 2021arXiv
0
citations

Towards Interpretable Deep Metric Learning With Structural Matching

ICCV 2021arXiv
0
citations

Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification

ICCV 2021arXiv
0
citations

Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models

ICCV 2023
0
citations

TCOVIS: Temporally Consistent Online Video Instance Segmentation

ICCV 2023arXiv
0
citations

Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models

CVPR 2025
0
citations

AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers

ECCV 2022
0
citations

LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection

ECCV 2022
0
citations

Unleashing Text-to-Image Diffusion Models for Visual Perception

ICCV 2023arXiv
0
citations

X-3D: Explicit 3D Structure Modeling for Point Cloud Recognition

CVPR 2024
0
citations

Learning Globally Optimized Object Detector via Policy Gradient

CVPR 2018
0
citations

Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition

CVPR 2019
0
citations

COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis

CVPR 2019
0
citations

Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds

CVPR 2020arXiv
0
citations

Deep Face Super-Resolution With Iterative Collaboration Between Attentive Recovery and Landmark Estimation

CVPR 2020arXiv
0
citations

Structure-Preserving Super Resolution With Gradient Guidance

CVPR 2020arXiv
0
citations

Global Filter Networks for Image Classification

NeurIPS 2021
0
citations

DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification

NeurIPS 2021arXiv
0
citations

HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions

NeurIPS 2022
0
citations

P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting

NeurIPS 2022
0
citations

UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models

NeurIPS 2023
0
citations