Papers (163)
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
ICLR 2024
1,128
citations
Runtime Neural Pruning
NeurIPS 2017
509
citations
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
ICLR 2024
476
citations
Large Language Models Are Not Robust Multiple Choice Selectors
ICLR 2024
370
citations
Temporal Coherence or Temporal Motion: Which is More Critical for Video-based Person Re-identification?
ECCV 2020
81
citations
Graph-Based Social Relation Reasoning
ECCV 2020
51
citations
MetaDistiller: Network Self-Boosting via Meta-Learned Top-Down Distillation
ECCV 2020
41
citations
FlowIE: Efficient Image Enhancement via Rectified Flow
CVPR 2024
31
citations
LiDAR-based Person Re-identification
CVPR 2024
19
citations
EmbodiedOcc: Embodied 3D Occupancy Prediction for Vision-based Online Scene Understanding
ICCV 2025
16
citations
DPMesh: Exploiting Diffusion Prior for Occluded Human Mesh Recovery
CVPR 2024
16
citations
CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering
CVPR 2025arXiv
10
citations
UniPre3D: Unified Pre-training of 3D Point Cloud Models with Cross-Modal Gaussian Splatting
CVPR 2025
6
citations
Enhancing Uncertainty Modeling with Semantic Graph for Hallucination Detection
AAAI 2025
6
citations
Continuous Visual Autoregressive Generation via Score Maximization
ICML 2025
5
citations
Secret Lies in Color: Enhancing AI-Generated Images Detection with Color Distribution Analysis
CVPR 2025
4
citations
Path Choice Matters for Clear Attributions in Path Methods
ICLR 2024
4
citations
Learning Dual-Level Deformable Implicit Representation for Real-World Scale Arbitrary Super-Resolution
ECCV 2024
4
citations
A Visual Leap in CLIP Compositionality Reasoning through Generation of Counterfactual Sets
ICCV 2025arXiv
3
citations
FADE: Frequency-Aware Diffusion Model Factorization for Video Editing
CVPR 2025
2
citations
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
NeurIPS 2025
2
citations
D3QE: Learning Discrete Distribution Discrepancy-aware Quantization Error for Autoregressive-Generated Image Detection
ICCV 2025
1
citations
Efficient Deformable ConvNets: Rethinking Dynamic and Sparse Operator for Vision Applications
CVPR 2024
0
citations
Memory-based Adapters for Online 3D Scene Perception
CVPR 2024
0
citations
Towards Accurate Post-training Quantization for Diffusion Models
CVPR 2024
0
citations
Language Generation with Strictly Proper Scoring Rules
ICML 2024
0
citations
Exploring the Benefit of Activation Sparsity in Pre-training
ICML 2024
0
citations
On Prompt-Driven Safeguarding for Large Language Models
ICML 2024
0
citations
Few-Shot Character Understanding in Movies as an Assessment to Meta-Learning of Theory-of-Mind
ICML 2024
0
citations
Multi-Manifold Deep Metric Learning for Image Set Classification
CVPR 2015
0
citations
Deep Hashing for Compact Binary Codes Learning
CVPR 2015
0
citations
Learning Compact Binary Descriptors With Unsupervised Deep Neural Networks
CVPR 2016
0
citations
Learning Deep Binary Descriptor With Multi-Quantization
CVPR 2017
0
citations
Consistent-Aware Deep Learning for Person Re-Identification in a Camera Network
CVPR 2017
0
citations
Deep Adversarial Metric Learning
CVPR 2018arXiv
0
citations
Deep Progressive Reinforcement Learning for Skeleton-Based Action Recognition
CVPR 2018
0
citations
Learning Globally Optimized Object Detector via Policy Gradient
CVPR 2018
0
citations
Deep Hashing via Discrepancy Minimization
CVPR 2018
0
citations
GraphBit: Bitwise Interaction Mining via Deep Reinforcement Learning
CVPR 2018
0
citations
Hardness-Aware Deep Metric Learning
CVPR 2019
0
citations
Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition
CVPR 2019
0
citations
Learning Channel-Wise Interactions for Binary Convolutional Neural Networks
CVPR 2019
0
citations
Structural Relational Reasoning of Point Clouds
CVPR 2019
0
citations
Deep Fitting Degree Scoring Network for Monocular 3D Object Detection
CVPR 2019
0
citations
BridgeNet: A Continuity-Aware Probabilistic Network for Age Estimation
CVPR 2019
0
citations
COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis
CVPR 2019
0
citations
UniformFace: Learning Deep Equidistributed Representation for Face Recognition
CVPR 2019
0
citations
Deep Embedding Learning With Discriminative Sampling Policy
CVPR 2019
0
citations
Enhanced Bayesian Compression via Deep Reinforcement Learning
CVPR 2019
0
citations
BiDet: An Efficient Binarized Object Detector
CVPR 2020arXiv
0
citations
Deep Metric Learning via Adaptive Learnable Assessment
CVPR 2020
0
citations
Global-Local Bidirectional Reasoning for Unsupervised Representation Learning of 3D Point Clouds
CVPR 2020arXiv
0
citations
Deep Face Super-Resolution With Iterative Collaboration Between Attentive Recovery and Landmark Estimation
CVPR 2020arXiv
0
citations
Structure-Preserving Super Resolution With Gradient Guidance
CVPR 2020arXiv
0
citations
Uncertainty-Aware Score Distribution Learning for Action Quality Assessment
CVPR 2020arXiv
0
citations
Self-Supervised Video Hashing via Bidirectional Transformers
CVPR 2021
0
citations
Structure-Aware Face Clustering on a Large-Scale Graph With 107 Nodes
CVPR 2021
0
citations
Objects Are Different: Flexible Monocular 3D Object Detection
CVPR 2021arXiv
0
citations
Deep Compositional Metric Learning
CVPR 2021
0
citations
Meta-Mining Discriminative Samples for Kinship Verification
CVPR 2021arXiv
0
citations
Pseudo Facial Generation With Extreme Poses for Face Recognition
CVPR 2021
0
citations
WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition
CVPR 2021arXiv
0
citations
Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression
CVPR 2021arXiv
0
citations
PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds
CVPR 2021
0
citations
HyperDet3D: Learning a Scene-Conditioned 3D Object Detector
CVPR 2022arXiv
0
citations
Stochastic Trajectory Prediction via Motion Indeterminacy Diffusion
CVPR 2022arXiv
0
citations
Bridge-Prompt: Towards Ordinal Action Understanding in Instructional Videos
CVPR 2022
0
citations
FineDiving: A Fine-Grained Dataset for Procedure-Aware Action Quality Assessment
CVPR 2022arXiv
0
citations
Back to Reality: Weakly-Supervised 3D Object Detection With Shape-Guided Label Enhancement
CVPR 2022arXiv
0
citations
Dimension Embeddings for Monocular 3D Object Detection
CVPR 2022
0
citations
Point-BERT: Pre-Training 3D Point Cloud Transformers With Masked Point Modeling
CVPR 2022
0
citations
DenseCLIP: Language-Guided Dense Prediction With Context-Aware Prompting
CVPR 2022arXiv
0
citations
Attributable Visual Similarity Learning
CVPR 2022arXiv
0
citations
SemAffiNet: Semantic-Affine Transformation for Point Cloud Segmentation
CVPR 2022arXiv
0
citations
Shapley-NAS: Discovering Operation Contribution for Neural Architecture Search
CVPR 2022
0
citations
Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction
CVPR 2023arXiv
0
citations
Towards All-in-One Pre-Training via Maximizing Multi-Modal Mutual Information
CVPR 2023arXiv
0
citations
LOGO: A Long-Form Video Dataset for Group Action Quality Assessment
CVPR 2023
0
citations
Deep Factorized Metric Learning
CVPR 2023
0
citations
BEVFormer v2: Adapting Modern Image Backbones to Bird's-Eye-View Recognition via Perspective Supervision
CVPR 2023
0
citations
FLAG3D: A 3D Fitness Activity Dataset With Language Instruction
CVPR 2023arXiv
0
citations
Diffusion-SDF: Text-To-Shape via Voxelized Diffusion
CVPR 2023
0
citations
Siamese Image Modeling for Self-Supervised Vision Representation Learning
CVPR 2023arXiv
0
citations
Binarizing Sparse Convolutional Networks for Efficient Point Cloud Analysis
CVPR 2023arXiv
0
citations
DiffTalk: Crafting Diffusion Models for Generalized Audio-Driven Portraits Animation
CVPR 2023arXiv
0
citations
DiffSwap: High-Fidelity and Controllable Face Swapping via 3D-Aware Masked Diffusion
CVPR 2023
0
citations
Multiple Feature Fusion via Weighted Entropy for Visual Tracking
ICCV 2015
0
citations
Simultaneous Local Binary Feature Learning and Encoding for Face Recognition
ICCV 2015
0
citations
Local Subspace Collaborative Tracking
ICCV 2015
0
citations
Learning Discriminative Aggregation Network for Video-Based Face Recognition
ICCV 2017
0
citations
Attention-Aware Deep Reinforcement Learning for Video Face Recognition
ICCV 2017
0
citations
Cross-Modal Deep Variational Hashing
ICCV 2017
0
citations
Neighborhood Preserving Hashing for Scalable Video Retrieval
ICCV 2019
0
citations
Deep Meta Metric Learning
ICCV 2019
0
citations
Self-Critical Attention Learning for Person Re-Identification
ICCV 2019
0
citations
Robust Variational Bayesian Point Set Registration
ICCV 2019
0
citations
Group-Aware Contrastive Regression for Action Quality Assessment
ICCV 2021arXiv
0
citations
Instance Similarity Learning for Unsupervised Feature Representation
ICCV 2021arXiv
0
citations
PoinTr: Diverse Point Cloud Completion With Geometry-Aware Transformers
ICCV 2021arXiv
0
citations
Frequency-Aware Spatiotemporal Transformers for Video Inpainting Detection
ICCV 2021
0
citations
Deep Relational Metric Learning
ICCV 2021arXiv
0
citations
RandomRooms: Unsupervised Pre-Training From Synthetic Shapes and Randomized Layouts for 3D Object Detection
ICCV 2021arXiv
0
citations
NerfingMVS: Guided Optimization of Neural Radiance Fields for Indoor Multi-View Stereo
ICCV 2021arXiv
0
citations
Generalizable Mixed-Precision Quantization via Attribution Rank Preservation
ICCV 2021arXiv
0
citations
Text-guided Sparse Voxel Pruning for Efficient 3D Visual Grounding
CVPR 2025
0
citations
Gait Recognition in the Wild: A Benchmark
ICCV 2021
0
citations
Human Trajectory Prediction via Counterfactual Analysis
ICCV 2021arXiv
0
citations
Counterfactual Attention Learning for Fine-Grained Visual Categorization and Re-Identification
ICCV 2021arXiv
0
citations
OPERA: Omni-Supervised Representation Learning with Hierarchical Supervisions
ICCV 2023arXiv
0
citations
Take-A-Photo: 3D-to-2D Generative Pre-training of Point Cloud Models
ICCV 2023
0
citations
Token-Label Alignment for Vision Transformers
ICCV 2023arXiv
0
citations
Skip-Plan: Procedure Planning in Instructional Videos via Condensed Action Space Learning
ICCV 2023
0
citations
TCOVIS: Temporally Consistent Online Video Instance Segmentation
ICCV 2023arXiv
0
citations
CLIP-Cluster: CLIP-Guided Attribute Hallucination for Face Clustering
ICCV 2023
0
citations
SurroundOcc: Multi-camera 3D Occupancy Prediction for Autonomous Driving
ICCV 2023arXiv
0
citations
Unleashing Text-to-Image Diffusion Models for Visual Perception
ICCV 2023arXiv
0
citations
Deep Credible Metric Learning for Unsupervised Domain Adaptation Person Re-identification
ECCV 2020
0
citations
Reinforced Axial Refinement Network for Monocular 3D Object Detection
ECCV 2020
0
citations
Structural Deep Metric Learning for Room Layout Estimation
ECCV 2020
0
citations
Deep Hashing with Active Pairwise Supervision
ECCV 2020
0
citations
Rotation-robust Intersection over Union for 3D Object Detection
ECCV 2020
0
citations
Spatial Geometric Reasoning for Room Layout Estimation via Deep Reinforcement Learning
ECCV 2020
0
citations
Shap-CAM: Visual Explanations for Convolutional Neural Networks Based on Shapley Value
ECCV 2022
0
citations
Label2Label: A Language Modeling Framework for Multi-Attribute Learning
ECCV 2022
0
citations
Learning Dynamic Facial Radiance Fields for Few-Shot Talking Head Synthesis
ECCV 2022
0
citations
Learning Series-Parallel Lookup Tables for Efficient Image Super-Resolution
ECCV 2022
0
citations
AMixer: Adaptive Weight Mixing for Self-Attention Free Vision Transformers
ECCV 2022
0
citations
Dynamic Metric Learning with Cross-Level Concept Distillation
ECCV 2022
0
citations
LiDAR Distillation: Bridging the Beam-Induced Domain Gap for 3D Object Detection
ECCV 2022
0
citations
Are You Talking to a Machine? Dataset and Methods for Multilingual Image Question
NeurIPS 2015arXiv
0
citations
Towards Interpretable Deep Metric Learning With Structural Matching
ICCV 2021arXiv
0
citations
EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models
CVPR 2025
0
citations
GaussianWorld: Gaussian World Model for Streaming 3D Occupancy Prediction
CVPR 2025
0
citations
UniGoal: Towards Universal Zero-shot Goal-oriented Navigation
CVPR 2025
0
citations
Learning Counterfactually Decoupled Attention for Open-World Model Attribution
ICCV 2025
0
citations
EFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Clients
ICCV 2025
0
citations
IGL-Nav: Incremental 3D Gaussian Localization for Image-goal Navigation
ICCV 2025
0
citations
WalkVLM: Aid Visually Impaired People Walking by Vision Language Model
ICCV 2025
0
citations
MCID: Multi-aspect Copyright Infringement Detection for Generated Images
ICCV 2025
0
citations
Authentic 4D Driving Simulation with a Video Generation Model
ICCV 2025
0
citations
SpectralAR: Spectral Autoregressive Visual Generation
ICCV 2025
0
citations
Entropy-Adaptive Diffusion Policy Optimization with Dynamic Step Alignment
ICCV 2025
0
citations
From Imitation to Innovation: The Emergence of AI's Unique Artistic Styles and the Challenge of Copyright Protection
ICCV 2025
0
citations
Learning with Open-world Noisy Data via Class-independent Margin in Dual Representation Space
AAAI 2025
0
citations
Teaching Large Language Models to Translate with Comparison
AAAI 2024
0
citations
MELO: Enhancing Model Editing with Neuron-Indexed Dynamic LoRA
AAAI 2024
0
citations
Tree-of-Reasoning Question Decomposition for Complex Question Answering with Large Language Models
AAAI 2024
0
citations
Learning Multi-Scale Video-Text Correspondence for Weakly Supervised Temporal Article Gronding
AAAI 2024
0
citations
Generative Multi-Modal Knowledge Retrieval with Large Language Models
AAAI 2024
0
citations
Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft
CVPR 2024
0
citations
LowRankOcc: Tensor Decomposition and Low-Rank Recovery for Vision-based 3D Semantic Occupancy Prediction
CVPR 2024
0
citations
SelfOcc: Self-Supervised Vision-Based 3D Occupancy Prediction
CVPR 2024
0
citations
Global Filter Networks for Image Classification
NeurIPS 2021
0
citations
DynamicViT: Efficient Vision Transformers with Dynamic Token Sparsification
NeurIPS 2021arXiv
0
citations
Topology-Imbalance Learning for Semi-Supervised Node Classification
NeurIPS 2021
0
citations
HorNet: Efficient High-Order Spatial Interactions with Recursive Gated Convolutions
NeurIPS 2022
0
citations
P2P: Tuning Pre-trained Image Models for Point Cloud Analysis with Point-to-Pixel Prompting
NeurIPS 2022
0
citations
A Win-win Deal: Towards Sparse and Robust Pre-trained Language Models
NeurIPS 2022
0
citations
OrdinalCLIP: Learning Rank Prompts for Language-Guided Ordinal Regression
NeurIPS 2022
0
citations
MCUFormer: Deploying Vision Tranformers on Microcontrollers with Limited Memory
NeurIPS 2023
0
citations
UniPC: A Unified Predictor-Corrector Framework for Fast Sampling of Diffusion Models
NeurIPS 2023
0
citations
VisionLLM: Large Language Model is also an Open-Ended Decoder for Vision-Centric Tasks
NeurIPS 2023
0
citations
Fed-FA: Theoretically Modeling Client Data Divergence for Federated Language Backdoor Defense
NeurIPS 2023
0
citations