Papers (124)
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
ICLR 2024
1,366
citations
Mitigating Object Hallucinations in Large Vision-Language Models through Visual Contrastive Decoding
CVPR 2024
449
citations
Cascade Graph Neural Networks for RGB-D Salient Object Detection
ECCV 2020
113
citations
DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving
ICCV 2025
58
citations
VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM
CVPR 2025
40
citations
Enhancing Visual Document Understanding with Contrastive Learning in Large Visual-Language Models
CVPR 2024
37
citations
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
ICLR 2025
35
citations
Multi-Space Alignments Towards Universal LiDAR Segmentation
CVPR 2024
30
citations
Insect-Foundation: A Foundation Model and Large-scale 1M Dataset for Visual Insect Understanding
CVPR 2024
29
citations
The Curse of Multi-Modalities: Evaluating Hallucinations of Large Multimodal Models across Language, Visual, and Audio
NeurIPS 2025
26
citations
AlignMamba: Enhancing Multimodal Mamba with Local and Global Cross-modal Alignment
CVPR 2025
17
citations
KD-DETR: Knowledge Distillation for Detection Transformer with Consistent Distillation Points Sampling
CVPR 2024
16
citations
Commonsense Prototype for Outdoor Unsupervised 3D Object Detection
CVPR 2024
16
citations
USE: Universal Segment Embeddings for Open-Vocabulary Image Segmentation
CVPR 2024
13
citations
LongPO: Long Context Self-Evolution of Large Language Models through Short-to-Long Preference Optimization
ICLR 2025
12
citations
Grab What You Need: Rethinking Complex Table Structure Recognition with Flexible Components Deliberation
AAAI 2024arXiv
10
citations
V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection
CVPR 2025
10
citations
MobileInst: Video Instance Segmentation on the Mobile
AAAI 2024arXiv
10
citations
Surf-D: Generating High-Quality Surfaces of Arbitrary Topologies Using Diffusion Models
ECCV 2024
10
citations
CADDreamer: CAD Object Generation from Single-view Images
CVPR 2025
9
citations
Inverse Weight-Balancing for Deep Long-Tailed Learning
AAAI 2024
7
citations
MetaCARD: Meta-Reinforcement Learning with Task Uncertainty Feedback via Decoupled Context-Aware Reward and Dynamics Components
AAAI 2024
4
citations
TIV-Diffusion: Towards Object-Centric Movement for Text-driven Image to Video Generation
AAAI 2025
4
citations
Symbolic Neural Ordinary Differential Equations
AAAI 2025
3
citations
MetaAT: Active Testing for Label-Efficient Evaluation of Dense Recognition Tasks
ECCV 2024
2
citations
RaSS: Improving Denoising Diffusion Samplers with Reinforced Active Sampling Scheduler
CVPR 2025
2
citations
Learning Latent Dynamic Robust Representations for World Models
ICML 2024
0
citations
A Unified Adaptive Testing System Enabled by Hierarchical Structure Search
ICML 2024
0
citations
Simplified Mirror-Based Camera Pose Computation via Rotation Averaging
CVPR 2015
0
citations
Object-Aware Dense Semantic Correspondence
CVPR 2017
0
citations
NM-Net: Mining Reliable Neighbors for Robust Feature Correspondences
CVPR 2019
0
citations
Target-Aware Deep Tracking
CVPR 2019
0
citations
RF-Net: An End-To-End Image Matching Network Based on Receptive Field
CVPR 2019
0
citations
LO-Net: Deep Real-Time Lidar Odometry
CVPR 2019
0
citations
Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search
CVPR 2019
0
citations
Probabilistic Model Distillation for Semantic Correspondence
CVPR 2021
0
citations
Learning Semantic Person Image Generation by Region-Adaptive Normalization
CVPR 2021arXiv
0
citations
Mutual Graph Learning for Camouflaged Object Detection
CVPR 2021arXiv
0
citations
Drafting and Revision: Laplacian Pyramid Network for Fast High-Quality Artistic Style Transfer
CVPR 2021arXiv
0
citations
Multi-Object Tracking Meets Moving UAV
CVPR 2022
0
citations
Learning Optical Flow With Kernel Patch Attention
CVPR 2022
0
citations
Unsupervised Learning of Accurate Siamese Tracking
CVPR 2022arXiv
0
citations
Towards Bidirectional Arbitrary Image Rescaling: Joint Optimization and Cycle Idempotence
CVPR 2022arXiv
0
citations
DirecFormer: A Directed Attention in Transformer Approach to Robust Action Recognition
CVPR 2022arXiv
0
citations
NomMer: Nominate Synergistic Context in Vision Transformer for Visual Recognition
CVPR 2022arXiv
0
citations
Neural Collaborative Graph Machines for Table Structure Recognition
CVPR 2022arXiv
0
citations
SCPNet: Semantic Scene Completion on Point Cloud
CVPR 2023arXiv
0
citations
Learning Distortion Invariant Representation for Image Restoration From a Causality Perspective
CVPR 2023arXiv
0
citations
LoGoNet: Towards Accurate 3D Object Detection With Local-to-Global Cross-Modal Fusion
CVPR 2023arXiv
0
citations
Self-Supervised Non-Uniform Kernel Estimation With Flow-Based Motion Prior for Blind Image Deblurring
CVPR 2023
0
citations
Micron-BERT: BERT-Based Facial Micro-Expression Recognition
CVPR 2023
0
citations
Vector Quantization With Self-Attention for Quality-Independent Representation Learning
CVPR 2023
0
citations
Virtual Sparse Convolution for Multimodal 3D Object Detection
CVPR 2023arXiv
0
citations
Low-Rank Tensor Approximation With Laplacian Scale Mixture Modeling for Multiframe Image Denoising
ICCV 2015
0
citations
3D Fragment Reassembly Using Integrated Template Guidance and Fracture-Region Matching
ICCV 2015
0
citations
Semi-Supervised Zero-Shot Classification With Label Representation Learning
ICCV 2015
0
citations
FoveaNet: Perspective-Aware Urban Scene Parsing
ICCV 2017arXiv
0
citations
SBGAR: Semantics Based Group Activity Recognition
ICCV 2017
0
citations
Video Scene Parsing With Predictive Feature Learning
ICCV 2017arXiv
0
citations
Adversarial Examples Detection in Deep Networks With Convolutional Filter Statistics
ICCV 2017arXiv
0
citations
BMN: Boundary-Matching Network for Temporal Action Proposal Generation
ICCV 2019
0
citations
Semantics-Enhanced Adversarial Nets for Text-to-Image Synthesis
ICCV 2019
0
citations
Paint Transformer: Feed Forward Neural Painting With Stroke Prediction
ICCV 2021arXiv
0
citations
Saliency-Associated Object Tracking
ICCV 2021arXiv
0
citations
AdaAttN: Revisit Attention Mechanism in Arbitrary Neural Style Transfer
ICCV 2021arXiv
0
citations
Uncertainty-Guided Transformer Reasoning for Camouflaged Object Detection
ICCV 2021
0
citations
CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations
ICCV 2023
0
citations
UniSeg: A Unified Multi-Modal LiDAR Segmentation Network and the OpenPCSeg Codebase
ICCV 2023arXiv
0
citations
Surface Extraction from Neural Unsigned Distance Fields
ICCV 2023arXiv
0
citations
DetZero: Rethinking Offboard 3D Object Detection with Long-term Sequential Point Clouds
ICCV 2023arXiv
0
citations
Robo3D: Towards Robust and Reliable 3D Perception against Corruptions
ICCV 2023arXiv
0
citations
Low-Light Image Enhancement with Multi-Stage Residue Quantization and Brightness-Aware Attention
ICCV 2023
0
citations
Batch-based Model Registration for Fast 3D Sherd Reconstruction
ICCV 2023arXiv
0
citations
Fast Full-frame Video Stabilization with Iterative Optimization
ICCV 2023arXiv
0
citations
LMR: A Large-Scale Multi-Reference Dataset for Reference-Based Super-Resolution
ICCV 2023arXiv
0
citations
Constraining Depth Map Geometry for Multi-View Stereo: A Dual-Depth Approach with Saddle-shaped Depth Cells
ICCV 2023arXiv
0
citations
CiteTracker: Correlating Image and Text for Visual Tracking
ICCV 2023arXiv
0
citations
LIRA: Lifelong Image Restoration from Unknown Blended Distortions
ECCV 2020
0
citations
DDGCN: A Dynamic Directed Graph Convolutional Network for Action Recognition
ECCV 2020
0
citations
Sparse-to-Dense Depth Completion Revisited: Sampling Strategy and Graph Construction
ECCV 2020
0
citations
Learning Disentangled Feature Representation for Hybrid-distorted Image Restoration
ECCV 2020
0
citations
Uncertainty Learning in Kernel Estimation for Multi-stage Blind Image Super-Resolution
ECCV 2022
0
citations
Neural Color Operators for Sequential Image Retouching
ECCV 2022
0
citations
RRSR:Reciprocal Reference-Based Image Super-Resolution with Progressive Feature Alignment and Selection
ECCV 2022
0
citations
Self-Feature Distillation with Uncertainty Modeling for Degraded Image Recognition
ECCV 2022
0
citations
Homogeneous Multi-modal Feature Fusion and Interaction for 3D Object Detection
ECCV 2022
0
citations
Learning Parametric Sparse Models for Image Super-Resolution
NeurIPS 2016
0
citations
GAFlow: Incorporating Gaussian Attention into Optical Flow
ICCV 2023
0
citations
ECBench: Can Multi-modal Foundation Models Understand the Egocentric World? A Holistic Embodied Cognition Benchmark
CVPR 2025
0
citations
Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy
CVPR 2025
0
citations
Parameterized Blur Kernel Prior Learning for Local Motion Deblurring
CVPR 2025
0
citations
Gain from Neighbors: Boosting Model Robustness in the Wild via Adversarial Perturbations Toward Neighboring Classes
CVPR 2025
0
citations
2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining
ICCV 2025
0
citations
Motal: Unsupervised 3D Object Detection by Modality and Task-specific Knowledge Transfer
ICCV 2025
0
citations
Controllable 3D Outdoor Scene Generation via Scene Graphs
ICCV 2025
0
citations
ViT-Split: Unleashing the Power of Vision Foundation Models via Efficient Splitting Heads
ICCV 2025
0
citations
CoStoDet-DDPM: Collaborative Training of Stochastic and Deterministic Models Improves Surgical Workflow Anticipation and Recognition
ICCV 2025
0
citations
Multi-Perspective Consolidation Enhanced Cognitive Diagnosis via Conditional Diffusion Model
AAAI 2025
0
citations
Training-Free Image Manipulation Localization Using Diffusion Models
AAAI 2025
0
citations
Automated Creation of Reusable and Diverse Toolsets for Enhancing LLM Reasoning
AAAI 2025
0
citations
Sunshine to Rainstorm: Cross-Weather Knowledge Distillation for Robust 3D Object Detection
AAAI 2024arXiv
0
citations
Integrated Decision Gradients: Compute Your Attributions Where the Model Makes Its Decision
AAAI 2024
0
citations
Improving GNN Calibration with Discriminative Ability: Insights and Strategies
AAAI 2024
0
citations
Pushing the Limit of Fine-Tuning for Few-Shot Learning: Where Feature Reusing Meets Cross-Scale Attention
AAAI 2024
0
citations
SMILEtrack: SiMIlarity LEarning for Occlusion-Aware Multiple Object Tracking
AAAI 2024arXiv
0
citations
Is Vanilla MLP in Neural Radiance Field Enough for Few-shot View Synthesis?
CVPR 2024
0
citations
SeD: Semantic-Aware Discriminator for Image Super-Resolution
CVPR 2024
0
citations
RTracker: Recoverable Tracking via PN Tree Structured Memory
CVPR 2024
0
citations
KVQ: Kwai Video Quality Assessment for Short-form Videos
CVPR 2024
0
citations
HRVDA: High-Resolution Visual Document Assistant
CVPR 2024
0
citations
HINTED: Hard Instance Enhanced Detector with Mixed-Density Feature Fusion for Sparsely-Supervised 3D Object Detection
CVPR 2024
0
citations
From Fourier to Neural ODEs: Flow Matching for Modeling Complex Systems
ICML 2024
0
citations
Beyond Sole Strength: Customized Ensembles for Generalized Vision-Language Models
ICML 2024
0
citations
Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement
NeurIPS 2020
0
citations
Uncertainty-Driven Loss for Single Image Super-Resolution
NeurIPS 2021
0
citations
DeepReduce: A Sparse-tensor Communication Framework for Federated Deep Learning
NeurIPS 2021
0
citations
Discrete Compositional Representations as an Abstraction for Goal Conditioned Reinforcement Learning
NeurIPS 2022
0
citations
AttCAT: Explaining Transformers via Attentive Class Activation Tokens
NeurIPS 2022
0
citations
UP-DP: Unsupervised Prompt Learning for Data Pre-Selection with Vision-Language Models
NeurIPS 2023
0
citations
A Bounded Ability Estimation for Computerized Adaptive Testing
NeurIPS 2023
0
citations
GraphAdapter: Tuning Vision-Language Models With Dual Knowledge Graph
NeurIPS 2023
0
citations
Understanding and Addressing the Pitfalls of Bisimulation-based Representations in Offline Reinforcement Learning
NeurIPS 2023
0
citations
GradOrth: A Simple yet Efficient Out-of-Distribution Detection with Orthogonal Projection of Gradients
NeurIPS 2023
0
citations
From Cloze to Comprehension: Retrofitting Pre-trained Masked Language Models to Pre-trained Machine Reader
NeurIPS 2023
0
citations