Most Cited CVPR "3d all-atom models" Papers

5,589 papers found • Page 24 of 28

#4601

LayoutFormer: Hierarchical Text Detection Towards Scene Text Understanding

Min Liang, Jia-Wei Ma, Xiaobin Zhu et al.

CVPR 2024poster
#4602

Can Generative Video Models Help Pose Estimation?

Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.

CVPR 2025highlightarXiv:2412.16155
#4603

ICE: Intrinsic Concept Extraction from a Single Image via Diffusion Models

Fernando Julio Cendra, Kai Han

CVPR 2025highlightarXiv:2503.19902
#4604

On the Diversity and Realism of Distilled Dataset: An Efficient Dataset Distillation Paradigm

Peng Sun, Bei Shi, Daiwei Yu et al.

CVPR 2024posterarXiv:2312.03526
#4605

HumanGaussian: Text-Driven 3D Human Generation with Gaussian Splatting

Xian Liu, Xiaohang Zhan, Jiaxiang Tang et al.

CVPR 2024highlightarXiv:2311.17061
#4606

Depth Prompting for Sensor-Agnostic Depth Estimation

Jin-Hwi Park, Chanhwi Jeong, Junoh Lee et al.

CVPR 2024posterarXiv:2405.11867
#4607

Modality-Collaborative Test-Time Adaptation for Action Recognition

Baochen Xiong, Xiaoshan Yang, Yaguang Song et al.

CVPR 2024poster
#4608

DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback

Yangyi Chen, Karan Sikka, Michael Cogswell et al.

CVPR 2024posterarXiv:2311.10081
#4609

Rethinking Inductive Biases for Surface Normal Estimation

Gwangbin Bae, Andrew J. Davison

CVPR 2024posterarXiv:2403.00712
#4610

Visual Layout Composer: Image-Vector Dual Diffusion Model for Design Layout Generation

Mohammad Amin Shabani, Zhaowen Wang, Difan Liu et al.

CVPR 2024poster
#4611

Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery

Siddharth Tourani, Ahmed Alwheibi, Arif Mahmood et al.

CVPR 2024highlightarXiv:2403.16194
#4612

Exploiting Deblurring Networks for Radiance Fields

Haeyun Choi, Heemin Yang, Janghyeok Han et al.

CVPR 2025posterarXiv:2502.14454
#4613

CH3Depth: Efficient and Flexible Depth Foundation Model with Flow Matching

Jiaqi Li, Yiran Wang, Jinghong Zheng et al.

CVPR 2025highlight
#4614

OVMR: Open-Vocabulary Recognition with Multi-Modal References

Zehong Ma, Shiliang Zhang, Longhui Wei et al.

CVPR 2024posterarXiv:2406.04675
#4615

AETTA: Label-Free Accuracy Estimation for Test-Time Adaptation

Taeckyung Lee, Sorn Chottananurak, Taesik Gong et al.

CVPR 2024posterarXiv:2404.01351
#4616

EventFly: Event Camera Perception from Ground to the Sky

Lingdong Kong, Dongyue Lu, Xiang Xu et al.

CVPR 2025posterarXiv:2503.19916
#4617

A Simple Recipe for Language-guided Domain Generalized Segmentation

Mohammad Fahes, TUAN-HUNG VU, Andrei Bursuc et al.

CVPR 2024posterarXiv:2311.17922
#4618

An Edit Friendly DDPM Noise Space: Inversion and Manipulations

Inbar Huberman-Spiegelglas, Vladimir Kulikov, Tomer Michaeli

CVPR 2024posterarXiv:2304.06140
#4619

AdaShift: Learning Discriminative Self-Gated Neural Feature Activation With an Adaptive Shift Factor

Sudong Cai

CVPR 2024poster
#4620

EffiDec3D: An Optimized Decoder for High-Performance and Efficient 3D Medical Image Segmentation

Md Mostafijur Rahman, Radu Marculescu

CVPR 2025highlight
#4621

PredToken: Predicting Unknown Tokens and Beyond with Coarse-to-Fine Iterative Decoding

Xuesong Nie, Haoyuan Jin, Yunfeng Yan et al.

CVPR 2024poster
#4622

3D-Mem: 3D Scene Memory for Embodied Exploration and Reasoning

Yuncong Yang, Han Yang, Jiachen Zhou et al.

CVPR 2025posterarXiv:2411.17735
#4623

Holistic Features are almost Sufficient for Text-to-Video Retrieval

Kaibin Tian, Ruixiang Zhao, Zijie Xin et al.

CVPR 2024poster
#4624

Causal Mode Multiplexer: A Novel Framework for Unbiased Multispectral Pedestrian Detection

Taeheon Kim, Sebin Shin, Youngjoon Yu et al.

CVPR 2024posterarXiv:2403.01300
#4625

Seeing the Unseen: Visual Common Sense for Semantic Placement

Ram Ramrakhya, Aniruddha Kembhavi, Dhruv Batra et al.

CVPR 2024posterarXiv:2401.07770
#4626

Diffuse Attend and Segment: Unsupervised Zero-Shot Segmentation using Stable Diffusion

Junjiao Tian, Lavisha Aggarwal, Andrea Colaco et al.

CVPR 2024poster
#4627

GOAT-Bench: A Benchmark for Multi-Modal Lifelong Navigation

Mukul Khanna, Ram Ramrakhya, Gunjan Chhablani et al.

CVPR 2024posterarXiv:2404.06609
#4628

WonderJourney: Going from Anywhere to Everywhere

Hong-Xing Yu, Haoyi Duan, Junhwa Hur et al.

CVPR 2024posterarXiv:2312.03884
#4629

HistoFS: Non-IID Histopathologic Whole Slide Image Classification via Federated Style Transfer with RoI-Preserving

Farchan Hakim Raswa, Chun-Shien Lu, Jia-Ching Wang

CVPR 2025poster
#4630

Efficient Dynamic Scene Editing via 4D Gaussian-based Static-Dynamic Separation

Joohyun Kwon, Hanbyel Cho, Junmo Kim

CVPR 2025posterarXiv:2502.02091
#4631

Design2GarmentCode: Turning Design Concepts to Tangible Garments Through Program Synthesis

Feng Zhou, Ruiyang Liu, chen liu et al.

CVPR 2025posterarXiv:2412.08603
#4632

CLIP-Driven Open-Vocabulary 3D Scene Graph Generation via Cross-Modality Contrastive Learning

Lianggangxu Chen, Xuejiao Wang, Jiale Lu et al.

CVPR 2024highlight
#4633

VisionZip: Longer is Better but Not Necessary in Vision Language Models

Senqiao Yang, Yukang Chen, Zhuotao Tian et al.

CVPR 2025posterarXiv:2412.04467
#4634

Explainable Saliency: Articulating Reasoning with Contextual Prioritization

Nuo Chen, Ming Jiang, Qi Zhao

CVPR 2025poster
#4635

Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation

Luca Barsellotti, Roberto Amoroso, Marcella Cornia et al.

CVPR 2024posterarXiv:2404.06542
#4636

Exposure-slot: Exposure-centric Representations Learning with Slot-in-Slot Attention for Region-aware Exposure Correction

Donggoo Jung, DAEHYUN KIM, Guanghui Wang et al.

CVPR 2025poster
#4637

HRVDA: High-Resolution Visual Document Assistant

Chaohu Liu, Kun Yin, Haoyu Cao et al.

CVPR 2024posterarXiv:2404.06918
#4638

A Dual-Augmentor Framework for Domain Generalization in 3D Human Pose Estimation

Qucheng Peng, Ce Zheng, Chen Chen

CVPR 2024posterarXiv:2403.11310
#4639

Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

Runmin Dong, Shuai Yuan, Bin Luo et al.

CVPR 2024posterarXiv:2403.17460
#4640

Gaussian Shading: Provable Performance-Lossless Image Watermarking for Diffusion Models

Zijin Yang, Kai Zeng, Kejiang Chen et al.

CVPR 2024posterarXiv:2404.04956
#4641

Unlocking the Potential of Unlabeled Data in Semi-Supervised Domain Generalization

Dongkwan Lee, Kyomin Hwang, Nojun Kwak

CVPR 2025posterarXiv:2503.13915
#4642

Multimodal Sense-Informed Forecasting of 3D Human Motions

Zhenyu Lou, Qiongjie Cui, Haofan Wang et al.

CVPR 2024poster
#4643

Resolution Limit of Single-Photon LiDAR

Stanley H. Chan, Hashan K Weerasooriya, Weijian Zhang et al.

CVPR 2024posterarXiv:2403.17719
#4644

Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration

Mingyuan Meng, Dagan Feng, Lei Bi et al.

CVPR 2024posterarXiv:2406.00123
#4645

Ego-Exo4D: Understanding Skilled Human Activity from First- and Third-Person Perspectives

Kristen Grauman, Andrew Westbury, Lorenzo Torresani et al.

CVPR 2024posterarXiv:2311.18259
#4646

RainyGS: Efficient Rain Synthesis with Physically-Based Gaussian Splatting

Qiyu Dai, Xingyu Ni, Qianfan Shen et al.

CVPR 2025posterarXiv:2503.21442
#4647

CAD-SIGNet: CAD Language Inference from Point Clouds using Layer-wise Sketch Instance Guided Attention

Mohammad Sadil Khan, Elona Dupont, Sk Aziz Ali et al.

CVPR 2024highlightarXiv:2402.17678
#4648

LAA-Net: Localized Artifact Attention Network for Quality-Agnostic and Generalizable Deepfake Detection

Dat NGUYEN, Nesryne Mejri, Inder Pal Singh et al.

CVPR 2024posterarXiv:2401.13856
#4649

The Devil is in the Details: StyleFeatureEditor for Detail-Rich StyleGAN Inversion and High Quality Image Editing

Denis Bobkov, Vadim Titov, Aibek Alanov et al.

CVPR 2024posterarXiv:2406.10601
#4650

Make-It-Animatable: An Efficient Framework for Authoring Animation-Ready 3D Characters

Zhiyang Guo, Jinxu Xiang, Kai Ma et al.

CVPR 2025highlightarXiv:2411.18197
#4651

Generative Quanta Color Imaging

Vishal Purohit, Junjie Luo, Yiheng Chi et al.

CVPR 2024posterarXiv:2403.19066
#4652

Adaptive Random Feature Regularization on Fine-tuning Deep Neural Networks

Shin&#x27, ya Yamaguchi, Sekitoshi Kanai et al.

CVPR 2024poster
#4653

Self-Training Large Language Models for Improved Visual Program Synthesis With Visual Reinforcement

Zaid Khan, Vijay Kumar BG, Samuel Schulter et al.

CVPR 2024posterarXiv:2404.04627
#4654

Generating Enhanced Negatives for Training Language-Based Object Detectors

Shiyu Zhao, Long Zhao, Vijay Kumar BG et al.

CVPR 2024posterarXiv:2401.00094
#4655

Joint-Task Regularization for Partially Labeled Multi-Task Learning

Kento Nishi, Junsik Kim, Wanhua Li et al.

CVPR 2024posterarXiv:2404.01976
#4656

MRFP: Learning Generalizable Semantic Segmentation from Sim-2-Real with Multi-Resolution Feature Perturbation

Sumanth Udupa, Prajwal Gurunath, Aniruddh Sikdar et al.

CVPR 2024posterarXiv:2311.18331
#4657

ZoomLDM: Latent Diffusion Model for Multi-scale Image Generation

Srikar Yellapragada, Alexandros Graikos, Kostas Triaridis et al.

CVPR 2025posterarXiv:2411.16969
#4658

Object Recognition as Next Token Prediction

Kaiyu Yue, Bor-Chun Chen, Jonas Geiping et al.

CVPR 2024highlightarXiv:2312.02142
#4659

MuGE: Multiple Granularity Edge Detection

Caixia Zhou, Yaping Huang, Mengyang Pu et al.

CVPR 2024poster
#4660

STAR-Edge: Structure-aware Local Spherical Curve Representation for Thin-walled Edge Extraction from Unstructured Point Clouds

Zikuan Li, Honghua Chen, Yuecheng Wang et al.

CVPR 2025posterarXiv:2503.00801
#4661

Shape Abstraction via Marching Differentiable Support Functions

Sunkyung Park, Jeongmin Lee, Dongjun Lee

CVPR 2025highlight
#4662

TensoFlow: Tensorial Flow-based Sampler for Inverse Rendering

Chun Gu, Xiaofei Wei, Li Zhang et al.

CVPR 2025posterarXiv:2503.18328
#4663

Local-consistent Transformation Learning for Rotation-invariant Point Cloud Analysis

Yiyang Chen, Lunhao Duan, Shanshan Zhao et al.

CVPR 2024posterarXiv:2403.11113
#4664

Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis

Zhan Li, Zhang Chen, Zhong Li et al.

CVPR 2024posterarXiv:2312.16812
#4665

LOTUS: Evasive and Resilient Backdoor Attacks through Sub-Partitioning

Siyuan Cheng, Guanhong Tao, Yingqi Liu et al.

CVPR 2024posterarXiv:2403.17188
#4666

The More You See in 2D the More You Perceive in 3D

Xinyang Han, Zelin Gao, Angjoo Kanazawa et al.

CVPR 2024highlightarXiv:2404.03652
#4667

What When and Where? Self-Supervised Spatio-Temporal Grounding in Untrimmed Multi-Action Videos from Narrated Instructions

Brian Chen, Nina Shvetsova, Andrew Rouditchenko et al.

CVPR 2024posterarXiv:2303.16990
#4668

RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

Yuxin Yao, Zhi Deng, Junhui Hou

CVPR 2025posterarXiv:2503.16822
#4669

LoRA Recycle: Unlocking Tuning-Free Few-Shot Adaptability in Visual Foundation Models by Recycling Pre-Tuned LoRAs

Zixuan Hu, Yongxian Wei, Li Shen et al.

CVPR 2025poster
#4670

Structure-Aware Correspondence Learning for Relative Pose Estimation

Yihan Chen, Wenfei Yang, Huan Ren et al.

CVPR 2025highlightarXiv:2503.18671
#4671

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Jack Urbanek, Florian Bordes, Pietro Astolfi et al.

CVPR 2024posterarXiv:2312.08578
#4672

Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities

Mingcheng Li, Dingkang Yang, Xiao Zhao et al.

CVPR 2024posterarXiv:2404.16456
#4673

ES³: Evolving Self-Supervised Learning of Robust Audio-Visual Speech Representations

Yuanhang Zhang, Shuang Yang, Shiguang Shan et al.

CVPR 2024poster
#4674

Depth-aware Test-Time Training for Zero-shot Video Object Segmentation

Weihuang Liu, Xi Shen, Haolun Li et al.

CVPR 2024posterarXiv:2403.04258
#4675

MSU-4S - The Michigan State University Four Seasons Dataset

Daniel Kent, Mohammed Alyaqoub, Xiaohu Lu et al.

CVPR 2024poster
#4676

An Interactive Navigation Method with Effect-oriented Affordance

Xiaohan Wang, Yuehu LIU, Xinhang Song et al.

CVPR 2024poster
#4677

SegEarth-OV: Towards Training-Free Open-Vocabulary Segmentation for Remote Sensing Images

Kaiyu Li, Ruixun Liu, Xiangyong Cao et al.

CVPR 2025posterarXiv:2410.01768
#4678

Rapid 3D Model Generation with Intuitive 3D Input

Tianrun Chen, Chaotao Ding, Shangzhan Zhang et al.

CVPR 2024highlight
#4679

Unsupervised Salient Instance Detection

Xin Tian, Ke Xu, Rynson W.H. Lau

CVPR 2024poster
#4680

Prior Does Matter: Visual Navigation via Denoising Diffusion Bridge Models

Hao Ren, Yiming Zeng, Zetong Bi et al.

CVPR 2025posterarXiv:2504.10041
#4681

Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception

Junwen He, Yifan Wang, Lijun Wang et al.

CVPR 2024highlightarXiv:2403.02969
#4682

CoDi-2: In-Context Interleaved and Interactive Any-to-Any Generation

Zineng Tang, Ziyi Yang, MAHMOUD KHADEMI et al.

CVPR 2024highlight
#4683

PanoOcc: Unified Occupancy Representation for Camera-based 3D Panoptic Segmentation

Yuqi Wang, Yuntao Chen, Xingyu Liao et al.

CVPR 2024posterarXiv:2306.10013
#4684

AdaBM: On-the-Fly Adaptive Bit Mapping for Image Super-Resolution

Cheeun Hong, Kyoung Mu Lee

CVPR 2024posterarXiv:2404.03296
#4685

MULDE: Multiscale Log-Density Estimation via Denoising Score Matching for Video Anomaly Detection

Jakub Micorek, Horst Possegger, Dominik Narnhofer et al.

CVPR 2024posterarXiv:2403.14497
#4686

Instance-level Expert Knowledge and Aggregate Discriminative Attention for Radiology Report Generation

Shenshen Bu, Taiji Li, Zhiming Dai et al.

CVPR 2024poster
#4687

HyperSDFusion: Bridging Hierarchical Structures in Language and Geometry for Enhanced 3D Text2Shape Generation

Zhiying Leng, Tolga Birdal, Xiaohui Liang et al.

CVPR 2024posterarXiv:2403.00372
#4688

EDM: Equirectangular Projection-Oriented Dense Kernelized Feature Matching

Dongki Jung, Jaehoon Choi, Yonghan Lee et al.

CVPR 2025posterarXiv:2502.20685
#4689

Masked Scene Modeling: Narrowing the Gap Between Supervised and Self-Supervised Learning in 3D Scene Understanding

Pedro Hermosilla, Christian Stippel, Leon Sick

CVPR 2025posterarXiv:2504.06719
#4690

Just Add ?! Pose Induced Video Transformers for Understanding Activities of Daily Living

Dominick Reilly, Srijan Das

CVPR 2024poster
#4691

PhysGaussian: Physics-Integrated 3D Gaussians for Generative Dynamics

Tianyi Xie, Zeshun Zong, Yuxing Qiu et al.

CVPR 2024highlightarXiv:2311.12198
#4692

Viewpoint-Aware Visual Grounding in 3D Scenes

Xiangxi Shi, Zhonghua Wu, Stefan Lee

CVPR 2024poster
#4693

VideoSPatS: Video SPatiotemporal Splines for Disentangled Occlusion, Appearance and Motion Modeling and Editing

Juan Luis Gonzalez Bello, Xu Yao, Alex Whelan et al.

CVPR 2025posterarXiv:2504.07146
#4694

SI-MIL: Taming Deep MIL for Self-Interpretability in Gigapixel Histopathology

Saarthak Kapse, Pushpak Pati, Srijan Das et al.

CVPR 2024posterarXiv:2312.15010
#4695

h-Edit: Effective and Flexible Diffusion-Based Editing via Doob's h-Transform

Toan Nguyen, Kien Do, Duc Kieu et al.

CVPR 2025posterarXiv:2503.02187
#4696

Long-Tail Class Incremental Learning via Independent Sub-prototype Construction

Xi Wang, Xu Yang, Jie Yin et al.

CVPR 2024poster
#4697

MM-OR: A Large Multimodal Operating Room Dataset for Semantic Understanding of High-Intensity Surgical Environments

Ege Özsoy, Chantal Pellegrini, Tobias Czempiel et al.

CVPR 2025posterarXiv:2503.02579
#4698

VideoTree: Adaptive Tree-based Video Representation for LLM Reasoning on Long Videos

Ziyang Wang, Shoubin Yu, Elias Stengel-Eskin et al.

CVPR 2025posterarXiv:2405.19209
#4699

An Aggregation-Free Federated Learning for Tackling Data Heterogeneity

Yuan Wang, Huazhu Fu, Renuga Kanagavelu et al.

CVPR 2024posterarXiv:2404.18962
#4700

Repurposing Pre-trained Video Diffusion Models for Event-based Video Interpolation

Jingxi Chen, Brandon Y. Feng, Haoming Cai et al.

CVPR 2025posterarXiv:2412.07761
#4701

Infrared Adversarial Car Stickers

Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu et al.

CVPR 2024posterarXiv:2405.09924
#4702

XFibrosis: Explicit Vessel-Fiber Modeling for Fibrosis Staging from Liver Pathology Images

CHONG YIN, Siqi Liu, Fei Lyu et al.

CVPR 2024poster
#4703

Advancing Saliency Ranking with Human Fixations: Dataset Models and Benchmarks

Bowen Deng, Siyang Song, Andrew French et al.

CVPR 2024poster
#4704

CALICO: Part-Focused Semantic Co-Segmentation with Large Vision-Language Models

Kiet A. Nguyen, Adheesh Juvekar, Tianjiao Yu et al.

CVPR 2025posterarXiv:2412.19331
#4705

Implicit Event-RGBD Neural SLAM

Delin Qu, Chi Yan, Dong Wang et al.

CVPR 2024highlightarXiv:2311.11013
#4706

Retraining-Free Model Quantization via One-Shot Weight-Coupling Learning

Chen Tang, Yuan Meng, Jiacheng Jiang et al.

CVPR 2024posterarXiv:2401.01543
#4707

DreamControl: Control-Based Text-to-3D Generation with 3D Self-Prior

Tianyu Huang, Yihan Zeng, Zhilu Zhang et al.

CVPR 2024posterarXiv:2312.06439
#4708

From Coarse to Fine-Grained Open-Set Recognition

Nico Lang, Vésteinn Snæbjarnarson, Elijah Cole et al.

CVPR 2024poster
#4709

Exploiting Inter-sample and Inter-feature Relations in Dataset Distillation

Wenxiao Deng, Wenbin Li, Tianyu Ding et al.

CVPR 2024posterarXiv:2404.00563
#4710

Discriminative Pattern Calibration Mechanism for Source-Free Domain Adaptation

Haifeng Xia, Siyu Xia, Zhengming Ding

CVPR 2024poster
#4711

RAM-Avatar: Real-time Photo-Realistic Avatar from Monocular Videos with Full-body Control

xiang deng, Zerong Zheng, Yuxiang Zhang et al.

CVPR 2024poster
#4712

PhotoMaker: Customizing Realistic Human Photos via Stacked ID Embedding

Zhen Li, Mingdeng Cao, Xintao Wang et al.

CVPR 2024posterarXiv:2312.04461
#4713

Privacy-Preserving Face Recognition Using Trainable Feature Subtraction

Yuxi Mi, Zhizhou Zhong, Yuge Huang et al.

CVPR 2024posterarXiv:2403.12457
#4714

Arbitrary Motion Style Transfer with Multi-condition Motion Latent Diffusion Model

Wenfeng Song, Xingliang Jin, Shuai Li et al.

CVPR 2024poster
#4715

3DSFLabelling: Boosting 3D Scene Flow Estimation by Pseudo Auto-labelling

Chaokang Jiang, Guangming Wang, Jiuming Liu et al.

CVPR 2024posterarXiv:2402.18146
#4716

Towards Effective and Sparse Adversarial Attack on Spiking Neural Networks via Breaking Invisible Surrogate Gradients

Li Lun, Kunyu Feng, Qinglong Ni et al.

CVPR 2025posterarXiv:2503.03272
#4717

CPR-Coach: Recognizing Composite Error Actions based on Single-class Training

Shunli Wang, Shuaibing Wang, Dingkang Yang et al.

CVPR 2024posterarXiv:2309.11718
#4718

Restoration by Generation with Constrained Priors

Zheng Ding, Xuaner Zhang, Zhuowen Tu et al.

CVPR 2024highlightarXiv:2312.17161
#4719

LotusFilter: Fast Diverse Nearest Neighbor Search via a Learned Cutoff Table

Yusuke Matsui

CVPR 2025posterarXiv:2506.04790
#4720

PatchDEMUX: A Certifiably Robust Framework for Multi-label Classifiers Against Adversarial Patches

Dennis Jacob, Chong Xiang, Prateek Mittal

CVPR 2025posterarXiv:2505.24703
#4721

Unified Entropy Optimization for Open-Set Test-Time Adaptation

Zhengqing Gao, Xu-Yao Zhang, Cheng-Lin Liu

CVPR 2024posterarXiv:2404.06065
#4722

Poly Kernel Inception Network for Remote Sensing Detection

Xinhao Cai, Qiuxia Lai, Yuwei Wang et al.

CVPR 2024posterarXiv:2403.06258
#4723

Distraction is All You Need: Memory-Efficient Image Immunization against Diffusion-Based Image Editing

Ling Lo, Cheng Yeo, Hong-Han Shuai et al.

CVPR 2024highlight
#4724

Parameter Efficient Fine-tuning via Cross Block Orchestration for Segment Anything Model

Zelin Peng, Zhengqin Xu, Zhilin Zeng et al.

CVPR 2024posterarXiv:2311.17112
#4725

MGMap: Mask-Guided Learning for Online Vectorized HD Map Construction

Xiaolu Liu, Song Wang, Wentong Li et al.

CVPR 2024posterarXiv:2404.00876
#4726

ViT-Lens: Towards Omni-modal Representations

Stan Weixian Lei, Yixiao Ge, Kun Yi et al.

CVPR 2024posterarXiv:2311.16081
#4727

Prompt-Driven Referring Image Segmentation with Instance Contrasting

Chao Shang, Zichen Song, Heqian Qiu et al.

CVPR 2024poster
#4728

CosmicMan: A Text-to-Image Foundation Model for Humans

Shikai Li, Jianglin Fu, Kaiyuan Liu et al.

CVPR 2024highlightarXiv:2404.01294
#4729

MTMMC: A Large-Scale Real-World Multi-Modal Camera Tracking Benchmark

Sanghyun Woo, Kwanyong Park, Inkyu Shin et al.

CVPR 2024posterarXiv:2403.20225
#4730

JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation

Yiyang Ma, Xingchao Liu, Xiaokang Chen et al.

CVPR 2025posterarXiv:2411.07975
#4731

Align and Aggregate: Compositional Reasoning with Video Alignment and Answer Aggregation for Video Question-Answering

Zhaohe Liao, Jiangtong Li, Li Niu et al.

CVPR 2024posterarXiv:2407.03008
#4732

Bias for Action: Video Implicit Neural Representations with Bias Modulation

Alper Kayabasi, Anil Kumar Vadathya, Guha Balakrishnan et al.

CVPR 2025posterarXiv:2501.09277
#4733

EchoMatch: Partial-to-Partial Shape Matching via Correspondence Reflection

Yizheng Xie, Viktoria Ehm, Paul Roetzer et al.

CVPR 2025poster
#4734

Overload: Latency Attacks on Object Detection for Edge Devices

Erh-Chung Chen, Pin-Yu Chen, I-Hsin Chung et al.

CVPR 2024posterarXiv:2304.05370
#4735

Neural Exposure Fusion for High-Dynamic Range Object Detection

Emmanuel Onzon, Maximilian Bömer, Fahim Mannan et al.

CVPR 2024poster
#4736

StoryGPT-V: Large Language Models as Consistent Story Visualizers

Xiaoqian Shen, Mohamed Elhoseiny

CVPR 2025posterarXiv:2312.02252
#4737

MITracker: Multi-View Integration for Visual Object Tracking

Mengjie Xu, Yitao Zhu, Haotian Jiang et al.

CVPR 2025highlightarXiv:2502.20111
#4738

Bridge Frame and Event: Common Spatiotemporal Fusion for High-Dynamic Scene Optical Flow

Hanyu Zhou, Haonan Wang, Haoyue Liu et al.

CVPR 2025posterarXiv:2503.06992
#4739

Semantics Distortion and Style Matter: Towards Source-free UDA for Panoramic Segmentation

Xu Zheng, Pengyuan Zhou, ATHANASIOS et al.

CVPR 2024poster
#4740

PaSCo: Urban 3D Panoptic Scene Completion with Uncertainty Awareness

Anh-Quan Cao, Angela Dai, Raoul de Charette

CVPR 2024posterarXiv:2312.02158
#4741

Evaluating Transferability in Retrieval Tasks: An Approach Using MMD and Kernel Methods

Mengyu Dai, Amir Hossein Raffiee, Aashish Jain et al.

CVPR 2024poster
#4742

ParameterNet: Parameters Are All You Need for Large-scale Visual Pretraining of Mobile Networks

Kai Han, Yunhe Wang, Jianyuan Guo et al.

CVPR 2024poster
#4743

Confronting Ambiguity in 6D Object Pose Estimation via Score-Based Diffusion on SE(3)

Tsu-Ching Hsiao, Hao-Wei Chen, Hsuan-Kung Yang et al.

CVPR 2024posterarXiv:2305.15873
#4744

Communication-Efficient Collaborative Perception via Information Filling with Codebook

Yue Hu, Juntong Peng, Sifei Liu et al.

CVPR 2024posterarXiv:2405.04966
#4745

QUADify: Extracting Meshes with Pixel-level Details and Materials from Images

Maximilian Frühauf, Hayko Riemenschneider, Markus Gross et al.

CVPR 2024highlight
#4746

Enhancing Post-training Quantization Calibration through Contrastive Learning

Yuzhang Shang, Gaowen Liu, Ramana Kompella et al.

CVPR 2024poster
#4747

LASO: Language-guided Affordance Segmentation on 3D Object

Yicong Li, Na Zhao, Junbin Xiao et al.

CVPR 2024poster
#4748

SAMWISE: Infusing Wisdom in SAM2 for Text-Driven Video Segmentation

Claudia Cuttano, Gabriele Trivigno, Gabriele Rosi et al.

CVPR 2025highlightarXiv:2411.17646
#4749

Dispersed Structured Light for Hyperspectral 3D Imaging

Suhyun Shin, Seokjun Choi, Felix Heide et al.

CVPR 2024posterarXiv:2311.18287
#4750

DualAD: Disentangling the Dynamic and Static World for End-to-End Driving

Simon Doll, Niklas Hanselmann, Lukas Schneider et al.

CVPR 2024posterarXiv:2406.06264
#4751

Focus on Hiders: Exploring Hidden Threats for Enhancing Adversarial Training

Qian Li, Yuxiao Hu, Yinpeng Dong et al.

CVPR 2024posterarXiv:2312.07067
#4752

ColorPCR: Color Point Cloud Registration with Multi-Stage Geometric-Color Fusion

Juncheng Mu, Lin Bie, Shaoyi Du et al.

CVPR 2024poster
#4753

Narrating the Video: Boosting Text-Video Retrieval via Comprehensive Utilization of Frame-Level Captions

Chan Hur, Jeong-hun Hong, Dong-hun Lee et al.

CVPR 2025posterarXiv:2503.05186
#4754

Any-Shift Prompting for Generalization over Distributions

Zehao Xiao, Jiayi Shen, Mohammad Mahdi Derakhshani et al.

CVPR 2024posterarXiv:2402.10099
#4755

MoFlow: One-Step Flow Matching for Human Trajectory Forecasting via Implicit Maximum Likelihood Estimation based Distillation

Yuxiang Fu, Qi Yan, Ke Li et al.

CVPR 2025posterarXiv:2503.09950
#4756

Time- Memory- and Parameter-Efficient Visual Adaptation

Otniel-Bogdan Mercea, Alexey Gritsenko, Cordelia Schmid et al.

CVPR 2024highlightarXiv:2402.02887
#4757

Behind the Veil: Enhanced Indoor 3D Scene Reconstruction with Occluded Surfaces Completion

Su Sun, Cheng Zhao, Yuliang Guo et al.

CVPR 2024posterarXiv:2404.03070
#4758

Reloc3r: Large-Scale Training of Relative Camera Pose Regression for Generalizable, Fast, and Accurate Visual Localization

Siyan Dong, Shuzhe Wang, Shaohui Liu et al.

CVPR 2025posterarXiv:2412.08376
#4759

Revisiting Counterfactual Problems in Referring Expression Comprehension

Zhihan Yu, Ruifan Li

CVPR 2024poster
#4760

HuMoCon: Concept Discovery for Human Motion Understanding

Qihang Fang, Chengcheng Tang, Bugra Tekin et al.

CVPR 2025posterarXiv:2505.20920
#4761

Differentiable Point-based Inverse Rendering

Hoon-Gyu Chung, Seokjun Choi, Seung-Hwan Baek

CVPR 2024posterarXiv:2312.02480
#4762

VMINer: Versatile Multi-view Inverse Rendering with Near- and Far-field Light Sources

Fan Fei, Jiajun Tang, Ping Tan et al.

CVPR 2024highlight
#4763

SLAM3R: Real-Time Dense Scene Reconstruction from Monocular RGB Videos

Yuzheng Liu, Siyan Dong, Shuzhe Wang et al.

CVPR 2025highlightarXiv:2412.09401
#4764

LIM: Large Interpolator Model for Dynamic Reconstruction

Remy Sabathier, Niloy J. Mitra, David Novotny

CVPR 2025posterarXiv:2503.22537
#4765

ActiveDC: Distribution Calibration for Active Finetuning

Wenshuai Xu, Zhenghui Hu, Yu Lu et al.

CVPR 2024posterarXiv:2311.07634
#4766

AUEditNet: Dual-Branch Facial Action Unit Intensity Manipulation with Implicit Disentanglement

Shiwei Jin, Zhen Wang, Lei Wang et al.

CVPR 2024posterarXiv:2404.05063
#4767

VecFusion: Vector Font Generation with Diffusion

Vikas Thamizharasan, Difan Liu, Shantanu Agarwal et al.

CVPR 2024highlightarXiv:2312.10540
#4768

Generating Non-Stationary Textures using Self-Rectification

Yang Zhou, Rongjun Xiao, Dani Lischinski et al.

CVPR 2024posterarXiv:2401.02847
#4769

OOSTraj: Out-of-Sight Trajectory Prediction With Vision-Positioning Denoising

Haichao Zhang, Yi Xu, Hongsheng Lu et al.

CVPR 2024posterarXiv:2404.02227
#4770

Arc2Avatar: Generating Expressive 3D Avatars from a Single Image via ID Guidance

Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias et al.

CVPR 2025posterarXiv:2501.05379
#4771

CyberDemo: Augmenting Simulated Human Demonstration for Real-World Dexterous Manipulation

Jun Wang, Yuzhe Qin, Kaiming Kuang et al.

CVPR 2024posterarXiv:2402.14795
#4772

Towards Zero-Shot Anomaly Detection and Reasoning with Multimodal Large Language Models

Jiacong Xu, Shao-Yuan Lo, Bardia Safaei et al.

CVPR 2025highlightarXiv:2502.07601
#4773

Video Harmonization with Triplet Spatio-Temporal Variation Patterns

Zonghui Guo, XinYu Han, Jie Zhang et al.

CVPR 2024poster
#4774

Rethinking Interactive Image Segmentation with Low Latency High Quality and Diverse Prompts

Qin Liu, Jaemin Cho, Mohit Bansal et al.

CVPR 2024posterarXiv:2404.00741
#4775

SeMoLi: What Moves Together Belongs Together

Jenny Seidenschwarz, Aljoša Ošep, Francesco Ferroni et al.

CVPR 2024posterarXiv:2402.19463
#4776

Locality-Aware Zero-Shot Human-Object Interaction Detection

Sanghyun Kim, Deunsol Jung, Minsu Cho

CVPR 2025posterarXiv:2505.19503
#4777

HINTED: Hard Instance Enhanced Detector with Mixed-Density Feature Fusion for Sparsely-Supervised 3D Object Detection

Qiming Xia, Wei Ye, Hai Wu et al.

CVPR 2024poster
#4778

NAYER: Noisy Layer Data Generation for Efficient and Effective Data-free Knowledge Distillation

Minh-Tuan Tran, Trung Le, Xuan-May Le et al.

CVPR 2024posterarXiv:2310.00258
#4779

Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Yeonguk Yu, Sungho Shin, Seunghyeok Back et al.

CVPR 2024posterarXiv:2404.10966
#4780

Occlusion-aware Text-Image-Point Cloud Pretraining for Open-World 3D Object Recognition

Khanh Nguyen, Ghulam Mubashar Hassan, Ajmal Mian

CVPR 2025posterarXiv:2502.10674
#4781

Relative Pose Estimation through Affine Corrections of Monocular Depth Priors

Yifan Yu, Shaohui Liu, Rémi Pautrat et al.

CVPR 2025highlightarXiv:2501.05446
#4782

Revamping Federated Learning Security from a Defender's Perspective: A Unified Defense with Homomorphic Encrypted Data Space

Naveen Kumar Kummari, Reshmi Mitra, Krishna Mohan Chalavadi

CVPR 2024poster
#4783

LLMs are Good Sign Language Translators

Jia Gong, Lin Geng Foo, Yixuan He et al.

CVPR 2024posterarXiv:2404.00925
#4784

Unseen Visual Anomaly Generation

HAN SUN, Yunkang Cao, Hao Dong et al.

CVPR 2025posterarXiv:2406.01078
#4785

PanoRecon: Real-Time Panoptic 3D Reconstruction from Monocular Video

Dong Wu, Zike Yan, Hongbin Zha

CVPR 2024poster
#4786

TACO: Benchmarking Generalizable Bimanual Tool-ACtion-Object Understanding

Yun Liu, Haolin Yang, Xu Si et al.

CVPR 2024posterarXiv:2401.08399
#4787

InsTaG: Learning Personalized 3D Talking Head from Few-Second Video

Jiahe Li, Jiawei Zhang, Xiao Bai et al.

CVPR 2025posterarXiv:2502.20387
#4788

Learning by Correction: Efficient Tuning Task for Zero-Shot Generative Vision-Language Reasoning

Rongjie Li, Yu Wu, Xuming He

CVPR 2024posterarXiv:2404.00909
#4789

Calibrating Multi-modal Representations: A Pursuit of Group Robustness without Annotations

Chenyu You, Yifei Min, Weicheng Dai et al.

CVPR 2024posterarXiv:2403.07241
#4790

Gaussian-Flow: 4D Reconstruction with Dynamic 3D Gaussian Particle

Youtian Lin, Zuozhuo Dai, Siyu Zhu et al.

CVPR 2024highlightarXiv:2312.03431
#4791

Dysen-VDM: Empowering Dynamics-aware Text-to-Video Diffusion with LLMs

Hao Fei, Shengqiong Wu, Wei Ji et al.

CVPR 2024posterarXiv:2308.13812
#4792

CLIP-driven Coarse-to-fine Semantic Guidance for Fine-grained Open-set Semi-supervised Learning

Xiaokun Li, Yaping Huang, Qingji Guan

CVPR 2025poster
#4793

Electromyography-Informed Facial Expression Reconstruction for Physiological-Based Synthesis and Analysis

Tim Büchner, Christoph Anders, Orlando Guntinas-Lichius et al.

CVPR 2025highlightarXiv:2503.09556
#4794

Can Biases in ImageNet Models Explain Generalization?

Paul Gavrikov, Janis Keuper

CVPR 2024posterarXiv:2404.01509
#4795

HumMUSS: Human Motion Understanding using State Space Models

Arnab Mondal, Stefano Alletto, Denis Tome

CVPR 2024posterarXiv:2404.10880
#4796

Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations

Sangmin Lee, Bolin Lai, Fiona Ryan et al.

CVPR 2024posterarXiv:2403.02090
#4797

CoLLM: A Large Language Model for Composed Image Retrieval

Chuong Huynh, Jinyu Yang, Ashish Tawari et al.

CVPR 2025posterarXiv:2503.19910
#4798

FreeControl: Training-Free Spatial Control of Any Text-to-Image Diffusion Model with Any Condition

Sicheng Mo, Fangzhou Mu, Kuan Heng Lin et al.

CVPR 2024posterarXiv:2312.07536
#4799

Benchmarking Object Detectors under Real-World Distribution Shifts in Satellite Imagery

Sara Al-Emadi, Yin Yang, Ferda Ofli

CVPR 2025posterarXiv:2503.19202
#4800

How to Make Cross Encoder a Good Teacher for Efficient Image-Text Retrieval?

Yuxin Chen, Zongyang Ma, Ziqi Zhang et al.

CVPR 2024posterarXiv:2407.07479