Most Cited CVPR "active vision" Papers

5,589 papers found • Page 23 of 28

#4401

AnyMoLe: Any Character Motion In-betweening Leveraging Video Diffusion Models

Kwan Yun, Seokhyeon Hong, Chaelin Kim et al.

CVPR 2025posterarXiv:2503.08417
#4402

SuGaR: Surface-Aligned Gaussian Splatting for Efficient 3D Mesh Reconstruction and High-Quality Mesh Rendering

Antoine Guédon, Vincent Lepetit

CVPR 2024posterarXiv:2311.12775
#4403

DiffusionPoser: Real-time Human Motion Reconstruction From Arbitrary Sparse Sensors Using Autoregressive Diffusion

Tom Van Wouwe, Seunghwan Lee, Antoine Falisse et al.

CVPR 2024posterarXiv:2308.16682
#4404

HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion

Jingbo Zhang, Xiaoyu Li, Qi Zhang et al.

CVPR 2024posterarXiv:2311.16961
#4405

CurveCloudNet: Processing Point Clouds with 1D Structure

Colton Stearns, Alex Fu, Jiateng Liu et al.

CVPR 2024posterarXiv:2303.12050
#4406

Explaining in Diffusion: Explaining a Classifier with Diffusion Semantics

Tahira Kazimi, Ritika Allada, Pinar Yanardag

CVPR 2025poster
#4407

Harnessing Meta-Learning for Improving Full-Frame Video Stabilization

Muhammad Kashif Ali, Eun Woo Im, Dongjin Kim et al.

CVPR 2024posterarXiv:2403.03662
#4408

Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous Driving

Junhao Zheng, Chenhao Lin, Jiahao Sun et al.

CVPR 2024posterarXiv:2403.17301
#4409

SeaBird: Segmentation in Bird’s View with Dice Loss Improves Monocular 3D Detection of Large Objects

Abhinav Kumar, Yuliang Guo, Xinyu Huang et al.

CVPR 2024posterarXiv:2403.20318
#4410

Blurred LiDAR for Sharper 3D: Robust Handheld 3D Scanning with Diffuse LiDAR and RGB

Nikhil Behari, Aaron Young, Siddharth Somasundaram et al.

CVPR 2025highlightarXiv:2411.19474
#4411

MoML: Online Meta Adaptation for 3D Human Motion Prediction

Xiaoning Sun, Huaijiang Sun, Bin Li et al.

CVPR 2024poster
#4412

Let's Verify and Reinforce Image Generation Step by Step

Renrui Zhang, Chengzhuo Tong, Zhizheng Zhao et al.

CVPR 2025poster
#4413

Learning with Structural Labels for Learning with Noisy Labels

Noo-ri Kim, Jin-Seop Lee, Jee-Hyong Lee

CVPR 2024poster
#4414

EVPGS: Enhanced View Prior Guidance for Splatting-based Extrapolated View Synthesis

Jiahe Li, Feiyu Wang, Xiaochao Qu et al.

CVPR 2025posterarXiv:2503.21816
#4415

What If the TV Was Off? Examining Counterfactual Reasoning Abilities of Multi-modal Language Models

Letian Zhang, Xiaotong Zhai, Zhongkai Zhao et al.

CVPR 2024posterarXiv:2310.06627
#4416

Optical-Flow Guided Prompt Optimization for Coherent Video Generation

Hyelin Nam, Jaemin Kim, Dohun Lee et al.

CVPR 2025posterarXiv:2411.15540
#4417

Incremental Nuclei Segmentation from Histopathological Images via Future-class Awareness and Compatibility-inspired Distillation

Huyong Wang, Huisi Wu, Jing Qin

CVPR 2024poster
#4418

Model Inversion Robustness: Can Transfer Learning Help?

Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran et al.

CVPR 2024posterarXiv:2405.05588
#4419

Scene-adaptive and Region-aware Multi-modal Prompt for Open Vocabulary Object Detection

Xiaowei Zhao, Xianglong Liu, Duorui Wang et al.

CVPR 2024poster
#4420

Detecting Open World Objects via Partial Attribute Assignment

Muli Yang, Gabriel James Goenawan, Huaiyuan Qin et al.

CVPR 2025poster
#4421

InteractDiffusion: Interaction Control in Text-to-Image Diffusion Models

Jiun Tian Hoe, Xudong Jiang, Chee Seng Chan et al.

CVPR 2024posterarXiv:2312.05849
#4422

OpenMIBOOD: Open Medical Imaging Benchmarks for Out-Of-Distribution Detection

Max Gutbrod, David Rauber, Danilo Weber Nunes et al.

CVPR 2025posterarXiv:2503.16247
#4423

MAP: MAsk-Pruning for Source-Free Model Intellectual Property Protection

Boyang Peng, Sanqing Qu, Yong Wu et al.

CVPR 2024posterarXiv:2403.04149
#4424

Dynamic Adapter Meets Prompt Tuning: Parameter-Efficient Transfer Learning for Point Cloud Analysis

Xin Zhou, Dingkang Liang, Wei Xu et al.

CVPR 2024posterarXiv:2403.01439
#4425

Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Chong Wang, Lanqing Guo, Zixuan Fu et al.

CVPR 2025posterarXiv:2503.01288
#4426

EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

CVPR 2024posterarXiv:2405.06880
#4427

CTRL-O: Language-Controllable Object-Centric Visual Representation Learning

Aniket Rajiv Didolkar, Andrii Zadaianchuk, Rabiul Awal et al.

CVPR 2025posterarXiv:2503.21747
#4428

On Exact Inversion of DPM-Solvers

Seongmin Hong, Kyeonghyun Lee, Suh Yoon Jeon et al.

CVPR 2024posterarXiv:2311.18387
#4429

Generate Like Experts: Multi-Stage Font Generation by Incorporating Font Transfer Process into Diffusion Models

Bin Fu, Fanghua Yu, Anran Liu et al.

CVPR 2024poster
#4430

A Unified Diffusion Framework for Scene-aware Human Motion Estimation from Sparse Signals

Jiangnan Tang, Jingya Wang, Kaiyang Ji et al.

CVPR 2024posterarXiv:2404.04890
#4431

LASA: Instance Reconstruction from Real Scans using A Large-scale Aligned Shape Annotation Dataset

Haolin Liu, Chongjie Ye, Yinyu Nie et al.

CVPR 2024posterarXiv:2312.12418
#4432

Subnet-Aware Dynamic Supernet Training for Neural Architecture Search

Jeimin Jeon, Youngmin Oh, Junghyup Lee et al.

CVPR 2025posterarXiv:2503.10740
#4433

MaskCLR: Attention-Guided Contrastive Learning for Robust Action Representation Learning

Mohamed Abdelfattah, Mariam Hassan, Alex Alahi

CVPR 2024poster
#4434

MARVEL-40M+: Multi-Level Visual Elaboration for High-Fidelity Text-to-3D Content Creation

Sankalp Sinha, Mohammad Sadil Khan, Muhammad Usama et al.

CVPR 2025posterarXiv:2411.17945
#4435

D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection

Dinh Phat Do, Taehoon Kim, JAEMIN NA et al.

CVPR 2024posterarXiv:2403.09359
#4436

MAGICK: A Large-scale Captioned Dataset from Matting Generated Images using Chroma Keying

Ryan Burgert, Brian Price, Jason Kuen et al.

CVPR 2024poster
#4437

Intrinsic Image Diffusion for Indoor Single-view Material Estimation

Peter Kocsis, Vincent Sitzmann, Matthias Nießner

CVPR 2024posterarXiv:2312.12274
#4438

Prompt Highlighter: Interactive Control for Multi-Modal LLMs

Yuechen Zhang, Shengju Qian, Bohao Peng et al.

CVPR 2024posterarXiv:2312.04302
#4439

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Ruicheng Wang, Sicheng Xu, Cassie Lee Dai et al.

CVPR 2025posterarXiv:2410.19115
#4440

PhyT2V: LLM-Guided Iterative Self-Refinement for Physics-Grounded Text-to-Video Generation

Qiyao Xue, Xiangyu Yin, Boyuan Yang et al.

CVPR 2025posterarXiv:2412.00596
#4441

Can Protective Perturbation Safeguard Personal Data from Being Exploited by Stable Diffusion?

Zhengyue Zhao, Jinhao Duan, Kaidi Xu et al.

CVPR 2024posterarXiv:2312.00084
#4442

NetTrack: Tracking Highly Dynamic Objects with a Net

Guangze Zheng, Shijie Lin, Haobo Zuo et al.

CVPR 2024posterarXiv:2403.11186
#4443

Scaling Up Video Summarization Pretraining with Large Language Models

Dawit Argaw Argaw, Seunghyun Yoon, Fabian Caba Heilbron et al.

CVPR 2024posterarXiv:2404.03398
#4444

Online Task-Free Continual Generative and Discriminative Learning via Dynamic Cluster Memory

飞 叶, Adrian Bors

CVPR 2024poster
#4445

FADES: Fair Disentanglement with Sensitive Relevance

Taeuk Jang, Xiaoqian Wang

CVPR 2024poster
#4446

Versatile Navigation Under Partial Observability via Value-guided Diffusion Policy

Gengyu Zhang, Hao Tang, Yan Yan

CVPR 2024posterarXiv:2404.02176
#4447

Improving Depth Completion via Depth Feature Upsampling

Yufei Wang, Ge Zhang, Shaoqian Wang et al.

CVPR 2024poster
#4448

AutoSSVH: Exploring Automated Frame Sampling for Efficient Self-Supervised Video Hashing

Niu Lian, Jun Li, Jinpeng Wang et al.

CVPR 2025posterarXiv:2504.03587
#4449

EmotiveTalk: Expressive Talking Head Generation through Audio Information Decoupling and Emotional Video Diffusion

Haotian Wang, Yuzhe Weng, Yueyan Li et al.

CVPR 2025posterarXiv:2411.16726
#4450

Deep Single Image Camera Calibration by Heatmap Regression to Recover Fisheye Images Under Manhattan World Assumption

Nobuhiko Wakai, Satoshi Sato, Yasunori Ishii et al.

CVPR 2024posterarXiv:2303.17166
#4451

LPOSS: Label Propagation Over Patches and Pixels for Open-vocabulary Semantic Segmentation

Vladan Stojnić, Yannis Kalantidis, Jiri Matas et al.

CVPR 2025posterarXiv:2503.19777
#4452

MAGiC-SLAM: Multi-Agent Gaussian Globally Consistent SLAM

Vladimir Yugay, Theo Gevers, Martin R. Oswald

CVPR 2025posterarXiv:2411.16785
#4453

MRFS: Mutually Reinforcing Image Fusion and Segmentation

HAO ZHANG, Xuhui Zuo, Jie Jiang et al.

CVPR 2024poster
#4454

Multi-agent Long-term 3D Human Pose Forecasting via Interaction-aware Trajectory Conditioning

Jaewoo Jeong, Daehee Park, Kuk-Jin Yoon

CVPR 2024highlightarXiv:2404.05218
#4455

Arbitrary-steps Image Super-resolution via Diffusion Inversion

Zongsheng Yue, Kang Liao, Chen Change Loy

CVPR 2025posterarXiv:2412.09013
#4456

OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental Learning

Noor Ahmed, Anna Kukleva, Bernt Schiele

CVPR 2024highlightarXiv:2403.18550
#4457

3D-LFM: Lifting Foundation Model

Mosam Dabhi, László A. Jeni, Simon Lucey

CVPR 2024posterarXiv:2312.11894
#4458

DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal

Yizhilv, Xiao Lu, Hong Ding et al.

CVPR 2025poster
#4459

AutoLUT: LUT-Based Image Super-Resolution with Automatic Sampling and Adaptive Residual Learning

Yuheng Xu, Shijie Yang, Xin Liu et al.

CVPR 2025posterarXiv:2503.01565
#4460

Crab: A Unified Audio-Visual Scene Understanding Model with Explicit Cooperation

Henghui Du, Guangyao Li, Chang Zhou et al.

CVPR 2025posterarXiv:2503.13068
#4461

LASIL: Learner-Aware Supervised Imitation Learning For Long-term Microscopic Traffic Simulation

Ke Guo, Zhenwei Miao, Wei Jing et al.

CVPR 2024posterarXiv:2403.17601
#4462

EgoLife: Towards Egocentric Life Assistant

Jingkang Yang, Shuai Liu, Hongming Guo et al.

CVPR 2025posterarXiv:2503.03803
#4463

HybridNeRF: Efficient Neural Rendering via Adaptive Volumetric Surfaces

Haithem Turki, Vasu Agrawal, Samuel Rota Bulò et al.

CVPR 2024highlightarXiv:2312.03160
#4464

Cross-Modal 3D Representation with Multi-View Images and Point Clouds

Ziyang Zhou, Pinghui Wang, Zi Liang et al.

CVPR 2025poster
#4465

IIRP-Net: Iterative Inference Residual Pyramid Network for Enhanced Image Registration

Tai Ma, zhangsuwei, Jiafeng Li et al.

CVPR 2024poster
#4466

SEED-Bench: Benchmarking Multimodal Large Language Models

Bohao Li, Yuying Ge, Yixiao Ge et al.

CVPR 2024poster
#4467

Style Aligned Image Generation via Shared Attention

Amir Hertz, Andrey Voynov, Shlomi Fruchter et al.

CVPR 2024posterarXiv:2312.02133
#4468

NeRFDeformer: NeRF Transformation from a Single View via 3D Scene Flows

Zhenggang Tang, Jason Ren, Xiaoming Zhao et al.

CVPR 2024posterarXiv:2406.10543
#4469

Convex Relaxation for Robust Vanishing Point Estimation in Manhattan World

Bangyan Liao, Zhenjun Zhao, Haoang Li et al.

CVPR 2025posterarXiv:2505.04788
#4470

StyleStudio: Text-Driven Style Transfer with Selective Control of Style Elements

Mingkun Lei, Xue Song, Beier Zhu et al.

CVPR 2025posterarXiv:2412.08503
#4471

Invisible Backdoor Attack against Self-supervised Learning

Hanrong Zhang, Zhenting Wang, Boheng Li et al.

CVPR 2025posterarXiv:2405.14672
#4472

SketchFusion: Learning Universal Sketch Features through Fusing Foundation Models

Subhadeep Koley, Tapas Kumar Dutta, Aneeshan Sain et al.

CVPR 2025posterarXiv:2503.14129
#4473

BIVDiff: A Training-Free Framework for General-Purpose Video Synthesis via Bridging Image and Video Diffusion Models

Fengyuan Shi, Jiaxi Gu, Hang Xu et al.

CVPR 2024posterarXiv:2312.02813
#4474

CoMapGS: Covisibility Map-based Gaussian Splatting for Sparse Novel View Synthesis

Youngkyoon Jang, Eduardo Pérez-Pellitero

CVPR 2025posterarXiv:2503.20998
#4475

Active Domain Adaptation with False Negative Prediction for Object Detection

Yuzuru Nakamura, Yasunori Ishii, Takayoshi Yamashita

CVPR 2024highlight
#4476

LTGC: Long-tail Recognition via Leveraging LLMs-driven Generated Content

Qihao Zhao, Yalun Dai, Hao Li et al.

CVPR 2024posterarXiv:2403.05854
#4477

How to Train Neural Field Representations: A Comprehensive Study and Benchmark

Samuele Papa, Riccardo Valperga, David Knigge et al.

CVPR 2024posterarXiv:2312.10531
#4478

Interactive Medical Image Analysis with Concept-based Similarity Reasoning

Ta Duc Huy, Sen Kim Tran, Phan Nguyen et al.

CVPR 2025posterarXiv:2503.06873
#4479

Steepest Descent Density Control for Compact 3D Gaussian Splatting

Peihao Wang, Yuehao Wang, Dilin Wang et al.

CVPR 2025posterarXiv:2505.05587
#4480

Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression

Zhenqi Dai, Ting Liu, Yanning Zhang

CVPR 2025poster
#4481

Image Quality Assessment: From Human to Machine Preference

Chunyi Li, Yuan Tian, Xiaoyue Ling et al.

CVPR 2025highlightarXiv:2503.10078
#4482

Towards Improved Text-Aligned Codebook Learning: Multi-Hierarchical Codebook-Text Alignment with Long Text

Guotao liang, Baoquan Zhang, Zhiyuan Wen et al.

CVPR 2025highlightarXiv:2503.01261
#4483

InterMimic: Towards Universal Whole-Body Control for Physics-Based Human-Object Interactions

Sirui Xu, Hung Yu Ling, Yu-Xiong Wang et al.

CVPR 2025highlightarXiv:2502.20390
#4484

EdgeMovingNet: Edge-preserving Point Cloud Reconstruction via Joint Geometry Features

Xinran Yang, Donghao Ji, Yuanqi Li et al.

CVPR 2025poster
#4485

Motion-adaptive Separable Collaborative Filters for Blind Motion Deblurring

Chengxu Liu, Xuan Wang, Xiangyu Xu et al.

CVPR 2024posterarXiv:2404.13153
#4486

SCoFT: Self-Contrastive Fine-Tuning for Equitable Image Generation

Zhixuan Liu, Peter Schaldenbrand, Beverley-Claire Okogwu et al.

CVPR 2024posterarXiv:2401.08053
#4487

MonoDGP: Monocular 3D Object Detection with Decoupled-Query and Geometry-Error Priors

Fanqi Pu, Yifan Wang, Jiru Deng et al.

CVPR 2025posterarXiv:2410.19590
#4488

PhysVLM: Enabling Visual Language Models to Understand Robotic Physical Reachability

Weijie Zhou, Manli Tao, Chaoyang Zhao et al.

CVPR 2025posterarXiv:2503.08481
#4489

Reg-PTQ: Regression-specialized Post-training Quantization for Fully Quantized Object Detector

Yifu Ding, Weilun Feng, Chuyan Chen et al.

CVPR 2024poster
#4490

Preserving Clusters in Prompt Learning for Unsupervised Domain Adaptation

Long Tung Vuong, Hoang Phan, Vy Vo et al.

CVPR 2025posterarXiv:2506.11493
#4491

BlenderGym: Benchmarking Foundational Model Systems for Graphics Editing

Yunqi Gu, Ian Huang, Jihyeon Je et al.

CVPR 2025highlightarXiv:2504.01786
#4492

FREE: Faster and Better Data-Free Meta-Learning

Yongxian Wei, Zixuan Hu, Zhenyi Wang et al.

CVPR 2024posterarXiv:2405.00984
#4493

Controllable Human Image Generation with Personalized Multi-Garments

Yisol Choi, Sangkyung Kwak, Sihyun Yu et al.

CVPR 2025posterarXiv:2411.16801
#4494

Open Vocabulary Semantic Scene Sketch Understanding

Ahmed Bourouis, Judith Fan, Yulia Gryaditskaya

CVPR 2024posterarXiv:2312.12463
#4495

Reconstructing Animals and the Wild

Peter Kulits, Michael J. Black, Silvia Zuffi

CVPR 2025posterarXiv:2411.18807
#4496

You Only Need Less Attention at Each Stage in Vision Transformers

Shuoxi Zhang, Hanpeng Liu, Stephen Lin et al.

CVPR 2024posterarXiv:2406.00427
#4497

Hierarchical Patch Diffusion Models for High-Resolution Video Generation

Ivan Skorokhodov, Willi Menapace, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2406.07792
#4498

Rethinking the Up-Sampling Operations in CNN-based Generative Network for Generalizable Deepfake Detection

Chuangchuang Tan, Huan Liu, Yao Zhao et al.

CVPR 2024posterarXiv:2312.10461
#4499

Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

Haoming Chen, Zhizhong Zhang, Yanyun Qu et al.

CVPR 2024posterarXiv:2405.07201
#4500

InteractVLM: 3D Interaction Reasoning from 2D Foundational Models

Sai Kumar Dwivedi, Dimitrije Antić, Shashank Tripathi et al.

CVPR 2025posterarXiv:2504.05303
#4501

BoQ: A Place is Worth a Bag of Learnable Queries

Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère

CVPR 2024posterarXiv:2405.07364
#4502

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344
#4503

Rethinking Epistemic and Aleatoric Uncertainty for Active Open-Set Annotation: An Energy-Based Approach

Chen-Chen Zong, Sheng-Jun Huang

CVPR 2025posterarXiv:2502.19691
#4504

UFC-Net: Unrolling Fixed-point Continuous Network for Deep Compressive Sensing

Xiaoyang Wang, Hongping Gan

CVPR 2024poster
#4505

Symphonize 3D Semantic Scene Completion with Contextual Instance Queries

Haoyi Jiang, Tianheng Cheng, Naiyu Gao et al.

CVPR 2024posterarXiv:2306.15670
#4506

DepthCrafter: Generating Consistent Long Depth Sequences for Open-world Videos

Wenbo Hu, Xiangjun Gao, Xiaoyu Li et al.

CVPR 2025highlightarXiv:2409.02095
#4507

SeaLion: Semantic Part-Aware Latent Point Diffusion Models for 3D Generation

Dekai Zhu, Yan Di, Stefan Gavranovic et al.

CVPR 2025posterarXiv:2505.17721
#4508

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

Sajid Javed, Arif Mahmood, IYYAKUTTI IYAPPAN GANAPATHI et al.

CVPR 2024posterarXiv:2406.05205
#4509

Investigating the Role of Weight Decay in Enhancing Nonconvex SGD

Tao Sun, Yuhao Huang, Li Shen et al.

CVPR 2025poster
#4510

ActiveGAMER: Active GAussian Mapping through Efficient Rendering

Liyan Chen, Huangying Zhan, Kevin Chen et al.

CVPR 2025posterarXiv:2501.06897
#4511

PromptHash: Affinity-Prompted Collaborative Cross-Modal Learning for Adaptive Hashing Retrieval

Qiang Zou, Shuli Cheng, Jiayi Chen

CVPR 2025posterarXiv:2503.16064
#4512

GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery

Enguang Wang, Zhimao Peng, Zhengyuan Xie et al.

CVPR 2025posterarXiv:2403.09974
#4513

MaskPLAN: Masked Generative Layout Planning from Partial Input

Hang Zhang, Anton Savov, Benjamin Dillenburger

CVPR 2024poster
#4514

Solving Masked Jigsaw Puzzles with Diffusion Vision Transformers

Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire et al.

CVPR 2024posterarXiv:2404.07292
#4515

Exploration-Driven Generative Interactive Environments

Nedko Savov, Naser Kazemi, Mohammad Mahdi et al.

CVPR 2025posterarXiv:2504.02515
#4516

Towards Memorization-Free Diffusion Models

Chen Chen, Daochang Liu, Chang Xu

CVPR 2024posterarXiv:2404.00922
#4517

Extreme Rotation Estimation in the Wild

Hana Bezalel, Dotan Ankri, Ruojin Cai et al.

CVPR 2025posterarXiv:2411.07096
#4518

AV-RIR: Audio-Visual Room Impulse Response Estimation

Anton Ratnarajah, Sreyan Ghosh, Sonal Kumar et al.

CVPR 2024posterarXiv:2312.00834
#4519

Entangled View-Epipolar Information Aggregation for Generalizable Neural Radiance Fields

Zhiyuan Min, Yawei Luo, Wei Yang et al.

CVPR 2024posterarXiv:2311.11845
#4520

A-Teacher: Asymmetric Network for 3D Semi-Supervised Object Detection

Hanshi Wang, Zhipeng Zhang, Jin Gao et al.

CVPR 2024poster
#4521

HanDiffuser: Text-to-Image Generation With Realistic Hand Appearances

Supreeth Narasimhaswamy, Uttaran Bhattacharya, Xiang Chen et al.

CVPR 2024posterarXiv:2403.01693
#4522

Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

Bin Xiao, Haiping Wu, Weijian Xu et al.

CVPR 2024posterarXiv:2311.06242
#4523

Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

CVPR 2025posterarXiv:2411.16738
#4524

DMR: Decomposed Multi-Modality Representations for Frames and Events Fusion in Visual Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2024poster
#4525

3D Feature Tracking via Event Camera

Siqi Li, Zhou Zhikuan, Zhou Xue et al.

CVPR 2024poster
#4526

Frequency-aware Event-based Video Deblurring for Real-World Motion Blur

Taewoo Kim, Hoonhee Cho, Kuk-Jin Yoon

CVPR 2024poster
#4527

FedHCA2: Towards Hetero-Client Federated Multi-Task Learning

Yuxiang Lu, Suizhi Huang, Yuwen Yang et al.

CVPR 2024poster
#4528

Improving Unsupervised Hierarchical Representation with Reinforcement Learning

Ruyi An, Yewen Li, Xu He et al.

CVPR 2024poster
#4529

Condensing Action Segmentation Datasets via Generative Network Inversion

Guodong Ding, Rongyu Chen, Angela Yao

CVPR 2025posterarXiv:2503.14112
#4530

Flash3D: Super-scaling Point Transformers through Joint Hardware-Geometry Locality

Liyan Chen, Gregory P. Meyer, Zaiwei Zhang et al.

CVPR 2025highlightarXiv:2412.16481
#4531

Global Latent Neural Rendering

Thomas Tanay, Matteo Maggioni

CVPR 2024posterarXiv:2312.08338
#4532

Data Poisoning based Backdoor Attacks to Contrastive Learning

Jinghuai Zhang, Hongbin Liu, Jinyuan Jia et al.

CVPR 2024posterarXiv:2211.08229
#4533

ProHOC: Probabilistic Hierarchical Out-of-Distribution Classification via Multi-Depth Networks

Erik Wallin, Fredrik Kahl, Lars Hammarstrand

CVPR 2025posterarXiv:2503.21397
#4534

RoHM: Robust Human Motion Reconstruction via Diffusion

Siwei Zhang, Bharat Lal Bhatnagar, Yuanlu Xu et al.

CVPR 2024posterarXiv:2401.08570
#4535

Motion Prompting: Controlling Video Generation with Motion Trajectories

Daniel Geng, Charles Herrmann, Junhwa Hur et al.

CVPR 2025posterarXiv:2412.02700
#4536

SportsHHI: A Dataset for Human-Human Interaction Detection in Sports Videos

Tao Wu, Runyu He, Gangshan Wu et al.

CVPR 2024posterarXiv:2404.04565
#4537

Classes Are Not Equal: An Empirical Study on Image Recognition Fairness

Jiequan Cui, Beier Zhu, Xin Wen et al.

CVPR 2024posterarXiv:2402.18133
#4538

EditSplat: Multi-View Fusion and Attention-Guided Optimization for View-Consistent 3D Scene Editing with 3D Gaussian Splatting

Dong In Lee, Hyeongcheol Park, Jiyoung Seo et al.

CVPR 2025posterarXiv:2412.11520
#4539

LinGen: Towards High-Resolution Minute-Length Text-to-Video Generation with Linear Computational Complexity

Hongjie Wang, Chih-Yao Ma, Yen-Cheng Liu et al.

CVPR 2025posterarXiv:2412.09856
#4540

ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic Object

Chenshuang Zhang, Fei Pan, Junmo Kim et al.

CVPR 2024highlightarXiv:2403.18775
#4541

Volumetrically Consistent 3D Gaussian Rasterization

Chinmay Talegaonkar, Yash Belhe, Ravi Ramamoorthi et al.

CVPR 2025highlightarXiv:2412.03378
#4542

BlockGCN: Redefine Topology Awareness for Skeleton-Based Action Recognition

Yuxuan Zhou, Xudong Yan, Zhi-Qi Cheng et al.

CVPR 2024poster
#4543

Dynamic Inertial Poser (DynaIP): Part-Based Motion Dynamics Learning for Enhanced Human Pose Estimation with Sparse Inertial Sensors

Yu Zhang, Songpengcheng Xia, Lei Chu et al.

CVPR 2024posterarXiv:2312.02196
#4544

Person-in-WiFi 3D: End-to-End Multi-Person 3D Pose Estimation with Wi-Fi

Kangwei Yan, Fei Wang, Bo Qian et al.

CVPR 2024poster
#4545

From Zero to Detail: Deconstructing Ultra-High-Definition Image Restoration from Progressive Spectral Perspective

Chen Zhao, Zhizhou Chen, Yunzhe Xu et al.

CVPR 2025posterarXiv:2503.13165
#4546

ERMVP: Communication-Efficient and Collaboration-Robust Multi-Vehicle Perception in Challenging Environments

Jingyu Zhang, Kun Yang, Yilei Wang et al.

CVPR 2024poster
#4547

GRAM: Global Reasoning for Multi-Page VQA

Itshak Blau, Sharon Fogel, Roi Ronen et al.

CVPR 2024posterarXiv:2401.03411
#4548

FreeCloth: Free-form Generation Enhances Challenging Clothed Human Modeling

Hang Ye, Xiaoxuan Ma, Hai Ci et al.

CVPR 2025highlightarXiv:2411.19942
#4549

HalluciDoctor: Mitigating Hallucinatory Toxicity in Visual Instruction Data

Qifan Yu, Juncheng Li, Longhui Wei et al.

CVPR 2024posterarXiv:2311.13614
#4550

Tri-Perspective View Decomposition for Geometry-Aware Depth Completion

Zhiqiang Yan, Yuankai Lin, Kun Wang et al.

CVPR 2024posterarXiv:2403.15008
#4551

Progressive Focused Transformer for Single Image Super-Resolution

Wei Long, Xingyu Zhou, Leheng Zhang et al.

CVPR 2025posterarXiv:2503.20337
#4552

KVQ: Boosting Video Quality Assessment via Saliency-guided Local Perception

Yunpeng Qu, Kun Yuan, Qizhi Xie et al.

CVPR 2025posterarXiv:2503.10259
#4553

MoPE-CLIP: Structured Pruning for Efficient Vision-Language Models with Module-wise Pruning Error Metric

Haokun Lin, Haoli Bai, Zhili Liu et al.

CVPR 2024posterarXiv:2403.07839
#4554

Efficient Personalization of Quantized Diffusion Model without Backpropagation

Hoigi Seo, Wongi Jeong, Kyungryeol Lee et al.

CVPR 2025posterarXiv:2503.14868
#4555

From Sparse Signal to Smooth Motion: Real-Time Motion Generation with Rolling Prediction Models

German Barquero, Nadine Bertsch, Manojkumar Marramreddy et al.

CVPR 2025posterarXiv:2504.05265
#4556

DiffusionRegPose: Enhancing Multi-Person Pose Estimation using a Diffusion-Based End-to-End Regression Approach

Dayi Tan, Hansheng Chen, Wei Tian et al.

CVPR 2024poster
#4557

Scalable Video-to-Dataset Generation for Cross-Platform Mobile Agents

Yunseok Jang, Yeda Song, Sungryull Sohn et al.

CVPR 2025posterarXiv:2505.12632
#4558

Bridging the Vision-Brain Gap with an Uncertainty-Aware Blur Prior

Haitao Wu, Qing Li, Changqing Zhang et al.

CVPR 2025posterarXiv:2503.04207
#4559

Tumor Micro-environment Interactions Guided Graph Learning for Survival Analysis of Human Cancers from Whole-slide Pathological Images

WEI SHAO, YangYang Shi, Daoqiang Zhang et al.

CVPR 2024poster
#4560

Perception-Oriented Video Frame Interpolation via Asymmetric Blending

Guangyang Wu, Xin Tao, Changlin Li et al.

CVPR 2024posterarXiv:2404.06692
#4561

Cooperation Does Matter: Exploring Multi-Order Bilateral Relations for Audio-Visual Segmentation

Qi Yang, Xing Nie, Tong Li et al.

CVPR 2024highlightarXiv:2312.06462
#4562

Exact Fusion via Feature Distribution Matching for Few-shot Image Generation

Yingbo Zhou, Yutong Ye, Pengyu Zhang et al.

CVPR 2024poster
#4563

Fooling Polarization-Based Vision using Locally Controllable Polarizing Projection

Zhuoxiao Li, Zhihang Zhong, Shohei Nobuhara et al.

CVPR 2024posterarXiv:2303.17890
#4564

Affine Equivariant Networks Based on Differential Invariants

Yikang Li, Yeqing Qiu, Yuxuan Chen et al.

CVPR 2024poster
#4565

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, jiawei zhang, Hao Li et al.

CVPR 2024posterarXiv:2312.08886
#4566

Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names

Yapeng Li, Yong Luo, Zengmao Wang et al.

CVPR 2024poster
#4567

Continual Learning for Motion Prediction Model via Meta-Representation Learning and Optimal Memory Buffer Retention Strategy

Dae Jun Kang, Dongsuk Kum, Sanmin Kim

CVPR 2024poster
#4568

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Ao Luo, XIN LI, Fan Yang et al.

CVPR 2024highlight
#4569

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Zhiyin Qian, Shaofei Wang, Marko Mihajlovic et al.

CVPR 2024posterarXiv:2312.09228
#4570

Free Lunch Enhancements for Multi-modal Crowd Counting

Haoliang Meng, Xiaopeng Hong, Zhengqin Lai et al.

CVPR 2025poster
#4571

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Haofeng Liu, Chenshu Xu, Yifei Yang et al.

CVPR 2024posterarXiv:2404.01050
#4572

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring

Xintian Mao, Xiwen Gao, Yan Wang

CVPR 2024posterarXiv:2406.09135
#4573

Puff-Net: Efficient Style Transfer with Pure Content and Style Feature Fusion Network

Sizhe Zheng, Pan Gao, Peng Zhou et al.

CVPR 2024posterarXiv:2405.19775
#4574

SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement

Tao Wang, Lei Jin, Zheng Wang et al.

CVPR 2024poster
#4575

Building Vision-Language Models on Solid Foundations with Masked Distillation

Sepehr Sameni, Kushal Kafle, Hao Tan et al.

CVPR 2024poster
#4576

Marten: Visual Question Answering with Mask Generation for Multi-modal Document Understanding

Zining Wang, Tongkun Guan, Pei Fu et al.

CVPR 2025posterarXiv:2503.14140
#4577

MS-DETR: Efficient DETR Training with Mixed Supervision

Chuyang Zhao, Yifan Sun, Wenhao Wang et al.

CVPR 2024posterarXiv:2401.03989
#4578

PUP 3D-GS: Principled Uncertainty Pruning for 3D Gaussian Splatting

Alex Hanson, Allen Tu, Vasu Singla et al.

CVPR 2025posterarXiv:2406.10219
#4579

Immune: Improving Safety Against Jailbreaks in Multi-modal LLMs via Inference-Time Alignment

Soumya Suvra Ghosal, Souradip Chakraborty, Vaibhav Singh et al.

CVPR 2025posterarXiv:2411.18688
#4580

Test-Time Backdoor Detection for Object Detection Models

Hangtao Zhang, Yichen Wang, Shihui Yan et al.

CVPR 2025posterarXiv:2503.15293
#4581

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Pengchong Qiao, Lei Shang, Chang Liu et al.

CVPR 2024posterarXiv:2403.06775
#4582

Towards High-fidelity Artistic Image Vectorization via Texture-Encapsulated Shape Parameterization

Ye Chen, Bingbing Ni, Jinfan Liu et al.

CVPR 2024poster
#4583

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024posterarXiv:2404.00678
#4584

PVC: Progressive Visual Token Compression for Unified Image and Video Processing in Large Vision-Language Models

Chenyu Yang, Xuan Dong, Xizhou Zhu et al.

CVPR 2025posterarXiv:2412.09613
#4585

Deformable One-shot Face Stylization via DINO Semantic Guidance

Yang Zhou, Zichong Chen, Hui Huang

CVPR 2024posterarXiv:2403.00459
#4586

Density-Guided Semi-Supervised 3D Semantic Segmentation with Dual-Space Hardness Sampling

Jianan Li, Qiulei Dong

CVPR 2024poster
#4587

Continual SFT Matches Multimodal RLHF with Negative Supervision

Ke Zhu, Yu Wang, Yanpeng Sun et al.

CVPR 2025posterarXiv:2411.14797
#4588

Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework

Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi et al.

CVPR 2024posterarXiv:2403.07636
#4589

MultiMorph: On-demand Atlas Construction

Mazdak Abulnaga, Andrew Hoopes, Neel Dey et al.

CVPR 2025posterarXiv:2504.00247
#4590

Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking

Phuc Nguyen, Minh Luu, Anh Tran et al.

CVPR 2025posterarXiv:2411.16183
#4591

DarkIR: Robust Low-Light Image Restoration

Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.

CVPR 2025posterarXiv:2412.13443
#4592

LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP

Yunshi HUANG, Fereshteh Shakeri, Jose Dolz et al.

CVPR 2024posterarXiv:2404.02285
#4593

ImagineFSL: Self-Supervised Pretraining Matters on Imagined Base Set for VLM-based Few-shot Learning

Haoyuan Yang, Xiaoou Li, Jiaming Lv et al.

CVPR 2025highlight
#4594

1-Lipschitz Layers Compared: Memory Speed and Certifiable Robustness

Bernd Prach, Fabio Brau, Giorgio Buttazzo et al.

CVPR 2024poster
#4595

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye

CVPR 2024posterarXiv:2312.00845
#4596

TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization

Liang Pan, Zeshi Yang, Zhiyang Dou et al.

CVPR 2025posterarXiv:2503.19901
#4597

DepthSplat: Connecting Gaussian Splatting and Depth

Haofei Xu, Songyou Peng, Fangjinhua Wang et al.

CVPR 2025posterarXiv:2410.13862
#4598

Towards Optimizing Large-Scale Multi-Graph Matching in Bioimaging

Max Kahl, Sebastian Stricker, Lisa Hutschenreiter et al.

CVPR 2025poster
#4599

PoNQ: a Neural QEM-based Mesh Representation

Nissim Maruani, Maks Ovsjanikov, Pierre Alliez et al.

CVPR 2024posterarXiv:2403.12870
#4600

M3-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection

Bin Pu, Liwen Wang, Jiewen Yang et al.

CVPR 2024poster