Most Cited CVPR "multimodal video analysis" Papers

5,589 papers found • Page 22 of 28

#4201

Affine Equivariant Networks Based on Differential Invariants

Yikang Li, Yeqing Qiu, Yuxuan Chen et al.

CVPR 2024poster
#4202

Diffusion-based Blind Text Image Super-Resolution

Yuzhe Zhang, jiawei zhang, Hao Li et al.

CVPR 2024posterarXiv:2312.08886
#4203

Improving Generalized Zero-Shot Learning by Exploring the Diverse Semantics from External Class Names

Yapeng Li, Yong Luo, Zengmao Wang et al.

CVPR 2024poster
#4204

Continual Learning for Motion Prediction Model via Meta-Representation Learning and Optimal Memory Buffer Retention Strategy

Dae Jun Kang, Dongsuk Kum, Sanmin Kim

CVPR 2024poster
#4205

FlowDiffuser: Advancing Optical Flow Estimation with Diffusion Models

Ao Luo, XIN LI, Fan Yang et al.

CVPR 2024highlight
#4206

3DGS-Avatar: Animatable Avatars via Deformable 3D Gaussian Splatting

Zhiyin Qian, Shaofei Wang, Marko Mihajlovic et al.

CVPR 2024posterarXiv:2312.09228
#4207

Drag Your Noise: Interactive Point-based Editing via Diffusion Semantic Propagation

Haofeng Liu, Chenshu Xu, Yifei Yang et al.

CVPR 2024posterarXiv:2404.01050
#4208

AdaRevD: Adaptive Patch Exiting Reversible Decoder Pushes the Limit of Image Deblurring

Xintian Mao, Xiwen Gao, Yan Wang

CVPR 2024posterarXiv:2406.09135
#4209

MultiMorph: On-demand Atlas Construction

Mazdak Abulnaga, Andrew Hoopes, Neel Dey et al.

CVPR 2025posterarXiv:2504.00247
#4210

Puff-Net: Efficient Style Transfer with Pure Content and Style Feature Fusion Network

Sizhe Zheng, Pan Gao, Peng Zhou et al.

CVPR 2024posterarXiv:2405.19775
#4211

SynSP: Synergy of Smoothness and Precision in Pose Sequences Refinement

Tao Wang, Lei Jin, Zheng Wang et al.

CVPR 2024poster
#4212

Building Vision-Language Models on Solid Foundations with Masked Distillation

Sepehr Sameni, Kushal Kafle, Hao Tan et al.

CVPR 2024poster
#4213

Any3DIS: Class-Agnostic 3D Instance Segmentation by 2D Mask Tracking

Phuc Nguyen, Minh Luu, Anh Tran et al.

CVPR 2025posterarXiv:2411.16183
#4214

MS-DETR: Efficient DETR Training with Mixed Supervision

Chuyang Zhao, Yifan Sun, Wenhao Wang et al.

CVPR 2024posterarXiv:2401.03989
#4215

DarkIR: Robust Low-Light Image Restoration

Daniel Feijoo, Juan C. Benito, Alvaro Garcia et al.

CVPR 2025posterarXiv:2412.13443
#4216

ImagineFSL: Self-Supervised Pretraining Matters on Imagined Base Set for VLM-based Few-shot Learning

Haoyuan Yang, Xiaoou Li, Jiaming Lv et al.

CVPR 2025highlight
#4217

TokenHSI: Unified Synthesis of Physical Human-Scene Interactions through Task Tokenization

Liang Pan, Zeshi Yang, Zhiyang Dou et al.

CVPR 2025posterarXiv:2503.19901
#4218

FaceChain-SuDe: Building Derived Class to Inherit Category Attributes for One-shot Subject-Driven Generation

Pengchong Qiao, Lei Shang, Chang Liu et al.

CVPR 2024posterarXiv:2403.06775
#4219

Towards High-fidelity Artistic Image Vectorization via Texture-Encapsulated Shape Parameterization

Ye Chen, Bingbing Ni, Jinfan Liu et al.

CVPR 2024poster
#4220

OmniSDF: Scene Reconstruction using Omnidirectional Signed Distance Functions and Adaptive Binoctrees

Hakyeong Kim, Andreas Meuleman, Hyeonjoong Jang et al.

CVPR 2024posterarXiv:2404.00678
#4221

DepthSplat: Connecting Gaussian Splatting and Depth

Haofei Xu, Songyou Peng, Fangjinhua Wang et al.

CVPR 2025posterarXiv:2410.13862
#4222

Deformable One-shot Face Stylization via DINO Semantic Guidance

Yang Zhou, Zichong Chen, Hui Huang

CVPR 2024posterarXiv:2403.00459
#4223

Density-Guided Semi-Supervised 3D Semantic Segmentation with Dual-Space Hardness Sampling

Jianan Li, Qiulei Dong

CVPR 2024poster
#4224

Decomposing Disease Descriptions for Enhanced Pathology Detection: A Multi-Aspect Vision-Language Pre-training Framework

Vu Minh Hieu Phan, Yutong Xie, Yuankai Qi et al.

CVPR 2024posterarXiv:2403.07636
#4225

Multitwine: Multi-Object Compositing with Text and Layout Control

Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.

CVPR 2025highlightarXiv:2502.05165
#4226

LP++: A Surprisingly Strong Linear Probe for Few-Shot CLIP

Yunshi HUANG, Fereshteh Shakeri, Jose Dolz et al.

CVPR 2024posterarXiv:2404.02285
#4227

1-Lipschitz Layers Compared: Memory Speed and Certifiable Robustness

Bernd Prach, Fabio Brau, Giorgio Buttazzo et al.

CVPR 2024poster
#4228

VMC: Video Motion Customization using Temporal Attention Adaption for Text-to-Video Diffusion Models

Hyeonho Jeong, Geon Yeong Park, Jong Chul Ye

CVPR 2024posterarXiv:2312.00845
#4229

RelationField: Relate Anything in Radiance Fields

Sebastian Koch, Johanna Wald, Mirco Colosi et al.

CVPR 2025posterarXiv:2412.13652
#4230

Recover and Match: Open-Vocabulary Multi-Label Recognition through Knowledge-Constrained Optimal Transport

Hao Tan, Zichang Tan, Jun Li et al.

CVPR 2025posterarXiv:2503.15337
#4231

PoNQ: a Neural QEM-based Mesh Representation

Nissim Maruani, Maks Ovsjanikov, Pierre Alliez et al.

CVPR 2024posterarXiv:2403.12870
#4232

M3-UDA: A New Benchmark for Unsupervised Domain Adaptive Fetal Cardiac Structure Detection

Bin Pu, Liwen Wang, Jiewen Yang et al.

CVPR 2024poster
#4233

Expandable Subspace Ensemble for Pre-Trained Model-Based Class-Incremental Learning

Da-Wei Zhou, Hai-Long Sun, Han-Jia Ye et al.

CVPR 2024posterarXiv:2403.12030
#4234

Beyond Image Super-Resolution for Image Recognition with Task-Driven Perceptual Loss

Jaeha Kim, Junghun Oh, Kyoung Mu Lee

CVPR 2024posterarXiv:2404.01692
#4235

Point-VOS: Pointing Up Video Object Segmentation

Sabarinath Mahadevan, Idil Esen Zulfikar, Paul Voigtlaender et al.

CVPR 2024posterarXiv:2402.05917
#4236

DFormerv2: Geometry Self-Attention for RGBD Semantic Segmentation

Bo-Wen Yin, Jiao-Long Cao, Ming-Ming Cheng et al.

CVPR 2025posterarXiv:2504.04701
#4237

Light Transport-aware Diffusion Posterior Sampling for Single-View Reconstruction of 3D Volumes

Ludwic Leonard, Nils Thuerey, rüdiger westermann

CVPR 2025highlightarXiv:2501.05226
#4238

A Dataset for Semantic Segmentation in the Presence of Unknowns

Zakaria Laskar, Tomas Vojir, Matej Grcic et al.

CVPR 2025posterarXiv:2503.22309
#4239

3D Face Tracking from 2D Video through Iterative Dense UV to Image Flow

Felix Taubner, Prashant Raina, Mathieu Tuli et al.

CVPR 2024posterarXiv:2404.09819
#4240

HIT: Estimating Internal Human Implicit Tissues from the Body Surface

Marilyn Keller, Vaibhav ARORA, Abdelmouttaleb Dakri et al.

CVPR 2024poster
#4241

FoundHand: Large-Scale Domain-Specific Learning for Controllable Hand Image Generation

Kefan Chen, Chaerin Min, Linguang Zhang et al.

CVPR 2025highlightarXiv:2412.02690
#4242

Efficient Event-Based Object Detection: A Hybrid Neural Network with Spatial and Temporal Attention

Soikat Hasan Ahmed, Jan Finkbeiner, Emre Neftci

CVPR 2025posterarXiv:2403.10173
#4243

Locally Orderless Images for Optimization in Differentiable Rendering

Ishit Mehta, Manmohan Chandraker, Ravi Ramamoorthi

CVPR 2025highlightarXiv:2503.21931
#4244

Authentic Hand Avatar from a Phone Scan via Universal Hand Model

Gyeongsik Moon, Weipeng Xu, Rohan Joshi et al.

CVPR 2024posterarXiv:2405.07933
#4245

DeDe: Detecting Backdoor Samples for SSL Encoders via Decoders

Sizai Hou, Songze Li, Duanyi Yao

CVPR 2025posterarXiv:2411.16154
#4246

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection

Jin Yang, Ping Wei, Huan Li et al.

CVPR 2024posterarXiv:2404.09263
#4247

Multiway Point Cloud Mosaicking with Diffusion and Global Optimization

Shengze Jin, Iro Armeni, Marc Pollefeys et al.

CVPR 2024posterarXiv:2404.00429
#4248

Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025highlightarXiv:2502.20256
#4249

NeRSP: Neural 3D Reconstruction for Reflective Objects with Sparse Polarized Images

Yufei Han, Heng Guo, Koki Fukai et al.

CVPR 2024posterarXiv:2406.07111
#4250

HDRFlow: Real-Time HDR Video Reconstruction with Large Motions

Gangwei Xu, Yujin Wang, Jinwei Gu et al.

CVPR 2024posterarXiv:2403.03447
#4251

Style-Editor: Text-driven Object-centric Style Editing

Jihun Park, Jongmin Gim, Kyoungmin Lee et al.

CVPR 2025highlightarXiv:2408.08461
#4252

Exploring Temporally-Aware Features for Point Tracking

Inès Hyeonsu Kim, Seokju Cho, Gabriel Huang et al.

CVPR 2025posterarXiv:2501.12218
#4253

Generative Inbetweening through Frame-wise Conditions-Driven Video Generation

Tianyi Zhu, Dongwei Ren, Qilong Wang et al.

CVPR 2025posterarXiv:2412.11755
#4254

Feature4X: Bridging Any Monocular Video to 4D Agentic AI with Versatile Gaussian Feature Fields

Shijie Zhou, Hui Ren, Yijia Weng et al.

CVPR 2025posterarXiv:2503.20776
#4255

Beyond Average: Individualized Visual Scanpath Prediction

Xianyu Chen, Ming Jiang, Qi Zhao

CVPR 2024posterarXiv:2404.12235
#4256

Beyond Text: Frozen Large Language Models in Visual Signal Comprehension

Lei Zhu, Fangyun Wei, Yanye Lu

CVPR 2024posterarXiv:2403.07874
#4257

LEDITS++: Limitless Image Editing using Text-to-Image Models

Manuel Brack, Felix Friedrich, Katharina Kornmeier et al.

CVPR 2024posterarXiv:2311.16711
#4258

Accurate Differential Operators for Hybrid Neural Fields

Aditya Chetan, Guandao Yang, Zichen Wang et al.

CVPR 2025posterarXiv:2312.05984
#4259

Open Ad-hoc Categorization with Contextualized Feature Learning

Zilin Wang, Sangwoo Mo, Stella X. Yu et al.

CVPR 2025posterarXiv:2512.16202
#4260

CAM Back Again: Large Kernel CNNs from a Weakly Supervised Object Localization Perspective

Shunsuke Yasuki, Masato Taki

CVPR 2024posterarXiv:2403.06676
#4261

MERGE: Multi-faceted Hierarchical Graph-based GNN for Gene Expression Prediction from Whole Slide Histopathology Images

Aniruddha Ganguly, Debolina Chatterjee, Wentao Huang et al.

CVPR 2025posterarXiv:2412.02601
#4262

Regularized Parameter Uncertainty for Improving Generalization in Reinforcement Learning

Pehuen Moure, Longbiao Cheng, Joachim Ott et al.

CVPR 2024poster
#4263

Robust Noisy Correspondence Learning with Equivariant Similarity Consistency

Yuchen Yang, Erkun Yang, Likai Wang et al.

CVPR 2024poster
#4264

Situational Awareness Matters in 3D Vision Language Reasoning

Yunze Man, Liang-Yan Gui, Yu-Xiong Wang

CVPR 2024posterarXiv:2406.07544
#4265

Omni-ID: Holistic Identity Representation Designed for Generative Tasks

Guocheng Qian, Kuan-Chieh Wang, Or Patashnik et al.

CVPR 2025posterarXiv:2412.09694
#4266

Decentralized Directed Collaboration for Personalized Federated Learning

Yingqi Liu, Yifan Shi, Qinglun Li et al.

CVPR 2024posterarXiv:2405.17876
#4267

Task-Driven Wavelets using Constrained Empirical Risk Minimization

Eric Marcus, Ray Sheombarsing, Jan-Jakob Sonke et al.

CVPR 2024poster
#4268

BADGR: Bundle Adjustment Diffusion Conditioned by Gradients for Wide-Baseline Floor Plan Reconstruction

Yuguang Li, Ivaylo Boyadzhiev, Zixuan Liu et al.

CVPR 2025highlightarXiv:2503.19340
#4269

SIFU: Side-view Conditioned Implicit Function for Real-world Usable Clothed Human Reconstruction

Zechuan Zhang, Zongxin Yang, Yi Yang

CVPR 2024highlightarXiv:2312.06704
#4270

OmniVec2 - A Novel Transformer based Network for Large Scale Multimodal and Multitask Learning

Siddharth Srivastava, Gaurav Sharma

CVPR 2024posterarXiv:2507.13364
#4271

Probing Synergistic High-Order Interaction in Infrared and Visible Image Fusion

Naishan Zheng, Man Zhou, Jie Huang et al.

CVPR 2024poster
#4272

SoundVista: Novel-View Ambient Sound Synthesis via Visual-Acoustic Binding

Mingfei Chen, Israel D. Gebru, Ishwarya Ananthabhotla et al.

CVPR 2025highlightarXiv:2504.05576
#4273

Scaling Up Dynamic Human-Scene Interaction Modeling

Nan Jiang, Zhiyuan Zhang, Hongjie Li et al.

CVPR 2024highlightarXiv:2403.08629
#4274

MVDoppler-Pose: Multi-Modal Multi-View mmWave Sensing for Long-Distance Self-Occluded Human Walking Pose Estimation

Jae-Ho Choi, Soheil Hor, Shubo Yang et al.

CVPR 2025poster
#4275

Utility-Fairness Trade-Offs and How to Find Them

Sepehr Dehdashtian, Bashir Sadeghi, Vishnu Naresh Boddeti

CVPR 2024posterarXiv:2404.09454
#4276

A Bias-Free Training Paradigm for More General AI-generated Image Detection

Fabrizio Guillaro, Giada Zingarini, Ben Usman et al.

CVPR 2025posterarXiv:2412.17671
#4277

ClimbingCap: Multi-Modal Dataset and Method for Rock Climbing in World Coordinate

Ming Yan, Xincheng Lin, Yuhua Luo et al.

CVPR 2025highlightarXiv:2503.21268
#4278

Data-Free Quantization via Pseudo-label Filtering

Chunxiao Fan, Ziqi Wang, Dan Guo et al.

CVPR 2024poster
#4279

PhD: A ChatGPT-Prompted Visual Hallucination Evaluation Dataset

Jiazhen Liu, Yuhan Fu, Ruobing Xie et al.

CVPR 2025highlightarXiv:2403.11116
#4280

DeepLA-Net: Very Deep Local Aggregation Networks for Point Cloud Analysis

Ziyin Zeng, Mingyue Dong, Jian Zhou et al.

CVPR 2025poster
#4281

Fitting Flats to Flats

Gabriel Dogadov, Ugo Finnendahl, Marc Alexa

CVPR 2024poster
#4282

HOIST-Former: Hand-held Objects Identification Segmentation and Tracking in the Wild

Supreeth Narasimhaswamy, Huy Anh Nguyen, Lihan Huang et al.

CVPR 2024poster
#4283

LamRA: Large Multimodal Model as Your Advanced Retrieval Assistant

Yikun Liu, Yajie Zhang, jiayin cai et al.

CVPR 2025posterarXiv:2412.01720
#4284

Faster Parameter-Efficient Tuning with Token Redundancy Reduction

Kwonyoung Kim, Jungin Park, Jin Kim et al.

CVPR 2025posterarXiv:2503.20282
#4285

Generative Multiview Relighting for 3D Reconstruction under Extreme Illumination Variation

Hadi Alzayer, Philipp Henzler, Jonathan T. Barron et al.

CVPR 2025highlightarXiv:2412.15211
#4286

Towards Robust Learning to Optimize with Theoretical Guarantees

Qingyu Song, Wei Lin, Juncheng Wang et al.

CVPR 2024posterarXiv:2506.14263
#4287

Animating General Image with Large Visual Motion Model

Dengsheng Chen, Xiaoming Wei, Xiaolin Wei

CVPR 2024poster
#4288

Feature Selection for Latent Factor Models

Rittwika Kansabanik, Adrian Barbu

CVPR 2025posterarXiv:2412.10128
#4289

MetaCloak: Preventing Unauthorized Subject-driven Text-to-image Diffusion-based Synthesis via Meta-learning

Yixin Liu, Chenrui Fan, Yutong Dai et al.

CVPR 2024posterarXiv:2311.13127
#4290

Stochastic Human Motion Prediction with Memory of Action Transition and Action Characteristic

Jianwei Tang, Hong Yang, Tengyue Chen et al.

CVPR 2025posterarXiv:2507.04062
#4291

Attention IoU: Examining Biases in CelebA using Attention Maps

Aaron Serianni, Tyler Zhu, Olga Russakovsky et al.

CVPR 2025posterarXiv:2503.19846
#4292

EventEgo3D: 3D Human Motion Capture from Egocentric Event Streams

Christen Millerdurai, Hiroyasu Akada, Jian Wang et al.

CVPR 2024posterarXiv:2404.08640
#4293

Prof. Robot: Differentiable Robot Rendering Without Static and Self-Collisions

Quanyuan Ruan, Jiabao Lei, Wenhao Yuan et al.

CVPR 2025posterarXiv:2503.11269
#4294

ModaVerse: Efficiently Transforming Modalities with LLMs

Xinyu Wang, Bohan Zhuang, Qi Wu

CVPR 2024posterarXiv:2401.06395
#4295

Improving Generalization via Meta-Learning on Hard Samples

Nishant Jain, Arun Suggala, Pradeep Shenoy

CVPR 2024posterarXiv:2403.12236
#4296

WaveFace: Authentic Face Restoration with Efficient Frequency Recovery

Yunqi Miao, Jiankang Deng, Jungong Han

CVPR 2024posterarXiv:2403.12760
#4297

Hierarchical Histogram Threshold Segmentation – Auto-terminating High-detail Oversegmentation

Thomas Chang, Simon Seibt, Bartosz von Rymon Lipinski

CVPR 2024poster
#4298

Low-Biased General Annotated Dataset Generation

Dengyang Jiang, Haoyu Wang, Lei Zhang et al.

CVPR 2025posterarXiv:2412.10831
#4299

CogAgent: A Visual Language Model for GUI Agents

Wenyi Hong, Weihan Wang, Qingsong Lv et al.

CVPR 2024highlightarXiv:2312.08914
#4300

Learning Adaptive Spatial Coherent Correlations for Speech-Preserving Facial Expression Manipulation

Tianshui Chen, Jianman Lin, Zhijing Yang et al.

CVPR 2024highlight
#4301

UFORecon: Generalizable Sparse-View Surface Reconstruction from Arbitrary and Unfavorable Sets

Youngju Na, Woo Jae Kim, Kyu Han et al.

CVPR 2024posterarXiv:2403.05086
#4302

Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data

Lihe Yang, Bingyi Kang, Zilong Huang et al.

CVPR 2024posterarXiv:2401.10891
#4303

EventDance: Unsupervised Source-free Cross-modal Adaptation for Event-based Object Recognition

Xu Zheng, Addison, Lin Wang

CVPR 2024posterarXiv:2403.14082
#4304

Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

Insoo Kim, Jae Seok Choi, Geonseok Seo et al.

CVPR 2024posterarXiv:2404.12168
#4305

ArtFormer: Controllable Generation of Diverse 3D Articulated Objects

Jiayi Su, Youhe Feng, Zheng Li et al.

CVPR 2025posterarXiv:2412.07237
#4306

Learning Physics-Based Full-Body Human Reaching and Grasping from Brief Walking References

Yitang Li, Mingxian Lin, Zhuo Lin et al.

CVPR 2025posterarXiv:2503.07481
#4307

Multi-criteria Token Fusion with One-step-ahead Attention for Efficient Vision Transformers

Sanghyeok Lee, Joonmyung Choi, Hyunwoo J. Kim

CVPR 2024posterarXiv:2403.10030
#4308

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.

CVPR 2025posterarXiv:2412.16153
#4309

BANF: Band-Limited Neural Fields for Levels of Detail Reconstruction

Ahan Shabanov, Shrisudhan Govindarajan, Cody Reading et al.

CVPR 2024posterarXiv:2404.13024
#4310

HUGS: Holistic Urban 3D Scene Understanding via Gaussian Splatting

Hongyu Zhou, Jiahao Shao, Lu Xu et al.

CVPR 2024posterarXiv:2403.12722
#4311

Human Motion Prediction Under Unexpected Perturbation

Jiangbei Yue, Baiyi Li, Julien Pettré et al.

CVPR 2024highlightarXiv:2403.15891
#4312

LLMs are Good Action Recognizers

Haoxuan Qu, Yujun Cai, Jun Liu

CVPR 2024posterarXiv:2404.00532
#4313

SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous Driving

Yiming Xie, Henglu Wei, Zhenyi Liu et al.

CVPR 2024posterarXiv:2403.17094
#4314

NeRFiller: Completing Scenes via Generative 3D Inpainting

Ethan Weber, Aleksander Holynski, Varun Jampani et al.

CVPR 2024posterarXiv:2312.04560
#4315

PeVL: Pose-Enhanced Vision-Language Model for Fine-Grained Human Action Recognition

Haosong Zhang, Mei Leong, Liyuan Li et al.

CVPR 2024poster
#4316

MPOD123: One Image to 3D Content Generation Using Mask-enhanced Progressive Outline-to-Detail Optimization

Jimin Xu, Tianbao Wang, Tao Jin et al.

CVPR 2024poster
#4317

MambaVision: A Hybrid Mamba-Transformer Vision Backbone

Ali Hatamizadeh, Jan Kautz

CVPR 2025posterarXiv:2407.08083
#4318

Look-Up Table Compression for Efficient Image Restoration

Yinglong Li, Jiacheng Li, Zhiwei Xiong

CVPR 2024highlight
#4319

Hourglass Tokenizer for Efficient Transformer-Based 3D Human Pose Estimation

Wenhao Li, Mengyuan Liu, Hong Liu et al.

CVPR 2024highlightarXiv:2311.12028
#4320

RepAn: Enhanced Annealing through Re-parameterization

Xiang Fei, Xiawu Zheng, Yan Wang et al.

CVPR 2024poster
#4321

PAPR in Motion: Seamless Point-level 3D Scene Interpolation

Shichong Peng, Yanshu Zhang, Ke Li

CVPR 2024highlightarXiv:2406.05533
#4322

Towards Modern Image Manipulation Localization: A Large-Scale Dataset and Novel Methods

Chenfan Qu, Yiwu Zhong, Chongyu Liu et al.

CVPR 2024poster
#4323

Dense Vision Transformer Compression with Few Samples

Hanxiao Zhang, Yifan Zhou, Guo-Hua Wang

CVPR 2024posterarXiv:2403.18708
#4324

Generative Photomontage

Sean J. Liu, Nupur Kumari, Ariel Shamir et al.

CVPR 2025posterarXiv:2408.07116
#4325

Can Large Vision-Language Models Correct Semantic Grounding Errors By Themselves?

Yuan-Hong Liao, Rafid Mahmood, Sanja Fidler et al.

CVPR 2025posterarXiv:2404.06510
#4326

IntrinsicAvatar: Physically Based Inverse Rendering of Dynamic Humans from Monocular Videos via Explicit Ray Tracing

Shaofei Wang, Bozidar Antic, Andreas Geiger et al.

CVPR 2024posterarXiv:2312.05210
#4327

Exploring Pose-Aware Human-Object Interaction via Hybrid Learning

EASTMAN Z Y WU, Yali Li, Yuan Wang et al.

CVPR 2024poster
#4328

All in One Framework for Multimodal Re-identification in the Wild

He Li, Mang Ye, Ming Zhang et al.

CVPR 2024posterarXiv:2405.04741
#4329

Bilateral Adaptation for Human-Object Interaction Detection with Occlusion-Robustness

Guangzhi Wang, Yangyang Guo, Ziwei Xu et al.

CVPR 2024poster
#4330

Community Forensics: Using Thousands of Generators to Train Fake Image Detectors

Jeongsoo Park, Andrew Owens

CVPR 2025posterarXiv:2411.04125
#4331

TCP:Textual-based Class-aware Prompt tuning for Visual-Language Model

Hantao Yao, Rui Zhang, Changsheng Xu

CVPR 2024poster
#4332

RMT: Retentive Networks Meet Vision Transformers

Qihang Fan, Huaibo Huang, Mingrui Chen et al.

CVPR 2024posterarXiv:2309.11523
#4333

FedSOL: Stabilized Orthogonal Learning with Proximal Restrictions in Federated Learning

Gihun Lee, Minchan Jeong, SangMook Kim et al.

CVPR 2024posterarXiv:2308.12532
#4334

Low-Rank Approximation for Sparse Attention in Multi-Modal LLMs

Lin Song, Yukang Chen, Shuai Yang et al.

CVPR 2024poster
#4335

LAENeRF: Local Appearance Editing for Neural Radiance Fields

Lukas Radl, Michael Steiner, Andreas Kurz et al.

CVPR 2024posterarXiv:2312.09913
#4336

PanoGS: Gaussian-based Panoptic Segmentation for 3D Open Vocabulary Scene Understanding

Hongjia Zhai, Hai Li, Zhenzhe Li et al.

CVPR 2025posterarXiv:2503.18107
#4337

Don't Look into the Dark: Latent Codes for Pluralistic Image Inpainting

Haiwei Chen, Yajie Zhao

CVPR 2024posterarXiv:2403.18186
#4338

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.

CVPR 2025highlightarXiv:2503.12127
#4339

Improved Visual Grounding through Self-Consistent Explanations

Ruozhen He, Paola Cascante-Bonilla, Ziyan Yang et al.

CVPR 2024posterarXiv:2312.04554
#4340

GLane3D: Detecting Lanes with Graph of 3D Keypoints

Halil İbrahim Öztürk, Muhammet Esat Kalfaoglu, Ozsel Kilinc

CVPR 2025posterarXiv:2503.23882
#4341

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

Yan Wang, Baoxiong Jia, Ziyu Zhu et al.

CVPR 2025posterarXiv:2504.19500
#4342

HyperNVD: Accelerating Neural Video Decomposition via Hypernetworks

Maria Pilligua, Danna Xue, Javier Vazquez-Corral

CVPR 2025posterarXiv:2503.17276
#4343

Time of the Flight of the Gaussians: Optimizing Depth Indirectly in Dynamic Radiance Fields

Runfeng Li, Mikhail Okunev, Zixuan Guo et al.

CVPR 2025posterarXiv:2505.05356
#4344

AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture Search

Junghyup Lee, Bumsub Ham

CVPR 2024posterarXiv:2403.19232
#4345

On the Faithfulness of Vision Transformer Explanations

Junyi Wu, Weitai Kang, Hao Tang et al.

CVPR 2024posterarXiv:2404.01415
#4346

HOTFormerLoc: Hierarchical Octree Transformer for Versatile Lidar Place Recognition Across Ground and Aerial Views

Ethan Griffiths, Maryam Haghighat, Simon Denman et al.

CVPR 2025posterarXiv:2503.08140
#4347

CHAIN: Enhancing Generalization in Data-Efficient GANs via lipsCHitz continuity constrAIned Normalization

Yao Ni, Piotr Koniusz

CVPR 2024posterarXiv:2404.00521
#4348

OneFormer3D: One Transformer for Unified Point Cloud Segmentation

Maksim Kolodiazhnyi, Anna Vorontsova, Anton Konushin et al.

CVPR 2024posterarXiv:2311.14405
#4349

PerLA: Perceptive 3D Language Assistant

Guofeng Mei, Wei Lin, Luigi Riz et al.

CVPR 2025posterarXiv:2411.19774
#4350

One-2-3-45++: Fast Single Image to 3D Objects with Consistent Multi-View Generation and 3D Diffusion

Minghua Liu, Ruoxi Shi, Linghao Chen et al.

CVPR 2024posterarXiv:2311.07885
#4351

C2KD: Bridging the Modality Gap for Cross-Modal Knowledge Distillation

Fushuo Huo, Wenchao Xu, Jingcai Guo et al.

CVPR 2024highlight
#4352

StrokeFaceNeRF: Stroke-based Facial Appearance Editing in Neural Radiance Field

Xiao-juan Li, Dingxi Zhang, Shu-Yu Chen et al.

CVPR 2024poster
#4353

Neural Modes: Self-supervised Learning of Nonlinear Modal Subspaces

Jiahong Wang, Yinwei DU, Stelian Coros et al.

CVPR 2024posterarXiv:2404.17620
#4354

WHAM: Reconstructing World-grounded Humans with Accurate 3D Motion

Soyong Shin, Juyong Kim, Eni Halilaj et al.

CVPR 2024posterarXiv:2312.07531
#4355

Magma: A Foundation Model for Multimodal AI Agents

Jianwei Yang, Reuben Tan, Qianhui Wu et al.

CVPR 2025posterarXiv:2502.13130
#4356

CLOAF: CoLlisiOn-Aware Human Flow

Andrey Davydov, Martin Engilberge, Mathieu Salzmann et al.

CVPR 2024posterarXiv:2403.09050
#4357

FedUV: Uniformity and Variance for Heterogeneous Federated Learning

Ha Min Son, Moon-Hyun Kim, Tai-Myoung Chung et al.

CVPR 2024posterarXiv:2402.18372
#4358

GOAL: Global-local Object Alignment Learning

Hyungyu Choi, Young Kyun Jang, Chanho Eom

CVPR 2025posterarXiv:2503.17782
#4359

MINIMA: Modality Invariant Image Matching

Jiangwei Ren, Xingyu Jiang, Zizhuo Li et al.

CVPR 2025posterarXiv:2412.19412
#4360

GenAssets: Generating in-the-wild 3D Assets in Latent Space

Ze Yang, Jingkang Wang, Haowei Zhang et al.

CVPR 2025poster
#4361

Towards Enhanced Image Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

Yikai Wang, Chenjie Cao, Junqiu Yu et al.

CVPR 2025highlightarXiv:2312.04831
#4362

LT3SD: Latent Trees for 3D Scene Diffusion

Quan Meng, Lei Li, Matthias Nießner et al.

CVPR 2025posterarXiv:2409.08215
#4363

Learning Occupancy for Monocular 3D Object Detection

Liang Peng, Junkai Xu, Haoran Cheng et al.

CVPR 2024posterarXiv:2305.15694
#4364

Bring Event into RGB and LiDAR: Hierarchical Visual-Motion Fusion for Scene Flow

Hanyu Zhou, Yi Chang, Zhiwei Shi

CVPR 2024posterarXiv:2403.07432
#4365

Language-driven Grasp Detection

An Dinh Vuong, Minh Nhat VU, Baoru Huang et al.

CVPR 2024posterarXiv:2406.09489
#4366

Each Test Image Deserves A Specific Prompt: Continual Test-Time Adaptation for 2D Medical Image Segmentation

Ziyang Chen, Yongsheng Pan, Yiwen Ye et al.

CVPR 2024posterarXiv:2311.18363
#4367

Realistic Test-Time Adaptation of Vision-Language Models

Maxime Zanella, Clément Fuchs, Christophe De Vleeschouwer et al.

CVPR 2025highlightarXiv:2501.03729
#4368

Iterative Predictor-Critic Code Decoding for Real-World Image Dehazing

Jiayi Fu, Siyu Liu, Zikun Liu et al.

CVPR 2025posterarXiv:2503.13147
#4369

SPMTrack: Spatio-Temporal Parameter-Efficient Fine-Tuning with Mixture of Experts for Scalable Visual Tracking

Wenrui Cai, Qingjie Liu, Yunhong Wang

CVPR 2025posterarXiv:2503.18338
#4370

HuPerFlow: A Comprehensive Benchmark for Human vs. Machine Motion Estimation Comparison

Yung-Hao Yang, Zitang Sun, Taiki Fukiage et al.

CVPR 2025highlight
#4371

Abductive Ego-View Accident Video Understanding for Safe Driving Perception

Jianwu Fang, Lei-lei Li, Junfei Zhou et al.

CVPR 2024highlightarXiv:2403.00436
#4372

Prompting Vision Foundation Models for Pathology Image Analysis

CHONG YIN, Siqi Liu, Kaiyang Zhou et al.

CVPR 2024poster
#4373

Generative Omnimatte: Learning to Decompose Video into Layers

Yao-Chih Lee, Erika Lu, Sarah Rumbley et al.

CVPR 2025highlightarXiv:2411.16683
#4374

Unmixing Before Fusion: A Generalized Paradigm for Multi-Source-based Hyperspectral Image Synthesis

Yang Yu, Erting Pan, Xinya Wang et al.

CVPR 2024poster
#4375

Localizing Events in Videos with Multimodal Queries

Gengyuan Zhang, Mang Ling Ada Fok, Jialu Ma et al.

CVPR 2025posterarXiv:2406.10079
#4376

Zero-Shot Image Restoration Using Few-Step Guidance of Consistency Models (and Beyond)

Tomer Garber, Tom Tirer

CVPR 2025posterarXiv:2412.20596
#4377

Silent Branding Attack: Trigger-free Data Poisoning Attack on Text-to-Image Diffusion Models

Sangwon Jang, June Suk Choi, Jaehyeong Jo et al.

CVPR 2025posterarXiv:2503.09669
#4378

Navigating Beyond Dropout: An Intriguing Solution towards Generalizable Image Super Resolution

Hongjun Wang, Jiyuan Chen, Yinqiang Zheng et al.

CVPR 2024posterarXiv:2402.18929
#4379

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch

Xidong Wu, Shangqian Gao, Zeyu Zhang et al.

CVPR 2024posterarXiv:2403.14729
#4380

SCINeRF: Neural Radiance Fields from a Snapshot Compressive Image

Yunhao Li, Xiaodong Wang, Ping Wang et al.

CVPR 2024highlightarXiv:2403.20018
#4381

Learning to Control Camera Exposure via Reinforcement Learning

Kyunghyun Lee, Ukcheol Shin, Byeong-Uk Lee

CVPR 2024posterarXiv:2404.01636
#4382

Regressor-Segmenter Mutual Prompt Learning for Crowd Counting

Mingyue Guo, Li Yuan, Zhaoyi Yan et al.

CVPR 2024posterarXiv:2312.01711
#4383

Vector Graphics Generation via Mutually Impulsed Dual-domain Diffusion

Zhongyin Zhao, Ye Chen, Zhangli Hu et al.

CVPR 2024poster
#4384

Spectral Meets Spatial: Harmonising 3D Shape Matching and Interpolation

Dongliang Cao, Marvin Eisenberger, Nafie El Amrani et al.

CVPR 2024posterarXiv:2402.18920
#4385

FocSAM: Delving Deeply into Focused Objects in Segmenting Anything

You Huang, Zongyu Lan, Liujuan Cao et al.

CVPR 2024posterarXiv:2405.18706
#4386

Pathways on the Image Manifold: Image Editing via Video Generation

Noam Rotstein, Gal Yona, Daniel Silver et al.

CVPR 2025posterarXiv:2411.16819
#4387

Sculpt3D: Multi-View Consistent Text-to-3D Generation with Sparse 3D Prior

Chen Cheng, Xiaofeng Yang, Fan Yang et al.

CVPR 2024posterarXiv:2403.09140
#4388

Scene-agnostic Pose Regression for Visual Localization

Junwei Zheng, Ruiping Liu, Yufan Chen et al.

CVPR 2025posterarXiv:2503.19543
#4389

PhysAnimator: Physics-Guided Generative Cartoon Animation

Tianyi Xie, Yiwei Zhao, Ying Jiang et al.

CVPR 2025posterarXiv:2501.16550
#4390

Conformal Prediction for Zero-Shot Models

Julio Silva-Rodríguez, Ismail Ben Ayed, Jose Dolz

CVPR 2025posterarXiv:2505.24693
#4391

Real-time Free-view Human Rendering from Sparse-view RGB Videos using Double Unprojected Textures

Guoxing Sun, Rishabh Dabral, Heming Zhu et al.

CVPR 2025highlightarXiv:2412.13183
#4392

Learning to Transform Dynamically for Better Adversarial Transferability

Rongyi Zhu, Zeliang Zhang, Susan Liang et al.

CVPR 2024posterarXiv:2405.14077
#4393

Spin-UP: Spin Light for Natural Light Uncalibrated Photometric Stereo

Zongrui Li, Zhan Lu, Haojie Yan et al.

CVPR 2024posterarXiv:2404.01612
#4394

SEAS: ShapE-Aligned Supervision for Person Re-Identification

Haidong Zhu, Pranav Budhwant, Zhaoheng Zheng et al.

CVPR 2024poster
#4395

Learning to Select Views for Efficient Multi-View Understanding

Yunzhong Hou, Stephen Gould, Liang Zheng

CVPR 2024poster
#4396

LidaRF: Delving into Lidar for Neural Radiance Field on Street Scenes

Shanlin Sun, Bingbing Zhuang, Ziyu Jiang et al.

CVPR 2024highlightarXiv:2405.00900
#4397

Task2Box: Box Embeddings for Modeling Asymmetric Task Relationships

Rangel Daroya, Aaron Sun, Subhransu Maji

CVPR 2024highlightarXiv:2403.17173
#4398

UniGS: Unified Representation for Image Generation and Segmentation

Lu Qi, Lehan Yang, Weidong Guo et al.

CVPR 2024posterarXiv:2312.01985
#4399

ExMap: Leveraging Explainability Heatmaps for Unsupervised Group Robustness to Spurious Correlations

Rwiddhi Chakraborty, Adrian de Sena Sletten, Michael C. Kampffmeyer

CVPR 2024posterarXiv:2403.13870
#4400

DUDF: Differentiable Unsigned Distance Fields with Hyperbolic Scaling

Miguel Fainstein, Viviana Siless, Emmanuel Iarussi

CVPR 2024posterarXiv:2402.08876