Most Cited CVPR "constrained online convex optimization" Papers

5,589 papers found • Page 17 of 28

#3201

Multimodal Representation Learning by Alternating Unimodal Adaptation

Xiaohui Zhang, Jaehong Yoon, Mohit Bansal et al.

CVPR 2024posterarXiv:2311.10707
#3202

Compositional Video Understanding with Spatiotemporal Structure-based Transformers

Hoyeoung Yun, Jinwoo Ahn, Minseo Kim et al.

CVPR 2024poster
#3203

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

Yuheng Feng, Changsong Wen, Zelin Peng et al.

CVPR 2025poster
#3204

Make-It-Vivid: Dressing Your Animatable Biped Cartoon Characters from Text

Junshu Tang, Yanhong Zeng, Ke Fan et al.

CVPR 2024posterarXiv:2403.16897
#3205

GeoAvatar: Geometrically-Consistent Multi-Person Avatar Reconstruction from Sparse Multi-View Videos

Soohyun Lee, SeoYeon Kim, HeeKyung Lee et al.

CVPR 2025poster
#3206

Universal Domain Adaptation for Semantic Segmentation

Seun-An Choe, Keon Hee Park, Jinwoo Choi et al.

CVPR 2025posterarXiv:2505.22458
#3207

Coherent Temporal Synthesis for Incremental Action Segmentation

Guodong Ding, Hans Golong, Angela Yao

CVPR 2024posterarXiv:2403.06102
#3208

Person in Place: Generating Associative Skeleton-Guidance Maps for Human-Object Interaction Image Editing

ChangHee Yang, ChanHee Kang, Kyeongbo Kong et al.

CVPR 2024poster
#3209

WF-VAE: Enhancing Video VAE by Wavelet-Driven Energy Flow for Latent Video Diffusion Model

Zongjian Li, Bin Lin, Yang Ye et al.

CVPR 2025posterarXiv:2411.17459
#3210

Estimating Extreme 3D Image Rotations using Cascaded Attention

Shay Dekel, Yosi Keller, Martin Čadík

CVPR 2024poster
#3211

Breaking the Memory Barrier of Contrastive Loss via Tile-Based Strategy

Zesen Cheng, Hang Zhang, Kehan Li et al.

CVPR 2025highlight
#3212

Towards Real-World HDR Video Reconstruction: A Large-Scale Benchmark Dataset and A Two-Stage Alignment Network

Yong Shu, Liquan Shen, Xiangyu Hu et al.

CVPR 2024posterarXiv:2405.00244
#3213

Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular Stereo and RGB-D Cameras

Huajian Huang, Longwei Li, Hui Cheng et al.

CVPR 2024posterarXiv:2311.16728
#3214

Attention Calibration for Disentangled Text-to-Image Personalization

Yanbing Zhang, Mengping Yang, Qin Zhou et al.

CVPR 2024posterarXiv:2403.18551
#3215

SmartMask: Context Aware High-Fidelity Mask Generation for Fine-grained Object Insertion and Layout Control

Jaskirat Singh, Jianming Zhang, Qing Liu et al.

CVPR 2024posterarXiv:2312.05039
#3216

GraCo: Granularity-Controllable Interactive Segmentation

Yian Zhao, Kehan Li, Zesen Cheng et al.

CVPR 2024highlightarXiv:2405.00587
#3217

Segment Every Out-of-Distribution Object

Wenjie Zhao, Jia Li, Xin Dong et al.

CVPR 2024posterarXiv:2311.16516
#3218

Upscale-A-Video: Temporal-Consistent Diffusion Model for Real-World Video Super-Resolution

Shangchen Zhou, Peiqing Yang, Jianyi Wang et al.

CVPR 2024highlightarXiv:2312.06640
#3219

Scaling Up to Excellence: Practicing Model Scaling for Photo-Realistic Image Restoration In the Wild

Fanghua Yu, Jinjin Gu, Zheyuan Li et al.

CVPR 2024posterarXiv:2401.13627
#3220

Masked and Shuffled Blind Spot Denoising for Real-World Images

Hamadi Chihaoui, Paolo Favaro

CVPR 2024posterarXiv:2404.09389
#3221

GeoMM: On Geodesic Perspective for Multi-modal Learning

Shibin Mei, Hang Wang, Bingbing Ni

CVPR 2025posterarXiv:2505.11216
#3222

Open-Vocabulary Object 6D Pose Estimation

Jaime Corsetti, Davide Boscaini, Changjae Oh et al.

CVPR 2024highlightarXiv:2312.00690
#3223

Generative Region-Language Pretraining for Open-Ended Object Detection

Chuang Lin, Yi Jiang, Lizhen Qu et al.

CVPR 2024posterarXiv:2403.10191
#3224

Boosting Diffusion Models with Moving Average Sampling in Frequency Domain

Yurui Qian, Qi Cai, Yingwei Pan et al.

CVPR 2024posterarXiv:2403.17870
#3225

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

Markus Karmann, Onay Urfalioglu

CVPR 2025posterarXiv:2411.10411
#3226

Discovering Syntactic Interaction Clues for Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Weibo Jiang et al.

CVPR 2024poster
#3227

Stabilizing and Accelerating Autofocus with Expert Trajectory Regularized Deep Reinforcement Learning

Shouhang Zhu, Chenglin Li, Yuankun Jiang et al.

CVPR 2025poster
#3228

Quantifying Uncertainty in Motion Prediction with Variational Bayesian Mixture

Juanwu Lu, Can Cui, Yunsheng Ma et al.

CVPR 2024posterarXiv:2404.03789
#3229

RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models

Greg Heinrich, Mike Ranzinger, Danny Yin et al.

CVPR 2025posterarXiv:2412.07679
#3230

Generative Latent Coding for Ultra-Low Bitrate Image Compression

Zhaoyang Jia, Jiahao Li, Bin Li et al.

CVPR 2024posterarXiv:2512.20194
#3231

Selectively Informative Description can Reduce Undesired Embedding Entanglements in Text-to-Image Personalization

Jimyeong Kim, Jungwon Park, Wonjong Rhee

CVPR 2024posterarXiv:2403.15330
#3232

Font-Agent: Enhancing Font Understanding with Large Language Models

Yingxin Lai, Cuijie Xu, Haitian Shi et al.

CVPR 2025poster
#3233

Multi-Modal Contrastive Masked Autoencoders: A Two-Stage Progressive Pre-training Approach for RGBD Datasets

Muhammad Abdullah Jamal, Omid Mohareri

CVPR 2025poster
#3234

SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream Tasks

Yaxu Xie, Alain Pagani, Didier Stricker

CVPR 2024posterarXiv:2403.19474
#3235

Text-to-Image Diffusion Models are Great Sketch-Photo Matchmakers

Subhadeep Koley, Ayan Kumar Bhunia, Aneeshan Sain et al.

CVPR 2024posterarXiv:2403.07214
#3236

Back to 3D: Few-Shot 3D Keypoint Detection with Back-Projected 2D Features

Thomas Wimmer, Peter Wonka, Maks Ovsjanikov

CVPR 2024posterarXiv:2311.18113
#3237

Alpha-CLIP: A CLIP Model Focusing on Wherever You Want

Zeyi Sun, Ye Fang, Tong Wu et al.

CVPR 2024posterarXiv:2312.03818
#3238

DemoFusion: Democratising High-Resolution Image Generation With No $$$

Ruoyi DU, Dongliang Chang, Timothy Hospedales et al.

CVPR 2024posterarXiv:2311.16973
#3239

Activity-Biometrics: Person Identification from Daily Activities

Shehreen Azad, Yogesh S. Rawat

CVPR 2024posterarXiv:2403.17360
#3240

Holoported Characters: Real-time Free-viewpoint Rendering of Humans from Sparse RGB Cameras

Ashwath Shetty, Marc Habermann, Guoxing Sun et al.

CVPR 2024posterarXiv:2312.07423
#3241

Neighbor Relations Matter in Video Scene Detection

Jiawei Tan, Hongxing Wang, Jiaxin Li et al.

CVPR 2024poster
#3242

STINR: Deciphering Spatial Transcriptomics via Implicit Neural Representation

Yisi Luo, Xile Zhao, Kai Ye et al.

CVPR 2025poster
#3243

Fast ODE-based Sampling for Diffusion Models in Around 5 Steps

Zhenyu Zhou, Defang Chen, Can Wang et al.

CVPR 2024highlightarXiv:2312.00094
#3244

Referring Image Editing: Object-level Image Editing via Referring Expressions

Chang Liu, Xiangtai Li, Henghui Ding

CVPR 2024poster
#3245

InNeRF360: Text-Guided 3D-Consistent Object Inpainting on 360-degree Neural Radiance Fields

Dongqing Wang, Tong Zhang, Alaa Abboud et al.

CVPR 2024posterarXiv:2305.15094
#3246

3D-SLNR: A Super Lightweight Neural Representation for Large-scale 3D Mapping

Chenhui Shi, Fulin Tang, Ning An et al.

CVPR 2025poster
#3247

R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization

Xudong Jiang, Fangjinhua Wang, Silvano Galliani et al.

CVPR 2025posterarXiv:2501.01421
#3248

A Unified Image-Dense Annotation Generation Model for Underwater Scenes

Hongkai Lin, Dingkang Liang, Zhenghao Qi et al.

CVPR 2025posterarXiv:2503.21771
#3249

From-Ground-To-Objects: Coarse-to-Fine Self-supervised Monocular Depth Estimation of Dynamic Objects with Ground Contact Prior

Jaeho Moon, Juan Luis Gonzalez Bello, Byeongjun Kwon et al.

CVPR 2024posterarXiv:2312.10118
#3250

Unsupervised Blind Image Deblurring Based on Self-Enhancement

Lufei Chen, Xiangpeng Tian, Shuhua Xiong et al.

CVPR 2024poster
#3251

Mask Grounding for Referring Image Segmentation

Yong Xien Chng, Henry Zheng, Yizeng Han et al.

CVPR 2024posterarXiv:2312.12198
#3252

SignGraph: A Sign Sequence is Worth Graphs of Nodes

Shiwei Gan, Yafeng Yin, Zhiwei Jiang et al.

CVPR 2024poster
#3253

Embracing Unimodal Aleatoric Uncertainty for Robust Multimodal Fusion

Zixian Gao, Xun Jiang, Xing Xu et al.

CVPR 2024poster
#3254

DGC-GNN: Leveraging Geometry and Color Cues for Visual Descriptor-Free 2D-3D Matching

Shuzhe Wang, Juho Kannala, Daniel Barath

CVPR 2024posterarXiv:2306.12547
#3255

Towards More General Video-based Deepfake Detection through Facial Component Guided Adaptation for Foundation Model

Yue-Hua Han, Tai-Ming Huang, Kailung Hua et al.

CVPR 2025posterarXiv:2404.05583
#3256

FreeDrag: Feature Dragging for Reliable Point-based Image Editing

Pengyang Ling, Lin Chen, Pan Zhang et al.

CVPR 2024posterarXiv:2307.04684
#3257

HIMap: HybrId Representation Learning for End-to-end Vectorized HD Map Construction

Yi ZHOU, Hui Zhang, Jiaqian Yu et al.

CVPR 2024posterarXiv:2403.08639
#3258

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Alice Heiman, Xiaoman Zhang, Emma Chen et al.

CVPR 2025posterarXiv:2411.18672
#3259

Pursuing Temporal-Consistent Video Virtual Try-On via Dynamic Pose Interaction

Dong Li, Wenqi Zhong, Wei Yu et al.

CVPR 2025posterarXiv:2505.16980
#3260

TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models

Yushi Huang, Ruihao Gong, Jing Liu et al.

CVPR 2024highlightarXiv:2311.16503
#3261

GaussianAvatars: Photorealistic Head Avatars with Rigged 3D Gaussians

Shenhan Qian, Tobias Kirschstein, Liam Schoneveld et al.

CVPR 2024highlightarXiv:2312.02069
#3262

Temporally Consistent Object-Centric Learning by Contrasting Slots

Anna Manasyan, Maximilian Seitzer, Filip Radovic et al.

CVPR 2025posterarXiv:2412.14295
#3263

SET: Spectral Enhancement for Tiny Object Detection

Huixin Sun, Runqi Wang, Yanjing Li et al.

CVPR 2025poster
#3264

Fast3R: Towards 3D Reconstruction of 1000+ Images in One Forward Pass

Jianing "Jed" Yang, Alexander Sax, Kevin Liang et al.

CVPR 2025posterarXiv:2501.13928
#3265

Explaining CLIP's Performance Disparities on Data from Blind/Low Vision Users

Daniela Massiceti, Camilla Longden, Agnieszka Słowik et al.

CVPR 2024posterarXiv:2311.17315
#3266

MMCert: Provable Defense against Adversarial Attacks to Multi-modal Models

Yanting Wang, Hongye Fu, Wei Zou et al.

CVPR 2024posterarXiv:2403.19080
#3267

Illumination Spectrum Estimation for Multispectral Images via Surface Reflectance Modeling and Spatial-Spectral Feature Generation

Hyejin Oh, Woo-Shik Kim, Sangyoon Lee et al.

CVPR 2025poster
#3268

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025posterarXiv:2504.13065
#3269

DiffusionMTL: Learning Multi-Task Denoising Diffusion Model from Partially Annotated Data

Hanrong Ye, Dan Xu

CVPR 2024posterarXiv:2403.15389
#3270

Revisiting Spatial-Frequency Information Integration from a Hierarchical Perspective for Panchromatic and Multi-Spectral Image Fusion

Jiangtong Tan, Jie Huang, Naishan Zheng et al.

CVPR 2024poster
#3271

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025posterarXiv:2407.07174
#3272

MoVE-KD: Knowledge Distillation for VLMs with Mixture of Visual Encoders

jiajun cao, Yuan Zhang, Tao Huang et al.

CVPR 2025posterarXiv:2501.01709
#3273

FineSports: A Multi-person Hierarchical Sports Video Dataset for Fine-grained Action Understanding

Jinglin Xu, Guohao Zhao, Sibo Yin et al.

CVPR 2024poster
#3274

MULTIFLOW: Shifting Towards Task-Agnostic Vision-Language Pruning

Matteo Farina, Massimiliano Mancini, Elia Cunegatti et al.

CVPR 2024posterarXiv:2404.05621
#3275

Unleashing Unlabeled Data: A Paradigm for Cross-View Geo-Localization

Guopeng Li, Ming Qian, Gui-Song Xia

CVPR 2024posterarXiv:2403.14198
#3276

CURSOR: Scalable Mixed-Order Hypergraph Matching with CUR Decomposition

Qixuan Zheng, Ming Zhang, Hong Yan

CVPR 2024posterarXiv:2402.16594
#3277

FCS: Feature Calibration and Separation for Non-Exemplar Class Incremental Learning

Qiwei Li, Yuxin Peng, Jiahuan Zhou

CVPR 2024poster
#3278

DrivingSphere: Building a High-fidelity 4D World for Closed-loop Simulation

Tianyi Yan, Dongming Wu, Wencheng Han et al.

CVPR 2025posterarXiv:2411.11252
#3279

Rethinking the Adversarial Robustness of Multi-Exit Neural Networks in an Attack-Defense Game

Keyizhi Xu, Chi Zhang, Zhan Chen et al.

CVPR 2025poster
#3280

Exploring Regional Clues in CLIP for Zero-Shot Semantic Segmentation

Yi Zhang, Meng-Hao Guo, Miao Wang et al.

CVPR 2024poster
#3281

GALA: Generating Animatable Layered Assets from a Single Scan

Taeksoo Kim, Byungjun Kim, Shunsuke Saito et al.

CVPR 2024posterarXiv:2401.12979
#3282

Improving Graph Contrastive Learning via Adaptive Positive Sampling

Jiaming Zhuo, Feiyang Qin, Can Cui et al.

CVPR 2024poster
#3283

EntropyMark: Towards More Harmless Backdoor Watermark via Entropy-based Constraint for Open-source Dataset Copyright Protection

Ming Sun, Rui Wang, Zixuan Zhu et al.

CVPR 2025poster
#3284

Hearing Anything Anywhere

Mason Wang, Ryosuke Sawata, Samuel Clarke et al.

CVPR 2024posterarXiv:2406.07532
#3285

Dr2Net: Dynamic Reversible Dual-Residual Networks for Memory-Efficient Finetuning

Chen Zhao, Shuming Liu, Karttikeya Mangalam et al.

CVPR 2024poster
#3286

Diffusion-EDFs: Bi-equivariant Denoising Generative Modeling on SE(3) for Visual Robotic Manipulation

Hyunwoo Ryu, Jiwoo Kim, Hyunseok An et al.

CVPR 2024highlightarXiv:2309.02685
#3287

BOTH2Hands: Inferring 3D Hands from Both Text Prompts and Body Dynamics

Wenqian Zhang, Molin Huang, Yuxuan Zhou et al.

CVPR 2024posterarXiv:2312.07937
#3288

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025posterarXiv:2503.12009
#3289

Bayesian Exploration of Pre-trained Models for Low-shot Image Classification

Yibo Miao, Yu lei, Feng Zhou et al.

CVPR 2024posterarXiv:2404.00312
#3290

VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond

Dabing Yu, Zheng Gao

CVPR 2025poster
#3291

Benchmarking the Robustness of Temporal Action Detection Models Against Temporal Corruptions

Runhao Zeng, Xiaoyong Chen, Jiaming Liang et al.

CVPR 2024posterarXiv:2403.20254
#3292

GoLF-NRT: Integrating Global Context and Local Geometry for Few-Shot View Synthesis

You Wang, Li Fang, Hao Zhu et al.

CVPR 2025posterarXiv:2505.19813
#3293

RepKPU: Point Cloud Upsampling with Kernel Point Representation and Deformation

Yi Rong, Haoran Zhou, Kang Xia et al.

CVPR 2024poster
#3294

4K4D: Real-Time 4D View Synthesis at 4K Resolution

Zhen Xu, Sida Peng, Haotong Lin et al.

CVPR 2024posterarXiv:2310.11448
#3295

Context-Guided Spatio-Temporal Video Grounding

Xin Gu, Heng Fan, Yan Huang et al.

CVPR 2024posterarXiv:2401.01578
#3296

SCFlow2: Plug-and-Play Object Pose Refiner with Shape-Constraint Scene Flow

Qingyuan Wang, Rui Song, Jiaojiao Li et al.

CVPR 2025posterarXiv:2504.09160
#3297

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Song Xia, Yi Yu, Wenhan Yang et al.

CVPR 2025highlightarXiv:2503.00383
#3298

Domain Adaptive Diabetic Retinopathy Grading with Model Absence and Flowing Data

Wenxin Su, Song Tang, Xiaofeng Liu et al.

CVPR 2025posterarXiv:2412.01203
#3299

TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation

Sai Kumar Dwivedi, Yu Sun, Priyanka Patel et al.

CVPR 2024posterarXiv:2404.16752
#3300

Re-thinking Data Availability Attacks Against Deep Neural Networks

Bin Fang, Bo Li, Shuang Wu et al.

CVPR 2024poster
#3301

Rethinking Personalized Aesthetics Assessment: Employing Physique Aesthetics Assessment as An Exemplification

Haobin Zhong, Shuai He, Anlong Ming et al.

CVPR 2025highlight
#3302

Frequency-Biased Synergistic Design for Image Compression and Compensation

Jiaming Liu, Qi Zheng, Zihao Liu et al.

CVPR 2025poster
#3303

APT: Adaptive Personalized Training for Diffusion Models with Limited Data

JungWoo Chae, Jiyoon Kim, Jaewoong Choi et al.

CVPR 2025posterarXiv:2507.02687
#3304

Logit Standardization in Knowledge Distillation

Shangquan Sun, Wenqi Ren, Jingzhi Li et al.

CVPR 2024highlightarXiv:2403.01427
#3305

Foundations of the Theory of Performance-Based Ranking

Sébastien Piérard, Anaïs Halin, Anthony Cioppa et al.

CVPR 2025posterarXiv:2412.04227
#3306

A Unified Approach for Text- and Image-guided 4D Scene Generation

Yufeng Zheng, Xueting Li, Koki Nagano et al.

CVPR 2024posterarXiv:2311.16854
#3307

CONFORM: Contrast is All You Need for High-Fidelity Text-to-Image Diffusion Models

Tuna Han Salih Meral, Enis Simsar, Federico Tombari et al.

CVPR 2024posterarXiv:2312.06059
#3308

WISH: Weakly Supervised Instance Segmentation using Heterogeneous Labels

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2025highlight
#3309

SPECAT: SPatial-spEctral Cumulative-Attention Transformer for High-Resolution Hyperspectral Image Reconstruction

Zhiyang Yao, Shuyang Liu, Xiaoyun Yuan et al.

CVPR 2024poster
#3310

Video-Based Human Pose Regression via Decoupled Space-Time Aggregation

Jijie He, Wenwu Yang

CVPR 2024posterarXiv:2403.19926
#3311

Neural Refinement for Absolute Pose Regression with Feature Synthesis

Shuai Chen, Yash Bhalgat, Xinghui Li et al.

CVPR 2024posterarXiv:2303.10087
#3312

Convex Combination Star Shape Prior for Data-driven Image Semantic Segmentation

Xinyu Zhao, Jun Xie, Shengzhe Chen et al.

CVPR 2025poster
#3313

VGGSfM: Visual Geometry Grounded Deep Structure From Motion

Jianyuan Wang, Nikita Karaev, Christian Rupprecht et al.

CVPR 2024highlight
#3314

Learning Conditional Space-Time Prompt Distributions for Video Class-Incremental Learning

Xiaohan Zou, Wenchao Ma, Shu Zhao

CVPR 2025highlight
#3315

Boosting Image Restoration via Priors from Pre-trained Models

Xiaogang Xu, Shu Kong, Tao Hu et al.

CVPR 2024posterarXiv:2403.06793
#3316

CPP-Net: Embracing Multi-Scale Feature Fusion into Deep Unfolding CP-PPA Network for Compressive Sensing

Zhen Guo, Hongping Gan

CVPR 2024poster
#3317

GenFlow: Generalizable Recurrent Flow for 6D Pose Refinement of Novel Objects

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2024posterarXiv:2403.11510
#3318

PKU-DyMVHumans: A Multi-View Video Benchmark for High-Fidelity Dynamic Human Modeling

Xiaoyun Zheng, Liwei Liao, Xufeng Li et al.

CVPR 2024posterarXiv:2403.16080
#3319

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025posterarXiv:2505.19618
#3320

MeshGPT: Generating Triangle Meshes with Decoder-Only Transformers

Yawar Siddiqui, Antonio Alliegro, Alexey Artemov et al.

CVPR 2024highlightarXiv:2311.15475
#3321

RadarDistill: Boosting Radar-based Object Detection Performance via Knowledge Distillation from LiDAR Features

Geonho Bang, Kwangjin Choi, Jisong Kim et al.

CVPR 2024posterarXiv:2403.05061
#3322

Task-Conditioned Adaptation of Visual Features in Multi-Task Policy Learning

Pierre Marza, Laetitia Matignon, Olivier Simonin et al.

CVPR 2024posterarXiv:2402.07739
#3323

ByTheWay: Boost Your Text-to-Video Generation Model to Higher Quality in a Training-free Way

Jiazi Bu, Pengyang Ling, Pan Zhang et al.

CVPR 2025posterarXiv:2410.06241
#3324

EasyDrag: Efficient Point-based Manipulation on Diffusion Models

Xingzhong Hou, Boxiao Liu, Yi Zhang et al.

CVPR 2024poster
#3325

MPDrive: Improving Spatial Understanding with Marker-Based Prompt Learning for Autonomous Driving

Zhi-Yuan Zhang, Xiaofan Li, Zhihao Xu et al.

CVPR 2025highlightarXiv:2504.00379
#3326

Learned Lossless Image Compression based on Bit Plane Slicing

Zhe Zhang, Huairui Wang, Zhenzhong Chen et al.

CVPR 2024poster
#3327

BEM: Balanced and Entropy-based Mix for Long-Tailed Semi-Supervised Learning

Hongwei Zheng, Linyuan Zhou, Han Li et al.

CVPR 2024posterarXiv:2404.01179
#3328

Dancing with Still Images: Video Distillation via Static-Dynamic Disentanglement

Ziyu Wang, Yue Xu, Cewu Lu et al.

CVPR 2024posterarXiv:2312.00362
#3329

SC-Tune: Unleashing Self-Consistent Referential Comprehension in Large Vision Language Models

Tongtian Yue, Jie Cheng, Longteng Guo et al.

CVPR 2024posterarXiv:2403.13263
#3330

Frequency-Adaptive Dilated Convolution for Semantic Segmentation

Linwei Chen, Lin Gu, Dezhi Zheng et al.

CVPR 2024highlightarXiv:2403.05369
#3331

Disentangled Pose and Appearance Guidance for Multi-Pose Generation

Tengfei Xiao, Yue Wu, Yuelong Li et al.

CVPR 2025poster
#3332

TexTile: A Differentiable Metric for Texture Tileability

Carlos Rodriguez-Pardo, Dan Casas, Elena Garces et al.

CVPR 2024posterarXiv:2403.12961
#3333

MatSynth: A Modern PBR Materials Dataset

Giuseppe Vecchio, Valentin Deschaintre

CVPR 2024posterarXiv:2401.06056
#3334

Image Processing GNN: Breaking Rigidity in Super-Resolution

Yuchuan Tian, Hanting Chen, Chao Xu et al.

CVPR 2024poster
#3335

ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth Estimation

Suraj Patni, Aradhye Agarwal, Chetan Arora

CVPR 2024posterarXiv:2403.18807
#3336

VI^3NR: Variance Informed Initialization for Implicit Neural Representations

Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Sameera Ramasinghe et al.

CVPR 2025poster
#3337

Efficient Diffusion as Low Light Enhancer

Guanzhou Lan, Qianli Ma, YUQI YANG et al.

CVPR 2025posterarXiv:2410.12346
#3338

Riemannian Multinomial Logistics Regression for SPD Neural Networks

Ziheng Chen, Yue Song, Gaowen Liu et al.

CVPR 2024posterarXiv:2305.11288
#3339

LED: A Large-scale Real-world Paired Dataset for Event Camera Denoising

Yuxing Duan

CVPR 2024posterarXiv:2405.19718
#3340

NoiseCollage: A Layout-Aware Text-to-Image Diffusion Model Based on Noise Cropping and Merging

Takahiro Shirakawa, Seiichi Uchida

CVPR 2024posterarXiv:2403.03485
#3341

OmniMedVQA: A New Large-Scale Comprehensive Evaluation Benchmark for Medical LVLM

Yutao Hu, Tianbin, Quanfeng Lu et al.

CVPR 2024posterarXiv:2402.09181
#3342

Visual Anagrams: Generating Multi-View Optical Illusions with Diffusion Models

Daniel Geng, Inbum Park, Andrew Owens

CVPR 2024posterarXiv:2311.17919
#3343

GliaNet: Adaptive Neural Network Structure Learning with Glia-Driven

Mengqiao Han, Liyuan Pan, Xiabi Liu

CVPR 2025poster
#3344

Weakly Supervised Semantic Segmentation via Progressive Confidence Region Expansion

Xiangfeng Xu, Pinyi Zhang, Wenxuan Huang et al.

CVPR 2025poster
#3345

Visual Programming for Zero-shot Open-Vocabulary 3D Visual Grounding

Zhihao Yuan, Jinke Ren, Chun-Mei Feng et al.

CVPR 2024posterarXiv:2311.15383
#3346

VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models

Qian Wang, Abdelrahman Eldesokey, Mohit Mendiratta et al.

CVPR 2025poster
#3347

MOS-Attack: A Scalable Multi-objective Adversarial Attack Framework

Ping Guo, Cheng Gong, Fei Liu et al.

CVPR 2025posterarXiv:2501.07251
#3348

Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity

Huaxin Zhang, Xiaohao Xu, Xiang Wang et al.

CVPR 2025highlightarXiv:2412.06171
#3349

SuperLightNet: Lightweight Parameter Aggregation Network for Multimodal Brain Tumor Segmentation

Feng Yu, Jiacheng Cao, Li Liu et al.

CVPR 2025poster
#3350

Towards HDR and HFR Video from Rolling-Mixed-Bit Spikings

Yakun Chang, Yeliduosi Xiaokaiti, Yujia Liu et al.

CVPR 2024poster
#3351

Learn from View Correlation: An Anchor Enhancement Strategy for Multi-view Clustering

Suyuan Liu, KE LIANG, Zhibin Dong et al.

CVPR 2024poster
#3352

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025posterarXiv:2504.01961
#3353

Passive Snapshot Coded Aperture Dual-Pixel RGB-D Imaging

Bhargav Ghanekar, Salman Siddique Khan, Pranav Sharma et al.

CVPR 2024posterarXiv:2402.18102
#3354

UniPAD: A Universal Pre-training Paradigm for Autonomous Driving

Honghui Yang, Sha Zhang, Di Huang et al.

CVPR 2024posterarXiv:2310.08370
#3355

ECLIPSE: A Resource-Efficient Text-to-Image Prior for Image Generations

Maitreya Patel, Changhoon Kim, Sheng Cheng et al.

CVPR 2024posterarXiv:2312.04655
#3356

Low-Res Leads the Way: Improving Generalization for Super-Resolution by Self-Supervised Learning

Haoyu Chen, Wenbo Li, Jinjin Gu et al.

CVPR 2024posterarXiv:2403.02601
#3357

Neural Video Compression with Feature Modulation

Jiahao Li, Bin Li, Yan Lu

CVPR 2024posterarXiv:2402.17414
#3358

Nearest is Not Dearest: Towards Practical Defense against Quantization-conditioned Backdoor Attacks

Boheng Li, Yishuo Cai, Haowei Li et al.

CVPR 2024posterarXiv:2405.12725
#3359

Dual DETRs for Multi-Label Temporal Action Detection

Yuhan Zhu, Guozhen Zhang, Jing Tan et al.

CVPR 2024posterarXiv:2404.00653
#3360

Discriminative Probing and Tuning for Text-to-Image Generation

Leigang Qu, Wenjie Wang, Yongqi Li et al.

CVPR 2024posterarXiv:2403.04321
#3361

GigaTraj: Predicting Long-term Trajectories of Hundreds of Pedestrians in Gigapixel Complex Scenes

Haozhe Lin, Chunyu Wei, Li He et al.

CVPR 2024poster
#3362

Outdoor Scene Extrapolation with Hierarchical Generative Cellular Automata

Dongsu Zhang, Francis Williams, Žan Gojčič et al.

CVPR 2024highlightarXiv:2406.08292
#3363

Comparing the Decision-Making Mechanisms by Transformers and CNNs via Explanation Methods

Mingqi Jiang, Saeed Khorram, Li Fuxin

CVPR 2024posterarXiv:2212.06872
#3364

Continual Segmentation with Disentangled Objectness Learning and Class Recognition

Yizheng Gong, Siyue Yu, Xiaoyang Wang et al.

CVPR 2024posterarXiv:2403.03477
#3365

Image Sculpting: Precise Object Editing with 3D Geometry Control

Jiraphon Yenphraphai, Xichen Pan, Sainan Liu et al.

CVPR 2024posterarXiv:2401.01702
#3366

Sim-to-Real Causal Transfer: A Metric Learning Approach to Causally-Aware Interaction Representations

Ahmad Rahimi, Po-Chien Luan, Yuejiang Liu et al.

CVPR 2025posterarXiv:2312.04540
#3367

MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action Anticipation

Olga Zatsarynna, Emad Bahrami, Yazan Abu Farha et al.

CVPR 2025poster
#3368

Attribute-Guided Pedestrian Retrieval: Bridging Person Re-ID with Internal Attribute Variability

Yan Huang, Zhang Zhang, Qiang Wu et al.

CVPR 2024poster
#3369

Weakly Misalignment-free Adaptive Feature Alignment for UAVs-based Multimodal Object Detection

Chen Chen, Jiahao Qi, Xingyue Liu et al.

CVPR 2024poster
#3370

Prompt-Driven Dynamic Object-Centric Learning for Single Domain Generalization

Deng Li, Aming Wu, Yaowei Wang et al.

CVPR 2024posterarXiv:2402.18447
#3371

EscherNet: A Generative Model for Scalable View Synthesis

Xin Kong, Shikun Liu, Xiaoyang Lyu et al.

CVPR 2024posterarXiv:2402.03908
#3372

GaussianIP: Identity-Preserving Realistic 3D Human Generation via Human-Centric Diffusion Prior

Zichen Tang, Yuan Yao, Miaomiao Cui et al.

CVPR 2025posterarXiv:2503.11143
#3373

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025posterarXiv:2411.12858
#3374

MVCPS-NeuS: Multi-view Constrained Photometric Stereo for Neural Surface Reconstruction

Hiroaki Santo, Fumio Okura, Yasuyuki Matsushita

CVPR 2024poster
#3375

OHTA: One-shot Hand Avatar via Data-driven Implicit Priors

Xiaozheng Zheng, Chao Wen, Zhuo Su et al.

CVPR 2024posterarXiv:2402.18969
#3376

E-GPS: Explainable Geometry Problem Solving via Top-Down Solver and Bottom-Up Generator

Wenjun Wu, Lingling Zhang, Jun Liu et al.

CVPR 2024poster
#3377

MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Nicolás Ugrinovic, Boxiao Pan, Georgios Pavlakos et al.

CVPR 2024posterarXiv:2404.11987
#3378

LMDrive: Closed-Loop End-to-End Driving with Large Language Models

Hao Shao, Yuxuan Hu, Letian Wang et al.

CVPR 2024posterarXiv:2312.07488
#3379

ID-Blau: Image Deblurring by Implicit Diffusion-based reBLurring AUgmentation

Jia-Hao Wu, Fu-Jen Tsai, Yan-Tsung Peng et al.

CVPR 2024posterarXiv:2312.10998
#3380

Bridging Gait Recognition and Large Language Models Sequence Modeling

Shaopeng Yang, Jilong Wang, Saihui Hou et al.

CVPR 2025poster
#3381

GauHuman: Articulated Gaussian Splatting from Monocular Human Videos

Shoukang Hu, Tao Hu, Ziwei Liu

CVPR 2024posterarXiv:2312.02973
#3382

BEVNeXt: Reviving Dense BEV Frameworks for 3D Object Detection

Zhenxin Li, Shiyi Lan, Jose M. Alvarez et al.

CVPR 2024posterarXiv:2312.01696
#3383

AnySkill: Learning Open-Vocabulary Physical Skill for Interactive Agents

Jieming Cui, Tengyu Liu, Nian Liu et al.

CVPR 2024posterarXiv:2403.12835
#3384

HumanNeRF-SE: A Simple yet Effective Approach to Animate HumanNeRF with Diverse Poses

Caoyuan Ma, Yu-Lun Liu, Zhixiang Wang et al.

CVPR 2024posterarXiv:2312.02232
#3385

Towards Practical Real-Time Neural Video Compression

Zhaoyang Jia, Bin Li, Jiahao Li et al.

CVPR 2025posterarXiv:2502.20762
#3386

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Feng Liu, Shiwei Zhang, Xiaofeng Wang et al.

CVPR 2025highlightarXiv:2411.19108
#3387

Cross-Rejective Open-Set SAR Image Registration

Shasha Mao, Shiming Lu, Zhaolong Du et al.

CVPR 2025poster
#3388

SurMo: Surface-based 4D Motion Modeling for Dynamic Human Rendering

Tao Hu, Fangzhou Hong, Ziwei Liu

CVPR 2024posterarXiv:2404.01225
#3389

LeftRefill: Filling Right Canvas based on Left Reference through Generalized Text-to-Image Diffusion Model

Chenjie Cao, Yunuo Cai, Qiaole Dong et al.

CVPR 2024posterarXiv:2305.11577
#3390

ManipLLM: Embodied Multimodal Large Language Model for Object-Centric Robotic Manipulation

Xiaoqi Li, Mingxu Zhang, Yiran Geng et al.

CVPR 2024posterarXiv:2312.16217
#3391

TAPT: Test-Time Adversarial Prompt Tuning for Robust Inference in Vision-Language Models

Xin Wang, Kai Chen, Jiaming Zhang et al.

CVPR 2025posterarXiv:2411.13136
#3392

Unveiling Parts Beyond Objects: Towards Finer-Granularity Referring Expression Segmentation

Wenxuan Wang, Tongtian Yue, Yisi Zhang et al.

CVPR 2024poster
#3393

PanoPose: Self-supervised Relative Pose Estimation for Panoramic Images

Diantao Tu, Hainan Cui, Xianwei Zheng et al.

CVPR 2024highlight
#3394

Mask4Align: Aligned Entity Prompting with Color Masks for Multi-Entity Localization Problems

Haoquan Zhang, Ronggang Huang, Yi Xie et al.

CVPR 2024poster
#3395

Global and Local Prompts Cooperation via Optimal Transport for Federated Learning

Hongxia Li, Wei Huang, Jingya Wang et al.

CVPR 2024posterarXiv:2403.00041
#3396

VSCode: General Visual Salient and Camouflaged Object Detection with 2D Prompt Learning

Ziyang Luo, Nian Liu, Wangbo Zhao et al.

CVPR 2024posterarXiv:2311.15011
#3397

Taxonomy-Aware Evaluation of Vision-Language Models

Vésteinn Snæbjarnarson, Kevin Du, Niklas Stoehr et al.

CVPR 2025posterarXiv:2504.05457
#3398

Dense Optical Tracking: Connecting the Dots

Guillaume Le Moing, Jean Ponce, Cordelia Schmid

CVPR 2024highlightarXiv:2312.00786
#3399

Multi-agent Collaborative Perception via Motion-aware Robust Communication Network

Shixin Hong, Yu LIU, Zhi Li et al.

CVPR 2024poster
#3400

Ungeneralizable Examples

Jingwen Ye, Xinchao Wang

CVPR 2024posterarXiv:2404.14016