Most Cited CVPR "risk allocation" Papers

5,589 papers found • Page 20 of 28

#3801

CASP: Consistency-aware Audio-induced Saliency Prediction Model for Omnidirectional Video

Zhaolin Wan, Han Qin, Zhiyang Li et al.

CVPR 2025poster
#3802

A Universal Scale-Adaptive Deformable Transformer for Image Restoration across Diverse Artifacts

Xuyi He, Yuhui Quan, Ruotao Xu et al.

CVPR 2025poster
#3803

EarthLoc: Astronaut Photography Localization by Indexing Earth from Space

Gabriele Berton, Alex Stoken, Barbara Caputo et al.

CVPR 2024posterarXiv:2403.06758
#3804

Neural Inverse Rendering from Propagating Light

Anagh Malik, Benjamin Attal, Andrew Xie et al.

CVPR 2025posterarXiv:2506.05347
#3805

PairDETR : Joint Detection and Association of Human Bodies and Faces

Ammar Ali, Georgii Gaikov, Denis Rybalchenko et al.

CVPR 2024poster
#3806

Close Imitation of Expert Retouching for Black-and-White Photography

Seunghyun Shin, Jisu Shin, Jihwan Bae et al.

CVPR 2024poster
#3807

ViCaS: A Dataset for Combining Holistic and Pixel-level Video Understanding using Captions with Grounded Segmentation

Ali Athar, Xueqing Deng, Liang-Chieh Chen

CVPR 2025posterarXiv:2412.09754
#3808

OmniLocalRF: Omnidirectional Local Radiance Fields from Dynamic Videos

Dongyoung Choi, Hyeonjoong Jang, Min H. Kim

CVPR 2024posterarXiv:2404.00676
#3809

Using Human Feedback to Fine-tune Diffusion Models without Any Reward Model

Kai Yang, Jian Tao, Jiafei Lyu et al.

CVPR 2024posterarXiv:2311.13231
#3810

A4A: Adapter for Adapter Transfer via All-for-All Mapping for Cross-Architecture Models

Keyu Tu, Mengqi Huang, Zhuowei Chen et al.

CVPR 2025poster
#3811

Reconstructing Hands in 3D with Transformers

Georgios Pavlakos, Dandan Shan, Ilija Radosavovic et al.

CVPR 2024posterarXiv:2312.05251
#3812

XFeat: Accelerated Features for Lightweight Image Matching

Guilherme Potje, Felipe Cadar, André Araujo et al.

CVPR 2024posterarXiv:2404.19174
#3813

Systematic Comparison of Semi-supervised and Self-supervised Learning for Medical Image Classification

Zhe Huang, Ruijie Jiang, Shuchin Aeron et al.

CVPR 2024posterarXiv:2307.08919
#3814

GoodSAM: Bridging Domain and Capacity Gaps via Segment Anything Model for Distortion-aware Panoramic Semantic Segmentation

WEIMING ZHANG, Yexin Liu, Xu Zheng et al.

CVPR 2024posterarXiv:2403.16370
#3815

VRP-SAM: SAM with Visual Reference Prompt

Yanpeng Sun, Jiahui Chen, Shan Zhang et al.

CVPR 2024posterarXiv:2402.17726
#3816

Towards Precise Embodied Dialogue Localization via Causality Guided Diffusion

Haoyu Wang, Le Wang, Sanping Zhou et al.

CVPR 2025poster
#3817

Distilling Long-tailed Datasets

Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2408.14506
#3818

DiffuScene: Denoising Diffusion Models for Generative Indoor Scene Synthesis

Jiapeng Tang, Yinyu Nie, Lev Markhasin et al.

CVPR 2024posterarXiv:2303.14207
#3819

Disentangling Safe and Unsafe Image Corruptions via Anisotropy and Locality

Ramchandran Muthukumar, Ambar Pal, Jeremias Sulam et al.

CVPR 2025poster
#3820

vid-TLDR: Training Free Token Merging for Light-weight Video Transformer

Joonmyung Choi, Sanghyeok Lee, Jaewon Chu et al.

CVPR 2024posterarXiv:2403.13347
#3821

Looking Similar Sounding Different: Leveraging Counterfactual Cross-Modal Pairs for Audiovisual Representation Learning

Nikhil Singh, Chih-Wei Wu, Iroro Orife et al.

CVPR 2024posterarXiv:2304.05600
#3822

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025poster
#3823

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.

CVPR 2025posterarXiv:2504.09966
#3824

YOLO-World: Real-Time Open-Vocabulary Object Detection

Tianheng Cheng, Lin Song, Yixiao Ge et al.

CVPR 2024posterarXiv:2401.17270
#3825

Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks

Uranik Berisha, Jens Mehnert, Alexandru Paul Condurache

CVPR 2025posterarXiv:2505.15414
#3826

Doppelgängers and Adversarial Vulnerability

George Kamberov

CVPR 2025highlightarXiv:2410.13193
#3827

Bézier Everywhere All at Once: Learning Drivable Lanes as Bézier Graphs

Hugh Blayney, Hanlin Tian, Hamish Scott et al.

CVPR 2024poster
#3828

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Bingda Tang, Sayak Paul, Boyang Zheng et al.

CVPR 2025posterarXiv:2505.10046
#3829

Initialization Matters for Adversarial Transfer Learning

Andong Hua, Jindong Gu, Zhiyu Xue et al.

CVPR 2024posterarXiv:2312.05716
#3830

Taming Self-Training for Open-Vocabulary Object Detection

Shiyu Zhao, Samuel Schulter, Long Zhao et al.

CVPR 2024posterarXiv:2308.06412
#3831

MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention

Yuhan Wang, Fangzhou Hong, Shuai Yang et al.

CVPR 2025posterarXiv:2503.08664
#3832

Ink Dot-Oriented Differentiable Optimization for Neural Image Halftoning

Hao Jiang, Bingfeng Zhou, Yadong Mu

CVPR 2024poster
#3833

GeoChat: Grounded Large Vision-Language Model for Remote Sensing

Kartik Kuckreja, Muhammad Sohail Danish, Muzammal Naseer et al.

CVPR 2024posterarXiv:2311.15826
#3834

FACT: Frame-Action Cross-Attention Temporal Modeling for Efficient Action Segmentation

Zijia Lu, Ehsan Elhamifar

CVPR 2024poster
#3835

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025posterarXiv:2506.03481
#3836

Matrix-Free Shared Intrinsics Bundle Adjustment

Daniel Safari

CVPR 2025poster
#3837

GaussianAvatar: Towards Realistic Human Avatar Modeling from a Single Video via Animatable 3D Gaussians

Liangxiao Hu, Hongwen Zhang, Yuxiang Zhang et al.

CVPR 2024posterarXiv:2312.02134
#3838

Seeing More with Less: Human-like Representations in Vision Models

Andrey Gizdov, Shimon Ullman, Daniel Harari

CVPR 2025highlight
#3839

ShapeMatcher: Self-Supervised Joint Shape Canonicalization Segmentation Retrieval and Deformation

Yan Di, Chenyangguang Zhang, Chaowei Wang et al.

CVPR 2024poster
#3840

Know Your Neighbors: Improving Single-View Reconstruction via Spatial Vision-Language Reasoning

Rui Li, Tobias Fischer, Mattia Segu et al.

CVPR 2024posterarXiv:2404.03658
#3841

SVDTree: Semantic Voxel Diffusion for Single Image Tree Reconstruction

Yuan Li, Zhihao Liu, Bedrich Benes et al.

CVPR 2024poster
#3842

SeedVR: Seeding Infinity in Diffusion Transformer Towards Generic Video Restoration

Jianyi Wang, Zhijie Lin, Meng Wei et al.

CVPR 2025highlightarXiv:2501.01320
#3843

Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding

Jiaxin Shi, Mingyue Xiang, Hao Sun et al.

CVPR 2025poster
#3844

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, Sumit Patidar, Iain Haughton et al.

CVPR 2024posterarXiv:2403.03890
#3845

Patch2Self2: Self-supervised Denoising on Coresets via Matrix Sketching

Shreyas Fadnavis, Agniva Chowdhury, Joshua Batson et al.

CVPR 2024poster
#3846

FreeCustom: Tuning-Free Customized Image Generation for Multi-Concept Composition

Ganggui Ding, Canyu Zhao, Wen Wang et al.

CVPR 2024posterarXiv:2405.13870
#3847

SATA: Spatial Autocorrelation Token Analysis for Enhancing the Robustness of Vision Transformers

Nikaan Nikzad, YI LIAO, Yongsheng Gao et al.

CVPR 2025posterarXiv:2409.19850
#3848

AniGS: Animatable Gaussian Avatar from a Single Image with Inconsistent Gaussian Reconstruction

Lingteng Qiu, Shenhao Zhu, Qi Zuo et al.

CVPR 2025posterarXiv:2412.02684
#3849

Generative Unlearning for Any Identity

Juwon Seo, Sung-Hoon Lee, Tae-Young Lee et al.

CVPR 2024posterarXiv:2405.09879
#3850

Fuzzy Multimodal Learning for Trusted Cross-modal Retrieval

Siyuan Duan, Yuan Sun, Dezhong Peng et al.

CVPR 2025poster
#3851

Traceable Federated Continual Learning

Qiang Wang, Bingyan Liu, Yawen Li

CVPR 2024poster
#3852

Mitigating Object Dependencies: Improving Point Cloud Self-Supervised Learning through Object Exchange

Yanhao Wu, Tong Zhang, Wei Ke et al.

CVPR 2024posterarXiv:2404.07504
#3853

NVILA: Efficient Frontier Visual Language Models

Zhijian Liu, Ligeng Zhu, Baifeng Shi et al.

CVPR 2025posterarXiv:2412.04468
#3854

Multi-scale Dynamic and Hierarchical Relationship Modeling for Facial Action Units Recognition

Zihan Wang, Siyang Song, Cheng Luo et al.

CVPR 2024posterarXiv:2404.06443
#3855

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

Kai Wang, Zekai Li, Zhi-Qi Cheng et al.

CVPR 2025posterarXiv:2410.17193
#3856

Imagine Before Go: Self-Supervised Generative Map for Object Goal Navigation

Sixian Zhang, Xinyao Yu, Xinhang Song et al.

CVPR 2024poster
#3857

FloVD: Optical Flow Meets Video Diffusion Model for Enhanced Camera-Controlled Video Synthesis

Wonjoon Jin, Qi Dai, Chong Luo et al.

CVPR 2025posterarXiv:2502.08244
#3858

SVDinsTN: A Tensor Network Paradigm for Efficient Structure Search from Regularized Modeling Perspective

Yu-Bang Zheng, Xile Zhao, Junhua Zeng et al.

CVPR 2024highlightarXiv:2305.14912
#3859

Exploring the Potential of Large Foundation Models for Open-Vocabulary HOI Detection

Ting Lei, Shaofeng Yin, Yang Liu

CVPR 2024posterarXiv:2404.06194
#3860

Modality-Agnostic Structural Image Representation Learning for Deformable Multi-Modality Medical Image Registration

Tony C. W. MOK, Zi Li, Yunhao Bai et al.

CVPR 2024highlightarXiv:2402.18933
#3861

PoseIRM: Enhance 3D Human Pose Estimation on Unseen Camera Settings via Invariant Risk Minimization

Yanlu Cai, Weizhong Zhang, Yuan Wu et al.

CVPR 2024poster
#3862

On the Estimation of Image-matching Uncertainty in Visual Place Recognition

Mubariz Zaffar, Liangliang Nan, Julian F. P. Kooij

CVPR 2024highlightarXiv:2404.00546
#3863

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

Xiangtao Zhang, Sheng Li, Ao Li et al.

CVPR 2025poster
#3864

Pink: Unveiling the Power of Referential Comprehension for Multi-modal LLMs

shiyu xuan, Qingpei Guo, Ming Yang et al.

CVPR 2024posterarXiv:2310.00582
#3865

LoS: Local Structure-Guided Stereo Matching

Kunhong Li, Longguang Wang, Ye Zhang et al.

CVPR 2024poster
#3866

ACAttack: Adaptive Cross Attacking RGB-T Tracker via Multi-Modal Response Decoupling

Xinyu Xiang, Qinglong Yan, HAO ZHANG et al.

CVPR 2025poster
#3867

RadSimReal: Bridging the Gap Between Synthetic and Real Data in Radar Object Detection With Simulation

Oded Bialer, Yuval Haitman

CVPR 2024posterarXiv:2404.18150
#3868

OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware Interpolation

Jisoo Jeong, Hong Cai, Risheek Garrepalli et al.

CVPR 2024posterarXiv:2403.18092
#3869

High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model

Yiyang Shen, Kun Zhou, He Wang et al.

CVPR 2025highlightarXiv:2504.01512
#3870

FlowerFormer: Empowering Neural Architecture Encoding using a Flow-aware Graph Transformer

Dongyeong Hwang, Hyunju Kim, Sunwoo Kim et al.

CVPR 2024posterarXiv:2403.12821
#3871

Mip-Splatting: Alias-free 3D Gaussian Splatting

Zehao Yu, Anpei Chen, Binbin Huang et al.

CVPR 2024posterarXiv:2311.16493
#3872

Text2QR: Harmonizing Aesthetic Customization and Scanning Robustness for Text-Guided QR Code Generation

Guangyang Wu, Xiaohong Liu, Jun Jia et al.

CVPR 2024posterarXiv:2403.06452
#3873

UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation

Yinqiao Wang, Hao Xu, Pheng-Ann Heng et al.

CVPR 2025posterarXiv:2503.13303
#3874

ProMark: Proactive Diffusion Watermarking for Causal Attribution

Vishal Asnani, John Collomosse, Tu Bui et al.

CVPR 2024posterarXiv:2403.09914
#3875

MMM: Generative Masked Motion Model

Ekkasit Pinyoanuntapong, Pu Wang, Minwoo Lee et al.

CVPR 2024highlightarXiv:2312.03596
#3876

Toward Generalist Anomaly Detection via In-context Residual Learning with Few-shot Sample Prompts

Jiawen Zhu, Guansong Pang

CVPR 2024posterarXiv:2403.06495
#3877

DiffForensics: Leveraging Diffusion Prior to Image Forgery Detection and Localization

Zeqin Yu, Jiangqun Ni, Yuzhen Lin et al.

CVPR 2024poster
#3878

VideoGrounding-DINO: Towards Open-Vocabulary Spatio-Temporal Video Grounding

Syed Talal Wasim, Muzammal Naseer, Salman Khan et al.

CVPR 2024poster
#3879

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025posterarXiv:2503.17261
#3880

Neural Directional Encoding for Efficient and Accurate View-Dependent Appearance Modeling

Liwen Wu, Sai Bi, Zexiang Xu et al.

CVPR 2024highlightarXiv:2405.14847
#3881

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

Junsung Park, HwiJeong Lee, Inha Kang et al.

CVPR 2025posterarXiv:2503.15910
#3882

Learning Partonomic 3D Reconstruction from Image Collections

Xiaoqian Ruan, Pei Yu, Dian Jia et al.

CVPR 2025poster
#3883

LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning

Peng Wu, Xiankai Lu, Hao Hu et al.

CVPR 2025poster
#3884

SAFDNet: A Simple and Effective Network for Fully Sparse 3D Object Detection

Gang Zhang, Chen Junnan, Guohuan Gao et al.

CVPR 2024posterarXiv:2403.05817
#3885

3D Student Splatting and Scooping

Jialin Zhu, Jiangbei Yue, Feixiang He et al.

CVPR 2025posterarXiv:2503.10148
#3886

Sheared Backpropagation for Fine-tuning Foundation Models

Zhiyuan Yu, Li Shen, Liang Ding et al.

CVPR 2024poster
#3887

On the Content Bias in Fréchet Video Distance

Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar et al.

CVPR 2024posterarXiv:2404.12391
#3888

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Baptiste Brument, Robin Bruneau, Yvain Queau et al.

CVPR 2024posterarXiv:2312.01215
#3889

LEDiff: Latent Exposure Diffusion for HDR Generation

Chao Wang, Zhihao Xia, Thomas Leimkuehler et al.

CVPR 2025posterarXiv:2412.14456
#3890

AIM-Fair: Advancing Algorithmic Fairness via Selectively Fine-Tuning Biased Models with Contextual Synthetic Data

Zengqun Zhao, Ziquan Liu, Yu Cao et al.

CVPR 2025posterarXiv:2503.05665
#3891

Multiview Aerial Visual RECognition (MAVREC): Can Multi-view Improve Aerial Visual Perception?

Aritra Dutta, Srijan Das, Jacob Nielsen et al.

CVPR 2024posterarXiv:2312.04548
#3892

VINECS: Video-based Neural Character Skinning

Zhouyingcheng Liao, Vladislav Golyanik, Marc Habermann et al.

CVPR 2024posterarXiv:2307.00842
#3893

PanSplat: 4K Panorama Synthesis with Feed-Forward Gaussian Splatting

Cheng Zhang, Haofei Xu, Qianyi Wu et al.

CVPR 2025posterarXiv:2412.12096
#3894

Plug and Play Active Learning for Object Detection

Chenhongyi Yang, Lichao Huang, Elliot Crowley

CVPR 2024posterarXiv:2211.11612
#3895

Plug-and-Play Diffusion Distillation

Yi-Ting Hsiao, Siavash Khodadadeh, Kevin Duarte et al.

CVPR 2024posterarXiv:2406.01954
#3896

CLIB-FIQA: Face Image Quality Assessment with Confidence Calibration

Fu-Zhao Ou, Chongyi Li, Shiqi Wang et al.

CVPR 2024poster
#3897

GS-2DGS: Geometrically Supervised 2DGS for Reflective Object Reconstruction

Jinguang Tong, Xuesong li, Fahira Afzal Maken et al.

CVPR 2025posterarXiv:2506.13110
#3898

Polos: Multimodal Metric Learning from Human Feedback for Image Captioning

Yuiga Wada, Kanta Kaneda, Daichi Saito et al.

CVPR 2024highlightarXiv:2402.18091
#3899

XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized Manifold

Guangyu Wang, Jinzhi Zhang, Fan Wang et al.

CVPR 2024posterarXiv:2403.19517
#3900

Differentiable Micro-Mesh Construction

Yishun Dou, Zhong Zheng, Qiaoqiao Jin et al.

CVPR 2024poster
#3901

HOIDiffusion: Generating Realistic 3D Hand-Object Interaction Data

Mengqi Zhang, Yang Fu, Zheng Ding et al.

CVPR 2024posterarXiv:2403.12011
#3902

CPGA: Coding Priors-Guided Aggregation Network for Compressed Video Quality Enhancement

Qiang Zhu, Jinhua Hao, Yukang Ding et al.

CVPR 2024posterarXiv:2403.10362
#3903

DiSciPLE: Learning Interpretable Programs for Scientific Visual Discovery

Utkarsh Mall, Cheng Perng Phoo, Mia Chiquier et al.

CVPR 2025posterarXiv:2502.10060
#3904

ProxyCap: Real-time Monocular Full-body Capture in World Space via Human-Centric Proxy-to-Motion Learning

Yuxiang Zhang, Hongwen Zhang, Liangxiao Hu et al.

CVPR 2024posterarXiv:2307.01200
#3905

Learning from Synthetic Human Group Activities

Che-Jui Chang, Danrui Li, Deep Patel et al.

CVPR 2024posterarXiv:2306.16772
#3906

Can’t Make an Omelette Without Breaking Some Eggs: Plausible Action Anticipation Using Large Video-Language Models

Himangi Mittal, Nakul Agarwal, Shao-Yuan Lo et al.

CVPR 2024poster
#3907

Unsupervised 3D Structure Inference from Category-Specific Image Collections

Weikang Wang, Dongliang Cao, Florian Bernard

CVPR 2024poster
#3908

Video2Game: Real-time Interactive Realistic and Browser-Compatible Environment from a Single Video

Hongchi Xia, Chih-Hao Lin, Wei-Chiu Ma et al.

CVPR 2024poster
#3909

Identifying Important Group of Pixels using Interactions

Kosuke Sumiyasu, Kazuhiko Kawamoto, Hiroshi Kera

CVPR 2024posterarXiv:2401.03785
#3910

Continuous Adverse Weather Removal via Degradation-Aware Distillation

Xin Lu, Jie Xiao, Yurui Zhu et al.

CVPR 2025poster
#3911

Multi-Modal Proxy Learning Towards Personalized Visual Multiple Clustering

Jiawei Yao, Qi Qian, Juhua Hu

CVPR 2024posterarXiv:2404.15655
#3912

Adaptive Bidirectional Displacement for Semi-Supervised Medical Image Segmentation

Hanyang Chi, Jian Pang, Bingfeng Zhang et al.

CVPR 2024posterarXiv:2405.00378
#3913

HotSpot: Signed Distance Function Optimization with an Asymptotically Sufficient Condition

Zimo Wang, Cheng Wang, Taiki Yoshino et al.

CVPR 2025highlightarXiv:2411.14628
#3914

DreamAvatar: Text-and-Shape Guided 3D Human Avatar Generation via Diffusion Models

Yukang Cao, Yan-Pei Cao, Kai Han et al.

CVPR 2024posterarXiv:2304.00916
#3915

Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for Video Adverse Weather Removal

Yijun Yang, Hongtao Wu, Angelica I. Aviles-Rivero et al.

CVPR 2024posterarXiv:2403.07684
#3916

Are Conventional SNNs Really Efficient? A Perspective from Network Quantization

Guobin Shen, Dongcheng Zhao, Tenglong Li et al.

CVPR 2024highlight
#3917

RILA: Reflective and Imaginative Language Agent for Zero-Shot Semantic Audio-Visual Navigation

Zeyuan Yang, LIU JIAGENG, Peihao Chen et al.

CVPR 2024poster
#3918

Sharingan: A Transformer Architecture for Multi-Person Gaze Following

Samy Tafasca, Anshul Gupta, Jean-marc Odobez

CVPR 2024poster
#3919

OA-CNNs: Omni-Adaptive Sparse CNNs for 3D Semantic Segmentation

Bohao Peng, Xiaoyang Wu, Li Jiang et al.

CVPR 2024posterarXiv:2403.14418
#3920

SeriesBench: A Benchmark for Narrative-Driven Drama Series Understanding

chenkai zhang, Yiming Lei, Zeming Liu et al.

CVPR 2025posterarXiv:2504.21435
#3921

Dynamic Support Information Mining for Category-Agnostic Pose Estimation

Pengfei Ren, Yuanyuan Gao, Haifeng Sun et al.

CVPR 2024poster
#3922

Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness

Sibo Wang, Jie Zhang, Zheng Yuan et al.

CVPR 2024posterarXiv:2401.04350
#3923

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025posterarXiv:2503.21140
#3924

MART: Masked Affective RepresenTation Learning via Masked Temporal Distribution Distillation

Zhicheng Zhang, Pancheng Zhao, Eunil Park et al.

CVPR 2024poster
#3925

Uncertainty Meets Diversity: A Comprehensive Active Learning Framework for Indoor 3D Object Detection

Jiangyi Wang, Na Zhao

CVPR 2025posterarXiv:2503.16125
#3926

MindBridge: A Cross-Subject Brain Decoding Framework

Shizun Wang, Songhua Liu, Zhenxiong Tan et al.

CVPR 2024highlightarXiv:2404.07850
#3927

Loopy-SLAM: Dense Neural SLAM with Loop Closures

Lorenzo Liso, Erik Sandström, Vladimir Yugay et al.

CVPR 2024posterarXiv:2402.09944
#3928

CosalPure: Learning Concept from Group Images for Robust Co-Saliency Detection

Jiayi Zhu, Qing Guo, Felix Juefei Xu et al.

CVPR 2024posterarXiv:2403.18554
#3929

Neural Clustering based Visual Representation Learning

Guikun Chen, Xia Li, Yi Yang et al.

CVPR 2024posterarXiv:2403.17409
#3930

ViVid-1-to-3: Novel View Synthesis with Video Diffusion Models

Jeong-gi Kwak, Erqun Dong, Yuhe Jin et al.

CVPR 2024highlightarXiv:2312.01305
#3931

Text-Driven Fashion Image Editing with Compositional Concept Learning and Counterfactual Abduction

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025poster
#3932

CrossMAE: Cross-Modality Masked Autoencoders for Region-Aware Audio-Visual Pre-Training

Yuxin Guo, Siyang Sun, Shuailei Ma et al.

CVPR 2024poster
#3933

Weakly-Supervised Audio-Visual Video Parsing with Prototype-based Pseudo-Labeling

Kranthi Kumar Rachavarapu, Kalyan Ramakrishnan, A. N. Rajagopalan

CVPR 2024poster
#3934

CapHuman: Capture Your Moments in Parallel Universes

Chao Liang, Fan Ma, Linchao Zhu et al.

CVPR 2024posterarXiv:2402.00627
#3935

Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection

Yicheng Xiao, Zhuoyan Luo, Yong Liu et al.

CVPR 2024posterarXiv:2311.16464
#3936

ChAda-ViT : Channel Adaptive Attention for Joint Representation Learning of Heterogeneous Microscopy Images

Nicolas Bourriez, Ihab Bendidi, Cohen Ethan et al.

CVPR 2024posterarXiv:2311.15264
#3937

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation

Hang Li, Chengzhi Shen, Philip H.S. Torr et al.

CVPR 2024posterarXiv:2311.17216
#3938

Number it: Temporal Grounding Videos like Flipping Manga

Yongliang Wu, Xinting Hu, Yuyang Sun et al.

CVPR 2025posterarXiv:2411.10332
#3939

VS: Reconstructing Clothed 3D Human from Single Image via Vertex Shift

Leyuan Liu, Yuhan Li, Yunqi Gao et al.

CVPR 2024poster
#3940

Autoregressive Sequential Pretraining for Visual Tracking

Shiyi Liang, Yifan Bai, Yihong Gong et al.

CVPR 2025poster
#3941

A Selective Re-learning Mechanism for Hyperspectral Fusion Imaging

Yuanye Liu, jinyang liu, Renwei Dian et al.

CVPR 2025poster
#3942

CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging

Zhiwei Ling, Yachen Chang, Hailiang Zhao et al.

CVPR 2025posterarXiv:2503.00325
#3943

Towards Automatic Power Battery Detection: New Challenge Benchmark Dataset and Baseline

Xiaoqi Zhao, Youwei Pang, Zhenyu Chen et al.

CVPR 2024posterarXiv:2312.02528
#3944

Point Transformer V3: Simpler Faster Stronger

Xiaoyang Wu, Li Jiang, Peng-Shuai Wang et al.

CVPR 2024poster
#3945

Improving Distant 3D Object Detection Using 2D Box Supervision

Zetong Yang, Zhiding Yu, Christopher Choy et al.

CVPR 2024posterarXiv:2403.09230
#3946

Infrared Small Target Detection with Scale and Location Sensitivity

Qiankun Liu, Rui Liu, Bolun Zheng et al.

CVPR 2024posterarXiv:2403.19366
#3947

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025posterarXiv:2412.10122
#3948

Wonder3D: Single Image to 3D using Cross-Domain Diffusion

Xiaoxiao Long, Yuan-Chen Guo, Cheng Lin et al.

CVPR 2024highlightarXiv:2310.15008
#3949

Honeybee: Locality-enhanced Projector for Multimodal LLM

Junbum Cha, Woo-Young Kang, Jonghwan Mun et al.

CVPR 2024highlightarXiv:2312.06742
#3950

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025posterarXiv:2505.11182
#3951

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025poster
#3952

Minority-Focused Text-to-Image Generation via Prompt Optimization

Soobin Um, Jong Chul Ye

CVPR 2025posterarXiv:2410.07838
#3953

Mining Supervision for Dynamic Regions in Self-Supervised Monocular Depth Estimation

Hoang Chuong Nguyen, Tianyu Wang, Jose M. Alvarez et al.

CVPR 2024posterarXiv:2404.14908
#3954

pFedMxF: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation

Yifei Zhang, Hao Zhu, Alysa Ziying Tan et al.

CVPR 2025poster
#3955

SleepVST: Sleep Staging from Near-Infrared Video Signals using Pre-Trained Transformers

Jonathan F. Carter, Joao Jorge, Oliver Gibson et al.

CVPR 2024highlightarXiv:2404.03831
#3956

Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

Seungwook Kim, Kejie Li, Xueqing Deng et al.

CVPR 2024posterarXiv:2404.10603
#3957

Learning to Rematch Mismatched Pairs for Robust Cross-Modal Retrieval

Haochen Han, Qinghua Zheng, Guang Dai et al.

CVPR 2024posterarXiv:2403.05105
#3958

OmniMMI: A Comprehensive Multi-modal Interaction Benchmark in Streaming Video Contexts

Yuxuan Wang, Yueqian Wang, Bo Chen et al.

CVPR 2025posterarXiv:2503.22952
#3959

Mamba-Reg: Vision Mamba Also Needs Registers

Feng Wang, Jiahao Wang, Sucheng Ren et al.

CVPR 2025poster
#3960

EVS-assisted Joint Deblurring Rolling-Shutter Correction and Video Frame Interpolation through Sensor Inverse Modeling

Rui Jiang, Fangwen Tu, Yixuan Long et al.

CVPR 2024poster
#3961

Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025posterarXiv:2503.13108
#3962

Open-World Semantic Segmentation Including Class Similarity

Matteo Sodano, Federico Magistri, Lucas Nunes et al.

CVPR 2024posterarXiv:2403.07532
#3963

MeLFusion: Synthesizing Music from Image and Language Cues using Diffusion Models

Sanjoy Chowdhury, Sayan Nag, Joseph K J et al.

CVPR 2024highlightarXiv:2406.04673
#3964

Empowering Resampling Operation for Ultra-High-Definition Image Enhancement with Model-Aware Guidance

Yu, Jie Huang, Li et al.

CVPR 2024poster
#3965

READ: Retrieval-Enhanced Asymmetric Diffusion for Motion Planning

Takeru Oba, Matthew Walter, Norimichi Ukita

CVPR 2024poster
#3966

From Pixels to Graphs: Open-Vocabulary Scene Graph Generation with Vision-Language Models

Rongjie Li, Songyang Zhang, Dahua Lin et al.

CVPR 2024posterarXiv:2404.00906
#3967

Not Just Text: Uncovering Vision Modality Typographic Threats in Image Generation Models

Hao Cheng, Erjia Xiao, Jiayan Yang et al.

CVPR 2025posterarXiv:2412.05538
#3968

MeshPose: Unifying DensePose and 3D Body Mesh Reconstruction

Eric-Tuan Le, Antonios Kakolyris, Petros Koutras et al.

CVPR 2024poster
#3969

Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

CVPR 2025posterarXiv:2502.20981
#3970

Bayesian Differentiable Physics for Cloth Digitalization

Deshan Gong, Ningtao Mao, He Wang

CVPR 2024posterarXiv:2402.17664
#3971

MemSAM: Taming Segment Anything Model for Echocardiography Video Segmentation

Xiaolong Deng, Huisi Wu, Runhao Zeng et al.

CVPR 2024poster
#3972

SCEdit: Efficient and Controllable Image Diffusion Generation via Skip Connection Editing

Zeyinzi Jiang, Chaojie Mao, Yulin Pan et al.

CVPR 2024highlightarXiv:2312.11392
#3973

InstaGen: Enhancing Object Detection by Training on Synthetic Dataset

Chengjian Feng, Yujie Zhong, Zequn Jie et al.

CVPR 2024posterarXiv:2402.05937
#3974

OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Xinyu Zhan, Lixin Yang, Yifei Zhao et al.

CVPR 2024posterarXiv:2403.19417
#3975

Unified-IO 2: Scaling Autoregressive Multimodal Models with Vision Language Audio and Action

Jiasen Lu, Christopher Clark, Sangho Lee et al.

CVPR 2024highlight
#3976

PTQ4SAM: Post-Training Quantization for Segment Anything

Chengtao Lv, Hong Chen, Jinyang Guo et al.

CVPR 2024posterarXiv:2405.03144
#3977

Towards Fine-Grained Interpretability: Counterfactual Explanations for Misclassification with Saliency Partition

ZHANG LINTONG, Kang Yin, Seong-Whan Lee

CVPR 2025posterarXiv:2511.07974
#3978

DiffLO: Semantic-Aware LiDAR Odometry with Diffusion-Based Refinement

huang yongshu, Chen Liu, Minghang Zhu et al.

CVPR 2025poster
#3979

Narrative Action Evaluation with Prompt-Guided Multimodal Interaction

Shiyi Zhang, Sule Bai, Guangyi Chen et al.

CVPR 2024posterarXiv:2404.14471
#3980

RAEncoder: A Label-Free Reversible Adversarial Examples Encoder for Dataset Intellectual Property Protection

Fan Xing, Zhuo Tian, Xuefeng Fan et al.

CVPR 2025poster
#3981

Training-free Neural Architecture Search through Variance of Knowledge of Deep Network Weights

Ondrej Tybl, Lukas Neumann

CVPR 2025poster
#3982

Training Diffusion Models Towards Diverse Image Generation with Reinforcement Learning

Zichen Miao, Jiang Wang, Ze Wang et al.

CVPR 2024poster
#3983

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang, Nieqing Cao, Yan Ding et al.

CVPR 2025poster
#3984

Visual Point Cloud Forecasting enables Scalable Autonomous Driving

Zetong Yang, Li Chen, Yanan Sun et al.

CVPR 2024highlightarXiv:2312.17655
#3985

Light the Night: A Multi-Condition Diffusion Framework for Unpaired Low-Light Enhancement in Autonomous Driving

JINLONG LI, Baolu Li, Zhengzhong Tu et al.

CVPR 2024posterarXiv:2404.04804
#3986

Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection

Ji Du, Fangwei Hao, Mingyang Yu et al.

CVPR 2025poster
#3987

MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos

Zhengqi Li, Richard Tucker, Forrester Cole et al.

CVPR 2025posterarXiv:2412.04463
#3988

MAD: Memory-Augmented Detection of 3D Objects

Ben Agro, Sergio Casas, Patrick Wang et al.

CVPR 2025poster
#3989

Elite360D: Towards Efficient 360 Depth Estimation via Semantic- and Distance-Aware Bi-Projection Fusion

Hao Ai, Addison, Lin Wang

CVPR 2024posterarXiv:2403.16376
#3990

Learning Triangular Distribution in Visual World

Ping Chen, Xingpeng Zhang, Chengtao Zhou et al.

CVPR 2024posterarXiv:2311.18605
#3991

Your Student is Better Than Expected: Adaptive Teacher-Student Collaboration for Text-Conditional Diffusion Models

Nikita Starodubcev, Dmitry Baranchuk, Artem Fedorov et al.

CVPR 2024posterarXiv:2312.10835
#3992

GLiDR: Topologically Regularized Graph Generative Network for Sparse LiDAR Point Clouds

Prashant Kumar, Kshitij Madhav Bhat, Vedang Bhupesh Shenvi Nadkarni et al.

CVPR 2024posterarXiv:2312.00068
#3993

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025posterarXiv:2407.18914
#3994

HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation

Xin Huang, Ruizhi Shao, Qi Zhang et al.

CVPR 2024posterarXiv:2310.01406
#3995

FIFA: Fine-grained Inter-frame Attention for Driver's Video Gaze Estimation

Daosong Hu, Mingyue Cui, Kai Huang

CVPR 2025poster
#3996

Unbiased Estimator for Distorted Conics in Camera Calibration

Chaehyeon Song, Jaeho Shin, Myung-Hwan Jeon et al.

CVPR 2024highlightarXiv:2403.04583
#3997

Siamese Learning with Joint Alignment and Regression for Weakly-Supervised Video Paragraph Grounding

Chaolei Tan, Jianhuang Lai, Wei-Shi Zheng et al.

CVPR 2024posterarXiv:2403.11463
#3998

Enhancing Quality of Compressed Images by Mitigating Enhancement Bias Towards Compression Domain

Qunliang Xing, Mai Xu, Shengxi Li et al.

CVPR 2024posterarXiv:2402.17200
#3999

Dynamic Pseudo Labeling via Gradient Cutting for High-Low Entropy Exploration

Jae Hyeon Park, Joo Hyeon Jeon, Jae Yun Lee et al.

CVPR 2025poster
#4000

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025posterarXiv:2503.01899