Most Cited 2025 "numerical reconstruction" Papers

22,274 papers found • Page 98 of 112

#19401

GRAE-3DMOT: Geometry Relation-Aware Encoder for Online 3D Multi-Object Tracking

Hyunseop Kim, Hyo-Jun Lee, Yonguk Lee et al.

CVPR 2025poster
#19402

Generative Modeling of Class Probability for Multi-Modal Representation Learning

JungKyoo Shin, Bumsoo Kim, Eunwoo Kim

CVPR 2025highlightarXiv:2503.17417
#19403

Unified Medical Lesion Segmentation via Self-referring Indicator

Shijie Chang, Xiaoqi Zhao, Lihe Zhang et al.

CVPR 2025poster
#19404

SGSST: Scaling Gaussian Splatting Style Transfer

Bruno Galerne, Jianling WANG, Lara Raad et al.

CVPR 2025poster
#19405

DIO: Decomposable Implicit 4D Occupancy-Flow World Model

Christopher Diehl, Quinlan Sykora, Ben Agro et al.

CVPR 2025poster
#19406

HERA: Hybrid Explicit Representation for Ultra-Realistic Head Avatars

Hongrui Cai, Yuting Xiao, Xuan Wang et al.

CVPR 2025poster
#19407

Hierarchical Adaptive Filtering Network for Text Image Specular Highlight Removal

Zhi Jiang, Jingbo Hu, Ling Zhang et al.

CVPR 2025poster
#19408

Conformal Prediction and MLLM aided Uncertainty Quantification in Scene Graph Generation

Sayak Nag, Udita Ghosh, Calvin-Khang Ta et al.

CVPR 2025posterarXiv:2503.13947
#19409

Move-in-2D: 2D-Conditioned Human Motion Generation

Hsin-Ping Huang, Yang Zhou, Jui-Hsien Wang et al.

CVPR 2025posterarXiv:2412.13185
#19410

Boosting Domain Incremental Learning: Selecting the Optimal Parameters is All You Need

Qiang Wang, Xiang Song, Yuhang He et al.

CVPR 2025posterarXiv:2505.23744
#19411

Phoenix: A Motion-based Self-Reflection Framework for Fine-grained Robotic Action Correction

Wenke Xia, Ruoxuan Feng, Dong Wang et al.

CVPR 2025posterarXiv:2504.14588
#19412

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Wanhua Li, Renping Zhou, Jiawei Zhou et al.

CVPR 2025posterarXiv:2503.10437
#19413

Quaffure: Real-Time Quasi-Static Neural Hair Simulation

Tuur Stuyck, Gene Wei-Chin Lin, Egor Larionov et al.

CVPR 2025posterarXiv:2412.10061
#19414

Implicit Bias Injection Attacks against Text-to-Image Diffusion Models

Huayang Huang, Xiangye Jin, Jiaxu Miao et al.

CVPR 2025posterarXiv:2504.01819
#19415

Reversing Flow for Image Restoration

Haina Qin, Wenyang Luo, Bing Li et al.

CVPR 2025posterarXiv:2506.16961
#19416

MTADiffusion: Mask Text Alignment Diffusion Model for Object Inpainting

jun huang, Ting Liu, Yihang Wu et al.

CVPR 2025posterarXiv:2506.23482
#19417

DAGSM: Disentangled Avatar Generation with GS-enhanced Mesh

Jingyu Zhuang, Di Kang, Linchao Bao et al.

CVPR 2025posterarXiv:2411.15205
#19418

Open-Canopy: Towards Very High Resolution Forest Monitoring

Fajwel Fogel, Yohann PERRON, Nikola Besic et al.

CVPR 2025highlightarXiv:2407.09392
#19419

S2D-LFE: Sparse-to-Dense Light Field Event Generation

Yutong Liu, Wenming Weng, Yueyi Zhang et al.

CVPR 2025poster
#19420

Beyond Generation: A Diffusion-based Low-level Feature Extractor for Detecting AI-generated Images

Nan Zhong, Haoyu Chen, Yiran Xu et al.

CVPR 2025poster
#19421

Image is All You Need to Empower Large-scale Diffusion Models for In-Domain Generation

Pu Cao, Feng Zhou, Lu Yang et al.

CVPR 2025posterarXiv:2312.08195
#19422

MedUnifier: Unifying Vision-and-Language Pre-training on Medical Data with Vision Generation Task using Discrete Visual Representations

Ziyang Zhang, Yang Yu, Yucheng Chen et al.

CVPR 2025posterarXiv:2503.01019
#19423

Inference-Scale Complexity in ANN-SNN Conversion for High-Performance and Low-Power Applications

Tong Bu, Maohua Li, Zhaofei Yu

CVPR 2025posterarXiv:2409.03368
#19424

GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations

Yiwei Bao, Zhiming Wang, Feng Lu

CVPR 2025poster
#19425

Multirate Neural Image Compression with Adaptive Lattice Vector Quantization

Hao Xu, Xiaolin Wu, Xi Zhang

CVPR 2025highlight
#19426

VideoGEM: Training-free Action Grounding in Videos

Felix Vogel, Walid Bousselham, Anna Kukleva et al.

CVPR 2025posterarXiv:2503.20348
#19427

SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Aleksei Bokhovkin, Quan Meng, Shubham Tulsiani et al.

CVPR 2025posterarXiv:2412.01801
#19428

DefMamba: Deformable Visual State Space Model

Leiye Liu, Miao Zhang, Jihao Yin et al.

CVPR 2025posterarXiv:2504.05794
#19429

PEER Pressure: Model-to-Model Regularization for Single Source Domain Generalization

Dongkyu Cho, Inwoo Hwang, Sanghack Lee

CVPR 2025posterarXiv:2505.12745
#19430

Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks

Han Wang, Gang Wang, Huan Zhang

CVPR 2025posterarXiv:2411.16721
#19431

Less is More: Efficient Image Vectorization with Adaptive Parameterization

Kaibo Zhao, Liang Bao, Yufei Li et al.

CVPR 2025poster
#19432

Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

Wenbin An, Feng Tian, Sicong Leng et al.

CVPR 2025posterarXiv:2406.12718
#19433

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Hyunsoo Cha, Inhee Lee, Hanbyul Joo

CVPR 2025posterarXiv:2412.21206
#19434

Towards Explainable and Unprecedented Accuracy in Matching Challenging Finger Crease Patterns

Zhenyu Zhou, Chengdong Dong, Ajay Kumar

CVPR 2025highlight
#19435

Animate and Sound an Image

Xihua Wang, Ruihua Song, Chongxuan Li et al.

CVPR 2025poster
#19436

Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration

Yiyang Chen, Tianyu Ding, Lei Wang et al.

CVPR 2025poster
#19437

Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning

Xueyi Ke, Satoshi Tsutsui, Yayun Zhang et al.

CVPR 2025posterarXiv:2501.05205
#19438

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694
#19439

Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Dokyoon Yoon, Youngsook Song, Woomyoung Park

CVPR 2025posterarXiv:2506.11417
#19440

LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians

Jiamin WU, Kenkun Liu, Han Gao et al.

CVPR 2025posterarXiv:2404.16323
#19441

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang et al.

CVPR 2025posterarXiv:2504.12959
#19442

Star with Bilinear Mapping

Zelin Peng, Yu Huang, Zhengqin Xu et al.

CVPR 2025poster
#19443

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li, Bin Chen, Chen Zhao et al.

CVPR 2025posterarXiv:2411.15255
#19444

ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Guo Junfu, Yu Xin, Gaoyi Liu et al.

CVPR 2025posterarXiv:2503.08135
#19445

Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture

Kenkun Liu, Yurong Fu, Weihao Yuan et al.

CVPR 2025poster
#19446

Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels

Jiyuan Liu, Xinwang Liu, chuankun Li et al.

CVPR 2025poster
#19447

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Andrea Maracani, Savas Ozkan, Sijun Cho et al.

CVPR 2025posterarXiv:2503.16184
#19448

Analyzing the Synthetic-to-Real Domain Gap in 3D Hand Pose Estimation

Zhuoran ZHAO, Linlin Yang, Pengzhan Sun et al.

CVPR 2025posterarXiv:2503.19307
#19449

Zero-Shot 4D Lidar Panoptic Segmentation

Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.

CVPR 2025posterarXiv:2504.00848
#19450

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation

Lanyun Zhu, Tianrun Chen, Qianxiong Xu et al.

CVPR 2025posterarXiv:2504.00640
#19451

HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories

Eric Hedlin, Munawar Hayat, Fatih Porikli et al.

CVPR 2025posterarXiv:2412.17040
#19452

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long, Abhinav Kumar, Xiaoming Liu et al.

CVPR 2025posterarXiv:2504.09086
#19453

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Yuhao Wang, Yongfeng Lv, Pingping Zhang et al.

CVPR 2025posterarXiv:2503.10324
#19454

MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices

Jianwen Jiang, Gaojie Lin, Zhengkun Rong et al.

CVPR 2025posterarXiv:2407.05712
#19455

Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising

Yongli Xiang, Ziming Hong, Lina Yao et al.

CVPR 2025posterarXiv:2503.17198
#19456

Joint Vision-Language Social Bias Removal for CLIP

Haoyu Zhang, Yangyang Guo, Mohan Kankanhalli

CVPR 2025posterarXiv:2411.12785
#19457

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Yang Yue, Yulin Wang, Chenxin Tao et al.

CVPR 2025posterarXiv:2504.13820
#19458

Knowledge Bridger: Towards Training-Free Missing Modality Completion

Guanzhou Ke, Shengfeng He, Xiao-Li Wang et al.

CVPR 2025posterarXiv:2502.19834
#19459

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

Qi Zhu, Jiangwei Lao, Deyi Ji et al.

CVPR 2025poster
#19460

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025poster
#19461

Fingerprinting Denoising Diffusion Probabilistic Models

Huan Teng, Yuhui Quan, Chengyu Wang et al.

CVPR 2025poster
#19462

Query Efficient Black-Box Visual Prompting with Subspace Learning

Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.

CVPR 2025poster
#19463

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Yongfan Liu, Hyoukjun Kwon

CVPR 2025posterarXiv:2411.10013
#19464

Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks

Daizong Liu, Wei Hu

CVPR 2025poster
#19465

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025poster
#19466

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025posterarXiv:2506.03481
#19467

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025posterarXiv:2503.21140
#19468

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025poster
#19469

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025posterarXiv:2503.01899
#19470

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025posterarXiv:2412.13573
#19471

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025posterarXiv:2503.15197
#19472

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025posterarXiv:2412.03752
#19473

LoKi: Low-dimensional KAN for Efficient Fine-tuning Image Models

Xuan Cai, Renjie Pan, Hua Yang

CVPR 2025poster
#19474

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344
#19475

AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments

Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.

CVPR 2025poster
#19476

Viewpoint Rosetta Stone: Unlocking Unpaired Ego-Exo Videos for View-invariant Representation Learning

Mi Luo, Zihui Xue, Alex Dimakis et al.

CVPR 2025poster
#19477

GIF: Generative Inspiration for Face Recognition at Scale

Mohammad Saadabadi Saadabadi, Sahar Rahimi Malakshan, Ali Dabouei et al.

CVPR 2025poster
#19478

CrossSDF: 3D Reconstruction of Thin Structures From Cross-Sections

Thomas Walker, Salvatore Esposito, Daniel Rebain et al.

CVPR 2025posterarXiv:2412.04120
#19479

Investigating the Role of Weight Decay in Enhancing Nonconvex SGD

Tao Sun, Yuhao Huang, Li Shen et al.

CVPR 2025poster
#19480

MaskGWM: A Generalizable Driving World Model with Video Mask Reconstruction

Jingcheng Ni, Yuxin Guo, Yichen Liu et al.

CVPR 2025posterarXiv:2502.11663
#19481

AVQACL: A Novel Benchmark for Audio-Visual Question Answering Continual Learning

Kaixuan Wu, Xinde Li, Xinglin Li et al.

CVPR 2025poster
#19482

Divide and Conquer: Heterogeneous Noise Integration for Diffusion-based Adversarial Purification

Gaozheng Pei, Shaojie Lyu, Gong Chen et al.

CVPR 2025posterarXiv:2503.01407
#19483

Foley-Flow: Coordinated Video-to-Audio Generation with Masked Audio-Visual Alignment and Dynamic Conditional Flows

Shentong Mo, Yibing Song

CVPR 2025poster
#19484

DL2G: Degradation-guided Local-to-Global Restoration for Eyeglass Reflection Removal

Yizhilv, Xiao Lu, Hong Ding et al.

CVPR 2025poster
#19485

Efficient Decoupled Feature 3D Gaussian Splatting via Hierarchical Compression

Zhenqi Dai, Ting Liu, Yanning Zhang

CVPR 2025poster
#19486

Hearing Hands: Generating Sounds from Physical Interactions in 3D Scenes

Yiming Dou, Wonseok Oh, Yuqing Luo et al.

CVPR 2025posterarXiv:2506.09989
#19487

Domain Generalization in CLIP via Learning with Diverse Text Prompts

Changsong Wen, Zelin Peng, Yu Huang et al.

CVPR 2025poster
#19488

Alias-Free Latent Diffusion Models: Improving Fractional Shift Equivariance of Diffusion Latent Space

Yifan Zhou, Zeqi Xiao, Shuai Yang et al.

CVPR 2025posterarXiv:2503.09419
#19489

Weakly Supervised Contrastive Adversarial Training for Learning Robust Features from Semi-supervised Data

Lilin Zhang, Chengpei Wu, Ning Yang

CVPR 2025posterarXiv:2503.11032
#19490

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025posterarXiv:2502.19908
#19491

UCM-VeID V2: A Richer Dataset and A Pre-training Method for UAV Cross-Modality Vehicle Re-Identification

Xingyue Liu, Jiahao Qi, Chen Chen et al.

CVPR 2025poster
#19492

Unboxed: Geometrically and Temporally Consistent Video Outpainting

Zhongrui Yu, Martina Megaro-Boldini, Robert Sumner et al.

CVPR 2025poster
#19493

Visual Lexicon: Rich Image Features in Language Space

XuDong Wang, Xingyi Zhou, Alireza Fathi et al.

CVPR 2025posterarXiv:2412.06774
#19494

Continual SFT Matches Multimodal RLHF with Negative Supervision

Ke Zhu, Yu Wang, Yanpeng Sun et al.

CVPR 2025posterarXiv:2411.14797
#19495

Decoupled Motion Expression Video Segmentation

Hao Fang, Runmin Cong, Xiankai Lu et al.

CVPR 2025poster
#19496

Mixture of Submodules for Domain Adaptive Person Search

Minsu Kim, Seungryong Kim, Kwanghoon Sohn

CVPR 2025poster
#19497

Unsupervised Discovery of Facial Landmarks and Head Pose

Satyajit Tourani, Siddharth Tourani, Arif Mahmood et al.

CVPR 2025poster
#19498

Dynamic Integration of Task-Specific Adapters for Class Incremental Learning

Jiashuo Li, Shaokun Wang, Bo Qian et al.

CVPR 2025posterarXiv:2409.14983
#19499

Test-time Augmentation Improves Efficiency in Conformal Prediction

Divya M Shanmugam, Helen Lu, Swami Sankaranarayanan et al.

CVPR 2025posterarXiv:2505.22764
#19500

GREAT: Geometry-Intention Collaborative Inference for Open-Vocabulary 3D Object Affordance Grounding

Yawen Shao, Wei Zhai, Yuhang Yang et al.

CVPR 2025posterarXiv:2411.19626
#19501

Dual Diffusion for Unified Image Generation and Understanding

Zijie Li, Henry Li, Yichun Shi et al.

CVPR 2025posterarXiv:2501.00289
#19502

Commonsense Video Question Answering through Video-Grounded Entailment Tree Reasoning

Huabin Liu, Filip Ilievski, Cees G. M. Snoek

CVPR 2025posterarXiv:2501.05069
#19503

Enduring, Efficient and Robust Trajectory Prediction Attack in Autonomous Driving via Optimization-Driven Multi-Frame Perturbation Framework

Yi Yu, Weizhen Han, Libing Wu et al.

CVPR 2025highlight
#19504

UNEM: UNrolled Generalized EM for Transductive Few-Shot Learning

Long Zhou, Fereshteh Shakeri, Aymen Sadraoui et al.

CVPR 2025posterarXiv:2412.16739
#19505

Flash-Split: 2D Reflection Removal with Flash Cues and Latent Diffusion Separation

Tianfu Wang, Mingyang Xie, Haoming Cai et al.

CVPR 2025posterarXiv:2501.00637
#19506

Test-Time Backdoor Detection for Object Detection Models

Hangtao Zhang, Yichen Wang, Shihui Yan et al.

CVPR 2025posterarXiv:2503.15293
#19507

Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

CVPR 2025posterarXiv:2411.16738
#19508

Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Chong Wang, Lanqing Guo, Zixuan Fu et al.

CVPR 2025posterarXiv:2503.01288
#19509

Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models

Zhenguang Liu, Chao Shuai, Shaojing Fan et al.

CVPR 2025posterarXiv:2503.11071
#19510

Gain from Neighbors: Boosting Model Robustness in the Wild via Adversarial Perturbations Toward Neighboring Classes

Zhou Yang, Mingtao Feng, Tao Huang et al.

CVPR 2025poster
#19511

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025posterarXiv:2503.23538
#19512

EquiPose: Exploiting Permutation Equivariance for Relative Camera Pose Estimation

Yuzhen Liu, Qiulei Dong

CVPR 2025poster
#19513

EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues

Sagar Soni, Akshay Dudhane, Hiyam Debary et al.

CVPR 2025posterarXiv:2412.15190
#19514

Visual Consensus Prompting for Co-Salient Object Detection

Jie Wang, Nana Yu, Zihao Zhang et al.

CVPR 2025posterarXiv:2504.14254
#19515

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Dongseob Kim, Hyunjung Shim

CVPR 2025posterarXiv:2503.16873
#19516

UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Hang Yin, Xiuwei Xu, Linqing Zhao et al.

CVPR 2025posterarXiv:2503.10630
#19517

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025posterarXiv:2407.18914
#19518

pFedMxF: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation

Yifei Zhang, Hao Zhu, Alysa Ziying Tan et al.

CVPR 2025poster
#19519

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025posterarXiv:2412.10122
#19520

ACAttack: Adaptive Cross Attacking RGB-T Tracker via Multi-Modal Response Decoupling

Xinyu Xiang, Qinglong Yan, HAO ZHANG et al.

CVPR 2025poster
#19521

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Bingda Tang, Sayak Paul, Boyang Zheng et al.

CVPR 2025posterarXiv:2505.10046
#19522

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.

CVPR 2025posterarXiv:2504.09966
#19523

Distilling Long-tailed Datasets

Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2408.14506
#19524

Knowledge Memorization and Rumination for Pre-trained Model-based Class-Incremental Learning

Zijian Gao, Wangwang Jia, Xingxing Zhang et al.

CVPR 2025poster
#19525

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

CVPR 2025posterarXiv:2406.04746
#19526

CheXwhatsApp: A Dataset for Exploring Challenges in the Diagnosis of Chest X-rays through Mobile Devices

Mariamma Antony, Rajiv Porana, Sahil M. Lathiya et al.

CVPR 2025poster
#19527

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

Jinhong Lin, Cheng-En Wu, Huanran Li et al.

CVPR 2025posterarXiv:2411.10685
#19528

SINR: Sparsity Driven Compressed Implicit Neural Representations

Dhananjaya Jayasundara, Sudarshan Rajagopalan, Yasiru Ranasinghe et al.

CVPR 2025posterarXiv:2503.19576
#19529

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
#19530

DVHGNN: Multi-Scale Dilated Vision HGNN for Efficient Vision Recognition

Caoshuo Li, Tanzhe Li, Xiaobin Hu et al.

CVPR 2025posterarXiv:2503.14867
#19531

RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability

Minh Kha Do, Kang Han, Phu Lai et al.

CVPR 2025poster
#19532

MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image

Shaoming Li, Qing Cai, Songqi KONG et al.

CVPR 2025poster
#19533

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Xuweiyi Chen, Markus Marks, Zezhou Cheng

CVPR 2025posterarXiv:2411.17474
#19534

ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Zirun Guo, Tao Jin

CVPR 2025posterarXiv:2503.10358
#19535

Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning

Jeongryong Lee, Yejee Shin, Geonhui Son et al.

CVPR 2025poster
#19536

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.

CVPR 2025posterarXiv:2504.11739
#19537

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2025poster
#19538

Beyond Image Classification: A Video Benchmark and Dual-Branch Hybrid Discrimination Framework for Compositional Zero-Shot Learning

Dongyao Jiang, Haodong Jing, Yongqiang Ma et al.

CVPR 2025poster
#19539

AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video

Noah Stier, Alex Rich, Pradeep Sen et al.

CVPR 2025poster
#19540

Easy-editable Image Vectorization with Multi-layer Multi-scale Distributed Visual Feature Embedding

Ye Chen, Zhangli Hu, Zhongyin Zhao et al.

CVPR 2025poster
#19541

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

Yudong Han, Qingpei Guo, Liyuan Pan et al.

CVPR 2025posterarXiv:2411.12355
#19542

Automated Proof of Polynomial Inequalities via Reinforcement Learning

Banglong Liu, Niuniu Qi, Xia Zeng et al.

CVPR 2025posterarXiv:2503.06592
#19543

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Xuewu Lin, Tianwei Lin, Alan Huang et al.

CVPR 2025posterarXiv:2411.14869
#19544

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Riku Murai, Eric Dexheimer, Andrew J. Davison

CVPR 2025highlightarXiv:2412.12392
#19545

Toward Robust Neural Reconstruction from Sparse Point Sets

Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.

CVPR 2025posterarXiv:2412.16361
#19546

Just Dance with pi! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection

Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva et al.

CVPR 2025highlight
#19547

Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

Yutao Feng, Xiang Feng, Yintong Shang et al.

CVPR 2025posterarXiv:2401.15318
#19548

Improving Accuracy and Calibration via Differentiated Deep Mutual Learning

Han Liu, Peng Cui, Bingning Wang et al.

CVPR 2025poster
#19549

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Jiahao Cui, Hui Li, Qingkun Su et al.

CVPR 2025posterarXiv:2412.00733
#19550

Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Fengfan Zhou, Bangjie Yin, Hefei Ling et al.

CVPR 2025posterarXiv:2411.15555
#19551

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Zhiwei Jia, Yuesong Nan, Huixi Zhao et al.

CVPR 2025posterarXiv:2411.15247
#19552

CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth

Zhiyu Qu, Yunqi Miao, Zhensong Zhang et al.

CVPR 2025poster
#19553

Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

JunYong Choi, Min-Cheol Sagong, SeokYeong Lee et al.

CVPR 2025posterarXiv:2503.09993
#19554

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Yan Xie, Zequn Zeng, Hao Zhang et al.

CVPR 2025posterarXiv:2505.07209
#19555

Embodied Scene Understanding for Vision Language Models via MetaVQA

Weizhen Wang, Chenda Duan, Zhenghao Peng et al.

CVPR 2025posterarXiv:2501.09167
#19556

Learning Flow Fields in Attention for Controllable Person Image Generation

Zijian Zhou, Shikun Liu, Xiao Han et al.

CVPR 2025posterarXiv:2412.08486
#19557

Forming Auxiliary High-confident Instance-level Loss to Promote Learning from Label Proportions

Tianhao Ma, Han Chen, Juncheng Hu et al.

CVPR 2025posterarXiv:2411.10364
#19558

Towards Open-Vocabulary Audio-Visual Event Localization

Jinxing Zhou, Dan Guo, Ruohao Guo et al.

CVPR 2025posterarXiv:2411.11278
#19559

ADU: Adaptive Detection of Unknown Categories in Black-Box Domain Adaptation

Yushan Lai, Guowen Li, Haoyuan Liang et al.

CVPR 2025poster
#19560

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Zichen Miao, WEI CHEN, Qiang Qiu

CVPR 2025highlightarXiv:2503.18337
#19561

MVBoost: Boost 3D Reconstruction with Multi-View Refinement

Xiangyu Liu, Xiaomei Zhang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17772
#19562

CARL: A Framework for Equivariant Image Registration

Hastings Greer, Lin Tian, François-Xavier Vialard et al.

CVPR 2025posterarXiv:2405.16738
#19563

AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation

Datao Tang, Xiangyong Cao, Xuan Wu et al.

CVPR 2025posterarXiv:2411.15497
#19564

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines

Chen Tang, Xinzhu Ma, Encheng Su et al.

CVPR 2025posterarXiv:2503.20748
#19565

RENO: Real-Time Neural Compression for 3D LiDAR Point Clouds

Kang You, Tong Chen, Dandan Ding et al.

CVPR 2025posterarXiv:2503.12382
#19566

MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks

Yifei Liu, Zhihang Zhong, Yifan Zhan et al.

CVPR 2025posterarXiv:2412.20522
#19567

Forensics Adapter: Adapting CLIP for Generalizable Face Forgery Detection

Xinjie Cui, Yuezun Li, Ao Luo et al.

CVPR 2025poster
#19568

Co-Speech Gesture Video Generation with Implicit Motion-Audio Entanglement

Xinjie Li, Ziyi Chen, Xinlu Yu et al.

CVPR 2025poster
#19569

Janus: Decoupling Visual Encoding for Unified Multimodal Understanding and Generation

Chengyue Wu, Xiaokang Chen, Zhiyu Wu et al.

CVPR 2025posterarXiv:2410.13848
#19570

FATE: Full-head Gaussian Avatar with Textural Editing from Monocular Video

Jiawei Zhang, Zijian Wu, Zhiyang Liang et al.

CVPR 2025posterarXiv:2411.15604
#19571

Can Knowledge be Transferred from Unimodal to Multimodal? Investigating the Transitivity of Multimodal Knowledge Editing

Lingyong Fang, Xinzhong Wang, Depeng depeng wang et al.

ICCV 2025poster
#19572

ConsNoTrainLoRA: Data-driven Weight Initialization of Low-rank Adapters using Constraints

Debasmit Das, Hyoungwoo Park, Munawar Hayat et al.

ICCV 2025posterarXiv:2507.08044
#19573

UDC-VIT: A Real-World Video Dataset for Under-Display Cameras

Kyusu Ahn, JiSoo Kim, Sangik Lee et al.

ICCV 2025highlightarXiv:2501.18545
#19574

Is Visual in-Context Learning for Compositional Medical Tasks within Reach?

Simon Reiß, Zdravko Marinov, Alexander Jaus et al.

ICCV 2025posterarXiv:2507.00868
#19575

Optimal Transport for Brain-Image Alignment: Unveiling Redundancy and Synergy in Neural Information Processing

Yang Xiao, Wang Lu, Jie Ji et al.

ICCV 2025posterarXiv:2503.10663
#19576

Chimera: Improving Generalist Model with Domain-Specific Experts

Tianshuo Peng, Mingsheng Li, Jiakang Yuan et al.

ICCV 2025posterarXiv:2412.05983
#19577

Enhanced Event-based Dense Stereo via Cross-Sensor Knowledge Distillation

Haihao Zhang, Yunjian Zhang, Jianing Li et al.

ICCV 2025poster
#19578

Not Only Vision: Evolve Visual Speech Recognition via Peripheral Information

Zhaoxin Yuan, Shuang Yang, Shiguang Shan et al.

ICCV 2025poster
#19579

ReFlex: Text-Guided Editing of Real Images in Rectified Flow via Mid-Step Feature Extraction and Attention Adaptation

Jimyeong Kim, Jungwon Park, Yeji Song et al.

ICCV 2025highlightarXiv:2507.01496
#19580

Imbalance in Balance: Online Concept Balancing in Generation Models

Yukai Shi, Jiarong Ou, Rui Chen et al.

ICCV 2025posterarXiv:2507.13345
#19581

RALoc: Enhancing Outdoor LiDAR Localization via Rotation Awareness

Yuyang Yang, Wen Li, Sheng Ao et al.

ICCV 2025highlight
#19582

Derm1M: A Million-scale Vision-Language Dataset Aligned with Clinical Ontology Knowledge for Dermatology

Siyuan Yan, Ming Hu, Yiwen Jiang et al.

ICCV 2025highlightarXiv:2503.14911
#19583

MagicID: Hybrid Preference Optimization for ID-Consistent and Dynamic-Preserved Video Customization

Hengjia Li, Lifan Jiang, Xi Xiao et al.

ICCV 2025posterarXiv:2503.12689
#19584

Visual Interestingness Decoded: How GPT-4o Mirrors Human Interests

Fitim Abdullahu, Helmut Grabner

ICCV 2025posterarXiv:2510.13316
#19585

D-Attn: Decomposed Attention for Large Vision-and-Language Model

Chia-Wen Kuo, Sijie Zhu, Fan Chen et al.

ICCV 2025posterarXiv:2502.01906
#19586

Understanding Personal Concept in Open-Vocabulary Semantic Segmentation

Sunghyun Park, Jungsoo Lee, Shubhankar Borse et al.

ICCV 2025posterarXiv:2507.11030
#19587

CoDa-4DGS: Dynamic Gaussian Splatting with Context and Deformation Awareness for Autonomous Driving

Rui Song, Chenwei Liang, Yan Xia et al.

ICCV 2025posterarXiv:2503.06744
#19588

UnZipLoRA: Separating Content and Style from a Single Image

Chang Liu, Viraj Shah, Aiyu Cui et al.

ICCV 2025highlightarXiv:2412.04465
#19589

SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

Yi Qin, Rui Wang, Tao Huang et al.

ICCV 2025posterarXiv:2508.06127
#19590

Semi-supervised Concept Bottleneck Models

Lijie Hu, Tianhao Huang, Huanyi Xie et al.

ICCV 2025posterarXiv:2406.18992
#19591

WINS: Winograd Structured Pruning for Fast Winograd Convolution

Cheonjun Park, Hyunjae Oh, Mincheol Park et al.

ICCV 2025highlight
#19592

Sparsity Outperforms Low-Rank Projections in Few-Shot Adaptation

Nairouz Mrabah, Nicolas Richet, Ismail Ayed et al.

ICCV 2025posterarXiv:2504.12436
#19593

ART: Adaptive Relation Tuning for Generalized Relation Prediction

Gopika Sudhakaran, Hikaru Shindo, Patrick Schramowski et al.

ICCV 2025posterarXiv:2507.23543
#19594

Feed-Forward SceneDINO for Unsupervised Semantic Scene Completion

Aleksandar Jevtić, Christoph Reich, Felix Wimbauer et al.

ICCV 2025posterarXiv:2507.06230
#19595

No Pose at All: Self-Supervised Pose-Free 3D Gaussian Splatting from Sparse Views

Ranran Huang, Krystian Mikolajczyk

ICCV 2025highlightarXiv:2508.01171
#19596

Cooperative Pseudo Labeling for Unsupervised Federated Classification

Kuangpu Guo, Lijun Sheng, Yongcan Yu et al.

ICCV 2025posterarXiv:2510.10100
#19597

MemDistill: Distilling LiDAR Knowledge into Memory for Camera-Only 3D Object Detection

Donghyeon Kwon, Youngseok Yoon, Hyeongseok Son et al.

ICCV 2025poster
#19598

From Sharp to Blur: Unsupervised Domain Adaptation for 2D Human Pose Estimation Under Extreme Motion Blur Using Event Cameras

Youngho Kim, Hoonhee Cho, Kuk-Jin Yoon

ICCV 2025posterarXiv:2507.22438
#19599

Augmenting Moment Retrieval: Zero-Dependency Two-Stage Learning

Zhengxuan Wei, Jiajin Tang, Sibei Yang

ICCV 2025posterarXiv:2510.19622
#19600

PAN-Crafter: Learning Modality-Consistent Alignment for PAN-Sharpening

Jeonghyeok Do, Sungpyo Kim, Geunhyuk Youk et al.

ICCV 2025posterarXiv:2505.23367