Most Cited CVPR "uncertain preferences" Papers

5,589 papers found • Page 26 of 28

#5001

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025poster
#5002

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025posterarXiv:2407.18914
#5003

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang, Nieqing Cao, Yan Ding et al.

CVPR 2025poster
#5004

DiffLO: Semantic-Aware LiDAR Odometry with Diffusion-Based Refinement

huang yongshu, Chen Liu, Minghang Zhu et al.

CVPR 2025poster
#5005

pFedMxF: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation

Yifei Zhang, Hao Zhu, Alysa Ziying Tan et al.

CVPR 2025poster
#5006

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025posterarXiv:2412.13573
#5007

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025posterarXiv:2412.10122
#5008

Continuous Adverse Weather Removal via Degradation-Aware Distillation

Xin Lu, Jie Xiao, Yurui Zhu et al.

CVPR 2025poster
#5009

Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation

Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou et al.

CVPR 2025posterarXiv:2505.06068
#5010

High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model

Yiyang Shen, Kun Zhou, He Wang et al.

CVPR 2025highlightarXiv:2504.01512
#5011

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025posterarXiv:2503.01899
#5012

ACAttack: Adaptive Cross Attacking RGB-T Tracker via Multi-Modal Response Decoupling

Xinyu Xiang, Qinglong Yan, HAO ZHANG et al.

CVPR 2025poster
#5013

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

Xiangtao Zhang, Sheng Li, Ao Li et al.

CVPR 2025poster
#5014

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

Kai Wang, Zekai Li, Zhi-Qi Cheng et al.

CVPR 2025posterarXiv:2410.17193
#5015

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Baptiste Brument, Robin Bruneau, Yvain Queau et al.

CVPR 2024posterarXiv:2312.01215
#5016

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Bingda Tang, Sayak Paul, Boyang Zheng et al.

CVPR 2025posterarXiv:2505.10046
#5017

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025poster
#5018

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.

CVPR 2025posterarXiv:2504.09966
#5019

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025posterarXiv:2503.21140
#5020

Distilling Long-tailed Datasets

Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2408.14506
#5021

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025posterarXiv:2506.03481
#5022

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025poster
#5023

Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks

Daizong Liu, Wei Hu

CVPR 2025poster
#5024

Knowledge Memorization and Rumination for Pre-trained Model-based Class-Incremental Learning

Zijian Gao, Wangwang Jia, Xingxing Zhang et al.

CVPR 2025poster
#5025

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Pengcheng Xu, Boyuan Jiang, Xiaobin Hu et al.

CVPR 2025posterarXiv:2411.15843
#5026

SeqMvRL: A Sequential Fusion Framework for Multi-view Representation Learning

Ren Wang, Haoliang Sun, Yuxiu Lin et al.

CVPR 2025poster
#5027

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Yicheng Chen, Xiangtai Li, Yining Li et al.

CVPR 2025posterarXiv:2406.20085
#5028

HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion

Ding Ding, Yueming Pan, Ruoyu Feng et al.

CVPR 2025poster
#5029

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, Sumit Patidar, Iain Haughton et al.

CVPR 2024posterarXiv:2403.03890
#5030

Structure-from-Motion with a Non-Parametric Camera Model

Yihan Wang, Linfei Pan, Marc Pollefeys et al.

CVPR 2025highlight
#5031

EventPSR: Surface Normal and Reflectance Estimation from Photometric Stereo Using an Event Camera

Bohan Yu, Jin Han, Boxin Shi et al.

CVPR 2025highlight
#5032

LAL: Enhancing 3D Human Motion Prediction with Latency-aware Auxiliary Learning

Xiaoning Sun, Dong Wei, Huaijiang Sun et al.

CVPR 2025poster
#5033

Sea-ing in Low-light

Nisha Varghese, A. N. Rajagopalan

CVPR 2025poster
#5034

Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Haoxin Li, Boyang Li

CVPR 2025posterarXiv:2503.01167
#5035

DiskVPS: Vanishing Point Detector via Hough Transform in a Disk Region

Jianping Wu

CVPR 2025poster
#5036

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Yongfan Liu, Hyoukjun Kwon

CVPR 2025posterarXiv:2411.10013
#5037

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025highlightarXiv:2505.00690
#5038

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025posterarXiv:2505.00693
#5039

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

CVPR 2025posterarXiv:2406.04746
#5040

CheXwhatsApp: A Dataset for Exploring Challenges in the Diagnosis of Chest X-rays through Mobile Devices

Mariamma Antony, Rajiv Porana, Sahil M. Lathiya et al.

CVPR 2025poster
#5041

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

Yuhui Liu, Liangxun Ou, Qiang Fu et al.

CVPR 2025poster
#5042

Learning-enabled Polynomial Lyapunov Function Synthesis via High-Accuracy Counterexample-Guided Framework

Hanrui Zhao, Niuniu Qi, Mengxin Ren et al.

CVPR 2025poster
#5043

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Teng Hu, Jiangning Zhang, Ran Yi et al.

CVPR 2025posterarXiv:2501.00880
#5044

Query Efficient Black-Box Visual Prompting with Subspace Learning

Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.

CVPR 2025poster
#5045

Fingerprinting Denoising Diffusion Probabilistic Models

Huan Teng, Yuhui Quan, Chengyu Wang et al.

CVPR 2025poster
#5046

Advancing Adversarial Robustness in GNeRFs: The IL2-NeRF Attack

Nicole Meng, Caleb Manicke, Ronak Sahu et al.

CVPR 2025poster
#5047

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

Jinhong Lin, Cheng-En Wu, Huanran Li et al.

CVPR 2025posterarXiv:2411.10685
#5048

SINR: Sparsity Driven Compressed Implicit Neural Representations

Dhananjaya Jayasundara, Sudarshan Rajagopalan, Yasiru Ranasinghe et al.

CVPR 2025posterarXiv:2503.19576
#5049

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025poster
#5050

Spiking Transformer: Introducing Accurate Addition-Only Spiking Self-Attention for Transformer

Yufei Guo, Xiaode Liu, Yuanpei Chen et al.

CVPR 2025poster
#5051

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

Qi Zhu, Jiangwei Lao, Deyi Ji et al.

CVPR 2025poster
#5052

NoiseCtrl: A Sampling-Algorithm-Agnostic Conditional Generation Method for Diffusion Models

Longquan Dai, He Wang, Jinhui Tang

CVPR 2025poster
#5053

Knowledge Bridger: Towards Training-Free Missing Modality Completion

Guanzhou Ke, Shengfeng He, Xiao-Li Wang et al.

CVPR 2025posterarXiv:2502.19834
#5054

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
#5055

BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects

Wanyue Zhang, Rishabh Dabral, Vladislav Golyanik et al.

CVPR 2025posterarXiv:2412.05066
#5056

Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Xinshuai Song, weixing chen, Yang Liu et al.

CVPR 2025posterarXiv:2412.09082
#5057

PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models

Junhyuk So, Jiwoong Shin, Chaeyeon Jang et al.

CVPR 2025posterarXiv:2503.19731
#5058

Towards Precise Scaling Laws for Video Diffusion Transformers

Yuanyang Yin, Yaqi Zhao, Mingwu Zheng et al.

CVPR 2025posterarXiv:2411.17470
#5059

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Yang Yue, Yulin Wang, Chenxin Tao et al.

CVPR 2025posterarXiv:2504.13820
#5060

RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability

Minh Kha Do, Kang Han, Phu Lai et al.

CVPR 2025poster
#5061

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Xiaozhong Ji, Xiaobin Hu, Zhihong Xu et al.

CVPR 2025posterarXiv:2411.16331
#5062

MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image

Shaoming Li, Qing Cai, Songqi KONG et al.

CVPR 2025poster
#5063

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Xuweiyi Chen, Markus Marks, Zezhou Cheng

CVPR 2025posterarXiv:2411.17474
#5064

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Changsheng Lv, Mengshi Qi, Liang Liu et al.

CVPR 2025posterarXiv:2411.18894
#5065

ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Zirun Guo, Tao Jin

CVPR 2025posterarXiv:2503.10358
#5066

Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning

Jeongryong Lee, Yejee Shin, Geonhui Son et al.

CVPR 2025poster
#5067

MODfinity: Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining

Shanglin Liu, Jianming Lv, Jingdan Kang et al.

CVPR 2025poster
#5068

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261
#5069

Joint Vision-Language Social Bias Removal for CLIP

Haoyu Zhang, Yangyang Guo, Mohan Kankanhalli

CVPR 2025posterarXiv:2411.12785
#5070

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.

CVPR 2025posterarXiv:2504.11739
#5071

CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR

Xugong Qin, peng zhang, Jun Jie Ou Yang et al.

CVPR 2025poster
#5072

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo, Weiping Tan, Wenyu Ran et al.

CVPR 2025poster
#5073

Learned Image Compression with Dictionary-based Entropy Model

Jingbo Lu, Leheng Zhang, Xingyu Zhou et al.

CVPR 2025posterarXiv:2504.00496
#5074

Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?

Yuechen Xie, Jie Song, Huiqiong Wang et al.

CVPR 2025posterarXiv:2503.09122
#5075

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2025poster
#5076

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Yuhao Wang, Yongfeng Lv, Pingping Zhang et al.

CVPR 2025posterarXiv:2503.10324
#5077

Shadow Generation Using Diffusion Model with Geometry Prior

Haonan Zhao, Qingyang Liu, Xinhao Tao et al.

CVPR 2025poster
#5078

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long, Abhinav Kumar, Xiaoming Liu et al.

CVPR 2025posterarXiv:2504.09086
#5079

HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories

Eric Hedlin, Munawar Hayat, Fatih Porikli et al.

CVPR 2025posterarXiv:2412.17040
#5080

AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video

Noah Stier, Alex Rich, Pradeep Sen et al.

CVPR 2025poster
#5081

iG-6DoF: Model-free 6DoF Pose Estimation for Unseen Object via Iterative 3D Gaussian Splatting

Tuo Cao, Fei LUO, Jiongming Qin et al.

CVPR 2025poster
#5082

BOOTPLACE: Bootstrapped Object Placement with Detection Transformers

Hang Zhou, Xinxin Zuo, Rui Ma et al.

CVPR 2025posterarXiv:2503.21991
#5083

Q-PART: Quasi-Periodic Adaptive Regression with Test-time Training for Pediatric Left Ventricular Ejection Fraction Regression

Jie Liu, Tiexin Qin, Hui Liu et al.

CVPR 2025posterarXiv:2503.04131
#5084

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation

Lanyun Zhu, Tianrun Chen, Qianxiong Xu et al.

CVPR 2025posterarXiv:2504.00640
#5085

Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens

Zhangqi Jiang, Junkai Chen, Beier Zhu et al.

CVPR 2025posterarXiv:2411.16724
#5086

UNIALIGN: Scaling Multimodal Alignment within One Unified Model

bo zhou, Liulei Li, Yujia Wang et al.

CVPR 2025poster
#5087

How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth et al.

CVPR 2025posterarXiv:2412.06712
#5088

Zero-Shot 4D Lidar Panoptic Segmentation

Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.

CVPR 2025posterarXiv:2504.00848
#5089

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Jingzhou Luo, Yang Liu, weixing chen et al.

CVPR 2025posterarXiv:2503.03190
#5090

Efficient Motion-Aware Video MLLM

Zijia Zhao, Yuqi Huo, Tongtian Yue et al.

CVPR 2025highlightarXiv:2503.13016
#5091

Easy-editable Image Vectorization with Multi-layer Multi-scale Distributed Visual Feature Embedding

Ye Chen, Zhangli Hu, Zhongyin Zhao et al.

CVPR 2025poster
#5092

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

Yudong Han, Qingpei Guo, Liyuan Pan et al.

CVPR 2025posterarXiv:2411.12355
#5093

Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng et al.

CVPR 2025poster
#5094

Analyzing the Synthetic-to-Real Domain Gap in 3D Hand Pose Estimation

Zhuoran ZHAO, Linlin Yang, Pengzhan Sun et al.

CVPR 2025posterarXiv:2503.19307
#5095

Automated Proof of Polynomial Inequalities via Reinforcement Learning

Banglong Liu, Niuniu Qi, Xia Zeng et al.

CVPR 2025posterarXiv:2503.06592
#5096

Active Hyperspectral Imaging Using an Event Camera

Bohan Yu, Jinxiu Liang, Zhuofeng Wang et al.

CVPR 2025highlight
#5097

Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression

Lucas Relic, Roberto Azevedo, Yang Zhang et al.

CVPR 2025posterarXiv:2504.02579
#5098

InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Haijie Li, Yanmin Wu, Jiarui Meng et al.

CVPR 2025posterarXiv:2411.19235
#5099

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Xuewu Lin, Tianwei Lin, Alan Huang et al.

CVPR 2025posterarXiv:2411.14869
#5100

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Riku Murai, Eric Dexheimer, Andrew J. Davison

CVPR 2025highlightarXiv:2412.12392
#5101

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Andrea Maracani, Savas Ozkan, Sijun Cho et al.

CVPR 2025posterarXiv:2503.16184
#5102

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Yankai Jiang, Peng Zhang, Donglin Yang et al.

CVPR 2025posterarXiv:2505.02753
#5103

Online Task-Free Continual Learning via Dynamic Expansionable Memory Distribution

Fei Ye, Adrian Bors

CVPR 2025poster
#5104

OffsetOPT: Explicit Surface Reconstruction without Normals

Huan Lei

CVPR 2025posterarXiv:2503.15763
#5105

Towards Smart Point-and-Shoot Photography

Jiawan Li, Fei Zhou, Zhipeng Zhong et al.

CVPR 2025posterarXiv:2505.03638
#5106

Image Reconstruction from Readout-Multiplexed Single-Photon Detector Arrays

Shashwath Bharadwaj, Ruangrawee Kitichotkul, Akshay Agarwal et al.

CVPR 2025highlightarXiv:2312.02971
#5107

Learning on Model Weights using Tree Experts

Eliahu Horwitz, Bar Cavia, Jonathan Kahana et al.

CVPR 2025posterarXiv:2410.13569
#5108

Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels

Jiyuan Liu, Xinwang Liu, chuankun Li et al.

CVPR 2025poster
#5109

Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Basim Azam, Naveed Akhtar

CVPR 2025posterarXiv:2503.18324
#5110

Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals

Changhao Peng

CVPR 2025posterarXiv:2506.09510
#5111

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

Yuhang Yang, Wei Zhai, Hongchen Luo et al.

CVPR 2024posterarXiv:2312.08963
#5112

The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation

Marcus Nordström, Atsuto Maki, Henrik Hult

CVPR 2025poster
#5113

Toward Robust Neural Reconstruction from Sparse Point Sets

Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.

CVPR 2025posterarXiv:2412.16361
#5114

Just Dance with pi! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection

Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva et al.

CVPR 2025highlight
#5115

EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models

Yinan Liang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2025poster
#5116

Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture

Kenkun Liu, Yurong Fu, Weihao Yuan et al.

CVPR 2025poster
#5117

ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Guo Junfu, Yu Xin, Gaoyi Liu et al.

CVPR 2025posterarXiv:2503.08135
#5118

HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction

Yuan Wang, Yali Li, Lixiang Li et al.

CVPR 2025highlight
#5119

SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang et al.

CVPR 2025posterarXiv:2409.07041
#5120

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

wenqiao Li, Yao Gu, Xintao Chen et al.

CVPR 2025posterarXiv:2503.03562
#5121

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations

Xunzhi Zheng, Dan Xu

CVPR 2025posterarXiv:2503.10464
#5122

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.

CVPR 2025posterarXiv:2501.02955
#5123

Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

Huakai Lai, Guoxin Xiong, Huayu Mai et al.

CVPR 2025poster
#5124

Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

Pan Yin, Kaiyu Li, Xiangyong Cao et al.

CVPR 2025posterarXiv:2411.16733
#5125

Adaptive Parameter Selection for Tuning Vision-Language Models

Yi Zhang, Yi-Xuan Deng, Meng-Hao Guo et al.

CVPR 2025poster
#5126

ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Kailin Li, Puhao Li, Tengyu Liu et al.

CVPR 2025posterarXiv:2503.21860
#5127

COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting

Jiaxin Zhang, Junjun Jiang, Youyu Chen et al.

CVPR 2025posterarXiv:2503.19443
#5128

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Jieming Cui, Tengyu Liu, Ziyu Meng et al.

CVPR 2025posterarXiv:2504.04191
#5129

Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2025posterarXiv:2507.02565
#5130

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li, Bin Chen, Chen Zhao et al.

CVPR 2025posterarXiv:2411.15255
#5131

M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings

Qingzheng Xu, Ru Cao, Xin Shen et al.

CVPR 2025poster
#5132

Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

Yutao Feng, Xiang Feng, Yintong Shang et al.

CVPR 2025posterarXiv:2401.15318
#5133

Improving Accuracy and Calibration via Differentiated Deep Mutual Learning

Han Liu, Peng Cui, Bingning Wang et al.

CVPR 2025poster
#5134

Star with Bilinear Mapping

Zelin Peng, Yu Huang, Zhengqin Xu et al.

CVPR 2025poster
#5135

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Thibaut Loiseau, Guillaume Bourmaud

CVPR 2025posterarXiv:2502.19955
#5136

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Jiahao Cui, Hui Li, Qingkun Su et al.

CVPR 2025posterarXiv:2412.00733
#5137

Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Fengfan Zhou, Bangjie Yin, Hefei Ling et al.

CVPR 2025posterarXiv:2411.15555
#5138

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang et al.

CVPR 2025posterarXiv:2504.12959
#5139

OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation

Xiao Cui, Yulei Qin, Wengang Zhou et al.

CVPR 2025highlight
#5140

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.

CVPR 2025posterarXiv:2503.22405
#5141

LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians

Jiamin WU, Kenkun Liu, Han Gao et al.

CVPR 2025posterarXiv:2404.16323
#5142

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Zhiwei Jia, Yuesong Nan, Huixi Zhao et al.

CVPR 2025posterarXiv:2411.15247
#5143

STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Haiyi Qiu, Minghe Gao, Long Qian et al.

CVPR 2025posterarXiv:2412.00161
#5144

CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth

Zhiyu Qu, Yunqi Miao, Zhensong Zhang et al.

CVPR 2025poster
#5145

Incremental Object Keypoint Learning

Mingfu Liang, Jiahuan Zhou, Xu Zou et al.

CVPR 2025posterarXiv:2503.20248
#5146

Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Dokyoon Yoon, Youngsook Song, Woomyoung Park

CVPR 2025posterarXiv:2506.11417
#5147

HOT: Hadamard-based Optimized Training

Seonggon Kim, Juncheol Shin, Seung-taek Woo et al.

CVPR 2025posterarXiv:2503.21261
#5148

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.

CVPR 2025posterarXiv:2506.21976
#5149

Learning Extremely High Density Crowds as Active Matters

Feixiang He, Jiangbei Yue, Jialin Zhu et al.

CVPR 2025posterarXiv:2503.12168
#5150

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Junha Lee, Chunghyun Park, Jaesung Choe et al.

CVPR 2025posterarXiv:2502.02548
#5151

Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

JunYong Choi, Min-Cheol Sagong, SeokYeong Lee et al.

CVPR 2025posterarXiv:2503.09993
#5152

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Zifan Wang, Ziqing Chen, Junyu Chen et al.

CVPR 2025posterarXiv:2501.04595
#5153

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body

Zeqing Wang, Qingyang Ma, Wentao Wan et al.

CVPR 2025highlightarXiv:2411.14205
#5154

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Yan Xie, Zequn Zeng, Hao Zhang et al.

CVPR 2025posterarXiv:2505.07209
#5155

ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing

Zhongze Wang, Haitao Zhao, Jingchao Peng et al.

CVPR 2024posterarXiv:2404.17825
#5156

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement

Qianhan Feng, Wenshuo Li, Tong Lin et al.

CVPR 2025poster
#5157

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Yuanqi Yao, Siao Liu, Haoming Song et al.

CVPR 2025posterarXiv:2504.00420
#5158

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694
#5159

Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning

Xueyi Ke, Satoshi Tsutsui, Yayun Zhang et al.

CVPR 2025posterarXiv:2501.05205
#5160

Shape and Texture: What Influences Reliable Optical Flow Estimation?

Libo Long, Xiao Hu, Jochen Lang

CVPR 2025poster
#5161

AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Li Lin, Santosh Santosh, Mingyang Wu et al.

CVPR 2025posterarXiv:2406.00783
#5162

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Jianyang Zhang, Qianli Luo, Guowu Yang et al.

CVPR 2025posterarXiv:2503.20301
#5163

Learning Textual Prompts for Open-World Semi-Supervised Learning

Yuxin Fan, Junbiao Cui, Jiye Liang

CVPR 2025poster
#5164

Embodied Scene Understanding for Vision Language Models via MetaVQA

Weizhen Wang, Chenda Duan, Zhenghao Peng et al.

CVPR 2025posterarXiv:2501.09167
#5165

Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.

CVPR 2025posterarXiv:2412.07169
#5166

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

Yao Mu, Tianxing Chen, Zanxin Chen et al.

CVPR 2025highlightarXiv:2504.13059
#5167

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Zhanhao Liang, Yuhui Yuan, Shuyang Gu et al.

CVPR 2025posterarXiv:2406.04314
#5168

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Silin Gao, Sheryl Mathew, Li Mi et al.

CVPR 2025posterarXiv:2503.20871
#5169

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

Hongmei Yin, Tingliang Feng, Fan Lyu et al.

CVPR 2025posterarXiv:2503.22136
#5170

DKC: Differentiated Knowledge Consolidation for Cloth-Hybrid Lifelong Person Re-identification

Zhenyu Cui, Jiahuan Zhou, Yuxin Peng

CVPR 2025poster
#5171

Learning Flow Fields in Attention for Controllable Person Image Generation

Zijian Zhou, Shikun Liu, Xiao Han et al.

CVPR 2025posterarXiv:2412.08486
#5172

Rectification-specific Supervision and Constrained Estimator for Online Stereo Rectification

Rui Gong, Kim-Hui Yap, Weide Liu et al.

CVPR 2025poster
#5173

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563
#5174

Dual Focus-Attention Transformer for Robust Point Cloud Registration

Kexue Fu, Ming'zhi Yuan, Changwei Wang et al.

CVPR 2025poster
#5175

Forming Auxiliary High-confident Instance-level Loss to Promote Learning from Label Proportions

Tianhao Ma, Han Chen, Juncheng Hu et al.

CVPR 2025posterarXiv:2411.10364
#5176

Towards Open-Vocabulary Audio-Visual Event Localization

Jinxing Zhou, Dan Guo, Ruohao Guo et al.

CVPR 2025posterarXiv:2411.11278
#5177

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting

Jiahui Zhang, Fangneng Zhan, Ling Shao et al.

CVPR 2025posterarXiv:2503.07476
#5178

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025posterarXiv:2504.00072
#5179

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang et al.

CVPR 2025posterarXiv:2503.20297
#5180

IceDiff: High Resolution and High-Quality Arctic Sea Ice Forecasting with Generative Diffusion Prior

Jingyi Xu, Siwei Tu, Weidong Yang et al.

CVPR 2025poster
#5181

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Zichen Miao, WEI CHEN, Qiang Qiu

CVPR 2025highlightarXiv:2503.18337
#5182

MVBoost: Boost 3D Reconstruction with Multi-View Refinement

Xiangyu Liu, Xiaomei Zhang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17772
#5183

Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration

Yiyang Chen, Tianyu Ding, Lei Wang et al.

CVPR 2025poster
#5184

SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models

Wufei Ma, Luoxin Ye, Nessa McWeeney et al.

CVPR 2025highlightarXiv:2505.00788
#5185

Animate and Sound an Image

Xihua Wang, Ruihua Song, Chongxuan Li et al.

CVPR 2025poster
#5186

Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Wenxi Chen, Raymond A. Yeh, Shaoshuai Mou et al.

CVPR 2025posterarXiv:2503.18784
#5187

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.

CVPR 2025posterarXiv:2505.16811
#5188

Towards Explainable and Unprecedented Accuracy in Matching Challenging Finger Crease Patterns

Zhenyu Zhou, Chengdong Dong, Ajay Kumar

CVPR 2025highlight
#5189

MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices

Jianwen Jiang, Gaojie Lin, Zhengkun Rong et al.

CVPR 2025posterarXiv:2407.05712
#5190

Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising

Yongli Xiang, Ziming Hong, Lina Yao et al.

CVPR 2025posterarXiv:2503.17198
#5191

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Haoyang He, Jiangning Zhang, Yuxuan Cai et al.

CVPR 2025posterarXiv:2411.15941
#5192

EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues

Sagar Soni, Akshay Dudhane, Hiyam Debary et al.

CVPR 2025posterarXiv:2412.15190
#5193

Learning Endogenous Attention for Incremental Object Detection

Xiang Song, Yuhang He, Jingyuan Li et al.

CVPR 2025poster
#5194

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Hyunsoo Cha, Inhee Lee, Hanbyul Joo

CVPR 2025posterarXiv:2412.21206
#5195

Beyond Clean Training Data: A Versatile and Model-Agnostic Framework for Out-of-Distribution Detection with Contaminated Training Data

Yuchuan Li, Jae-Mo Kang, Il-Min Kim

CVPR 2025poster
#5196

Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation

Fangyun Wei, Jinjing Zhao, Kun Yan et al.

CVPR 2025poster
#5197

Sketchy Bounding-box Supervision for 3D Instance Segmentation

qian deng, Le Hui, Jin Xie et al.

CVPR 2025posterarXiv:2505.16399
#5198

DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh et al.

CVPR 2025posterarXiv:2503.19373
#5199

Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation

Taeyoung Yun, Dinghuai Zhang, Jinkyoo Park et al.

CVPR 2025posterarXiv:2502.11477
#5200

Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

Wenbin An, Feng Tian, Sicong Leng et al.

CVPR 2025posterarXiv:2406.12718