Most Cited CVPR "sensitive optimality" Papers

5,589 papers found • Page 26 of 28

#5001

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

Qi Zhu, Jiangwei Lao, Deyi Ji et al.

CVPR 2025poster
#5002

IndoorGS: Geometric Cues Guided Gaussian Splatting for Indoor Scene Reconstruction

Cong Ruan, Yuesong Wang, Bin Zhang et al.

CVPR 2025poster
#5003

Pose Priors from Language Models

Sanjay Subramanian, Evonne Ng, Lea Müller et al.

CVPR 2025posterarXiv:2405.03689
#5004

Generative Densification: Learning to Densify Gaussians for High-Fidelity Generalizable 3D Reconstruction

Seungtae Nam, Xiangyu Sun, Gyeongjin Kang et al.

CVPR 2025highlightarXiv:2412.06234
#5005

Towards Optimizing Large-Scale Multi-Graph Matching in Bioimaging

Max Kahl, Sebastian Stricker, Lisa Hutschenreiter et al.

CVPR 2025poster
#5006

Test-Time Backdoor Detection for Object Detection Models

Hangtao Zhang, Yichen Wang, Shihui Yan et al.

CVPR 2025posterarXiv:2503.15293
#5007

Fairy: Fast Parallelized Instruction-Guided Video-to-Video Synthesis

Bichen Wu, Ching-Yao Chuang, Xiaoyan Wang et al.

CVPR 2024posterarXiv:2312.13834
#5008

Classifier-Free Guidance Inside the Attraction Basin May Cause Memorization

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya et al.

CVPR 2025posterarXiv:2411.16738
#5009

Image Quality Assessment: From Human to Machine Preference

Chunyi Li, Yuan Tian, Xiaoyue Ling et al.

CVPR 2025highlightarXiv:2503.10078
#5010

MoGe: Unlocking Accurate Monocular Geometry Estimation for Open-Domain Images with Optimal Training Supervision

Ruicheng Wang, Sicheng Xu, Cassie Lee Dai et al.

CVPR 2025posterarXiv:2410.19115
#5011

Knowledge Bridger: Towards Training-Free Missing Modality Completion

Guanzhou Ke, Shengfeng He, Xiao-Li Wang et al.

CVPR 2025posterarXiv:2502.19834
#5012

Reconciling Stochastic and Deterministic Strategies for Zero-shot Image Restoration using Diffusion Model in Dual

Chong Wang, Lanqing Guo, Zixuan Fu et al.

CVPR 2025posterarXiv:2503.01288
#5013

Harnessing Frequency Spectrum Insights for Image Copyright Protection Against Diffusion Models

Zhenguang Liu, Chao Shuai, Shaojing Fan et al.

CVPR 2025posterarXiv:2503.11071
#5014

Gain from Neighbors: Boosting Model Robustness in the Wild via Adversarial Perturbations Toward Neighboring Classes

Zhou Yang, Mingtao Feng, Tao Huang et al.

CVPR 2025poster
#5015

M^3-VOS: Multi-Phase, Multi-Transition, and Multi-Scenery Video Object Segmentation

Zixuan Chen, Jiaxin Li, Junxuan Liang et al.

CVPR 2025posterarXiv:2412.13803
#5016

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025posterarXiv:2503.23538
#5017

EquiPose: Exploiting Permutation Equivariance for Relative Camera Pose Estimation

Yuzhen Liu, Qiulei Dong

CVPR 2025poster
#5018

Visual Consensus Prompting for Co-Salient Object Detection

Jie Wang, Nana Yu, Zihao Zhang et al.

CVPR 2025posterarXiv:2504.14254
#5019

BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects

Wanyue Zhang, Rishabh Dabral, Vladislav Golyanik et al.

CVPR 2025posterarXiv:2412.05066
#5020

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao, Shaoyu Chen, haoran yin et al.

CVPR 2025highlightarXiv:2411.15139
#5021

Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Xinshuai Song, weixing chen, Yang Liu et al.

CVPR 2025posterarXiv:2412.09082
#5022

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Dongseob Kim, Hyunjung Shim

CVPR 2025posterarXiv:2503.16873
#5023

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Yang Yue, Yulin Wang, Chenxin Tao et al.

CVPR 2025posterarXiv:2504.13820
#5024

Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching

Paul Roetzer, Viktoria Ehm, Daniel Cremers et al.

CVPR 2025poster
#5025

Joint Vision-Language Social Bias Removal for CLIP

Haoyu Zhang, Yangyang Guo, Mohan Kankanhalli

CVPR 2025posterarXiv:2411.12785
#5026

ReDiffDet: Rotation-equivariant Diffusion Model for Oriented Object Detection

Jiaqi Zhao, Zeyu Ding, Yong Zhou et al.

CVPR 2025poster
#5027

UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Hang Yin, Xiuwei Xu, Linqing Zhao et al.

CVPR 2025posterarXiv:2503.10630
#5028

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Yuhao Wang, Yongfeng Lv, Pingping Zhang et al.

CVPR 2025posterarXiv:2503.10324
#5029

FIFA: Fine-grained Inter-frame Attention for Driver's Video Gaze Estimation

Daosong Hu, Mingyue Cui, Kai Huang

CVPR 2025poster
#5030

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long, Abhinav Kumar, Xiaoming Liu et al.

CVPR 2025posterarXiv:2504.09086
#5031

Floating No More: Object-Ground Reconstruction from a Single Image

Yunze Man, Yichen Sheng, Jianming Zhang et al.

CVPR 2025posterarXiv:2407.18914
#5032

SKE-Layout: Spatial Knowledge Enhanced Layout Generation with LLMs

Junsheng Wang, Nieqing Cao, Yan Ding et al.

CVPR 2025poster
#5033

DiffLO: Semantic-Aware LiDAR Odometry with Diffusion-Based Refinement

huang yongshu, Chen Liu, Minghang Zhu et al.

CVPR 2025poster
#5034

pFedMxF: Personalized Federated Class-Incremental Learning with Mixture of Frequency Aggregation

Yifei Zhang, Hao Zhu, Alysa Ziying Tan et al.

CVPR 2025poster
#5035

HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories

Eric Hedlin, Munawar Hayat, Fatih Porikli et al.

CVPR 2025posterarXiv:2412.17040
#5036

The Art of Deception: Color Visual Illusions and Diffusion Models

Alexandra Gomez-Villa, Kai Wang, C.Alejandro Parraga et al.

CVPR 2025posterarXiv:2412.10122
#5037

iG-6DoF: Model-free 6DoF Pose Estimation for Unseen Object via Iterative 3D Gaussian Splatting

Tuo Cao, Fei LUO, Jiongming Qin et al.

CVPR 2025poster
#5038

Continuous Adverse Weather Removal via Degradation-Aware Distillation

Xin Lu, Jie Xiao, Yurui Zhu et al.

CVPR 2025poster
#5039

Condensing Action Segmentation Datasets via Generative Network Inversion

Guodong Ding, Rongyu Chen, Angela Yao

CVPR 2025posterarXiv:2503.14112
#5040

High-fidelity 3D Object Generation from Single Image with RGBN-Volume Gaussian Reconstruction Model

Yiyang Shen, Kun Zhou, He Wang et al.

CVPR 2025highlightarXiv:2504.01512
#5041

ACAttack: Adaptive Cross Attacking RGB-T Tracker via Multi-Modal Response Decoupling

Xinyu Xiang, Qinglong Yan, HAO ZHANG et al.

CVPR 2025poster
#5042

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

Xiangtao Zhang, Sheng Li, Ao Li et al.

CVPR 2025poster
#5043

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

Kai Wang, Zekai Li, Zhi-Qi Cheng et al.

CVPR 2025posterarXiv:2410.17193
#5044

Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation

Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou et al.

CVPR 2025posterarXiv:2505.06068
#5045

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Bingda Tang, Sayak Paul, Boyang Zheng et al.

CVPR 2025posterarXiv:2505.10046
#5046

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation

Lanyun Zhu, Tianrun Chen, Qianxiong Xu et al.

CVPR 2025posterarXiv:2504.00640
#5047

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.

CVPR 2025posterarXiv:2504.09966
#5048

Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens

Zhangqi Jiang, Junkai Chen, Beier Zhu et al.

CVPR 2025posterarXiv:2411.16724
#5049

Distilling Long-tailed Datasets

Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2408.14506
#5050

UNIALIGN: Scaling Multimodal Alignment within One Unified Model

bo zhou, Liulei Li, Yujia Wang et al.

CVPR 2025poster
#5051

Zero-Shot 4D Lidar Panoptic Segmentation

Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.

CVPR 2025posterarXiv:2504.00848
#5052

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Jingzhou Luo, Yang Liu, weixing chen et al.

CVPR 2025posterarXiv:2503.03190
#5053

Knowledge Memorization and Rumination for Pre-trained Model-based Class-Incremental Learning

Zijian Gao, Wangwang Jia, Xingxing Zhang et al.

CVPR 2025poster
#5054

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Pengcheng Xu, Boyuan Jiang, Xiaobin Hu et al.

CVPR 2025posterarXiv:2411.15843
#5055

SeqMvRL: A Sequential Fusion Framework for Multi-view Representation Learning

Ren Wang, Haoliang Sun, Yuxiu Lin et al.

CVPR 2025poster
#5056

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Yicheng Chen, Xiangtai Li, Yining Li et al.

CVPR 2025posterarXiv:2406.20085
#5057

Efficient Motion-Aware Video MLLM

Zijia Zhao, Yuqi Huo, Tongtian Yue et al.

CVPR 2025highlightarXiv:2503.13016
#5058

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Baptiste Brument, Robin Bruneau, Yvain Queau et al.

CVPR 2024posterarXiv:2312.01215
#5059

Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng et al.

CVPR 2025poster
#5060

Structure-from-Motion with a Non-Parametric Camera Model

Yihan Wang, Linfei Pan, Marc Pollefeys et al.

CVPR 2025highlight
#5061

EventPSR: Surface Normal and Reflectance Estimation from Photometric Stereo Using an Event Camera

Bohan Yu, Jin Han, Boxin Shi et al.

CVPR 2025highlight
#5062

LAL: Enhancing 3D Human Motion Prediction with Latency-aware Auxiliary Learning

Xiaoning Sun, Dong Wei, Huaijiang Sun et al.

CVPR 2025poster
#5063

Sea-ing in Low-light

Nisha Varghese, A. N. Rajagopalan

CVPR 2025poster
#5064

Analyzing the Synthetic-to-Real Domain Gap in 3D Hand Pose Estimation

Zhuoran ZHAO, Linlin Yang, Pengzhan Sun et al.

CVPR 2025posterarXiv:2503.19307
#5065

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Andrea Maracani, Savas Ozkan, Sijun Cho et al.

CVPR 2025posterarXiv:2503.16184
#5066

DiskVPS: Vanishing Point Detector via Hough Transform in a Disk Region

Jianping Wu

CVPR 2025poster
#5067

Towards Smart Point-and-Shoot Photography

Jiawan Li, Fei Zhou, Zhipeng Zhong et al.

CVPR 2025posterarXiv:2505.03638
#5068

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025highlightarXiv:2505.00690
#5069

Image Reconstruction from Readout-Multiplexed Single-Photon Detector Arrays

Shashwath Bharadwaj, Ruangrawee Kitichotkul, Akshay Agarwal et al.

CVPR 2025highlightarXiv:2312.02971
#5070

Learning on Model Weights using Tree Experts

Eliahu Horwitz, Bar Cavia, Jonathan Kahana et al.

CVPR 2025posterarXiv:2410.13569
#5071

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

CVPR 2025posterarXiv:2406.04746
#5072

CheXwhatsApp: A Dataset for Exploring Challenges in the Diagnosis of Chest X-rays through Mobile Devices

Mariamma Antony, Rajiv Porana, Sahil M. Lathiya et al.

CVPR 2025poster
#5073

Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels

Jiyuan Liu, Xinwang Liu, chuankun Li et al.

CVPR 2025poster
#5074

Learning-enabled Polynomial Lyapunov Function Synthesis via High-Accuracy Counterexample-Guided Framework

Hanrui Zhao, Niuniu Qi, Mengxin Ren et al.

CVPR 2025poster
#5075

The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation

Marcus Nordström, Atsuto Maki, Henrik Hult

CVPR 2025poster
#5076

EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models

Yinan Liang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2025poster
#5077

Advancing Adversarial Robustness in GNeRFs: The IL2-NeRF Attack

Nicole Meng, Caleb Manicke, Ronak Sahu et al.

CVPR 2025poster
#5078

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

Jinhong Lin, Cheng-En Wu, Huanran Li et al.

CVPR 2025posterarXiv:2411.10685
#5079

Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture

Kenkun Liu, Yurong Fu, Weihao Yuan et al.

CVPR 2025poster
#5080

SINR: Sparsity Driven Compressed Implicit Neural Representations

Dhananjaya Jayasundara, Sudarshan Rajagopalan, Yasiru Ranasinghe et al.

CVPR 2025posterarXiv:2503.19576
#5081

ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Guo Junfu, Yu Xin, Gaoyi Liu et al.

CVPR 2025posterarXiv:2503.08135
#5082

Spiking Transformer: Introducing Accurate Addition-Only Spiking Self-Attention for Transformer

Yufei Guo, Xiaode Liu, Yuanpei Chen et al.

CVPR 2025poster
#5083

Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

Huakai Lai, Guoxin Xiong, Huayu Mai et al.

CVPR 2025poster
#5084

NoiseCtrl: A Sampling-Algorithm-Agnostic Conditional Generation Method for Diffusion Models

Longquan Dai, He Wang, Jinhui Tang

CVPR 2025poster
#5085

Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2025posterarXiv:2507.02565
#5086

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li, Bin Chen, Chen Zhao et al.

CVPR 2025posterarXiv:2411.15255
#5087

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
#5088

M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings

Qingzheng Xu, Ru Cao, Xin Shen et al.

CVPR 2025poster
#5089

Star with Bilinear Mapping

Zelin Peng, Yu Huang, Zhengqin Xu et al.

CVPR 2025poster
#5090

PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models

Junhyuk So, Jiwoong Shin, Chaeyeon Jang et al.

CVPR 2025posterarXiv:2503.19731
#5091

Towards Precise Scaling Laws for Video Diffusion Transformers

Yuanyang Yin, Yaqi Zhao, Mingwu Zheng et al.

CVPR 2025posterarXiv:2411.17470
#5092

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Thibaut Loiseau, Guillaume Bourmaud

CVPR 2025posterarXiv:2502.19955
#5093

RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability

Minh Kha Do, Kang Han, Phu Lai et al.

CVPR 2025poster
#5094

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Xiaozhong Ji, Xiaobin Hu, Zhihong Xu et al.

CVPR 2025posterarXiv:2411.16331
#5095

MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image

Shaoming Li, Qing Cai, Songqi KONG et al.

CVPR 2025poster
#5096

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Xuweiyi Chen, Markus Marks, Zezhou Cheng

CVPR 2025posterarXiv:2411.17474
#5097

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Changsheng Lv, Mengshi Qi, Liang Liu et al.

CVPR 2025posterarXiv:2411.18894
#5098

ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Zirun Guo, Tao Jin

CVPR 2025posterarXiv:2503.10358
#5099

Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning

Jeongryong Lee, Yejee Shin, Geonhui Son et al.

CVPR 2025poster
#5100

MODfinity: Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining

Shanglin Liu, Jianming Lv, Jingdan Kang et al.

CVPR 2025poster
#5101

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261
#5102

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang et al.

CVPR 2025posterarXiv:2504.12959
#5103

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.

CVPR 2025posterarXiv:2504.11739
#5104

CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR

Xugong Qin, peng zhang, Jun Jie Ou Yang et al.

CVPR 2025poster
#5105

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo, Weiping Tan, Wenyu Ran et al.

CVPR 2025poster
#5106

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.

CVPR 2025posterarXiv:2503.22405
#5107

Learned Image Compression with Dictionary-based Entropy Model

Jingbo Lu, Leheng Zhang, Xingyu Zhou et al.

CVPR 2025posterarXiv:2504.00496
#5108

Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?

Yuechen Xie, Jie Song, Huiqiong Wang et al.

CVPR 2025posterarXiv:2503.09122
#5109

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2025poster
#5110

LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians

Jiamin WU, Kenkun Liu, Han Gao et al.

CVPR 2025posterarXiv:2404.16323
#5111

Shadow Generation Using Diffusion Model with Geometry Prior

Haonan Zhao, Qingyang Liu, Xinhao Tao et al.

CVPR 2025poster
#5112

STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Haiyi Qiu, Minghe Gao, Long Qian et al.

CVPR 2025posterarXiv:2412.00161
#5113

Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Dokyoon Yoon, Youngsook Song, Woomyoung Park

CVPR 2025posterarXiv:2506.11417
#5114

AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video

Noah Stier, Alex Rich, Pradeep Sen et al.

CVPR 2025poster
#5115

HOT: Hadamard-based Optimized Training

Seonggon Kim, Juncheol Shin, Seung-taek Woo et al.

CVPR 2025posterarXiv:2503.21261
#5116

BOOTPLACE: Bootstrapped Object Placement with Detection Transformers

Hang Zhou, Xinxin Zuo, Rui Ma et al.

CVPR 2025posterarXiv:2503.21991
#5117

Q-PART: Quasi-Periodic Adaptive Regression with Test-time Training for Pediatric Left Ventricular Ejection Fraction Regression

Jie Liu, Tiexin Qin, Hui Liu et al.

CVPR 2025posterarXiv:2503.04131
#5118

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement

Qianhan Feng, Wenshuo Li, Tong Lin et al.

CVPR 2025poster
#5119

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Yuanqi Yao, Siao Liu, Haoming Song et al.

CVPR 2025posterarXiv:2504.00420
#5120

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694
#5121

How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth et al.

CVPR 2025posterarXiv:2412.06712
#5122

Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning

Xueyi Ke, Satoshi Tsutsui, Yayun Zhang et al.

CVPR 2025posterarXiv:2501.05205
#5123

Learning Textual Prompts for Open-World Semi-Supervised Learning

Yuxin Fan, Junbiao Cui, Jiye Liang

CVPR 2025poster
#5124

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, Sumit Patidar, Iain Haughton et al.

CVPR 2024posterarXiv:2403.03890
#5125

Easy-editable Image Vectorization with Multi-layer Multi-scale Distributed Visual Feature Embedding

Ye Chen, Zhangli Hu, Zhongyin Zhao et al.

CVPR 2025poster
#5126

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

Yudong Han, Qingpei Guo, Liyuan Pan et al.

CVPR 2025posterarXiv:2411.12355
#5127

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Zhanhao Liang, Yuhui Yuan, Shuyang Gu et al.

CVPR 2025posterarXiv:2406.04314
#5128

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Silin Gao, Sheryl Mathew, Li Mi et al.

CVPR 2025posterarXiv:2503.20871
#5129

Automated Proof of Polynomial Inequalities via Reinforcement Learning

Banglong Liu, Niuniu Qi, Xia Zeng et al.

CVPR 2025posterarXiv:2503.06592
#5130

Active Hyperspectral Imaging Using an Event Camera

Bohan Yu, Jinxiu Liang, Zhuofeng Wang et al.

CVPR 2025highlight
#5131

Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression

Lucas Relic, Roberto Azevedo, Yang Zhang et al.

CVPR 2025posterarXiv:2504.02579
#5132

InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Haijie Li, Yanmin Wu, Jiarui Meng et al.

CVPR 2025posterarXiv:2411.19235
#5133

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Xuewu Lin, Tianwei Lin, Alan Huang et al.

CVPR 2025posterarXiv:2411.14869
#5134

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Riku Murai, Eric Dexheimer, Andrew J. Davison

CVPR 2025highlightarXiv:2412.12392
#5135

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563
#5136

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting

Jiahui Zhang, Fangneng Zhan, Ling Shao et al.

CVPR 2025posterarXiv:2503.07476
#5137

Online Task-Free Continual Learning via Dynamic Expansionable Memory Distribution

Fei Ye, Adrian Bors

CVPR 2025poster
#5138

OffsetOPT: Explicit Surface Reconstruction without Normals

Huan Lei

CVPR 2025posterarXiv:2503.15763
#5139

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025posterarXiv:2504.00072
#5140

Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration

Yiyang Chen, Tianyu Ding, Lei Wang et al.

CVPR 2025poster
#5141

Animate and Sound an Image

Xihua Wang, Ruihua Song, Chongxuan Li et al.

CVPR 2025poster
#5142

Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Basim Azam, Naveed Akhtar

CVPR 2025posterarXiv:2503.18324
#5143

Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals

Changhao Peng

CVPR 2025posterarXiv:2506.09510
#5144

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Yankai Jiang, Peng Zhang, Donglin Yang et al.

CVPR 2025posterarXiv:2505.02753
#5145

Towards Explainable and Unprecedented Accuracy in Matching Challenging Finger Crease Patterns

Zhenyu Zhou, Chengdong Dong, Ajay Kumar

CVPR 2025highlight
#5146

Toward Robust Neural Reconstruction from Sparse Point Sets

Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.

CVPR 2025posterarXiv:2412.16361
#5147

Just Dance with pi! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection

Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva et al.

CVPR 2025highlight
#5148

PERSE: Personalized 3D Generative Avatars from A Single Portrait

Hyunsoo Cha, Inhee Lee, Hanbyul Joo

CVPR 2025posterarXiv:2412.21206
#5149

Sketchy Bounding-box Supervision for 3D Instance Segmentation

qian deng, Le Hui, Jin Xie et al.

CVPR 2025posterarXiv:2505.16399
#5150

Mitigating Object Hallucinations in Large Vision-Language Models with Assembly of Global and Local Attention

Wenbin An, Feng Tian, Sicong Leng et al.

CVPR 2025posterarXiv:2406.12718
#5151

HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction

Yuan Wang, Yali Li, Lixiang Li et al.

CVPR 2025highlight
#5152

Diffusion Model is Effectively Its Own Teacher

Xinyin Ma, Runpeng Yu, Songhua Liu et al.

CVPR 2025poster
#5153

SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang et al.

CVPR 2025posterarXiv:2409.07041
#5154

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

wenqiao Li, Yao Gu, Xintao Chen et al.

CVPR 2025posterarXiv:2503.03562
#5155

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations

Xunzhi Zheng, Dan Xu

CVPR 2025posterarXiv:2503.10464
#5156

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.

CVPR 2025posterarXiv:2501.02955
#5157

RASP: Revisiting 3D Anamorphic Art for Shadow-Guided Packing of Irregular Objects

Soumyaratna Debnath, Ashish Tiwari, Kaustubh Sadekar et al.

CVPR 2025posterarXiv:2504.02465
#5158

Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation

Chuandong Liu, Xingxing Weng, Shuguo Jiang et al.

CVPR 2025posterarXiv:2408.11280
#5159

Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

Pan Yin, Kaiyu Li, Xiangyong Cao et al.

CVPR 2025posterarXiv:2411.16733
#5160

Adaptive Parameter Selection for Tuning Vision-Language Models

Yi Zhang, Yi-Xuan Deng, Meng-Hao Guo et al.

CVPR 2025poster
#5161

ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Kailin Li, Puhao Li, Tengyu Liu et al.

CVPR 2025posterarXiv:2503.21860
#5162

COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting

Jiaxin Zhang, Junjun Jiang, Youyu Chen et al.

CVPR 2025posterarXiv:2503.19443
#5163

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Jieming Cui, Tengyu Liu, Ziyu Meng et al.

CVPR 2025posterarXiv:2504.04191
#5164

Revisiting Audio-Visual Segmentation with Vision-Centric Transformer

Shaofei Huang, Rui Ling, Tianrui Hui et al.

CVPR 2025posterarXiv:2506.23623
#5165

Less is More: Efficient Image Vectorization with Adaptive Parameterization

Kaibo Zhao, Liang Bao, Yufei Li et al.

CVPR 2025poster
#5166

Steering Away from Harm: An Adaptive Approach to Defending Vision Language Model Against Jailbreaks

Han Wang, Gang Wang, Huan Zhang

CVPR 2025posterarXiv:2411.16721
#5167

PEER Pressure: Model-to-Model Regularization for Single Source Domain Generalization

Dongkyu Cho, Inwoo Hwang, Sanghack Lee

CVPR 2025posterarXiv:2505.12745
#5168

Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

Yutao Feng, Xiang Feng, Yintong Shang et al.

CVPR 2025posterarXiv:2401.15318
#5169

Improving Accuracy and Calibration via Differentiated Deep Mutual Learning

Han Liu, Peng Cui, Bingning Wang et al.

CVPR 2025poster
#5170

Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking

chaocan xue, Bineng Zhong, Qihua Liang et al.

CVPR 2025posterarXiv:2503.06625
#5171

Unified Dense Prediction of Video Diffusion

Lehan Yang, Lu Qi, Xiangtai Li et al.

CVPR 2025posterarXiv:2503.09344
#5172

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Jiahao Cui, Hui Li, Qingkun Su et al.

CVPR 2025posterarXiv:2412.00733
#5173

Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Fengfan Zhou, Bangjie Yin, Hefei Ling et al.

CVPR 2025posterarXiv:2411.15555
#5174

Joint Scheduling of Causal Prompts and Tasks for Multi-Task Learning

Chaoyang Li, Jianyang Qin, Jinhao Cui et al.

CVPR 2025poster
#5175

OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation

Xiao Cui, Yulei Qin, Wengang Zhou et al.

CVPR 2025highlight
#5176

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation

Yiping Wang, Xuehai He, Kuan Wang et al.

CVPR 2025posterarXiv:2412.16211
#5177

DynScene: Scalable Generation of Dynamic Robotic Manipulation Scenes for Embodied AI

Sangmin Lee, Sungyong Park, Heewon Kim

CVPR 2025poster
#5178

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Zhiwei Jia, Yuesong Nan, Huixi Zhao et al.

CVPR 2025posterarXiv:2411.15247
#5179

CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth

Zhiyu Qu, Yunqi Miao, Zhensong Zhang et al.

CVPR 2025poster
#5180

Incremental Object Keypoint Learning

Mingfu Liang, Jiahuan Zhou, Xu Zou et al.

CVPR 2025posterarXiv:2503.20248
#5181

DefMamba: Deformable Visual State Space Model

Leiye Liu, Miao Zhang, Jihao Yin et al.

CVPR 2025posterarXiv:2504.05794
#5182

SceneFactor: Factored Latent 3D Diffusion for Controllable 3D Scene Generation

Aleksei Bokhovkin, Quan Meng, Shubham Tulsiani et al.

CVPR 2025posterarXiv:2412.01801
#5183

VideoGEM: Training-free Action Grounding in Videos

Felix Vogel, Walid Bousselham, Anna Kukleva et al.

CVPR 2025posterarXiv:2503.20348
#5184

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

Yuhang Yang, Wei Zhai, Hongchen Luo et al.

CVPR 2024posterarXiv:2312.08963
#5185

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.

CVPR 2025posterarXiv:2506.21976
#5186

Learning Extremely High Density Crowds as Active Matters

Feixiang He, Jiangbei Yue, Jialin Zhu et al.

CVPR 2025posterarXiv:2503.12168
#5187

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Junha Lee, Chunghyun Park, Jaesung Choe et al.

CVPR 2025posterarXiv:2502.02548
#5188

Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

JunYong Choi, Min-Cheol Sagong, SeokYeong Lee et al.

CVPR 2025posterarXiv:2503.09993
#5189

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Zifan Wang, Ziqing Chen, Junyu Chen et al.

CVPR 2025posterarXiv:2501.04595
#5190

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body

Zeqing Wang, Qingyang Ma, Wentao Wan et al.

CVPR 2025highlightarXiv:2411.14205
#5191

ProReflow: Progressive Reflow with Decomposed Velocity

Lei Ke, Haohang Xu, Xuefei Ning et al.

CVPR 2025posterarXiv:2503.04824
#5192

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Yan Xie, Zequn Zeng, Hao Zhang et al.

CVPR 2025posterarXiv:2505.07209
#5193

Event-Equalized Dense Video Captioning

Kangyi Wu, Pengna Li, Jingwen Fu et al.

CVPR 2025poster
#5194

GazeGene: Large-scale Synthetic Gaze Dataset with 3D Eyeball Annotations

Yiwei Bao, Zhiming Wang, Feng Lu

CVPR 2025poster
#5195

Shape and Texture: What Influences Reliable Optical Flow Estimation?

Libo Long, Xiao Hu, Jochen Lang

CVPR 2025poster
#5196

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Jianyang Zhang, Qianli Luo, Guowu Yang et al.

CVPR 2025posterarXiv:2503.20301
#5197

Feature Information Driven Position Gaussian Distribution Estimation for Tiny Object Detection

Jinghao Bian, Mingtao Feng, Weisheng Dong et al.

CVPR 2025poster
#5198

PRaDA: Projective Radial Distortion Averaging

Daniil Sinitsyn, Linus Härenstam-Nielsen, Daniel Cremers

CVPR 2025posterarXiv:2504.16499
#5199

Embodied Scene Understanding for Vision Language Models via MetaVQA

Weizhen Wang, Chenda Duan, Zhenghao Peng et al.

CVPR 2025posterarXiv:2501.09167
#5200

Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.

CVPR 2025posterarXiv:2412.07169