Most Cited CVPR "extra gradient algorithm" Papers

5,589 papers found • Page 26 of 28

#5001

ACAttack: Adaptive Cross Attacking RGB-T Tracker via Multi-Modal Response Decoupling

Xinyu Xiang, Qinglong Yan, HAO ZHANG et al.

CVPR 2025poster
#5002

Subspace Constraint and Contribution Estimation for Heterogeneous Federated Learning

Xiangtao Zhang, Sheng Li, Ao Li et al.

CVPR 2025poster
#5003

Emphasizing Discriminative Features for Dataset Distillation in Complex Scenarios

Kai Wang, Zekai Li, Zhi-Qi Cheng et al.

CVPR 2025posterarXiv:2410.17193
#5004

RNb-NeuS: Reflectance and Normal-based Multi-View 3D Reconstruction

Baptiste Brument, Robin Bruneau, Yvain Queau et al.

CVPR 2024posterarXiv:2312.01215
#5005

AdMiT: Adaptive Multi-Source Tuning in Dynamic Environments

Xiangyu Chang, Fahim Faisal Niloy, Sk Miraj Ahmed et al.

CVPR 2025poster
#5006

Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis

Bingda Tang, Sayak Paul, Boyang Zheng et al.

CVPR 2025posterarXiv:2505.10046
#5007

Fortifying Federated Learning Towards Trustworthiness via Auditable Data Valuation and Verifiable Client Contribution

Naveen Kumar Kummari, Ranjeet Ranjan Jha, Krishna Mohan Chalavadi et al.

CVPR 2025poster
#5008

SemiETS: Integrating Spatial and Content Consistencies for Semi-Supervised End-to-end Text Spotting

Dongliang Luo, Hanshen Zhu, Ziyang Zhang et al.

CVPR 2025posterarXiv:2504.09966
#5009

DIV-FF: Dynamic Image-Video Feature Fields For Environment Understanding in Egocentric Videos

Lorenzo Mur-Labadia, Jose J. Guerrero, Ruben Martinez-Cantin

CVPR 2025highlightarXiv:2503.08344
#5010

Distilling Long-tailed Datasets

Zhenghao Zhao, Haoxuan Wang, Yuzhang Shang et al.

CVPR 2025posterarXiv:2408.14506
#5011

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025posterarXiv:2412.03752
#5012

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025posterarXiv:2410.01376
#5013

Knowledge Memorization and Rumination for Pre-trained Model-based Class-Incremental Learning

Zijian Gao, Wangwang Jia, Xingxing Zhang et al.

CVPR 2025poster
#5014

Unveil Inversion and Invariance in Flow Transformer for Versatile Image Editing

Pengcheng Xu, Boyuan Jiang, Xiaobin Hu et al.

CVPR 2025posterarXiv:2411.15843
#5015

SeqMvRL: A Sequential Fusion Framework for Multi-view Representation Learning

Ren Wang, Haoliang Sun, Yuxiu Lin et al.

CVPR 2025poster
#5016

Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language

Yicheng Chen, Xiangtai Li, Yining Li et al.

CVPR 2025posterarXiv:2406.20085
#5017

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025posterarXiv:2503.15197
#5018

Hierarchical Diffusion Policy for Kinematics-Aware Multi-Task Robotic Manipulation

Xiao Ma, Sumit Patidar, Iain Haughton et al.

CVPR 2024posterarXiv:2403.03890
#5019

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025poster
#5020

Structure-from-Motion with a Non-Parametric Camera Model

Yihan Wang, Linfei Pan, Marc Pollefeys et al.

CVPR 2025highlight
#5021

EventPSR: Surface Normal and Reflectance Estimation from Photometric Stereo Using an Event Camera

Bohan Yu, Jin Han, Boxin Shi et al.

CVPR 2025highlight
#5022

LAL: Enhancing 3D Human Motion Prediction with Latency-aware Auxiliary Learning

Xiaoning Sun, Dong Wei, Huaijiang Sun et al.

CVPR 2025poster
#5023

Sea-ing in Low-light

Nisha Varghese, A. N. Rajagopalan

CVPR 2025poster
#5024

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025posterarXiv:2412.13573
#5025

FASTer: Focal token Acquiring-and-Scaling Transformer for Long-term 3D Objection Detection

Chenxu Dang, Pei An, Xinmin Zhang et al.

CVPR 2025posterarXiv:2503.01899
#5026

Seeing Speech and Sound: Distinguishing and Locating Audio Sources in Visual Scenes

Hyeonggon Ryu, Seongyu Kim, Joon Chung et al.

CVPR 2025poster
#5027

DiskVPS: Vanishing Point Detector via Hough Transform in a Disk Region

Jianping Wu

CVPR 2025poster
#5028

Recurrent Feature Mining and Keypoint Mixup Padding for Category-Agnostic Pose Estimation

Junjie Chen, Weilong Chen, Yifan Zuo et al.

CVPR 2025posterarXiv:2503.21140
#5029

Towards Autonomous Micromobility through Scalable Urban Simulation

Wayne Wu, Honglin He, Chaoyuan Zhang et al.

CVPR 2025highlightarXiv:2505.00690
#5030

Heterogeneous Skeleton-Based Action Representation Learning

Xiaoyan Ma, jidong kuang, Hongsong Wang et al.

CVPR 2025posterarXiv:2506.03481
#5031

PQPP: A Joint Benchmark for Text-to-Image Prompt and Query Performance Prediction

Eduard Poesina, Adriana Valentina Costache, Adrian-Gabriel Chifu et al.

CVPR 2025posterarXiv:2406.04746
#5032

CheXwhatsApp: A Dataset for Exploring Challenges in the Diagnosis of Chest X-rays through Mobile Devices

Mariamma Antony, Rajiv Porana, Sahil M. Lathiya et al.

CVPR 2025poster
#5033

Learning-enabled Polynomial Lyapunov Function Synthesis via High-Accuracy Counterexample-Guided Framework

Hanrui Zhao, Niuniu Qi, Mengxin Ren et al.

CVPR 2025poster
#5034

Once-Tuning-Multiple-Variants: Tuning Once and Expanded as Multiple Vision-Language Model Variants

Chong Yu, Tao Chen, Zhongxue Gan

CVPR 2025poster
#5035

Seeing is Not Believing: Adversarial Natural Object Optimization for Hard-Label 3D Scene Attacks

Daizong Liu, Wei Hu

CVPR 2025poster
#5036

HomoGen: Enhanced Video Inpainting via Homography Propagation and Diffusion

Ding Ding, Yueming Pan, Ruoyu Feng et al.

CVPR 2025poster
#5037

Advancing Adversarial Robustness in GNeRFs: The IL2-NeRF Attack

Nicole Meng, Caleb Manicke, Ronak Sahu et al.

CVPR 2025poster
#5038

From Prototypes to General Distributions: An Efficient Curriculum for Masked Image Modeling

Jinhong Lin, Cheng-En Wu, Huanran Li et al.

CVPR 2025posterarXiv:2411.10685
#5039

SINR: Sparsity Driven Compressed Implicit Neural Representations

Dhananjaya Jayasundara, Sudarshan Rajagopalan, Yasiru Ranasinghe et al.

CVPR 2025posterarXiv:2503.19576
#5040

Spiking Transformer: Introducing Accurate Addition-Only Spiking Self-Attention for Transformer

Yufei Guo, Xiaode Liu, Yuanpei Chen et al.

CVPR 2025poster
#5041

NoiseCtrl: A Sampling-Algorithm-Agnostic Conditional Generation Method for Diffusion Models

Longquan Dai, He Wang, Jinhui Tang

CVPR 2025poster
#5042

Enhancing Vision-Language Compositional Understanding with Multimodal Synthetic Data

Haoxin Li, Boyang Li

CVPR 2025posterarXiv:2503.01167
#5043

Efficient Depth Estimation for Unstable Stereo Camera Systems on AR Glasses

Yongfan Liu, Hyoukjun Kwon

CVPR 2025posterarXiv:2411.10013
#5044

Erase Diffusion: Empowering Object Removal Through Calibrating Diffusion Pathways

Yi Liu, Hao Zhou, Benlei Cui et al.

CVPR 2025highlightarXiv:2503.07026
#5045

Robotic Visual Instruction

Yanbang Li, ZiYang Gong, Haoyang Li et al.

CVPR 2025posterarXiv:2505.00693
#5046

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

Yuhui Liu, Liangxun Ou, Qiang Fu et al.

CVPR 2025poster
#5047

PCM : Picard Consistency Model for Fast Parallel Sampling of Diffusion Models

Junhyuk So, Jiwoong Shin, Chaeyeon Jang et al.

CVPR 2025posterarXiv:2503.19731
#5048

Towards Precise Scaling Laws for Video Diffusion Transformers

Yuanyang Yin, Yaqi Zhao, Mingwu Zheng et al.

CVPR 2025posterarXiv:2411.17470
#5049

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Teng Hu, Jiangning Zhang, Ran Yi et al.

CVPR 2025posterarXiv:2501.00880
#5050

RobSense: A Robust Multi-modal Foundation Model for Remote Sensing with Static, Temporal, and Incomplete Data Adaptability

Minh Kha Do, Kang Han, Phu Lai et al.

CVPR 2025poster
#5051

Sonic: Shifting Focus to Global Audio Perception in Portrait Animation

Xiaozhong Ji, Xiaobin Hu, Zhihong Xu et al.

CVPR 2025posterarXiv:2411.16331
#5052

MESC-3D:Mining Effective Semantic Cues for 3D Reconstruction from a Single Image

Shaoming Li, Qing Cai, Songqi KONG et al.

CVPR 2025poster
#5053

Probing the Mid-level Vision Capabilities of Self-Supervised Learning

Xuweiyi Chen, Markus Marks, Zezhou Cheng

CVPR 2025posterarXiv:2411.17474
#5054

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Changsheng Lv, Mengshi Qi, Liang Liu et al.

CVPR 2025posterarXiv:2411.18894
#5055

ConceptGuard: Continual Personalized Text-to-Image Generation with Forgetting and Confusion Mitigation

Zirun Guo, Tao Jin

CVPR 2025posterarXiv:2503.10358
#5056

Diffusion Bridge: Leveraging Diffusion Model to Reduce the Modality Gap Between Text and Vision for Zero-Shot Image Captioning

Jeongryong Lee, Yejee Shin, Geonhui Son et al.

CVPR 2025poster
#5057

MODfinity: Unsupervised Domain Adaptation with Multimodal Information Flow Intertwining

Shanglin Liu, Jianming Lv, Jingdan Kang et al.

CVPR 2025poster
#5058

Volume Tells: Dual Cycle-Consistent Diffusion for 3D Fluorescence Microscopy De-noising and Super-Resolution

ZELIN LI, Chenwei Wang, Zhaoke Huang et al.

CVPR 2025highlightarXiv:2503.02261
#5059

Query Efficient Black-Box Visual Prompting with Subspace Learning

Haozhen Zhang, Zhaogeng Liu, Hualin Zhang et al.

CVPR 2025poster
#5060

The Devil is in the Prompts: Retrieval-Augmented Prompt Optimization for Text-to-Video Generation

Bingjie Gao, Xinyu Gao, Xiaoxue Wu et al.

CVPR 2025posterarXiv:2504.11739
#5061

CLIP is Almost All You Need: Towards Parameter-Efficient Scene Text Retrieval without OCR

Xugong Qin, peng zhang, Jun Jie Ou Yang et al.

CVPR 2025poster
#5062

DreamTrack: Dreaming the Future for Multimodal Visual Object Tracking

Mingzhe Guo, Weiping Tan, Wenyu Ran et al.

CVPR 2025poster
#5063

Fingerprinting Denoising Diffusion Probabilistic Models

Huan Teng, Yuhui Quan, Chengyu Wang et al.

CVPR 2025poster
#5064

Learned Image Compression with Dictionary-based Entropy Model

Jingbo Lu, Leheng Zhang, Xingyu Zhou et al.

CVPR 2025posterarXiv:2504.00496
#5065

Training Data Provenance Verification: Did Your Model Use Synthetic Data from My Generative Model for Training?

Yuechen Xie, Jie Song, Huiqiong Wang et al.

CVPR 2025posterarXiv:2503.09122
#5066

VLMs-Guided Representation Distillation for Efficient Vision-Based Reinforcement Learning

Haoran Xu, Peixi Peng, Guang Tan et al.

CVPR 2025poster
#5067

Shadow Generation Using Diffusion Model with Geometry Prior

Haonan Zhao, Qingyang Liu, Xinhao Tao et al.

CVPR 2025poster
#5068

AdaDARE-gamma: Balancing Stability and Plasticity in Multi-modal LLMs through Efficient Adaptation

Jingyi Xie, Jintao Yang, Zhunchen Luo et al.

CVPR 2025poster
#5069

SkySense-O: Towards Open-World Remote Sensing Interpretation with Vision-Centric Visual-Language Modeling

Qi Zhu, Jiangwei Lao, Deyi Ji et al.

CVPR 2025poster
#5070

AniGrad: Anisotropic Gradient-Adaptive Sampling for 3D Reconstruction From Monocular Video

Noah Stier, Alex Rich, Pradeep Sen et al.

CVPR 2025poster
#5071

BOOTPLACE: Bootstrapped Object Placement with Detection Transformers

Hang Zhou, Xinxin Zuo, Rui Ma et al.

CVPR 2025posterarXiv:2503.21991
#5072

Q-PART: Quasi-Periodic Adaptive Regression with Test-time Training for Pediatric Left Ventricular Ejection Fraction Regression

Jie Liu, Tiexin Qin, Hui Liu et al.

CVPR 2025posterarXiv:2503.04131
#5073

Knowledge Bridger: Towards Training-Free Missing Modality Completion

Guanzhou Ke, Shengfeng He, Xiao-Li Wang et al.

CVPR 2025posterarXiv:2502.19834
#5074

BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects

Wanyue Zhang, Rishabh Dabral, Vladislav Golyanik et al.

CVPR 2025posterarXiv:2412.05066
#5075

Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Xinshuai Song, weixing chen, Yang Liu et al.

CVPR 2025posterarXiv:2412.09082
#5076

CheXWorld: Exploring Image World Modeling for Radiograph Representation Learning

Yang Yue, Yulin Wang, Chenxin Tao et al.

CVPR 2025posterarXiv:2504.13820
#5077

How to Merge Your Multimodal Models Over Time?

Sebastian Dziadzio, Vishaal Udandarao, Karsten Roth et al.

CVPR 2025posterarXiv:2412.06712
#5078

Joint Vision-Language Social Bias Removal for CLIP

Haoyu Zhang, Yangyang Guo, Mohan Kankanhalli

CVPR 2025posterarXiv:2411.12785
#5079

IDEA: Inverted Text with Cooperative Deformable Aggregation for Multi-modal Object Re-Identification

Yuhao Wang, Yongfeng Lv, Pingping Zhang et al.

CVPR 2025posterarXiv:2503.10324
#5080

Easy-editable Image Vectorization with Multi-layer Multi-scale Distributed Visual Feature Embedding

Ye Chen, Zhangli Hu, Zhongyin Zhao et al.

CVPR 2025poster
#5081

DynFocus: Dynamic Cooperative Network Empowers LLMs with Video Understanding

Yudong Han, Qingpei Guo, Liyuan Pan et al.

CVPR 2025posterarXiv:2411.12355
#5082

RICCARDO: Radar Hit Prediction and Convolution for Camera-Radar 3D Object Detection

Yunfei Long, Abhinav Kumar, Xiaoming Liu et al.

CVPR 2025posterarXiv:2504.09086
#5083

HyperNet Fields: Efficiently Training Hypernetworks without Ground Truth by Learning Weight Trajectories

Eric Hedlin, Munawar Hayat, Fatih Porikli et al.

CVPR 2025posterarXiv:2412.17040
#5084

Automated Proof of Polynomial Inequalities via Reinforcement Learning

Banglong Liu, Niuniu Qi, Xia Zeng et al.

CVPR 2025posterarXiv:2503.06592
#5085

Active Hyperspectral Imaging Using an Event Camera

Bohan Yu, Jinxiu Liang, Zhuofeng Wang et al.

CVPR 2025highlight
#5086

Bridging the Gap between Gaussian Diffusion Models and Universal Quantization for Image Compression

Lucas Relic, Roberto Azevedo, Yang Zhang et al.

CVPR 2025posterarXiv:2504.02579
#5087

iG-6DoF: Model-free 6DoF Pose Estimation for Unseen Object via Iterative 3D Gaussian Splatting

Tuo Cao, Fei LUO, Jiongming Qin et al.

CVPR 2025poster
#5088

InstanceGaussian: Appearance-Semantic Joint Gaussian Representation for 3D Instance-Level Perception

Haijie Li, Yanmin Wu, Jiarui Meng et al.

CVPR 2025posterarXiv:2411.19235
#5089

BIP3D: Bridging 2D Images and 3D Perception for Embodied Intelligence

Xuewu Lin, Tianwei Lin, Alan Huang et al.

CVPR 2025posterarXiv:2411.14869
#5090

MASt3R-SLAM: Real-Time Dense SLAM with 3D Reconstruction Priors

Riku Murai, Eric Dexheimer, Andrew J. Davison

CVPR 2025highlightarXiv:2412.12392
#5091

POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation

Lanyun Zhu, Tianrun Chen, Qianxiong Xu et al.

CVPR 2025posterarXiv:2504.00640
#5092

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Yankai Jiang, Peng Zhang, Donglin Yang et al.

CVPR 2025posterarXiv:2505.02753
#5093

Online Task-Free Continual Learning via Dynamic Expansionable Memory Distribution

Fei Ye, Adrian Bors

CVPR 2025poster
#5094

OffsetOPT: Explicit Surface Reconstruction without Normals

Huan Lei

CVPR 2025posterarXiv:2503.15763
#5095

Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens

Zhangqi Jiang, Junkai Chen, Beier Zhu et al.

CVPR 2025posterarXiv:2411.16724
#5096

UNIALIGN: Scaling Multimodal Alignment within One Unified Model

bo zhou, Liulei Li, Yujia Wang et al.

CVPR 2025poster
#5097

Plug-and-Play Interpretable Responsible Text-to-Image Generation via Dual-Space Multi-facet Concept Control

Basim Azam, Naveed Akhtar

CVPR 2025posterarXiv:2503.18324
#5098

Generalized Gaussian Entropy Model for Point Cloud Attribute Compression with Dynamic Likelihood Intervals

Changhao Peng

CVPR 2025posterarXiv:2506.09510
#5099

LEMON: Learning 3D Human-Object Interaction Relation from 2D Images

Yuhang Yang, Wei Zhai, Hongchen Luo et al.

CVPR 2024posterarXiv:2312.08963
#5100

Zero-Shot 4D Lidar Panoptic Segmentation

Yushan Zhang, Aljoša Ošep, Laura Leal-Taixe et al.

CVPR 2025posterarXiv:2504.00848
#5101

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Jingzhou Luo, Yang Liu, weixing chen et al.

CVPR 2025posterarXiv:2503.03190
#5102

Toward Robust Neural Reconstruction from Sparse Point Sets

Amine Ouasfi, Shubhendu Jena, Eric Marchand et al.

CVPR 2025posterarXiv:2412.16361
#5103

Just Dance with pi! A Poly-modal Inductor for Weakly-supervised Video Anomaly Detection

Snehashis Majhi, Giacomo D'Amicantonio, Antitza Dantcheva et al.

CVPR 2025highlight
#5104

Efficient Motion-Aware Video MLLM

Zijia Zhao, Yuqi Huo, Tongtian Yue et al.

CVPR 2025highlightarXiv:2503.13016
#5105

Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng et al.

CVPR 2025poster
#5106

Analyzing the Synthetic-to-Real Domain Gap in 3D Hand Pose Estimation

Zhuoran ZHAO, Linlin Yang, Pengzhan Sun et al.

CVPR 2025posterarXiv:2503.19307
#5107

HSI-GPT: A General-Purpose Large Scene-Motion-Language Model for Human Scene Interaction

Yuan Wang, Yali Li, Lixiang Li et al.

CVPR 2025highlight
#5108

SoftShadow: Leveraging Soft Masks for Penumbra-Aware Shadow Removal

Xinrui Wang, Lanqing Guo, Xiyu Wang et al.

CVPR 2025posterarXiv:2409.07041
#5109

Towards Visual Discrimination and Reasoning of Real-World Physical Dynamics: Physics-Grounded Anomaly Detection

wenqiao Li, Yao Gu, Xintao Chen et al.

CVPR 2025posterarXiv:2503.03562
#5110

Flow-NeRF: Joint Learning of Geometry, Poses, and Dense Flow within Unified Neural Representations

Xunzhi Zheng, Dan Xu

CVPR 2025posterarXiv:2503.10464
#5111

MotionBench: Benchmarking and Improving Fine-grained Video Motion Understanding for Vision Language Models

Wenyi Hong, Yean Cheng, Zhuoyi Yang et al.

CVPR 2025posterarXiv:2501.02955
#5112

Towards Satellite Image Road Graph Extraction: A Global-Scale Dataset and A Novel Method

Pan Yin, Kaiyu Li, Xiangyong Cao et al.

CVPR 2025posterarXiv:2411.16733
#5113

Adaptive Parameter Selection for Tuning Vision-Language Models

Yi Zhang, Yi-Xuan Deng, Meng-Hao Guo et al.

CVPR 2025poster
#5114

ManipTrans: Efficient Dexterous Bimanual Manipulation Transfer via Residual Learning

Kailin Li, Puhao Li, Tengyu Liu et al.

CVPR 2025posterarXiv:2503.21860
#5115

COB-GS: Clear Object Boundaries in 3DGS Segmentation Based on Boundary-Adaptive Gaussian Splitting

Jiaxin Zhang, Junjun Jiang, Youyu Chen et al.

CVPR 2025posterarXiv:2503.19443
#5116

GROVE: A Generalized Reward for Learning Open-Vocabulary Physical Skill

Jieming Cui, Tengyu Liu, Ziyu Meng et al.

CVPR 2025posterarXiv:2504.04191
#5117

Accurate Scene Text Recognition with Efficient Model Scaling and Cloze Self-Distillation

Andrea Maracani, Savas Ozkan, Sijun Cho et al.

CVPR 2025posterarXiv:2503.16184
#5118

Towards Smart Point-and-Shoot Photography

Jiawan Li, Fei Zhou, Zhipeng Zhong et al.

CVPR 2025posterarXiv:2505.03638
#5119

Image Reconstruction from Readout-Multiplexed Single-Photon Detector Arrays

Shashwath Bharadwaj, Ruangrawee Kitichotkul, Akshay Agarwal et al.

CVPR 2025highlightarXiv:2312.02971
#5120

Learning on Model Weights using Tree Experts

Eliahu Horwitz, Bar Cavia, Jonathan Kahana et al.

CVPR 2025posterarXiv:2410.13569
#5121

Gaussian Splashing: Unified Particles for Versatile Motion Synthesis and Rendering

Yutao Feng, Xiang Feng, Yintong Shang et al.

CVPR 2025posterarXiv:2401.15318
#5122

Improving Accuracy and Calibration via Differentiated Deep Mutual Learning

Han Liu, Peng Cui, Bingning Wang et al.

CVPR 2025poster
#5123

Large-scale Multi-view Tensor Clustering with Implicit Linear Kernels

Jiyuan Liu, Xinwang Liu, chuankun Li et al.

CVPR 2025poster
#5124

The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation

Marcus Nordström, Atsuto Maki, Henrik Hult

CVPR 2025poster
#5125

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Jiahao Cui, Hui Li, Qingkun Su et al.

CVPR 2025posterarXiv:2412.00733
#5126

Improving the Transferability of Adversarial Attacks on Face Recognition with Diverse Parameters Augmentation

Fengfan Zhou, Bangjie Yin, Hefei Ling et al.

CVPR 2025posterarXiv:2411.15555
#5127

EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models

Yinan Liang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2025poster
#5128

OPTICAL: Leveraging Optimal Transport for Contribution Allocation in Dataset Distillation

Xiao Cui, Yulei Qin, Wengang Zhou et al.

CVPR 2025highlight
#5129

Motions as Queries: One-Stage Multi-Person Holistic Human Motion Capture

Kenkun Liu, Yurong Fu, Weihao Yuan et al.

CVPR 2025poster
#5130

Reward Fine-Tuning Two-Step Diffusion Models via Learning Differentiable Latent-Space Surrogate Reward

Zhiwei Jia, Yuesong Nan, Huixi Zhao et al.

CVPR 2025posterarXiv:2411.15247
#5131

ArticulatedGS: Self-supervised Digital Twin Modeling of Articulated Objects using 3D Gaussian Splatting

Guo Junfu, Yu Xin, Gaoyi Liu et al.

CVPR 2025posterarXiv:2503.08135
#5132

CaricatureBooth: Data-Free Interactive Caricature Generation in a Photo Booth

Zhiyu Qu, Yunqi Miao, Zhensong Zhang et al.

CVPR 2025poster
#5133

Incremental Object Keypoint Learning

Mingfu Liang, Jiahuan Zhou, Xu Zou et al.

CVPR 2025posterarXiv:2503.20248
#5134

Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

Huakai Lai, Guoxin Xiong, Huayu Mai et al.

CVPR 2025poster
#5135

SceneDiffuser++: City-Scale Traffic Simulation via a Generative World Model

Shuhan Tan, John Wheatley Lambert, Hong Jeon et al.

CVPR 2025posterarXiv:2506.21976
#5136

Learning Extremely High Density Crowds as Active Matters

Feixiang He, Jiangbei Yue, Jialin Zhu et al.

CVPR 2025posterarXiv:2503.12168
#5137

Mosaic3D: Foundation Dataset and Model for Open-Vocabulary 3D Segmentation

Junha Lee, Chunghyun Park, Jaesung Choe et al.

CVPR 2025posterarXiv:2502.02548
#5138

Channel-wise Noise Scheduled Diffusion for Inverse Rendering in Indoor Scenes

JunYong Choi, Min-Cheol Sagong, SeokYeong Lee et al.

CVPR 2025posterarXiv:2503.09993
#5139

MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Zifan Wang, Ziqing Chen, Junyu Chen et al.

CVPR 2025posterarXiv:2501.04595
#5140

Is this Generated Person Existed in Real-world? Fine-grained Detecting and Calibrating Abnormal Human-body

Zeqing Wang, Qingyang Ma, Wentao Wan et al.

CVPR 2025highlightarXiv:2411.14205
#5141

Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2025posterarXiv:2507.02565
#5142

Discovering Fine-Grained Visual-Concept Relations by Disentangled Optimal Transport Concept Bottleneck Models

Yan Xie, Zequn Zeng, Hao Zhang et al.

CVPR 2025posterarXiv:2505.07209
#5143

ODCR: Orthogonal Decoupling Contrastive Regularization for Unpaired Image Dehazing

Zhongze Wang, Haitao Zhao, Jingchao Peng et al.

CVPR 2024posterarXiv:2404.17825
#5144

OSMamba: Omnidirectional Spectral Mamba with Dual-Domain Prior Generator for Exposure Correction

Gehui Li, Bin Chen, Chen Zhao et al.

CVPR 2025posterarXiv:2411.15255
#5145

M3GYM: A Large-Scale Multimodal Multi-view Multi-person Pose Dataset for Fitness Activity Understanding in Real-world Settings

Qingzheng Xu, Ru Cao, Xin Shen et al.

CVPR 2025poster
#5146

Star with Bilinear Mapping

Zelin Peng, Yu Huang, Zhengqin Xu et al.

CVPR 2025poster
#5147

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Thibaut Loiseau, Guillaume Bourmaud

CVPR 2025posterarXiv:2502.19955
#5148

Rethinking Temporal Fusion with a Unified Gradient Descent View for 3D Semantic Occupancy Prediction

Dubing Chen, Huan Zheng, Jin Fang et al.

CVPR 2025posterarXiv:2504.12959
#5149

Shape and Texture: What Influences Reliable Optical Flow Estimation?

Libo Long, Xiao Hu, Jochen Lang

CVPR 2025poster
#5150

AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Li Lin, Santosh Santosh, Mingyang Wu et al.

CVPR 2025posterarXiv:2406.00783
#5151

Attribute-formed Class-specific Concept Space: Endowing Language Bottleneck Model with Better Interpretability and Scalability

Jianyang Zhang, Qianli Luo, Guowu Yang et al.

CVPR 2025posterarXiv:2503.20301
#5152

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.

CVPR 2025posterarXiv:2503.22405
#5153

LeanGaussian: Breaking Pixel or Point Cloud Correspondence in Modeling 3D Gaussians

Jiamin WU, Kenkun Liu, Han Gao et al.

CVPR 2025posterarXiv:2404.16323
#5154

Embodied Scene Understanding for Vision Language Models via MetaVQA

Weizhen Wang, Chenda Duan, Zhenghao Peng et al.

CVPR 2025posterarXiv:2501.09167
#5155

Rate-In: Information-Driven Adaptive Dropout Rates for Improved Inference-Time Uncertainty Estimation

Tal Zeevi, Ravid Shwartz-Ziv, Yann LeCun et al.

CVPR 2025posterarXiv:2412.07169
#5156

RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins

Yao Mu, Tianxing Chen, Zanxin Chen et al.

CVPR 2025highlightarXiv:2504.13059
#5157

STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Haiyi Qiu, Minghe Gao, Long Qian et al.

CVPR 2025posterarXiv:2412.00161
#5158

Stop Learning it all to Mitigate Visual Hallucination, Focus on the Hallucination Target.

Dokyoon Yoon, Youngsook Song, Woomyoung Park

CVPR 2025posterarXiv:2506.11417
#5159

Beyond Background Shift: Rethinking Instance Replay in Continual Semantic Segmentation

Hongmei Yin, Tingliang Feng, Fan Lyu et al.

CVPR 2025posterarXiv:2503.22136
#5160

DKC: Differentiated Knowledge Consolidation for Cloth-Hybrid Lifelong Person Re-identification

Zhenyu Cui, Jiahuan Zhou, Yuxin Peng

CVPR 2025poster
#5161

HOT: Hadamard-based Optimized Training

Seonggon Kim, Juncheol Shin, Seung-taek Woo et al.

CVPR 2025posterarXiv:2503.21261
#5162

Learning Flow Fields in Attention for Controllable Person Image Generation

Zijian Zhou, Shikun Liu, Xiao Han et al.

CVPR 2025posterarXiv:2412.08486
#5163

Rectification-specific Supervision and Constrained Estimator for Online Stereo Rectification

Rui Gong, Kim-Hui Yap, Weide Liu et al.

CVPR 2025poster
#5164

Dual Focus-Attention Transformer for Robust Point Cloud Registration

Kexue Fu, Ming'zhi Yuan, Changwei Wang et al.

CVPR 2025poster
#5165

Forming Auxiliary High-confident Instance-level Loss to Promote Learning from Label Proportions

Tianhao Ma, Han Chen, Juncheng Hu et al.

CVPR 2025posterarXiv:2411.10364
#5166

Towards Open-Vocabulary Audio-Visual Event Localization

Jinxing Zhou, Dan Guo, Ruohao Guo et al.

CVPR 2025posterarXiv:2411.11278
#5167

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement

Qianhan Feng, Wenshuo Li, Tong Lin et al.

CVPR 2025poster
#5168

Traversing Distortion-Perception Tradeoff using a Single Score-Based Generative Model

Yuhan Wang, Suzhi Bi, Ying-Jun Angela Zhang et al.

CVPR 2025posterarXiv:2503.20297
#5169

IceDiff: High Resolution and High-Quality Arctic Sea Ice Forecasting with Generative Diffusion Prior

Jingyi Xu, Siwei Tu, Weidong Yang et al.

CVPR 2025poster
#5170

Coeff-Tuning: A Graph Filter Subspace View for Tuning Attention-Based Large Models

Zichen Miao, WEI CHEN, Qiang Qiu

CVPR 2025highlightarXiv:2503.18337
#5171

MVBoost: Boost 3D Reconstruction with Multi-View Refinement

Xiangyu Liu, Xiaomei Zhang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17772
#5172

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Yuanqi Yao, Siao Liu, Haoming Song et al.

CVPR 2025posterarXiv:2504.00420
#5173

BEVDiffuser: Plug-and-Play Diffusion Model for BEV Denoising with Ground-Truth Guidance

Xin Ye, Burhan Yaman, Sheng Cheng et al.

CVPR 2025highlightarXiv:2502.19694
#5174

SpatialLLM: A Compound 3D-Informed Design towards Spatially-Intelligent Large Multimodal Models

Wufei Ma, Luoxin Ye, Nessa McWeeney et al.

CVPR 2025highlightarXiv:2505.00788
#5175

Leveraging Perturbation Robustness to Enhance Out-of-Distribution Detection

Wenxi Chen, Raymond A. Yeh, Shaoshuai Mou et al.

CVPR 2025posterarXiv:2503.18784
#5176

Discovering Hidden Visual Concepts Beyond Linguistic Input in Infant Learning

Xueyi Ke, Satoshi Tsutsui, Yayun Zhang et al.

CVPR 2025posterarXiv:2501.05205
#5177

Semi-Supervised State-Space Model with Dynamic Stacking Filter for Real-World Video Deraining

Shangquan Sun, Wenqi Ren, Juxiang Zhou et al.

CVPR 2025posterarXiv:2505.16811
#5178

MobilePortrait: Real-Time One-Shot Neural Head Avatars on Mobile Devices

Jianwen Jiang, Gaojie Lin, Zhengkun Rong et al.

CVPR 2025posterarXiv:2407.05712
#5179

Jailbreaking the Non-Transferable Barrier via Test-Time Data Disguising

Yongli Xiang, Ziming Hong, Lina Yao et al.

CVPR 2025posterarXiv:2503.17198
#5180

MobileMamba: Lightweight Multi-Receptive Visual Mamba Network

Haoyang He, Jiangning Zhang, Yuxuan Cai et al.

CVPR 2025posterarXiv:2411.15941
#5181

EarthDial: Turning Multi-sensory Earth Observations to Interactive Dialogues

Sagar Soni, Akshay Dudhane, Hiyam Debary et al.

CVPR 2025posterarXiv:2412.15190
#5182

Learning Endogenous Attention for Incremental Object Detection

Xiang Song, Yuhang He, Jingyuan Li et al.

CVPR 2025poster
#5183

Learning Textual Prompts for Open-World Semi-Supervised Learning

Yuxin Fan, Junbiao Cui, Jiye Liang

CVPR 2025poster
#5184

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Zhanhao Liang, Yuhui Yuan, Shuyang Gu et al.

CVPR 2025posterarXiv:2406.04314
#5185

Beyond Clean Training Data: A Versatile and Model-Agnostic Framework for Out-of-Distribution Detection with Contaminated Training Data

Yuchuan Li, Jae-Mo Kang, Il-Min Kim

CVPR 2025poster
#5186

Minimizing Labeled, Maximizing Unlabeled: An Image-Driven Approach for Video Instance Segmentation

Fangyun Wei, Jinjing Zhao, Kun Yan et al.

CVPR 2025poster
#5187

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Silin Gao, Sheryl Mathew, Li Mi et al.

CVPR 2025posterarXiv:2503.20871
#5188

DeClotH: Decomposable 3D Cloth and Human Body Reconstruction from a Single Image

Hyeongjin Nam, Donghwan Kim, Jeongtaek Oh et al.

CVPR 2025posterarXiv:2503.19373
#5189

Learning to Sample Effective and Diverse Prompts for Text-to-Image Generation

Taeyoung Yun, Dinghuai Zhang, Jinkyoo Park et al.

CVPR 2025posterarXiv:2502.11477
#5190

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563
#5191

CARL: A Framework for Equivariant Image Registration

Hastings Greer, Lin Tian, François-Xavier Vialard et al.

CVPR 2025posterarXiv:2405.16738
#5192

Perceptual Inductive Bias Is What You Need Before Contrastive Learning

Junru Zhao, Tianqin Li, Dunhan Jiang et al.

CVPR 2025posterarXiv:2506.01201
#5193

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting

Jiahui Zhang, Fangneng Zhan, Ling Shao et al.

CVPR 2025posterarXiv:2503.07476
#5194

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025posterarXiv:2504.00072
#5195

Enhancing Few-Shot Class-Incremental Learning via Training-Free Bi-Level Modality Calibration

Yiyang Chen, Tianyu Ding, Lei Wang et al.

CVPR 2025poster
#5196

AeroGen: Enhancing Remote Sensing Object Detection with Diffusion-Driven Data Generation

Datao Tang, Xiangyong Cao, Xuan Wu et al.

CVPR 2025posterarXiv:2411.15497
#5197

UniSTD: Towards Unified Spatio-Temporal Learning across Diverse Disciplines

Chen Tang, Xinzhu Ma, Encheng Su et al.

CVPR 2025posterarXiv:2503.20748
#5198

Dynamic Prompt Optimizing for Text-to-Image Generation

Wenyi Mo, Tianyu Zhang, Yalong Bai et al.

CVPR 2024posterarXiv:2404.04095
#5199

Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment

Chen Liu, Peike Li, Liying Yang et al.

CVPR 2025posterarXiv:2503.12847
#5200

Animate and Sound an Image

Xihua Wang, Ruihua Song, Chongxuan Li et al.

CVPR 2025poster