Most Cited CVPR "recurrent propagation module" Papers

5,589 papers found • Page 21 of 28

#4001

UniGoal: Towards Universal Zero-shot Goal-oriented Navigation

Hang Yin, Xiuwei Xu, Linqing Zhao et al.

CVPR 2025posterarXiv:2503.10630
#4002

TIGER: Time-Varying Denoising Model for 3D Point Cloud Generation with Diffusion Process

Zhiyuan Ren, Minchul Kim, Feng Liu et al.

CVPR 2024poster
#4003

Q-Bench-Video: Benchmark the Video Quality Understanding of LMMs

Zicheng Zhang, Ziheng Jia, Haoning Wu et al.

CVPR 2025posterarXiv:2409.20063
#4004

HOLD: Category-agnostic 3D Reconstruction of Interacting Hands and Objects from Video

Zicong Fan, Maria Parelli, Maria Kadoglou et al.

CVPR 2024highlightarXiv:2311.18448
#4005

Learning Continual Compatible Representation for Re-indexing Free Lifelong Person Re-identification

Zhenyu Cui, Jiahuan Zhou, Xun Wang et al.

CVPR 2024poster
#4006

LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation

Linfeng Yuan, Miaojing Shi, Zijie Yue et al.

CVPR 2024posterarXiv:2306.08736
#4007

Alignment, Mining and Fusion: Representation Alignment with Hard Negative Mining and Selective Knowledge Fusion for Medical Visual Question Answering

Yuanhao Zou, Zhaozheng Yin

CVPR 2025posterarXiv:2510.08791
#4008

ReDiffDet: Rotation-equivariant Diffusion Model for Oriented Object Detection

Jiaqi Zhao, Zeyu Ding, Yong Zhou et al.

CVPR 2025poster
#4009

Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos

Mehmet Saygin Seyfioglu, Wisdom Ikezogwo, Fatemeh Ghezloo et al.

CVPR 2024posterarXiv:2312.04746
#4010

X-Adapter: Adding Universal Compatibility of Plugins for Upgraded Diffusion Model

Lingmin Ran, Xiaodong Cun, Jia-Wei Liu et al.

CVPR 2024posterarXiv:2312.02238
#4011

DeconfuseTrack: Dealing with Confusion for Multi-Object Tracking

Cheng Huang, Shoudong Han, Mengyu He et al.

CVPR 2024poster
#4012

Efficient Test-time Adaptive Object Detection via Sensitivity-Guided Pruning

Kunyu Wang, Xueyang Fu, Xin Lu et al.

CVPR 2025posterarXiv:2506.02462
#4013

AIDE: An Automatic Data Engine for Object Detection in Autonomous Driving

Mingfu Liang, Jong-Chyi Su, Samuel Schulter et al.

CVPR 2024posterarXiv:2403.17373
#4014

Open-Vocabulary Attention Maps with Token Optimization for Semantic Segmentation in Diffusion Models

Pablo Marcos-Manchón, Roberto Alcover-Couso, Juan SanMiguel et al.

CVPR 2024posterarXiv:2403.14291
#4015

Attentive Illumination Decomposition Model for Multi-Illuminant White Balancing

Dongyoung Kim, Jinwoo Kim, Junsang Yu et al.

CVPR 2024posterarXiv:2402.18277
#4016

Visual-Augmented Dynamic Semantic Prototype for Generative Zero-Shot Learning

Wenjin Hou, Shiming Chen, Shuhuang Chen et al.

CVPR 2024posterarXiv:2404.14808
#4017

A&B BNN: Add&Bit-Operation-Only Hardware-Friendly Binary Neural Network

Ruichen Ma, Guanchao Qiao, Yian Liu et al.

CVPR 2024posterarXiv:2403.03739
#4018

Joint Optimization of Neural Radiance Fields and Continuous Camera Motion from a Monocular Video

Hoang Chuong Nguyen, Wei Mao, Jose M. Alvarez et al.

CVPR 2025posterarXiv:2504.19819
#4019

PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction

Mingzhi Pei, Xu Cao, Xiangyi Wang et al.

CVPR 2025posterarXiv:2504.08410
#4020

OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Lingdong Kong, Youquan Liu, Lai Xing Ng et al.

CVPR 2024highlightarXiv:2405.05259
#4021

Z*: Zero-shot Style Transfer via Attention Reweighting

Yingying Deng, Xiangyu He, Fan Tang et al.

CVPR 2024poster
#4022

Video-Bench: Human-Aligned Video Generation Benchmark

Hui Han, Siyuan Li, Jiaqi Chen et al.

CVPR 2025posterarXiv:2504.04907
#4023

Dual Consolidation for Pre-Trained Model-Based Domain-Incremental Learning

Da-Wei Zhou, Zi-Wen Cai, Han-Jia Ye et al.

CVPR 2025posterarXiv:2410.00911
#4024

MFogHub: Bridging Multi-Regional and Multi-Satellite Data for Global Marine Fog Detection and Forecasting

Mengqiu XU, Kaixin Chen, Heng Guo et al.

CVPR 2025posterarXiv:2505.10281
#4025

G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis

Yufei Ye, Abhinav Gupta, Kris Kitani et al.

CVPR 2024posterarXiv:2404.12383
#4026

3D Face Reconstruction with the Geometric Guidance of Facial Part Segmentation

Zidu Wang, Xiangyu Zhu, Tianshuo Zhang et al.

CVPR 2024highlightarXiv:2312.00311
#4027

Spike-guided Motion Deblurring with Unknown Modal Spatiotemporal Alignment

Jiyuan Zhang, Shiyan Chen, Yajing Zheng et al.

CVPR 2024poster
#4028

Multi-Task Dense Prediction via Mixture of Low-Rank Experts

Yuqi Yang, Peng-Tao Jiang, Qibin Hou et al.

CVPR 2024posterarXiv:2403.17749
#4029

COBRA: COmBinatorial Retrieval Augmentation for Few-Shot Adaptation

Arnav Mohanty Das, Gantavya Bhatt, Lilly Kumari et al.

CVPR 2025posterarXiv:2412.17684
#4030

A Bayesian Approach to OOD Robustness in Image Classification

Prakhar Kaushik, Adam Kortylewski, Alan L. Yuille

CVPR 2024posterarXiv:2403.07277
#4031

ChatPose: Chatting about 3D Human Pose

Yao Feng, Jing Lin, Sai Kumar Dwivedi et al.

CVPR 2024posterarXiv:2311.18836
#4032

ConCon-Chi: Concept-Context Chimera Benchmark for Personalized Vision-Language Tasks

Andrea Rosasco, Stefano Berti, Giulia Pasquale et al.

CVPR 2024poster
#4033

Instance-aware Contrastive Learning for Occluded Human Mesh Reconstruction

Mi-Gyeong Gwon, Gi-Mun Um, Won-Sik Cheong et al.

CVPR 2024poster
#4034

Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim et al.

CVPR 2024posterarXiv:2405.06284
#4035

All-directional Disparity Estimation for Real-world QPD Images

Hongtao Yu, Shaohui Song, Lihu Sun et al.

CVPR 2025highlight
#4036

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

Yuan Zhou, Qingshan Xu, Jiequan Cui et al.

CVPR 2025highlightarXiv:2411.16170
#4037

DITTO: Dual and Integrated Latent Topologies for Implicit 3D Reconstruction

Jaehyeok Shim, Kyungdon Joo

CVPR 2024posterarXiv:2403.05005
#4038

NC-TTT: A Noise Constrastive Approach for Test-Time Training

David OSOWIECHI, Gustavo Vargas Hakim, Mehrdad Noori et al.

CVPR 2024highlight
#4039

VidHalluc: Evaluating Temporal Hallucinations in Multimodal Large Language Models for Video Understanding

Chaoyu Li, Eun Woo Im, Pooyan Fazli

CVPR 2025posterarXiv:2412.03735
#4040

VL2Lite: Task-Specific Knowledge Distillation from Large Vision-Language Models to Lightweight Networks

Jinseong Jang, Chunfei Ma, Byeongwon Lee

CVPR 2025poster
#4041

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Bardia Safaei, Faizan Siddiqui, Jiacong Xu et al.

CVPR 2025highlightarXiv:2503.07591
#4042

Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events

Aditya Chinchure, Sahithya Ravi, Raymond Ng et al.

CVPR 2025posterarXiv:2412.05725
#4043

UniMODE: Unified Monocular 3D Object Detection

Zhuoling Li, Xiaogang Xu, Ser-Nam Lim et al.

CVPR 2024highlight
#4044

Semantic-guided Cross-Modal Prompt Learning for Skeleton-based Zero-shot Action Recognition

Anqi Zhu, Jingmin Zhu, James Bailey et al.

CVPR 2025poster
#4045

Habitat Synthetic Scenes Dataset (HSSD-200): An Analysis of 3D Scene Scale and Realism Tradeoffs for ObjectGoal Navigation

Mukul Khanna, Yongsen Mao, Hanxiao Jiang et al.

CVPR 2024posterarXiv:2306.11290
#4046

Sculpting Holistic 3D Representation in Contrastive Language-Image-3D Pre-training

Yipeng Gao, Zeyu Wang, Wei-Shi Zheng et al.

CVPR 2024posterarXiv:2311.01734
#4047

KeyPoint Relative Position Encoding for Face Recognition

Minchul Kim, Feng Liu, Yiyang Su et al.

CVPR 2024posterarXiv:2403.14852
#4048

QDFormer: Towards Robust Audiovisual Segmentation in Complex Environments with Quantization-based Semantic Decomposition

Xiang Li, Jinglu Wang, Xiaohao Xu et al.

CVPR 2024posterarXiv:2310.00132
#4049

CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner

Weiyu Li, Jiarui Liu, Hongyu Yan et al.

CVPR 2025poster
#4050

SEEN-DA: SEmantic ENtropy guided Domain-aware Attention for Domain Adaptive Object Detection

Haochen Li, Rui Zhang, Hantao Yao et al.

CVPR 2025poster
#4051

From a Bird's Eye View to See: Joint Camera and Subject Registration without the Camera Calibration

Zekun Qian, Ruize Han, Wei Feng et al.

CVPR 2024posterarXiv:2212.09298
#4052

Seeking Consistent Flat Minima for Better Domain Generalization via Refining Loss Landscapes

Aodi Li, Liansheng Zhuang, Xiao Long et al.

CVPR 2025posterarXiv:2412.13573
#4053

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

Fu Feng, Yucheng Xie, Jing Wang et al.

CVPR 2025posterarXiv:2406.17503
#4054

Joint2Human: High-Quality 3D Human Generation via Compact Spherical Embedding of 3D Joints

Muxin Zhang, Qiao Feng, Zhuo Su et al.

CVPR 2024posterarXiv:2312.08591
#4055

RoadSocial: A Diverse VideoQA Dataset and Benchmark for Road Event Understanding from Social Video Narratives

Chirag Parikh, Deepti Rawat, Rakshitha R. T. et al.

CVPR 2025posterarXiv:2503.21459
#4056

Investigating Compositional Challenges in Vision-Language Models for Visual Grounding

Yunan Zeng, Yan Huang, Jinjin Zhang et al.

CVPR 2024highlight
#4057

SDPose: Tokenized Pose Estimation via Circulation-Guide Self-Distillation

Chen Sichen, Yingyi Zhang, Siming Huang et al.

CVPR 2024posterarXiv:2404.03518
#4058

ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Ling-An Zeng, Guohong Huang, Yi-Lin Wei et al.

CVPR 2025posterarXiv:2503.13130
#4059

FlowVid: Taming Imperfect Optical Flows for Consistent Video-to-Video Synthesis

Feng Liang, Bichen Wu, Jialiang Wang et al.

CVPR 2024highlightarXiv:2312.17681
#4060

DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

Chen Min, Dawei Zhao, Liang Xiao et al.

CVPR 2024posterarXiv:2405.04390
#4061

Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

Yu Zhou, Dian Zheng, Qijie Mo et al.

CVPR 2025highlightarXiv:2503.23751
#4062

Accept the Modality Gap: An Exploration in the Hyperbolic Space

Sameera Ramasinghe, Violetta Shevchenko, Gil Avraham et al.

CVPR 2024highlight
#4063

MirageRoom: 3D Scene Segmentation with 2D Pre-trained Models by Mirage Projection

Haowen Sun, Yueqi Duan, Juncheng Yan et al.

CVPR 2024highlight
#4064

ABBSPO: Adaptive Bounding Box Scaling and Symmetric Prior based Orientation Prediction for Detecting Aerial Image Objects

Woojin Lee, Hyugjae Chang, Jaeho Moon et al.

CVPR 2025posterarXiv:2512.10031
#4065

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li et al.

CVPR 2025highlightarXiv:2505.10649
#4066

Segment Anything, Even Occluded

Wei-En Tai, Yu-Lin Shih, Cheng Sun et al.

CVPR 2025posterarXiv:2503.06261
#4067

ReWind: Understanding Long Videos with Instructed Learnable Memory

Anxhelo Diko, Tinghuai Wang, Wassim Swaileh et al.

CVPR 2025posterarXiv:2411.15556
#4068

Cross-View Completion Models are Zero-shot Correspondence Estimators

Honggyu An, Jin Hyeon Kim, Seonghoon Park et al.

CVPR 2025highlightarXiv:2412.09072
#4069

EIDT-V: Exploiting Intersections in Diffusion Trajectories for Model-Agnostic, Zero-Shot, Training-Free Text-to-Video Generation

Diljeet Jagpal, Xi Chen, Vinay P. Namboodiri

CVPR 2025posterarXiv:2504.06861
#4070

CAD: Photorealistic 3D Generation via Adversarial Distillation

Ziyu Wan, Despoina Paschalidou, Ian Huang et al.

CVPR 2024posterarXiv:2312.06663
#4071

Scaling up Image Segmentation across Data and Tasks

Pei Wang, Zhaowei Cai, Hao Yang et al.

CVPR 2025poster
#4072

Mind the Gap: Confidence Discrepancy Can Guide Federated Semi-Supervised Learning Across Pseudo-Mismatch

Yijie Liu, Xinyi Shang, Yiqun Zhang et al.

CVPR 2025posterarXiv:2503.13227
#4073

Linguistics-aware Masked Image Modeling for Self-supervised Scene Text Recognition

Yifei Zhang, Chang Liu, Jin Wei et al.

CVPR 2025posterarXiv:2503.18746
#4074

Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models

Jingyao Xu, Yuetong Lu, Yandong Li et al.

CVPR 2024posterarXiv:2404.15081
#4075

TriTex: Learning Texture from a Single Mesh via Triplane Semantic Features

Dana Cohen-Bar, Daniel Cohen-Or, Gal Chechik et al.

CVPR 2025posterarXiv:2503.16630
#4076

CityDreamer: Compositional Generative Model of Unbounded 3D Cities

Haozhe Xie, Zhaoxi Chen, Fangzhou Hong et al.

CVPR 2024posterarXiv:2309.00610
#4077

Noisy-Correspondence Learning for Text-to-Image Person Re-identification

Yang Qin, Yingke Chen, Dezhong Peng et al.

CVPR 2024posterarXiv:2308.09911
#4078

Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild

Damien Teney, Liangze Jiang, Florin Gogianu et al.

CVPR 2025posterarXiv:2503.10065
#4079

ColabSfM: Collaborative Structure-from-Motion by Point Cloud Registration

Johan Edstedt, André Mateus, Alberto Jaenal

CVPR 2025posterarXiv:2503.17093
#4080

Random Entangled Tokens for Adversarially Robust Vision Transformer

Huihui Gong, Minjing Dong, Siqi Ma et al.

CVPR 2024poster
#4081

PNeRV: Enhancing Spatial Consistency via Pyramidal Neural Representation for Videos

Qi Zhao, M. Salman Asif, Zhan Ma

CVPR 2024posterarXiv:2404.08921
#4082

OST: Refining Text Knowledge with Optimal Spatio-Temporal Descriptor for General Video Recognition

Tongjia Chen, Hongshan Yu, Zhengeng Yang et al.

CVPR 2024posterarXiv:2312.00096
#4083

DYSON: Dynamic Feature Space Self-Organization for Online Task-Free Class Incremental Learning

Yuhang He, YingJie Chen, Yuhan Jin et al.

CVPR 2024poster
#4084

Harnessing Large Language Models for Training-free Video Anomaly Detection

Luca Zanella, Willi Menapace, Massimiliano Mancini et al.

CVPR 2024posterarXiv:2404.01014
#4085

AnyMap: Learning a General Camera Model for Structure-from-Motion with Unknown Distortion in Dynamic Scenes

Andrea Porfiri Dal Cin, Georgi Dikov, Jihong Ju et al.

CVPR 2025poster
#4086

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models

Dahun Kim, AJ Piergiovanni, Ganesh Satish Mallya et al.

CVPR 2025posterarXiv:2504.03970
#4087

Continuous Pose for Monocular Cameras in Neural Implicit Representation

Qi Ma, Danda Paudel, Ajad Chhatkuli et al.

CVPR 2024posterarXiv:2311.17119
#4088

DreamCache: Finetuning-Free Lightweight Personalized Image Generation via Feature Caching

Emanuele Aiello, Umberto Michieli, Diego Valsesia et al.

CVPR 2025posterarXiv:2411.17786
#4089

Learned Trajectory Embedding for Subspace Clustering

Yaroslava Lochman, Christopher Zach, Carl Olsson

CVPR 2024poster
#4090

Shining Yourself: High-Fidelity Ornaments Virtual Try-on with Diffusion Model

Yingmao Miao, Zhanpeng Huang, Rui Han et al.

CVPR 2025posterarXiv:2503.16065
#4091

BadCLIP: Dual-Embedding Guided Backdoor Attack on Multimodal Contrastive Learning

Siyuan Liang, Mingli Zhu, Aishan Liu et al.

CVPR 2024highlightarXiv:2311.12075
#4092

Rethinking the Objectives of Vector-Quantized Tokenizers for Image Synthesis

Yuchao Gu, Xintao Wang, Yixiao Ge et al.

CVPR 2024posterarXiv:2212.03185
#4093

Weakly Supervised Video Individual Counting

Xinyan Liu, Guorong Li, Yuankai Qi et al.

CVPR 2024poster
#4094

FairRAG: Fair Human Generation via Fair Retrieval Augmentation

Robik Shrestha, Yang Zou, Qiuyu Chen et al.

CVPR 2024posterarXiv:2403.19964
#4095

MicroDiffusion: Implicit Representation-Guided Diffusion for 3D Reconstruction from Limited 2D Microscopy Projections

mude hui, Zihao Wei, Hongru Zhu et al.

CVPR 2024posterarXiv:2403.10815
#4096

Learning Inclusion Matching for Animation Paint Bucket Colorization

Yuekun Dai, Shangchen Zhou, Blake Li et al.

CVPR 2024posterarXiv:2403.18342
#4097

ESCAPE: Encoding Super-keypoints for Category-Agnostic Pose Estimation

Khoi D Nguyen, Chen Li, Gim Hee Lee

CVPR 2024poster
#4098

Higher-Order Ratio Cycles for Fast and Globally Optimal Shape Matching

Paul Roetzer, Viktoria Ehm, Daniel Cremers et al.

CVPR 2025poster
#4099

LucidDreamer: Towards High-Fidelity Text-to-3D Generation via Interval Score Matching

Yixun Liang, Xin Yang, Jiantao Lin et al.

CVPR 2024highlightarXiv:2311.11284
#4100

Preserving Fairness Generalization in Deepfake Detection

Li Lin, Xinan He, Yan Ju et al.

CVPR 2024posterarXiv:2402.17229
#4101

RepViT: Revisiting Mobile CNN From ViT Perspective

Ao Wang, Hui Chen, Zijia Lin et al.

CVPR 2024posterarXiv:2307.09283
#4102

Improved Implicit Neural Representation with Fourier Reparameterized Training

Kexuan Shi, Xingyu Zhou, Shuhang Gu

CVPR 2024posterarXiv:2401.07402
#4103

Gradient Alignment for Cross-Domain Face Anti-Spoofing

MINH BINH LE, Simon Woo

CVPR 2024posterarXiv:2402.18817
#4104

U-VAP: User-specified Visual Appearance Personalization via Decoupled Self Augmentation

You Wu, Kean Liu, Xiaoyue Mi et al.

CVPR 2024posterarXiv:2403.20231
#4105

Chebyshev Attention Depth Permutation Texture Network with Latent Texture Attribute Loss

Ravishankar Evani, Deepu Rajan, Shangbo Mao

CVPR 2025poster
#4106

Noise-Consistent Siamese-Diffusion for Medical Image Synthesis and Segmentation

Kunpeng Qiu, Zhiqiang Gao, Zhiying Zhou et al.

CVPR 2025posterarXiv:2505.06068
#4107

Task-Specific Gradient Adaptation for Few-Shot One-Class Classification

Yunlong Li, Xiabi Liu, Liyuan Pan et al.

CVPR 2025poster
#4108

Generalizable Object Keypoint Localization from Generative Priors

Dongkai Wang, Jiang Duan, Liangjian Wen et al.

CVPR 2025poster
#4109

Glossy Object Reconstruction with Cost-effective Polarized Acquisition

Bojian Wu, YIFAN PENG, Ruizhen Hu et al.

CVPR 2025highlightarXiv:2504.07025
#4110

Insights from the Use of Previously Unseen Neural Architecture Search Datasets

Rob Geada, David Towers, Matthew Forshaw et al.

CVPR 2024posterarXiv:2404.02189
#4111

A Pedestrian is Worth One Prompt: Towards Language Guidance Person Re-Identification

Zexian Yang, Dayan Wu, Chenming Wu et al.

CVPR 2024highlight
#4112

Towards Universal Dataset Distillation via Task-Driven Diffusion

Ding Qi, Jian Li, Junyao Gao et al.

CVPR 2025poster
#4113

Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Qilong Zhangli, Jindong Jiang, Di Liu et al.

CVPR 2024posterarXiv:2406.01062
#4114

Towards Explicit Geometry-Reflectance Collaboration for Generalized LiDAR Segmentation in Adverse Weather

Longyu Yang, Ping Hu, Shangbo Yuan et al.

CVPR 2025posterarXiv:2506.02396
#4115

CoDi: Conditional Diffusion Distillation for Higher-Fidelity and Faster Image Generation

Kangfu Mei, Mauricio Delbracio, Hossein Talebi et al.

CVPR 2024posterarXiv:2310.01407
#4116

PillarHist: A Quantization-aware Pillar Feature Encoder based on Height-aware Histogram

Sifan Zhou, Zhihang Yuan, Dawei Yang et al.

CVPR 2025poster
#4117

From SAM to CAMs: Exploring Segment Anything Model for Weakly Supervised Semantic Segmentation

Hyeokjun Kweon, Kuk-Jin Yoon

CVPR 2024poster
#4118

Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers

Jung-Ho Hong, Ho-Joong Kim, Kyu-Sung Jeon et al.

CVPR 2025highlightarXiv:2507.04388
#4119

Detect-and-Guide: Self-regulation of Diffusion Models for Safe Text-to-Image Generation via Guideline Token Optimization

Feifei Li, Mi Zhang, Yiming Sun et al.

CVPR 2025posterarXiv:2503.15197
#4120

Classifier-guided CLIP Distillation for Unsupervised Multi-label Classification

Dongseob Kim, Hyunjung Shim

CVPR 2025posterarXiv:2503.16873
#4121

MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing

Cong Wang, Di Kang, Heyi Sun et al.

CVPR 2025posterarXiv:2404.19026
#4122

ORIDa: Object-centric Real-world Image Composition Dataset

Jinwoo Kim, Sangmin Han, Jinho Jeong et al.

CVPR 2025posterarXiv:2506.08964
#4123

Learning Physics From Video: Unsupervised Physical Parameter Estimation for Continuous Dynamical Systems

Alejandro Castañeda Garcia, Jan Warchocki, Jan van Gemert et al.

CVPR 2025posterarXiv:2410.01376
#4124

Vlogger: Make Your Dream A Vlog

Shaobin Zhuang, Kunchang Li, Xinyuan Chen et al.

CVPR 2024posterarXiv:2401.09414
#4125

EpiDiff: Enhancing Multi-View Synthesis via Localized Epipolar-Constrained Diffusion

Zehuan Huang, Hao Wen, Junting Dong et al.

CVPR 2024posterarXiv:2312.06725
#4126

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation

Kaiyue Sun, Kaiyi Huang, Xian Liu et al.

CVPR 2025posterarXiv:2407.14505
#4127

IPoD: Implicit Field Learning with Point Diffusion for Generalizable 3D Object Reconstruction from Single RGB-D Images

Yushuang Wu, Luyue Shi, Junhao Cai et al.

CVPR 2024highlightarXiv:2404.00269
#4128

Dragin3D: Image Editing by Dragging in 3D Space

Weiran Guang, Xiaoguang Gu, Mengqi Huang et al.

CVPR 2025poster
#4129

CoMatcher: Multi-View Collaborative Feature Matching

Jintao Zhang, Zimin Xia, Mingyue Dong et al.

CVPR 2025posterarXiv:2504.01872
#4130

MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding

Bo He, Hengduo Li, Young Kyun Jang et al.

CVPR 2024posterarXiv:2404.05726
#4131

SVGDreamer: Text Guided SVG Generation with Diffusion Model

XiMing Xing, Chuang Wang, Haitao Zhou et al.

CVPR 2024posterarXiv:2312.16476
#4132

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Wang Zhao, Yan-Pei Cao, Jiale Xu et al.

CVPR 2025posterarXiv:2412.15200
#4133

Dual Prototype Attention for Unsupervised Video Object Segmentation

Suhwan Cho, Minhyeok Lee, Seunghoon Lee et al.

CVPR 2024posterarXiv:2211.12036
#4134

R-Cyclic Diffuser: Reductive and Cyclic Latent Diffusion for 3D Clothed Human Digitalization

Kennard Chan, Fayao Liu, Guosheng Lin et al.

CVPR 2024poster
#4135

Contrastive Mean-Shift Learning for Generalized Category Discovery

Sua Choi, Dahyun Kang, Minsu Cho

CVPR 2024posterarXiv:2404.09451
#4136

DiffusionDrive: Truncated Diffusion Model for End-to-End Autonomous Driving

Bencheng Liao, Shaoyu Chen, haoran yin et al.

CVPR 2025highlightarXiv:2411.15139
#4137

Panacea: Panoramic and Controllable Video Generation for Autonomous Driving

Yuqing Wen, Yucheng Zhao, Yingfei Liu et al.

CVPR 2024posterarXiv:2408.07605
#4138

Flexible Biometrics Recognition: Bridging the Multimodality Gap through Attention Alignment and Prompt Tuning

Leslie Ching Ow Tiong, Dick Sigmund, Chen-Hui Chan et al.

CVPR 2024poster
#4139

Beyond Local Sharpness: Communication-Efficient Global Sharpness-aware Minimization for Federated Learning

Debora Caldarola, Pietro Cagnasso, Barbara Caputo et al.

CVPR 2025posterarXiv:2412.03752
#4140

Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation

Ying Jin, Jinlong Peng, Qingdong He et al.

CVPR 2025posterarXiv:2408.13509
#4141

Towards Variable and Coordinated Holistic Co-Speech Motion Generation

Yifei Liu, Qiong Cao, Yandong Wen et al.

CVPR 2024posterarXiv:2404.00368
#4142

Minimal Perspective Autocalibration

Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri et al.

CVPR 2024posterarXiv:2405.05605
#4143

VASparse: Towards Efficient Visual Hallucination Mitigation via Visual-Aware Token Sparsification

Xianwei Zhuang, Zhihong Zhu, Yuxin Xie et al.

CVPR 2025posterarXiv:2501.06553
#4144

ReGenNet: Towards Human Action-Reaction Synthesis

Liang Xu, Yizhou Zhou, Yichao Yan et al.

CVPR 2024posterarXiv:2403.11882
#4145

Koala-36M: A Large-scale Video Dataset Improving Consistency between Fine-grained Conditions and Video Content

Qiuheng Wang, Yukai Shi, Jiarong Ou et al.

CVPR 2025posterarXiv:2410.08260
#4146

VSRD: Instance-Aware Volumetric Silhouette Rendering for Weakly Supervised 3D Object Detection

Zihua Liu, Hiroki Sakuma, Masatoshi Okutomi

CVPR 2024posterarXiv:2404.00149
#4147

Class Incremental Learning with Multi-Teacher Distillation

Haitao Wen, Lili Pan, Yu Dai et al.

CVPR 2024poster
#4148

AvatarArtist: Open-Domain 4D Avatarization

Hongyu Liu, Xuan Wang, Ziyu Wan et al.

CVPR 2025posterarXiv:2503.19906
#4149

Parameter Efficient Self-Supervised Geospatial Domain Adaptation

Linus Scheibenreif, Michael Mommert, Damian Borth

CVPR 2024poster
#4150

ALGM: Adaptive Local-then-Global Token Merging for Efficient Semantic Segmentation with Plain Vision Transformers

Narges Norouzi, Svetlana Orlova, Daan de Geus et al.

CVPR 2024posterarXiv:2406.09936
#4151

Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction

Dongxu Wei, Zhiqi Li, Peidong Liu

CVPR 2025posterarXiv:2412.06273
#4152

Beyond Seen Primitive Concepts and Attribute-Object Compositional Learning

Nirat Saini, Khoi Pham, Abhinav Shrivastava

CVPR 2024poster
#4153

Scaling Laws of Synthetic Images for Model Training ... for Now

Lijie Fan, Kaifeng Chen, Dilip Krishnan et al.

CVPR 2024posterarXiv:2312.04567
#4154

Visual Consensus Prompting for Co-Salient Object Detection

Jie Wang, Nana Yu, Zihao Zhang et al.

CVPR 2025posterarXiv:2504.14254
#4155

Where's the Liability in the Generative Era? Recovery-based Black-Box Detection of AI-Generated Content

Haoyue Bai, Yiyou Sun, Wei Cheng et al.

CVPR 2025posterarXiv:2505.01008
#4156

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion

Yiran Wang, Jiaqi Li, Chaoyi Hong et al.

CVPR 2025posterarXiv:2504.11773
#4157

No Pains, More Gains: Recycling Sub-Salient Patches for Efficient High-Resolution Image Recognition

Rong Qin, Xin Liu, Xingyu Liu et al.

CVPR 2025highlight
#4158

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2025posterarXiv:2405.18839
#4159

NLPrompt: Noise-Label Prompt Learning for Vision-Language Models

Bikang Pan, Qun Li, Xiaoying Tang et al.

CVPR 2025highlightarXiv:2412.01256
#4160

UDiFF: Generating Conditional Unsigned Distance Fields with Optimal Wavelet Diffusion

Junsheng Zhou, Weiqi Zhang, Baorui Ma et al.

CVPR 2024posterarXiv:2404.06851
#4161

Learning Group Activity Features Through Person Attribute Prediction

Chihiro Nakatani, Hiroaki Kawashima, Norimichi Ukita

CVPR 2024posterarXiv:2403.02753
#4162

MICap: A Unified Model for Identity-Aware Movie Descriptions

Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan et al.

CVPR 2024posterarXiv:2405.11483
#4163

UFineBench: Towards Text-based Person Retrieval with Ultra-fine Granularity

Jialong Zuo, Hanyu Zhou, Ying Nie et al.

CVPR 2024posterarXiv:2312.03441
#4164

Test-Time Zero-Shot Temporal Action Localization

Benedetta Liberatori, Alessandro Conti, Paolo Rota et al.

CVPR 2024posterarXiv:2404.05426
#4165

Improving the Training of Data-Efficient GANs via Quality Aware Dynamic Discriminator Rejection Sampling

Zhaoyu Zhang, Yang Hua, Guanxiong Sun et al.

CVPR 2025poster
#4166

FreeU: Free Lunch in Diffusion U-Net

Chenyang Si, Ziqi Huang, Yuming Jiang et al.

CVPR 2024posterarXiv:2309.11497
#4167

Towards Text-guided 3D Scene Composition

Qihang Zhang, Chaoyang Wang, Aliaksandr Siarohin et al.

CVPR 2024posterarXiv:2312.08885
#4168

Instance-aware Exploration-Verification-Exploitation for Instance ImageGoal Navigation

Xiaohan Lei, Min Wang, Wengang Zhou et al.

CVPR 2024posterarXiv:2402.17587
#4169

AnyScene: Customized Image Synthesis with Composited Foreground

Ruidong Chen, Lanjun Wang, Weizhi Nie et al.

CVPR 2024poster
#4170

Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

Chunghyun Park, Seungwook Kim, Jaesik Park et al.

CVPR 2024posterarXiv:2404.11156
#4171

Color Shift Estimation-and-Correction for Image Enhancement

Yiyu Li, Ke Xu, Gerhard Hancke et al.

CVPR 2024posterarXiv:2405.17725
#4172

RGBD Objects in the Wild: Scaling Real-World 3D Object Learning from RGB-D Videos

Hongchi Xia, Yang Fu, Sifei Liu et al.

CVPR 2024posterarXiv:2401.12592
#4173

VideoDirector: Precise Video Editing via Text-to-Video Models

Yukun Wang, Longguang Wang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17592
#4174

Endow SAM with Keen Eyes: Temporal-spatial Prompt Learning for Video Camouflaged Object Detection

Wenjun Hui, Zhenfeng Zhu, Shuai Zheng et al.

CVPR 2024poster
#4175

NICE: Neurogenesis Inspired Contextual Encoding for Replay-free Class Incremental Learning

Mustafa B Gurbuz, Jean Moorman, Constantine Dovrolis

CVPR 2024poster
#4176

Taming Mode Collapse in Score Distillation for Text-to-3D Generation

Peihao Wang, Dejia Xu, Zhiwen Fan et al.

CVPR 2024posterarXiv:2401.00909
#4177

Aligning and Prompting Everything All at Once for Universal Visual Perception

Yunhang Shen, Chaoyou Fu, Peixian Chen et al.

CVPR 2024posterarXiv:2312.02153
#4178

Learning to Filter Outlier Edges in Global SfM

Nicole Damblon, Marc Pollefeys, Daniel Barath

CVPR 2025highlight
#4179

MODA: Motion-Drift Augmentation for Inertial Human Motion Analysis

Yinghao Wu, Shihui Guo, Yipeng Qin

CVPR 2025poster
#4180

ZONE: Zero-Shot Instruction-Guided Local Editing

Shanglin Li, Bohan Zeng, Yutang Feng et al.

CVPR 2024posterarXiv:2312.16794
#4181

FocusMAE: Gallbladder Cancer Detection from Ultrasound Videos with Focused Masked Autoencoders

Soumen Basu, Mayuna Gupta, Chetan Madan et al.

CVPR 2024posterarXiv:2403.08848
#4182

SLADE: Shielding against Dual Exploits in Large Vision-Language Models

Md Zarif Hossain, AHMED IMTEAJ

CVPR 2025poster
#4183

Noisy One-point Homographies are Surprisingly Good

Yaqing Ding, Jonathan Astermark, Magnus Oskarsson et al.

CVPR 2024poster
#4184

CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Jaewon Son, Jaehun Park, Kwangsu Kim

CVPR 2024posterarXiv:2405.11905
#4185

SRTube: Video-Language Pre-Training with Action-Centric Video Tube Features and Semantic Role Labeling

Juhee Lee, Jewon Kang

CVPR 2024poster
#4186

SUGAR: Pre-training 3D Visual Representations for Robotics

Shizhe Chen, Ricardo Garcia Pinel, Ivan Laptev et al.

CVPR 2024posterarXiv:2404.01491
#4187

DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels

Erjian Guo, Zhen Zhao, Zicheng Wang et al.

CVPR 2025posterarXiv:2503.18536
#4188

SnAG: Scalable and Accurate Video Grounding

Fangzhou Mu, Sicheng Mo, Yin Li

CVPR 2024posterarXiv:2404.02257
#4189

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025posterarXiv:2502.19908
#4190

A Unified Framework for Heterogeneous Semi-supervised Learning

Marzi Heidari, Abdullah Alchihabi, Hao Yan et al.

CVPR 2025posterarXiv:2503.00286
#4191

GLaMM: Pixel Grounding Large Multimodal Model

Hanoona Rasheed, Muhammad Maaz, Sahal Shaji Mullappilly et al.

CVPR 2024posterarXiv:2311.03356
#4192

V2V3D: View-to-View Denoised 3D Reconstruction for Light Field Microscopy

Jiayin Zhao, Zhenqi Fu, Tao Yu et al.

CVPR 2025posterarXiv:2504.07853
#4193

ManiFPT: Defining and Analyzing Fingerprints of Generative Models

Hae Jin Song, Mahyar Khayatkhoei, Wael AbdAlmageed

CVPR 2024posterarXiv:2402.10401
#4194

Towards Universal AI-Generated Image Detection by Variational Information Bottleneck Network

Haifeng Zhang, Qinghui He, Xiuli Bi et al.

CVPR 2025poster
#4195

SeCap: Self-Calibrating and Adaptive Prompts for Cross-view Person Re-Identification in Aerial-Ground Networks

Shining Wang, Yunlong Wang, Ruiqi Wu et al.

CVPR 2025highlightarXiv:2503.06965
#4196

EquiPose: Exploiting Permutation Equivariance for Relative Camera Pose Estimation

Yuzhen Liu, Qiulei Dong

CVPR 2025poster
#4197

Spectral Informed Mamba for Robust Point Cloud Processing

Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori et al.

CVPR 2025posterarXiv:2503.04953
#4198

Enhancing Creative Generation on Stable Diffusion-based Models

Jiyeon Han, Dahee Kwon, Gayoung Lee et al.

CVPR 2025posterarXiv:2503.23538
#4199

WeatherGen: A Unified Diverse Weather Generator for LiDAR Point Clouds via Spider Mamba Diffusion

Yang Wu, Yun Zhu, Kaihua Zhang et al.

CVPR 2025posterarXiv:2504.13561
#4200

Hiding Images in Diffusion Models by Editing Learned Score Functions

Haoyu Chen, Yunqiao Yang, Nan Zhong et al.

CVPR 2025posterarXiv:2503.18459