Most Cited CVPR "prompt aggregation" Papers

5,589 papers found • Page 20 of 28

#3801

Spectral Informed Mamba for Robust Point Cloud Processing

Ali Bahri, Moslem Yazdanpanah, Mehrdad Noori et al.

CVPR 2025posterarXiv:2503.04953
#3802

DiN: Diffusion Model for Robust Medical VQA with Semantic Noisy Labels

Erjian Guo, Zhen Zhao, Zicheng Wang et al.

CVPR 2025posterarXiv:2503.18536
#3803

VideoDirector: Precise Video Editing via Text-to-Video Models

Yukun Wang, Longguang Wang, Zhiyuan Ma et al.

CVPR 2025posterarXiv:2411.17592
#3804

MEGA: Masked Generative Autoencoder for Human Mesh Recovery

Guénolé Fiche, Simon Leglaive, Xavier Alameda-Pineda et al.

CVPR 2025posterarXiv:2405.18839
#3805

TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion

Yiran Wang, Jiaqi Li, Chaoyi Hong et al.

CVPR 2025posterarXiv:2504.11773
#3806

Omni-Scene: Omni-Gaussian Representation for Ego-Centric Sparse-View Scene Reconstruction

Dongxu Wei, Zhiqi Li, Peidong Liu

CVPR 2025posterarXiv:2412.06273
#3807

AvatarArtist: Open-Domain 4D Avatarization

Hongyu Liu, Xuan Wang, Ziyu Wan et al.

CVPR 2025posterarXiv:2503.19906
#3808

DI-PCG: Diffusion-based Efficient Inverse Procedural Content Generation for High-quality 3D Asset Creation

Wang Zhao, Yan-Pei Cao, Jiale Xu et al.

CVPR 2025posterarXiv:2412.15200
#3809

Dragin3D: Image Editing by Dragging in 3D Space

Weiran Guang, Xiaoguang Gu, Mengqi Huang et al.

CVPR 2025poster
#3810

T2V-CompBench: A Comprehensive Benchmark for Compositional Text-to-video Generation

Kaiyue Sun, Kaiyi Huang, Xian Liu et al.

CVPR 2025posterarXiv:2407.14505
#3811

ORIDa: Object-centric Real-world Image Composition Dataset

Jinwoo Kim, Sangmin Han, Jinho Jeong et al.

CVPR 2025posterarXiv:2506.08964
#3812

MeGA: Hybrid Mesh-Gaussian Head Avatar for High-Fidelity Rendering and Head Editing

Cong Wang, Di Kang, Heyi Sun et al.

CVPR 2025posterarXiv:2404.19026
#3813

Comprehensive Information Bottleneck for Unveiling Universal Attribution to Interpret Vision Transformers

Jung-Ho Hong, Ho-Joong Kim, Kyu-Sung Jeon et al.

CVPR 2025highlightarXiv:2507.04388
#3814

Towards Universal Dataset Distillation via Task-Driven Diffusion

Ding Qi, Jian Li, Junyao Gao et al.

CVPR 2025poster
#3815

VideoComp: Advancing Fine-Grained Compositional and Temporal Alignment in Video-Text Models

Dahun Kim, AJ Piergiovanni, Ganesh Satish Mallya et al.

CVPR 2025posterarXiv:2504.03970
#3816

Do We Always Need the Simplicity Bias? Looking for Optimal Inductive Biases in the Wild

Damien Teney, Liangze Jiang, Florin Gogianu et al.

CVPR 2025posterarXiv:2503.10065
#3817

TriTex: Learning Texture from a Single Mesh via Triplane Semantic Features

Dana Cohen-Bar, Daniel Cohen-Or, Gal Chechik et al.

CVPR 2025posterarXiv:2503.16630
#3818

Cross-View Completion Models are Zero-shot Correspondence Estimators

Honggyu An, Jin Hyeon Kim, Seonghoon Park et al.

CVPR 2025highlightarXiv:2412.09072
#3819

Segment Anything, Even Occluded

Wei-En Tai, Yu-Lin Shih, Cheng Sun et al.

CVPR 2025posterarXiv:2503.06261
#3820

Advancing Multiple Instance Learning with Continual Learning for Whole Slide Imaging

Xianrui Li, Yufei Cui, Jun Li et al.

CVPR 2025highlightarXiv:2505.10649
#3821

Decoupled Distillation to Erase: A General Unlearning Method for Any Class-centric Tasks

Yu Zhou, Dian Zheng, Qijie Mo et al.

CVPR 2025highlightarXiv:2503.23751
#3822

ChainHOI: Joint-based Kinematic Chain Modeling for Human-Object Interaction Generation

Ling-An Zeng, Guohong Huang, Yi-Lin Wei et al.

CVPR 2025posterarXiv:2503.13130
#3823

WAVE: Weight Templates for Adaptive Initialization of Variable-sized Models

Fu Feng, Yucheng Xie, Jing Wang et al.

CVPR 2025posterarXiv:2406.17503
#3824

CraftsMan3D: High-fidelity Mesh Generation with 3D Native Diffusion and Interactive Geometry Refiner

Weiyu Li, Jiarui Liu, Hongyu Yan et al.

CVPR 2025poster
#3825

Black Swan: Abductive and Defeasible Video Reasoning in Unpredictable Events

Aditya Chinchure, Sahithya Ravi, Raymond Ng et al.

CVPR 2025posterarXiv:2412.05725
#3826

Filter Images First, Generate Instructions Later: Pre-Instruction Data Selection for Visual Instruction Tuning

Bardia Safaei, Faizan Siddiqui, Jiacong Xu et al.

CVPR 2025highlightarXiv:2503.07591
#3827

VL2Lite: Task-Specific Knowledge Distillation from Large Vision-Language Models to Lightweight Networks

Jinseong Jang, Chunfei Ma, Byeongwon Lee

CVPR 2025poster
#3828

CARE Transformer: Mobile-Friendly Linear Visual Transformer via Decoupled Dual Interaction

Yuan Zhou, Qingshan Xu, Jiequan Cui et al.

CVPR 2025highlightarXiv:2411.16170
#3829

Video-Bench: Human-Aligned Video Generation Benchmark

Hui Han, Siyuan Li, Jiaqi Chen et al.

CVPR 2025posterarXiv:2504.04907
#3830

PMNI: Pose-free Multi-view Normal Integration for Reflective and Textureless Surface Reconstruction

Mingzhi Pei, Xu Cao, Xiangyi Wang et al.

CVPR 2025posterarXiv:2504.08410
#3831

Joint Optimization of Neural Radiance Fields and Continuous Camera Motion from a Monocular Video

Hoang Chuong Nguyen, Wei Mao, Jose M. Alvarez et al.

CVPR 2025posterarXiv:2504.19819
#3832

Dynamic Pseudo Labeling via Gradient Cutting for High-Low Entropy Exploration

Jae Hyeon Park, Joo Hyeon Jeon, Jae Yun Lee et al.

CVPR 2025poster
#3833

MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos

Zhengqi Li, Richard Tucker, Forrester Cole et al.

CVPR 2025posterarXiv:2412.04463
#3834

Shift the Lens: Environment-Aware Unsupervised Camouflaged Object Detection

Ji Du, Fangwei Hao, Mingyang Yu et al.

CVPR 2025poster
#3835

Robust Audio-Visual Segmentation via Audio-Guided Visual Convergent Alignment

Chen Liu, Peike Li, Liying Yang et al.

CVPR 2025posterarXiv:2503.12847
#3836

Distribution Prototype Diffusion Learning for Open-set Supervised Anomaly Detection

Fuyun Wang, Tong Zhang, Yuanzhi Wang et al.

CVPR 2025posterarXiv:2502.20981
#3837

Lifting the Veil on Visual Information Flow in MLLMs: Unlocking Pathways to Faster Inference

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025posterarXiv:2503.13108
#3838

Minority-Focused Text-to-Image Generation via Prompt Optimization

Soobin Um, Jong Chul Ye

CVPR 2025posterarXiv:2410.07838
#3839

CADRef: Robust Out-of-Distribution Detection via Class-Aware Decoupled Relative Feature Leveraging

Zhiwei Ling, Yachen Chang, Hailiang Zhao et al.

CVPR 2025posterarXiv:2503.00325
#3840

A Selective Re-learning Mechanism for Hyperspectral Fusion Imaging

Yuanye Liu, jinyang liu, Renwei Dian et al.

CVPR 2025poster
#3841

Text-Driven Fashion Image Editing with Compositional Concept Learning and Counterfactual Abduction

Shanshan Huang, Haoxuan Li, Chunyuan Zheng et al.

CVPR 2025poster
#3842

HotSpot: Signed Distance Function Optimization with an Asymptotically Sufficient Condition

Zimo Wang, Cheng Wang, Taiki Yoshino et al.

CVPR 2025highlightarXiv:2411.14628
#3843

3D Student Splatting and Scooping

Jialin Zhu, Jiangbei Yue, Feixiang He et al.

CVPR 2025posterarXiv:2503.10148
#3844

LOGICZSL: Exploring Logic-induced Representation for Compositional Zero-shot Learning

Peng Wu, Xiankai Lu, Hao Hu et al.

CVPR 2025poster
#3845

Learning Partonomic 3D Reconstruction from Image Collections

Xiaoqian Ruan, Pei Yu, Dian Jia et al.

CVPR 2025poster
#3846

No Thing, Nothing: Highlighting Safety-Critical Classes for Robust LiDAR Semantic Segmentation in Adverse Weather

Junsung Park, HwiJeong Lee, Inha Kang et al.

CVPR 2025posterarXiv:2503.15910
#3847

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025posterarXiv:2503.17261
#3848

UniHOPE: A Unified Approach for Hand-Only and Hand-Object Pose Estimation

Yinqiao Wang, Hao Xu, Pheng-Ann Heng et al.

CVPR 2025posterarXiv:2503.13303
#3849

Chain of Semantics Programming in 3D Gaussian Splatting Representation for 3D Vision Grounding

Jiaxin Shi, Mingyue Xiang, Hao Sun et al.

CVPR 2025poster
#3850

MEAT: Multiview Diffusion Model for Human Generation on Megapixels with Mesh Attention

Yuhan Wang, Fangzhou Hong, Shuai Yang et al.

CVPR 2025posterarXiv:2503.08664
#3851

Efficient Data Driven Mixture-of-Expert Extraction from Trained Networks

Uranik Berisha, Jens Mehnert, Alexandru Paul Condurache

CVPR 2025posterarXiv:2505.15414
#3852

Towards Precise Embodied Dialogue Localization via Causality Guided Diffusion

Haoyu Wang, Le Wang, Sanping Zhou et al.

CVPR 2025poster
#3853

Neural Inverse Rendering from Propagating Light

Anagh Malik, Benjamin Attal, Andrew Xie et al.

CVPR 2025posterarXiv:2506.05347
#3854

A Universal Scale-Adaptive Deformable Transformer for Image Restoration across Diverse Artifacts

Xuyi He, Yuhui Quan, Ruotao Xu et al.

CVPR 2025poster
#3855

Gromov–Wasserstein Problem with Cyclic Symmetry

Shoichiro Takeda, Yasunori Akagi

CVPR 2025poster
#3856

RoGSplat: Learning Robust Generalizable Human Gaussian Splatting from Sparse Multi-View Images

Junjin Xiao, Qing Zhang, Yongwei Nie et al.

CVPR 2025posterarXiv:2503.14198
#3857

SDBF: Steep-Decision-Boundary Fingerprinting for Hard-Label Tampering Detection of DNN Models

Xiaofan Bai, Shixin Li, Xiaojing Ma et al.

CVPR 2025poster
#3858

AnyCam: Learning to Recover Camera Poses and Intrinsics from Casual Videos

Felix Wimbauer, Weirong Chen, Dominik Muhle et al.

CVPR 2025posterarXiv:2503.23282
#3859

Effortless Active Labeling for Long-Term Test-Time Adaptation

Guowei Wang, Changxing Ding

CVPR 2025posterarXiv:2503.14564
#3860

LATTE-MV: Learning to Anticipate Table Tennis Hits from Monocular Videos

Daniel Etaat, Dvij Rajesh Kalaria, Nima Rahmanian et al.

CVPR 2025posterarXiv:2503.20936
#3861

Learning Person-Specific Animatable Face Models from In-the-Wild Images via a Shared Base Model

Yuxiang Mao, Zhenfeng Fan, Zhijie Zhang et al.

CVPR 2025poster
#3862

Gradient Inversion Attacks on Parameter-Efficient Fine-Tuning

Hasin Us Sami, Swapneel Sen, Amit K. Roy-Chowdhury et al.

CVPR 2025posterarXiv:2506.04453
#3863

RDD: Robust Feature Detector and Descriptor using Deformable Transformer

Gonglin Chen, Tianwen Fu, Haiwei Chen et al.

CVPR 2025posterarXiv:2505.08013
#3864

Stacking Brick by Brick: Aligned Feature Isolation for Incremental Face Forgery Detection

Jikang Cheng, Zhiyuan Yan, Ying Zhang et al.

CVPR 2025posterarXiv:2411.11396
#3865

Closest Neighbors are Harmful for Lightweight Masked Auto-encoders

Jian Meng, Ahmed Hasssan, Li Yang et al.

CVPR 2025poster
#3866

SKDream: Controllable Multi-view and 3D Generation with Arbitrary Skeletons

Yuanyou Xu, Zongxin Yang, Yi Yang

CVPR 2025highlight
#3867

Dual Exposure Stereo for Extended Dynamic Range 3D Imaging

Juhyung Choi, Jinneyong Kim, Seokjun Choi et al.

CVPR 2025posterarXiv:2412.02351
#3868

Samba: A Unified Mamba-based Framework for General Salient Object Detection

Jiahao He, Keren Fu, Xiaohong Liu et al.

CVPR 2025highlight
#3869

FlexUOD: The Answer to Real-world Unsupervised Image Outlier Detection

Zhonghang Liu, Kun Zhou, Changshuo Wang et al.

CVPR 2025poster
#3870

GraphI2P: Image-to-Point Cloud Registration with Exploring Pattern of Correspondence via Graph Learning

Lin Bie, Shouan Pan, Siqi Li et al.

CVPR 2025poster
#3871

FedCS: Coreset Selection for Federated Learning

Chenhe Hao, Weiying Xie, Daixun Li et al.

CVPR 2025poster
#3872

Cross-Rejective Open-Set SAR Image Registration

Shasha Mao, Shiming Lu, Zhaolong Du et al.

CVPR 2025poster
#3873

Timestep Embedding Tells: It's Time to Cache for Video Diffusion Model

Feng Liu, Shiwei Zhang, Xiaofeng Wang et al.

CVPR 2025highlightarXiv:2411.19108
#3874

Bridging Gait Recognition and Large Language Models Sequence Modeling

Shaopeng Yang, Jilong Wang, Saihui Hou et al.

CVPR 2025poster
#3875

MANTA: Diffusion Mamba for Efficient and Effective Stochastic Long-Term Dense Action Anticipation

Olga Zatsarynna, Emad Bahrami, Yazan Abu Farha et al.

CVPR 2025poster
#3876

FIFA: Fine-grained Inter-frame Attention for Driver's Video Gaze Estimation

Daosong Hu, Mingyue Cui, Kai Huang

CVPR 2025poster
#3877

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025posterarXiv:2504.01961
#3878

SuperLightNet: Lightweight Parameter Aggregation Network for Multimodal Brain Tumor Segmentation

Feng Yu, Jiacheng Cao, Li Liu et al.

CVPR 2025poster
#3879

VidSeg: Training-free Video Semantic Segmentation based on Diffusion Models

Qian Wang, Abdelrahman Eldesokey, Mohit Mendiratta et al.

CVPR 2025poster
#3880

Efficient Diffusion as Low Light Enhancer

Guanzhou Lan, Qianli Ma, YUQI YANG et al.

CVPR 2025posterarXiv:2410.12346
#3881

VI^3NR: Variance Informed Initialization for Implicit Neural Representations

Chamin Hewa Koneputugodage, Yizhak Ben-Shabat, Sameera Ramasinghe et al.

CVPR 2025poster
#3882

MPDrive: Improving Spatial Understanding with Marker-Based Prompt Learning for Autonomous Driving

Zhi-Yuan Zhang, Xiaofan Li, Zhihao Xu et al.

CVPR 2025highlightarXiv:2504.00379
#3883

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025posterarXiv:2505.19618
#3884

Foundations of the Theory of Performance-Based Ranking

Sébastien Piérard, Anaïs Halin, Anthony Cioppa et al.

CVPR 2025posterarXiv:2412.04227
#3885

APT: Adaptive Personalized Training for Diffusion Models with Limited Data

JungWoo Chae, Jiyoon Kim, Jaewoong Choi et al.

CVPR 2025posterarXiv:2507.02687
#3886

Theoretical Insights in Model Inversion Robustness and Conditional Entropy Maximization for Collaborative Inference Systems

Song Xia, Yi Yu, Wenhan Yang et al.

CVPR 2025highlightarXiv:2503.00383
#3887

VolFormer: Explore More Comprehensive Cube Interaction for Hyperspectral Image Restoration and Beyond

Dabing Yu, Zheng Gao

CVPR 2025poster
#3888

UniMamba: Unified Spatial-Channel Representation Learning with Group-Efficient Mamba for LiDAR-based 3D Object Detection

Xin Jin, Haisheng Su, Kai Liu et al.

CVPR 2025posterarXiv:2503.12009
#3889

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025posterarXiv:2407.07174
#3890

Illumination Spectrum Estimation for Multispectral Images via Surface Reflectance Modeling and Spatial-Spectral Feature Generation

Hyejin Oh, Woo-Shik Kim, Sangyoon Lee et al.

CVPR 2025poster
#3891

SET: Spectral Enhancement for Tiny Object Detection

Huixin Sun, Runqi Wang, Yanjing Li et al.

CVPR 2025poster
#3892

Temporally Consistent Object-Centric Learning by Contrasting Slots

Anna Manasyan, Maximilian Seitzer, Filip Radovic et al.

CVPR 2025posterarXiv:2412.14295
#3893

FactCheXcker: Mitigating Measurement Hallucinations in Chest X-ray Report Generation Models

Alice Heiman, Xiaoman Zhang, Emma Chen et al.

CVPR 2025posterarXiv:2411.18672
#3894

R-SCoRe: Revisiting Scene Coordinate Regression for Robust Large-Scale Visual Localization

Xudong Jiang, Fangjinhua Wang, Silvano Galliani et al.

CVPR 2025posterarXiv:2501.01421
#3895

Repurposing Stable Diffusion Attention for Training-Free Unsupervised Interactive Segmentation

Markus Karmann, Onay Urfalioglu

CVPR 2025posterarXiv:2411.10411
#3896

Retaining Knowledge and Enhancing Long-Text Representations in CLIP through Dual-Teacher Distillation

Yuheng Feng, Changsong Wen, Zelin Peng et al.

CVPR 2025poster
#3897

Code-as-Monitor: Constraint-aware Visual Programming for Reactive and Proactive Robotic Failure Detection

Enshen Zhou, Qi Su, Cheng Chi et al.

CVPR 2025posterarXiv:2412.04455
#3898

Rethinking Reconstruction and Denoising in the Dark: New Perspective, General Architecture and Beyond

Long Ma, Tengyu Ma, Ziye Li et al.

CVPR 2025poster
#3899

Deterministic Certification of Graph Neural Networks against Graph Poisoning Attacks with Arbitrary Perturbations

Jiate Li, Meng Pang, Yun Dong et al.

CVPR 2025posterarXiv:2503.18503
#3900

Layered Motion Fusion: Lifting Motion Segmentation to 3D in Egocentric Videos

Vadim Tschernezki, Diane Larlus, Andrea Vedaldi et al.

CVPR 2025posterarXiv:2506.05546
#3901

A Focused Human Body Model for Accurate Anthropometric Measurements Extraction

Shuhang Chen, Xianliang Huang, Zhizhou Zhong et al.

CVPR 2025poster
#3902

All-Optical Nonlinear Diffractive Deep Network for Ultrafast Image Denoising

Xiaoling Zhou, Zhemg Lee, Wei Ye et al.

CVPR 2025highlight
#3903

SIR-DIFF: Sparse Image Sets Restoration with Multi-View Diffusion Model

Yucheng Mao, Boyang Wang, Nilesh Kulkarni et al.

CVPR 2025posterarXiv:2503.14463
#3904

SASep: Saliency-Aware Structured Separation of Geometry and Feature for Open Set Learning on Point Clouds

Jinfeng Xu, Xianzhi Li, Yuan Tang et al.

CVPR 2025posterarXiv:2506.13224
#3905

Beyond Sight: Towards Cognitive Alignment in LVLM via Enriched Visual Knowledge

Yaqi Zhao, Yuanyang Yin, Lin Li et al.

CVPR 2025posterarXiv:2411.16824
#3906

BIMBA: Selective-Scan Compression for Long-Range Video Question Answering

Md Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.

CVPR 2025posterarXiv:2503.09590
#3907

Task-aware Cross-modal Feature Refinement Transformer with Large Language Models for Visual Grounding

Wenbo Chen, Zhen Xu, Ruotao Xu et al.

CVPR 2025poster
#3908

Document Haystacks: Vision-Language Reasoning Over Piles of 1000+ Documents

Jun Chen, Dannong Xu, Junjie Fei et al.

CVPR 2025posterarXiv:2411.16740
#3909

ReCapture: Generative Video Camera Controls for User-Provided Videos using Masked Video Fine-Tuning

David Junhao Zhang, Roni Paiss, Shiran Zada et al.

CVPR 2025posterarXiv:2411.05003
#3910

GS-DiT: Advancing Video Generation with Dynamic 3D Gaussian Fields through Efficient Dense 3D Point Tracking

Weikang Bian, Zhaoyang Huang, Xiaoyu Shi et al.

CVPR 2025poster
#3911

CoCoGaussian: Leveraging Circle of Confusion for Gaussian Splatting from Defocused Images

Jungho Lee, Suhwan Cho, Taeoh Kim et al.

CVPR 2025posterarXiv:2412.16028
#3912

MaIR: A Locality- and Continuity-Preserving Mamba for Image Restoration

Boyun Li, Haiyu Zhao, Wenxin Wang et al.

CVPR 2025posterarXiv:2412.20066
#3913

Few-shot Implicit Function Generation via Equivariance

Suizhi Huang, Xingyi Yang, Hongtao Lu et al.

CVPR 2025highlightarXiv:2501.01601
#3914

SerialGen: Personalized Image Generation by First Standardization Then Personalization

Cong Xie, Han Zou, Ruiqi Yu et al.

CVPR 2025posterarXiv:2412.01485
#3915

SimAvatar: Simulation-Ready Avatars with Layered Hair and Clothing

Xueting Li, Ye Yuan, Shalini De Mello et al.

CVPR 2025posterarXiv:2412.09545
#3916

Camouflage Anything: Learning to Hide using Controlled Out-painting and Representation Engineering

Biplab Das, Viswanath Gopalakrishnan

CVPR 2025poster
#3917

AniDoc: Animation Creation Made Easier

Yihao Meng, Hao Ouyang, Hanlin Wang et al.

CVPR 2025posterarXiv:2412.14173
#3918

PointSR: Self-Regularized Point Supervision for Drone-View Object Detection

Weizhuo Li, Yue Xi, Wenjing Jia et al.

CVPR 2025poster
#3919

Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual Learning

Huiyi Wang, Haodong Lu, Lina Yao et al.

CVPR 2025posterarXiv:2403.18886
#3920

Brain-Inspired Spiking Neural Networks for Energy-Efficient Object Detection

Ziqi Li, Tao Gao, Yisheng An et al.

CVPR 2025poster
#3921

DnLUT: Ultra-Efficient Color Image Denoising via Channel-Aware Lookup Tables

Sidi Yang, Binxiao Huang, Yulun Zhang et al.

CVPR 2025posterarXiv:2503.15931
#3922

Incomplete Multi-View Multi-label Learning via Disentangled Representation and Label Semantic Embedding

Xu Yan, Jun Yin, Jie Wen

CVPR 2025poster
#3923

Building Vision Models upon Heat Conduction

Zhaozhi Wang, Yue Liu, Yunjie Tian et al.

CVPR 2025posterarXiv:2405.16555
#3924

PDFactor: Learning Tri-Perspective View Policy Diffusion Field for Multi-Task Robotic Manipulation

Jingyi Tian, Le Wang, Sanping Zhou et al.

CVPR 2025poster
#3925

Open Set Label Shift with Test Time Out-of-Distribution Reference

Changkun Ye, Russell Tsuchida, Lars Petersson et al.

CVPR 2025posterarXiv:2505.05868
#3926

Effective SAM Combination for Open-Vocabulary Semantic Segmentation

Minhyeok Lee, Suhwan Cho, Jungho Lee et al.

CVPR 2025posterarXiv:2411.14723
#3927

Do We Really Need Curated Malicious Data for Safety Alignment in Multi-modal Large Language Models?

Yanbo Wang, Jiyang Guan, Jian Liang et al.

CVPR 2025posterarXiv:2504.10000
#3928

Generating 6DoF Object Manipulation Trajectories from Action Description in Egocentric Vision

Tomoya Yoshida, Shuhei Kurita, Taichi Nishimura et al.

CVPR 2025highlightarXiv:2506.03605
#3929

Pattern Analogies: Learning to Perform Programmatic Image Edits by Analogy

Aditya Ganeshan, Thibault Groueix, Paul Guerrero et al.

CVPR 2025posterarXiv:2412.12463
#3930

Efficient Transfer Learning for Video-language Foundation Models

Haoxing Chen, Zizheng Huang, Yan Hong et al.

CVPR 2025posterarXiv:2411.11223
#3931

SVDC: Consistent Direct Time-of-Flight Video Depth Completion with Frequency Selective Fusion

Xuan Zhu, Jijun Xiang, Xianqi Wang et al.

CVPR 2025posterarXiv:2503.01257
#3932

FLAIR: VLM with Fine-grained Language-informed Image Representations

Rui Xiao, Sanghwan Kim, Iuliana Georgescu et al.

CVPR 2025posterarXiv:2412.03561
#3933

CryptoFace: End-to-End Encrypted Face Recognition

Wei Ao, Vishnu Naresh Boddeti

CVPR 2025posterarXiv:2509.00332
#3934

SynthLight: Portrait Relighting with Diffusion Model by Learning to Re-render Synthetic Faces

Sumit Chaturvedi, Mengwei Ren, Yannick Hold-Geoffroy et al.

CVPR 2025posterarXiv:2501.09756
#3935

Plug-and-Play Versatile Compressed Video Enhancement

Huimin Zeng, Jiacheng Li, Zhiwei Xiong

CVPR 2025posterarXiv:2504.15380
#3936

Geometry Field Splatting with Gaussian Surfels

Kaiwen Jiang, Venkataram Sivaram, Cheng Peng et al.

CVPR 2025posterarXiv:2411.17067
#3937

PS-EIP: Robust Photometric Stereo Based on Event Interval Profile

Kazuma Kitazawa, Takahito Aoto, Satoshi Ikehata et al.

CVPR 2025posterarXiv:2503.18341
#3938

Model Diagnosis and Correction via Linguistic and Implicit Attribute Editing

Xuanbai Chen, Xiang Xu, Zhihua Li et al.

CVPR 2025poster
#3939

BFANet: Revisiting 3D Semantic Segmentation with Boundary Feature Analysis

Weiguang Zhao, Rui Zhang, Qiufeng Wang et al.

CVPR 2025posterarXiv:2503.12539
#3940

DUNE: Distilling a Universal Encoder from Heterogeneous 2D and 3D Teachers

Mert Bülent Sarıyıldız, Philippe Weinzaepfel, Thomas Lucas et al.

CVPR 2025poster
#3941

NeISF++: Neural Incident Stokes Field for Polarized Inverse Rendering of Conductors and Dielectrics

Chenhao Li, Taishi Ono, Takeshi Uemori et al.

CVPR 2025posterarXiv:2411.10189
#3942

Gazing at Rewards: Eye Movements as a Lens into Human and AI Decision-Making in Hybrid Visual Foraging

Bo Wang, Dingwei Tan, Yen-Ling Kuo et al.

CVPR 2025posterarXiv:2411.09176
#3943

Layer- and Timestep-Adaptive Differentiable Token Compression Ratios for Efficient Diffusion Transformers

Haoran You, Connelly Barnes, Yuqian Zhou et al.

CVPR 2025posterarXiv:2412.16822
#3944

FreePCA: Integrating Consistency Information across Long-short Frames in Training-free Long Video Generation via Principal Component Analysis

Jiangtong Tan, Hu Yu, Jie Huang et al.

CVPR 2025highlightarXiv:2505.01172
#3945

Improving Semi-Supervised Semantic Segmentation with Sliced-Wasserstein Feature Alignment and Uniformity

Chen Yi Lu, Kasra Derakhshandeh, Somali Chaterji

CVPR 2025poster
#3946

Theory-Inspired Deep Multi-View Multi-Label Learning with Incomplete Views and Noisy Labels

Quanjiang Li, Tingjin Luo, Jiahui Liao

CVPR 2025poster
#3947

Asynchronous Collaborative Graph Representation for Frames and Events

Dianze Li, Jianing Li, Xu Liu et al.

CVPR 2025poster
#3948

SeqAfford: Sequential 3D Affordance Reasoning via Multimodal Large Language Model

Chunlin Yu, Hanqing Wang, Ye Shi et al.

CVPR 2025posterarXiv:2412.01550
#3949

Bridging Viewpoint Gaps: Geometric Reasoning Boosts Semantic Correspondence

Qiyang Qian, Hansheng Chen, Masayoshi Tomizuka et al.

CVPR 2025poster
#3950

ROLL: Robust Noisy Pseudo-label Learning for Multi-View Clustering with Noisy Correspondence

Yuan Sun, Yongxiang Li, Zhenwen Ren et al.

CVPR 2025highlight
#3951

Label Shift Meets Online Learning: Ensuring Consistent Adaptation with Universal Dynamic Regret

Yucong Dai, Shilin Gu, Ruidong Fan et al.

CVPR 2025highlight
#3952

PolarNeXt: Rethink Instance Segmentation with Polar Representation

Jiacheng Sun, Xinghong Zhou, Yiqiang Wu et al.

CVPR 2025poster
#3953

ShiftwiseConv: Small Convolutional Kernel with Large Kernel Effect

Dachong Li, li li, zhuangzhuang chen et al.

CVPR 2025posterarXiv:2401.12736
#3954

G3Flow: Generative 3D Semantic Flow for Pose-aware and Generalizable Object Manipulation

Tianxing Chen, Yao Mu, Zhixuan Liang et al.

CVPR 2025posterarXiv:2411.18369
#3955

Evolving High-Quality Rendering and Reconstruction in a Unified Framework with Contribution-Adaptive Regularization

You Shen, Zhipeng Zhang, Xinyang Li et al.

CVPR 2025posterarXiv:2503.00881
#3956

Language-Guided Salient Object Ranking

Fang Liu, Yuhao Liu, Ke Xu et al.

CVPR 2025poster
#3957

Structure from Collision

Takuhiro Kaneko

CVPR 2025highlightarXiv:2505.21335
#3958

PRaDA: Projective Radial Distortion Averaging

Daniil Sinitsyn, Linus Härenstam-Nielsen, Daniel Cremers

CVPR 2025posterarXiv:2504.16499
#3959

Feature Information Driven Position Gaussian Distribution Estimation for Tiny Object Detection

Jinghao Bian, Mingtao Feng, Weisheng Dong et al.

CVPR 2025poster
#3960

Event-Equalized Dense Video Captioning

Kangyi Wu, Pengna Li, Jingwen Fu et al.

CVPR 2025poster
#3961

ProReflow: Progressive Reflow with Decomposed Velocity

Lei Ke, Haohang Xu, Xuefei Ning et al.

CVPR 2025posterarXiv:2503.04824
#3962

Is Your World Simulator a Good Story Presenter? A Consecutive Events-Based Benchmark for Future Long Video Generation

Yiping Wang, Xuehai He, Kuan Wang et al.

CVPR 2025posterarXiv:2412.16211
#3963

Joint Scheduling of Causal Prompts and Tasks for Multi-Task Learning

Chaoyang Li, Jianyang Qin, Jinhao Cui et al.

CVPR 2025poster
#3964

Unified Dense Prediction of Video Diffusion

Lehan Yang, Lu Qi, Xiangtai Li et al.

CVPR 2025posterarXiv:2503.09344
#3965

Similarity-Guided Layer-Adaptive Vision Transformer for UAV Tracking

chaocan xue, Bineng Zhong, Qihua Liang et al.

CVPR 2025posterarXiv:2503.06625
#3966

Revisiting Audio-Visual Segmentation with Vision-Centric Transformer

Shaofei Huang, Rui Ling, Tianrui Hui et al.

CVPR 2025posterarXiv:2506.23623
#3967

Exploring Scene Affinity for Semi-Supervised LiDAR Semantic Segmentation

Chuandong Liu, Xingxing Weng, Shuguo Jiang et al.

CVPR 2025posterarXiv:2408.11280
#3968

RASP: Revisiting 3D Anamorphic Art for Shadow-Guided Packing of Irregular Objects

Soumyaratna Debnath, Ashish Tiwari, Kaustubh Sadekar et al.

CVPR 2025posterarXiv:2504.02465
#3969

Sketchy Bounding-box Supervision for 3D Instance Segmentation

qian deng, Le Hui, Jin Xie et al.

CVPR 2025posterarXiv:2505.16399
#3970

Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs

Lucas Ventura, Antoine Yang, Cordelia Schmid et al.

CVPR 2025posterarXiv:2504.00072
#3971

SOGS: Second-Order Anchor for Advanced 3D Gaussian Splatting

Jiahui Zhang, Fangneng Zhan, Ling Shao et al.

CVPR 2025posterarXiv:2503.07476
#3972

VoxelSplat: Dynamic Gaussian Splatting as an Effective Loss for Occupancy and Flow Prediction

Ziyue Zhu, Shenlong Wang, Jin Xie et al.

CVPR 2025posterarXiv:2506.05563
#3973

VinaBench: Benchmark for Faithful and Consistent Visual Narratives

Silin Gao, Sheryl Mathew, Li Mi et al.

CVPR 2025posterarXiv:2503.20871
#3974

Aesthetic Post-Training Diffusion Models from Generic Preferences with Step-by-step Preference Optimization

Zhanhao Liang, Yuhui Yuan, Shuyang Gu et al.

CVPR 2025posterarXiv:2406.04314
#3975

Learning Textual Prompts for Open-World Semi-Supervised Learning

Yuxin Fan, Junbiao Cui, Jiye Liang

CVPR 2025poster
#3976

Think Small, Act Big: Primitive Prompt Learning for Lifelong Robot Manipulation

Yuanqi Yao, Siao Liu, Haoming Song et al.

CVPR 2025posterarXiv:2504.00420
#3977

Align-KD: Distilling Cross-Modal Alignment Knowledge for Mobile Vision-Language Large Model Enhancement

Qianhan Feng, Wenshuo Li, Tong Lin et al.

CVPR 2025poster
#3978

HOT: Hadamard-based Optimized Training

Seonggon Kim, Juncheol Shin, Seung-taek Woo et al.

CVPR 2025posterarXiv:2503.21261
#3979

JTD-UAV: MLLM-Enhanced Joint Tracking and Description Framework for Anti-UAV Systems

Yifan Wang, Jian Zhao, Zhaoxin Fan et al.

CVPR 2025poster
#3980

STEP: Enhancing Video-LLMs’ Compositional Reasoning by Spatio-Temporal Graph-guided Self-Training

Haiyi Qiu, Minghe Gao, Long Qian et al.

CVPR 2025posterarXiv:2412.00161
#3981

Modeling Multiple Normal Action Representations for Error Detection in Procedural Tasks

Wei-Jin Huang, Yuan-Ming Li, Zhi-Wei Xia et al.

CVPR 2025posterarXiv:2503.22405
#3982

RUBIK: A Structured Benchmark for Image Matching across Geometric Challenges

Thibaut Loiseau, Guillaume Bourmaud

CVPR 2025posterarXiv:2502.19955
#3983

Reconstructing Close Human Interaction with Appearance and Proxemics Reasoning

Buzhen Huang, Chen Li, Chongyang Xu et al.

CVPR 2025posterarXiv:2507.02565
#3984

Rethinking Noisy Video-Text Retrieval via Relation-aware Alignment

Huakai Lai, Guoxin Xiong, Huayu Mai et al.

CVPR 2025poster
#3985

EfficientLLaVA: Generalizable Auto-Pruning for Large Vision-language Models

Yinan Liang, Ziwei Wang, Xiuwei Xu et al.

CVPR 2025poster
#3986

The Impact Label Noise and Choice of Threshold has on Cross-Entropy and Soft-Dice in Image Segmentation

Marcus Nordström, Atsuto Maki, Henrik Hult

CVPR 2025poster
#3987

Learning on Model Weights using Tree Experts

Eliahu Horwitz, Bar Cavia, Jonathan Kahana et al.

CVPR 2025posterarXiv:2410.13569
#3988

Image Reconstruction from Readout-Multiplexed Single-Photon Detector Arrays

Shashwath Bharadwaj, Ruangrawee Kitichotkul, Akshay Agarwal et al.

CVPR 2025highlightarXiv:2312.02971
#3989

Towards Smart Point-and-Shoot Photography

Jiawan Li, Fei Zhou, Zhipeng Zhong et al.

CVPR 2025posterarXiv:2505.03638
#3990

Hyperspectral Pansharpening via Diffusion Models with Iteratively Zero-Shot Guidance

Jin-Liang Xiao, Ting-Zhu Huang, Liang-Jian Deng et al.

CVPR 2025poster
#3991

Efficient Motion-Aware Video MLLM

Zijia Zhao, Yuqi Huo, Tongtian Yue et al.

CVPR 2025highlightarXiv:2503.13016
#3992

DSPNet: Dual-vision Scene Perception for Robust 3D Question Answering

Jingzhou Luo, Yang Liu, weixing chen et al.

CVPR 2025posterarXiv:2503.03190
#3993

Exploring CLIP's Dense Knowledge for Weakly Supervised Semantic Segmentation

Zhiwei Yang, Yucong Meng, Kexue Fu et al.

CVPR 2025posterarXiv:2503.20826
#3994

UNIALIGN: Scaling Multimodal Alignment within One Unified Model

bo zhou, Liulei Li, Yujia Wang et al.

CVPR 2025poster
#3995

Devils in Middle Layers of Large Vision-Language Models: Interpreting, Detecting and Mitigating Object Hallucinations via Attention Lens

Zhangqi Jiang, Junkai Chen, Beier Zhu et al.

CVPR 2025posterarXiv:2411.16724
#3996

iG-6DoF: Model-free 6DoF Pose Estimation for Unseen Object via Iterative 3D Gaussian Splatting

Tuo Cao, Fei LUO, Jiongming Qin et al.

CVPR 2025poster
#3997

Towards Long-Horizon Vision-Language Navigation: Platform, Benchmark and Method

Xinshuai Song, weixing chen, Yang Liu et al.

CVPR 2025posterarXiv:2412.09082
#3998

BimArt: A Unified Approach for the Synthesis of 3D Bimanual Interaction with Articulated Objects

Wanyue Zhang, Rishabh Dabral, Vladislav Golyanik et al.

CVPR 2025posterarXiv:2412.05066
#3999

Improving Autoregressive Visual Generation with Cluster-Oriented Token Prediction

Teng Hu, Jiangning Zhang, Ran Yi et al.

CVPR 2025posterarXiv:2501.00880
#4000

Learned Binocular-Encoding Optics for RGBD Imaging Using Joint Stereo and Focus Cues

Yuhui Liu, Liangxun Ou, Qiang Fu et al.

CVPR 2025poster