Most Cited 2025 "physics-guided architecture" Papers

22,274 papers found • Page 111 of 112

#22001

A Tiny Change, A Giant Leap: Long-Tailed Class-Incremental Learning via Geometric Prototype Alignment

xinyi lai, Luojun Lin, Weijie Chen et al.

ICCV 2025poster
#22002

CountSE: Soft Exemplar Open-set Object Counting

Shuai Liu, Peng Zhang, Shiwei Zhang et al.

ICCV 2025highlight
#22003

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178
#22004

Bias-Resilient Weakly Supervised Semantic Segmentation Using Normalizing Flows

Xianglin Qiu, Xiaoyang Wang, Zhen Zhang et al.

ICCV 2025poster
#22005

Text-guided Visual Prompt DINO for Generic Segmentation

Yuchen Guan, Chong Sun, Canmiao Fu et al.

ICCV 2025posterarXiv:2508.06146
#22006

GenieBlue: Integrating both Linguistic and Multimodal Capabilities for Large Language Models on Mobile Devices

Xudong LU, Yinghao Chen, Renshou Wu et al.

ICCV 2025posterarXiv:2503.06019
#22007

MedVSR: Medical Video Super-Resolution with Cross State-Space Propagation

Xinyu Liu, Guolei Sun, Cheng Wang et al.

ICCV 2025posterarXiv:2509.21265
#22008

FE-CLIP: Frequency Enhanced CLIP Model for Zero-Shot Anomaly Detection and Segmentation

Tao Gong, Qi Chu, Bin Liu et al.

ICCV 2025poster
#22009

Top2Pano: Learning to Generate Indoor Panoramas from Top-Down View

Zitong Zhang, Suranjan Gautam, Rui Yu

ICCV 2025posterarXiv:2507.21371
#22010

SAMora: Enhancing SAM through Hierarchical Self-Supervised Pre-Training for Medical Images

Shuhang Chen, Hangjie Yuan, Pengwei Liu et al.

ICCV 2025posterarXiv:2511.08626
#22011

Vamba: Understanding Hour-Long Videos with Hybrid Mamba-Transformers

Weiming Ren, Wentao Ma, Huan Yang et al.

ICCV 2025posterarXiv:2503.11579
#22012

MuGS: Multi-Baseline Generalizable Gaussian Splatting Reconstruction

Yaopeng Lou, Liao Shen, Tianqi Liu et al.

ICCV 2025posterarXiv:2508.04297
#22013

Region-Level Data Attribution for Text-to-Image Generative Models

Trong Bang Nguyen, Phi Le Nguyen, Simon Lucey et al.

ICCV 2025poster
#22014

Trans-Adapter: A Plug-and-Play Framework for Transparent Image Inpainting

Yuekun Dai, Haitian Li, Shangchen Zhou et al.

ICCV 2025posterarXiv:2508.01098
#22015

4D-Bench: Benchmarking Multi-modal Large Language Models for 4D Object Understanding

Wenxuan Zhu, Bing Li, Cheng Zheng et al.

ICCV 2025posterarXiv:2503.17827
#22016

Multi-View Slot Attention Using Paraphrased Texts for Face Anti-Spoofing

Jeongmin Yu, Susang Kim, Kisu Lee et al.

ICCV 2025posterarXiv:2509.06336
#22017

Generalization-Preserved Learning: Closing the Backdoor to Catastrophic Forgetting in Continual Deepfake Detection

Xueyi Zhang, Peiyin Zhu, Chengwei Zhang et al.

ICCV 2025poster
#22018

SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images

Yichi Zhang, Le Xue, Wenbo zhang et al.

ICCV 2025posterarXiv:2502.14351
#22019

LangBridge: Interpreting Image as a Combination of Language Embeddings

Jiaqi Liao, Yuwei Niu, Fanqing Meng et al.

ICCV 2025posterarXiv:2503.19404
#22020

IGD: Instructional Graphic Design with Multimodal Layer Generation

Yadong Qu, Shancheng Fang, Yuxin Wang et al.

ICCV 2025posterarXiv:2507.09910
#22021

Robustifying Zero-Shot Vision Language Models by Subspaces Alignment

Junhao Dong, Piotr Koniusz, Liaoyuan Feng et al.

ICCV 2025poster
#22022

SyncVP: Joint Diffusion for Synchronous Multi-Modal Video Prediction

Enrico Pallotta, Sina Mokhtarzadeh Azar, Shuai Li et al.

CVPR 2025posterarXiv:2503.18933
#22023

Exploration via Feature Perturbation in Contextual Bandits

Seouh-won Yi, Min-hwan Oh

NEURIPS 2025spotlightarXiv:2510.17390
#22024

CABLD: Contrast-Agnostic Brain Landmark Detection with Consistency-Based Regularization

Soorena Salari, Arash Harirpoush, Hassan Rivaz et al.

ICCV 2025posterarXiv:2411.17845
#22025

The Devil is in the Spurious Correlations: Boosting Moment Retrieval with Dynamic Learning

Xinyang Zhou, Fanyue Wei, Lixin Duan et al.

ICCV 2025posterarXiv:2501.07305
#22026

Parameter-Efficient Adaptation of Geospatial Foundation Models through Embedding Deflection

Romain Thoreau, Valerio Marsocci, Dawa Derksen

ICCV 2025posterarXiv:2503.09493
#22027

On the Recovery of Cameras from Fundamental Matrices

Rakshith Madhavan, Federica Arrigoni

ICCV 2025highlight
#22028

RhythmGuassian: Repurposing Generalizable Gaussian Model For Remote Physiological Measurement

Hao LU, Yuting Zhang, Jiaqi Tang et al.

ICCV 2025highlight
#22029

Superpowering Open-Vocabulary Object Detectors for X-ray Vision

Pablo Garcia-Fernandez, Lorenzo Vaquero, Mingxuan Liu et al.

ICCV 2025posterarXiv:2503.17071
#22030

Any Large Language Model Can Be a Reliable Judge: Debiasing with a Reasoning-based Bias Detector

Haoyan Yang, Runxue Bao, Cao (Danica) Xiao et al.

NEURIPS 2025posterarXiv:2505.17100
#22031

Generating Physically Sound Designs from Text and a Set of Physical Constraints

Gregory Barber, Todd Henry, Mulugeta Haile

NEURIPS 2025poster
#22032

CityGS-X: A Scalable Architecture for Efficient and Geometrically Accurate Large-Scale Scene Reconstruction

Yuanyuan Gao, Hao Li, Jiaqi Chen et al.

ICCV 2025posterarXiv:2503.23044
#22033

AIRA: Activation-Informed Low-Rank Adaptation for Large Models

Lujun Li, Dezhi Li, Cheng Lin et al.

ICCV 2025poster
#22034

True Impact of Cascade Length in Contextual Cascading Bandits

Hyun-jun Choi, Joongkyu Lee, Min-hwan Oh

NEURIPS 2025poster
#22035

Embodied Navigation with Auxiliary Task of Action Description Prediction

Haru Kondoh, Asako Kanezaki

ICCV 2025posterarXiv:2510.21809
#22036

Thompson Sampling for Multi-Objective Linear Contextual Bandit

Somangchan Park, Heesang Ann, Min-hwan Oh

NEURIPS 2025posterarXiv:2512.00930
#22037

Geo4D: Leveraging Video Generators for Geometric 4D Scene Reconstruction

Zeren Jiang, Chuanxia Zheng, Iro Laina et al.

ICCV 2025highlightarXiv:2504.07961
#22038

Cross-View Isolated Sign Language Recognition via View Synthesis and Feature Disentanglement

Xin Shen, Xinyu Wang, Lei Shen et al.

ICCV 2025poster
#22039

Towards Video Thinking Test: A Holistic Benchmark for Advanced Video Reasoning and Understanding

Yuanhan Zhang, Yunice Chew, Yuhao Dong et al.

ICCV 2025posterarXiv:2507.15028
#22040

Semantic versus Identity: A Divide-and-Conquer Approach towards Adjustable Medical Image De-Identification

Yuan Tian, Shuo Wang, Rongzhao Zhang et al.

ICCV 2025posterarXiv:2507.21703
#22041

Face Retouching with Diffusion Data Generation and Spectral Restorement

Zhidan Xu, Xiaoqin Zhang, Shijian Lu

ICCV 2025poster
#22042

Fuzzy Contrastive Decoding to Alleviate Object Hallucination in Large Vision-Language Models

Jieun Kim, Jinmyeong Kim, Yoonji Kim et al.

ICCV 2025poster
#22043

Bayesian Optimization with Preference Exploration using a Monotonic Neural Network Ensemble

Hanyang Wang, Juergen Branke, Matthias Poloczek

NEURIPS 2025posterarXiv:2501.18792
#22044

Zero-Shot Compositional Video Learning with Coding Rate Reduction

Heeseok Jung, Jun-Hyeon Bak, Yujin Jeong et al.

ICCV 2025poster
#22045

Att-Adapter: A Robust and Precise Domain-Specific Multi-Attributes T2I Diffusion Adapter via Conditional Variational Autoencoder

Wonwoong Cho, Yan-Ying Chen, Matthew Klenk et al.

ICCV 2025highlightarXiv:2503.11937
#22046

LaCoOT: Layer Collapse through Optimal Transport

Victor Quétu, Zhu LIAO, Nour Hezbri et al.

ICCV 2025posterarXiv:2406.08933
#22047

Neural Solver of Dichromatic Reflection Model for Specular Highlight Removal

Gang Fu

ICCV 2025poster
#22048

Wavelet Policy: Lifting Scheme for Policy Learning in Long-Horizon Tasks

Hao Huang, Shuaihang Yuan, Geeta Chandra Raju Bethala et al.

ICCV 2025posterarXiv:2507.04331
#22049

Accident Anticipation via Temporal Occurrence Prediction

Tianhao Zhao, Yiyang Zou, Zihao Mao et al.

NEURIPS 2025oralarXiv:2510.22260
#22050

ProSAM: Enhancing the Robustness of SAM-based Visual Reference Segmentation with Probabilistic Prompts

Xiaoqi Wang, Clint Sebastian, Wenbin He et al.

ICCV 2025posterarXiv:2506.21835
#22051

ZipVL: Accelerating Vision-Language Models through Dynamic Token Sparsity

Yefei He, Feng Chen, Jing Liu et al.

ICCV 2025poster
#22052

FlowMixer: A Depth-Agnostic Neural Architecture for Interpretable Spatiotemporal Forecasting

Fares Mehouachi, Saif Eddin Jabari

NEURIPS 2025oral
#22053

Contrastive Flow Matching

George Stoica, Vivek Ramanujan, Xiang Fan et al.

ICCV 2025posterarXiv:2506.05350
#22054

Class Token as Proxy: Optimal Transport-assisted Proxy Learning for Weakly Supervised Semantic Segmentation

Jian Wang, Tianhong Dai, Bingfeng Zhang et al.

ICCV 2025poster
#22055

Explore In-Context Message Passing Operator for Graph Neural Networks in A Mean Field Game

Tingting Dan, Xinwei Huang, Won Hwa Kim et al.

NEURIPS 2025poster
#22056

Representation Shift: Unifying Token Compression with FlashAttention

Joonmyung Choi, Sanghyeok Lee, Byungoh Ko et al.

ICCV 2025posterarXiv:2508.00367
#22057

HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation

Qinqian Lei, Bo Wang, Robby Tan

ICCV 2025posterarXiv:2507.15542
#22058

Topology-aware Graph Diffusion Model with Persistent Homology

Joonhyuk Park, Donghyun Lee, Yujee Song et al.

NEURIPS 2025poster
#22059

AllGCD: Leveraging All Unlabeled Data for Generalized Category Discovery

Xinzi Cao, Ke Chen, Feidiao Yang et al.

ICCV 2025poster
#22060

Towards Long-Horizon Vision-Language-Action System: Reasoning, Acting and Memory

Daixun Li, Yusi Zhang, Mingxiang Cao et al.

ICCV 2025poster
#22061

UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Dayong Su, Yafei Zhang, Huafeng Li et al.

ICCV 2025posterarXiv:2506.22736
#22062

3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt

Lukas Höllein, Aljaz Bozic, Michael Zollhöfer et al.

ICCV 2025posterarXiv:2409.12892
#22063

V^2Dial: Unification of Video and Visual Dialog via Multimodal Experts

Adnen Abdessaied, Anna Rohrbach, Marcus Rohrbach et al.

CVPR 2025poster
#22064

Online Mixture of Experts: No-Regret Learning for Optimal Collective Decision-Making

Larkin Liu, Jalal Etesami

NEURIPS 2025posterarXiv:2510.21788
#22065

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

Xiao Chen, Tai Wang, Quanyi Li et al.

ICCV 2025posterarXiv:2505.20294
#22066

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

Zhixiang Guo, Siyuan Liang, Aishan Liu et al.

ICCV 2025posterarXiv:2412.01528
#22067

CA2C: A Prior-Knowledge-Free Approach for Robust Label Noise Learning via Asymmetric Co-learning and Co-training

Mengmeng Sheng, Zeren Sun, Tianfei Zhou et al.

ICCV 2025poster
#22068

Learnable Logit Adjustment for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch

lee hyuck, Taemin Park, Heeyoung Kim

ICCV 2025poster
#22069

Mind the Gap: Aligning Vision Foundation Models to Image Feature Matching

Yuhan Liu, Jingwen Fu, Yang Wu et al.

ICCV 2025posterarXiv:2507.10318
#22070

Contextual Integrity in LLMs via Reasoning and Reinforcement Learning

Guangchen (Eric) Lan, Huseyin A. Inan, Sahar Abdelnabi et al.

NEURIPS 2025posterarXiv:2506.04245
#22071

CARL: Causality-guided Architecture Representation Learning for an Interpretable Performance Predictor

Han Ji, Yuqi Feng, Jiahao Fan et al.

ICCV 2025posterarXiv:2506.04001
#22072

Seeing the Abstract: Translating the Abstract Language for Vision Language Models

Davide Talon, Federico Girella, Ziyue Liu et al.

CVPR 2025posterarXiv:2505.03242
#22073

TCFG: Truncated Classifier-Free Guidance for Efficient and Scalable Text-to-Image Acceleration

Xiaomeng Fu, Jia Li

ICCV 2025poster
#22074

Point Cloud Self-supervised Learning via 3D to Multi-view Masked Learner

Zhimin Chen, Xuewei Chen, Xiao Guo et al.

ICCV 2025posterarXiv:2311.10887
#22075

SPRO: Improving Image Generation via Self-Play

Ritika Jha, Aanisha Bhattacharyya, Yaman Singla et al.

NEURIPS 2025poster
#22076

MSA2: Multi-task Framework with Structure-aware and Style-adaptive Character Representation for Open-set Chinese Text Recognition

Yangfu Li, Hongjian Zhan, Qi Liu et al.

ICCV 2025poster
#22077

DiffPCI: Large Motion Point Cloud frame Interpolation with Diffusion Model

tianyu zhang, Haobo Jiang, jian Yang et al.

ICCV 2025poster
#22078

DiffPS: Leveraging Prior Knowledge of Diffusion Model for Person Search

Giyeol Kim, Sooyoung Yang, Jihyong Oh et al.

ICCV 2025highlight
#22079

Feature Purification Matters: Suppressing Outlier Propagation for Training-Free Open-Vocabulary Semantic Segmentation

Shuo Jin, Siyue Yu, Bingfeng Zhang et al.

ICCV 2025highlight
#22080

ROVI: A VLM-LLM Re-Captioned Dataset for Open-Vocabulary Instance-Grounded Text-to-Image Generation

Cihang Peng, Qiming HOU, Zhong Ren et al.

ICCV 2025posterarXiv:2508.01008
#22081

MultiModal Action Conditioned Video Simulation

Yichen Li, Antonio Torralba

ICCV 2025poster
#22082

Local Dense Logit Relations for Enhanced Knowledge Distillation

Liuchi Xu, Kang Liu, Jinshuai Liu et al.

ICCV 2025posterarXiv:2507.15911
#22083

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Hao Chen, Shell Xu Hu, Wayne Luk et al.

ICCV 2025posterarXiv:2503.12649
#22084

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025posterarXiv:2503.12955
#22085

SVTRv2: CTC Beats Encoder-Decoder Models in Scene Text Recognition

Yongkun Du, Zhineng Chen, Hongtao Xie et al.

ICCV 2025posterarXiv:2411.15858
#22086

Soft Local Completeness: Rethinking Completeness in XAI

Ziv Weiss Haddad, Oren Barkan, Yehonatan Elisha et al.

ICCV 2025poster
#22087

ClearSight: Human Vision-Inspired Solutions for Event-Based Motion Deblurring

Xiaopeng LIN, Yulong Huang, Hongwei Ren et al.

ICCV 2025posterarXiv:2501.15808
#22088

PBFG: A New Physically-Based Dataset and Removal of Lens Flares and Glares

Jie Zhu, Sungkil Lee

ICCV 2025poster
#22089

Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild

Haoran Wang, Zekun Li, Jian Zhang et al.

ICCV 2025posterarXiv:2508.07759
#22090

An Information-Theoretic Regularizer for Lossy Neural Image Compression

ZHANG YINGWEN, Meng Wang, Xihua Sheng et al.

ICCV 2025posterarXiv:2411.16727
#22091

Knowledge-Guided Part Segmentation

Xuejian Gou, Fang Liu, Licheng Jiao et al.

ICCV 2025poster
#22092

Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation

Yooshin Cho, Hanbyel Cho, Janghyeon Lee et al.

ICCV 2025posterarXiv:2507.20284
#22093

OPHR: Mastering Volatility Trading with Multi-Agent Deep Reinforcement Learning

Zeting Chen, Xinyu Cai, Molei Qin et al.

NEURIPS 2025poster
#22094

KV-Edit: Training-Free Image Editing for Precise Background Preservation

Tianrui Zhu, Shiyi Zhang, Jiawei Shao et al.

ICCV 2025posterarXiv:2502.17363
#22095

FusionPhys: A Flexible Framework for Fusing Complementary Sensing Modalities in Remote Physiological Measurement

Chenhang Ying, Huiyu Yang, Jieyi Ge et al.

ICCV 2025poster
#22096

DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations

Xiaohui Li, Yihao Liu, Shuo Cao et al.

ICCV 2025posterarXiv:2501.10110
#22097

Moment Quantization for Video Temporal Grounding

Xiaolong Sun, Le Wang, Sanping Zhou et al.

ICCV 2025posterarXiv:2504.02286
#22098

Test-time Adaptation for Foundation Medical Segmentation Model Without Parametric Updates

Kecheng Chen, Xinyu Luo, Tiexin Qin et al.

ICCV 2025highlightarXiv:2504.02008
#22099

Power of Cooperative Supervision: Multiple Teachers Framework for Advanced 3D Semi-Supervised Object Detection

Jin-Hee Lee, Jae-keun Lee, Jeseok Kim et al.

ICCV 2025poster
#22100

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025posterarXiv:2504.21414
#22101

ESCNet:Edge-Semantic Collaborative Network for Camouflaged Object Detection

Sheng Ye, Xin Chen, Yan Zhang et al.

ICCV 2025poster
#22102

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching

Yuxuan Yuan, Luyao Tang, Chaoqi Chen et al.

ICCV 2025poster
#22103

DADet: Safeguarding Image Conditional Diffusion Models against Adversarial and Backdoor Attacks via Diffusion Anomaly Detection

Hongwei Yu, Xinlong Ding, Jiawei Li et al.

ICCV 2025highlight
#22104

ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations

Tianming Liang, Kun-Yu Lin, Chaolei Tan et al.

ICCV 2025posterarXiv:2501.14607
#22105

Multi-Schema Proximity Network for Composed Image Retrieval

Jiangming Shi, Xiangbo Yin, yeyunchen yeyunchen et al.

ICCV 2025poster
#22106

LEGO-Maker: A Semantic-Driven Algorithm for Text-to-3D Generation

Yifei Zhang, Lei Chen

ICCV 2025poster
#22107

CNS-Bench: Benchmarking Image Classifier Robustness Under Continuous Nuisance Shifts

Olaf Dünkel, Artur Jesslen, Jiahao Xie et al.

ICCV 2025posterarXiv:2507.17651
#22108

COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion

Zekun Qian, Ruize Han, Zhixiang Wang et al.

ICCV 2025poster
#22109

Dense Policy: Bidirectional Autoregressive Learning of Actions

Yue Su, Xinyu Zhan, Hongjie Fang et al.

ICCV 2025posterarXiv:2503.13217
#22110

monoVLN: Bridging the Observation Gap between Monocular and Panoramic Vision and Language Navigation

Ren-Jie Lu, Yu Zhou, hao cheng et al.

ICCV 2025poster
#22111

Graph Domain Adaptation with Dual-branch Encoder and Two-level Alignment for Whole Slide Image-based Survival Prediction

Yuntao Shou, Xiangyong Cao, PeiqiangYan PeiqiangYan et al.

ICCV 2025posterarXiv:2411.14001
#22112

DOGR: Towards Versatile Visual Document Grounding and Referring

Yinan Zhou, Yuxin Chen, Haokun Lin et al.

ICCV 2025posterarXiv:2411.17125
#22113

ReME: A Data-Centric Framework for Training-Free Open-Vocabulary Segmentation

Xiwei Xuan, Ziquan Deng, Kwan-Liu Ma

ICCV 2025highlightarXiv:2506.21233
#22114

MonoMobility: Zero-Shot 3D Mobility Analysis from Monocular Videos

Hongyi Zhou, Xiaogang Wang, Yulan Guo et al.

ICCV 2025posterarXiv:2505.11868
#22115

Mind the Gap: Detecting Black-box Adversarial Attacks in the Making through Query Update Analysis

Jeonghwan Park, Niall McLaughlin, Ihsen Alouani

CVPR 2025posterarXiv:2503.02986
#22116

Performing Defocus Deblurring by Modeling its Formation Process

Zhengbo Zhang, Lin Geng Foo, Hossein Rahmani et al.

ICCV 2025poster
#22117

CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance

Peiqi Chen, Lei Yu, Yi Wan et al.

ICCV 2025highlightarXiv:2507.17312
#22118

Supervised Exploratory Learning for Long-Tailed Visual Recognition

Zhongquan Jian, Yanhao Chen, Wangyancheng Wangyancheng et al.

ICCV 2025poster
#22119

The Burden of Interactive Alignment with Inconsistent Preferences

Ali Shirali

NEURIPS 2025posterarXiv:2510.16368
#22120

An Efficient Hybrid Vision Transformer for TinyML Applications

Fanhong Zeng, Huanan LI, Juntao Guan et al.

ICCV 2025poster
#22121

MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance

Zihan Cao, Yu Zhong, Ziqi Wang et al.

ICCV 2025posterarXiv:2503.14944
#22122

Blind Video Super-Resolution based on Implicit Kernels

Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.

ICCV 2025posterarXiv:2503.07856
#22123

OmniDiff: A Comprehensive Benchmark for Fine-grained Image Difference Captioning

Yuan Liu, Saihui Hou, Saijie Hou et al.

ICCV 2025posterarXiv:2503.11093
#22124

Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts

Chiao-An Yang, Kuan-Chuan Peng, Raymond A. Yeh

ICCV 2025posterarXiv:2507.16946
#22125

Collective Counterfactual Explanations: Balancing Individual Goals and Collective Dynamics

Ahmad-Reza Ehyaei, Ali Shirali, Samira Samadi

NEURIPS 2025posterarXiv:2402.04579
#22126

More Reliable Pseudo-labels, Better Performance: A Generalized Approach to Single Positive Multi-label Learning

Luong Tran, Thieu Vo, Anh Nguyen et al.

ICCV 2025posterarXiv:2508.20381
#22127

GauCho: Gaussian Distributions with Cholesky Decomposition for Oriented Object Detection

Jeffri Erwin Murrugarra Llerena, José Henrique Marques, Claudio Jung

CVPR 2025posterarXiv:2502.01565
#22128

TimeExpert: An Expert-Guided Video LLM for Video Temporal Grounding

Zuhao Yang, Yingchen Yu, Yunqing Zhao et al.

ICCV 2025posterarXiv:2508.01699
#22129

CaptionSmiths: Flexibly Controlling Language Pattern in Image Captioning

Kuniaki Saito, Donghyun Kim, Kwanyong Park et al.

ICCV 2025highlightarXiv:2507.01409
#22130

DCHM: Depth-Consistent Human Modeling for Multiview Detection

Jiahao Ma, Tianyu Wang, Miaomiao Liu et al.

ICCV 2025posterarXiv:2507.14505
#22131

Adversarial Robustness of Discriminative Self-Supervised Learning in Vision

Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy

ICCV 2025posterarXiv:2503.06361
#22132

HPSv3: Towards Wide-Spectrum Human Preference Score

Yuhang Ma, Keqiang Sun, Xiaoshi Wu et al.

ICCV 2025posterarXiv:2508.03789
#22133

Self supervised learning for in vivo localization of microelectrode arrays using raw local field potential

Tianxiao He, Malhar Patel, Chenyi Li et al.

NEURIPS 2025poster
#22134

Active Perception Meets Rule-Guided RL: A Two-Phase Approach for Precise Object Navigation in Complex Environments

Liang Qin, Min Wang, Peiwei Li et al.

ICCV 2025poster
#22135

HiERO: Understanding the Hierarchy of Human Behavior Enhances Reasoning on Egocentric Videos

Simone Alberto Peirone, Francesca Pistilli, Giuseppe Averta

ICCV 2025posterarXiv:2505.12911
#22136

UNIS: A Unified Framework for Achieving Unbiased Neural Implicit Surfaces in Volume Rendering

Junkai Deng, Hanting Niu, Jiaze Li et al.

ICCV 2025poster
#22137

Inter2Former: Dynamic Hybrid Attention for Efficient High-Precision Interactive Segmentation

You Huang, Lichao Chen, Jiayi Ji et al.

ICCV 2025poster
#22138

On the Provable Importance of Gradients for Autonomous Language-Assisted Image Clustering

Bo Peng, Jie Lu, Guangquan Zhang et al.

ICCV 2025highlight
#22139

IntrinsicControlNet: Cross-distribution Image Generation with Real and Unreal

Jiayuan Lu, Rengan Xie, Zixuan Xie et al.

ICCV 2025poster
#22140

MH-LVC: Multi-Hypothesis Temporal Prediction for Learned Conditional Residual Video Coding

Gao Zong lin, Huu-Tai Phung, Yi-Chen Yao et al.

ICCV 2025posterarXiv:2510.12479
#22141

DC-AE 1.5: Accelerating Diffusion Model Convergence with Structured Latent Space

Junyu Chen, Dongyun Zou, Wenkun He et al.

ICCV 2025posterarXiv:2508.00413
#22142

Hate in Plain Sight: On the Risks of Moderating AI-Generated Hateful Illusions

Yiting Qu, Ziqing Yang, Yihan Ma et al.

ICCV 2025posterarXiv:2507.22617
#22143

Loss Functions for Predictor-based Neural Architecture Search

Han Ji, Yuqi Feng, Jiahao Fan et al.

ICCV 2025posterarXiv:2506.05869
#22144

Advancing Text-to-3D Generation with Linearized Lookahead Variational Score Distillation

Yu Lei, Bingde Liu, Qingsong Xie et al.

ICCV 2025posterarXiv:2507.09748
#22145

GUI-Rise: Structured Reasoning and History Summarization for GUI Navigation

Tao Liu, Chongyu Wang, Rongjie Li et al.

NEURIPS 2025posterarXiv:2510.27210
#22146

Steering Guidance for Personalized Text-to-Image Diffusion Models

Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.

ICCV 2025posterarXiv:2508.00319
#22147

ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models

Zifu Wan, Ce Zhang, Silong Yong et al.

ICCV 2025posterarXiv:2507.00898
#22148

Structural Information-based Hierarchical Diffusion for Offline Reinforcement Learning

Xianghua Zeng, Hao Peng, Yicheng Pan et al.

NEURIPS 2025oralarXiv:2509.21942
#22149

Single GPU Task Adaptation of Pathology Foundation Models for Whole Slide Image Analysis

Neeraj Kumar, Chad Vanderbilt

NEURIPS 2025posterarXiv:2506.05184
#22150

Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds

Pei He, Lingling Li, Licheng Jiao et al.

ICCV 2025posterarXiv:2508.11265
#22151

Function-centric Bayesian Network for Zero-Shot Object Goal Navigation

Sixian Zhang, Xinyao Yu, Xinhang Song et al.

ICCV 2025poster
#22152

GaussianReg: Rapid 2D/3D Registration for Emergency Surgery via Explicit 3D Modeling with Gaussian Primitives

Weihao Yu, Xiaoqing Guo, Xinyu Liu et al.

ICCV 2025poster
#22153

Precise Diffusion Inversion: Towards Novel Samples and Few-Step Models

Jing Zuo, Luoping Cui, Chuang Zhu et al.

NEURIPS 2025poster
#22154

ArgoTweak: Towards Self-Updating HD Maps through Structured Priors

Lena Wild, Rafael Valencia, Patric Jensfelt

ICCV 2025posterarXiv:2509.08764
#22155

Event-aided Dense and Continuous Point Tracking: Everywhere and Anytime

Zhexiong Wan, Jianqin Luo, Yuchao Dai et al.

ICCV 2025poster
#22156

Context-Aware Academic Emotion Dataset and Benchmark

Luming Zhao, Jingwen Xuan, Jiamin Lou et al.

ICCV 2025posterarXiv:2507.00586
#22157

FlowSeek: Optical Flow Made Easier with Depth Foundation Models and Motion Bases

Matteo Poggi, Fabio Tosi

ICCV 2025posterarXiv:2509.05297
#22158

TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging

QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang

ICCV 2025highlight
#22159

SpatialCrafter: Unleashing the Imagination of Video Diffusion Models for Scene Reconstruction from Limited Observations

Songchun Zhang, Huiyao Xu, Sitong Guo et al.

ICCV 2025posterarXiv:2505.11992
#22160

Consistently Simulating Human Personas with Multi-Turn Reinforcement Learning

Marwa Abdulhai, Ryan Cheng, Donovan Clay et al.

NEURIPS 2025posterarXiv:2511.00222
#22161

All Parts Matter: A Unified Mask-Free Virtual Try-On Framework

Chenghu Du, Shengwu Xiong, Yi Rong

ICCV 2025poster
#22162

Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration

Sitao Zhang, Hongda Mao, Qingshuang Chen et al.

ICCV 2025poster
#22163

COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets

Lingyu Chen, Yawen Zeng, Yue Wang et al.

ICCV 2025posterarXiv:2508.09886
#22164

NATRA: Noise-Agnostic Framework for Trajectory Prediction with Noisy Observations

Rongqing Li, Changsheng Li, Ruilin Lv et al.

ICCV 2025poster
#22165

MS3D: High-Quality 3D Generation via Multi-Scale Representation Modeling

Guan Luo, Jianfeng Zhang

ICCV 2025poster
#22166

JPEG Processing Neural Operator for Backward-Compatible Coding

Woo Kyoung Han, Yongjun Lee, Byeonghun Lee et al.

ICCV 2025posterarXiv:2507.23521
#22167

UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation

Zhengyin Liang, Hui Yin, Min Liang et al.

ICCV 2025highlight
#22168

Hybrid Layout Control for Diffusion Transformer: Fewer Annotations, Superior Aesthetics

Keming Wu, Junwen Chen, Zhanhao Liang et al.

ICCV 2025poster
#22169

PLAN: Proactive Low-Rank Allocation for Continual Learning

XIEQUN WANG, Zhan Zhuang, Yu Zhang

ICCV 2025posterarXiv:2510.21188
#22170

Leveraging Spatial Invariance to Boost Adversarial Transferability

Zihan Zhou, LI LI, Yanli Ren et al.

ICCV 2025poster
#22171

LayerLock: Non-collapsing Representation Learning with Progressive Freezing

Goker Erdogan, Nikhil Parthasarathy, Catalin Ionescu et al.

ICCV 2025posterarXiv:2509.10156
#22172

T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation

Chieh-Yun Chen, Min Shi, Gong Zhang et al.

ICCV 2025posterarXiv:2507.20536
#22173

FedXDS: Leveraging Model Attribution Methods to counteract Data Heterogeneity in Federated Learning

Maximilian Hoefler, Karsten Mueller, Wojciech Samek

ICCV 2025poster
#22174

Visual Textualization for Image Prompted Object Detection

Yongjian Wu, Yang Zhou, Jiya Saiyin et al.

ICCV 2025posterarXiv:2506.23785
#22175

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Johannes Jakubik, Felix Yang, Benedikt Blumenstiel et al.

ICCV 2025posterarXiv:2504.11171
#22176

LLaVA-SP: Enhancing Visual Representation with Visual Spatial Tokens for MLLMs

Haoran Lou, Chunxiao Fan, Ziyan Liu et al.

ICCV 2025posterarXiv:2507.00505
#22177

Dr. Splat: Directly Referring 3D Gaussian Splatting via Direct Language Embedding Registration

JUNSEONG KIM, GeonU Kim, Kim Yu-Ji et al.

CVPR 2025highlightarXiv:2502.16652
#22178

AV-Link: Temporally-Aligned Diffusion Features for Cross-Modal Audio-Video Generation

Moayed Haji-Ali, Willi Menapace, Aliaksandr Siarohin et al.

ICCV 2025posterarXiv:2412.15191
#22179

EgoPressure: A Dataset for Hand Pressure and Pose Estimation in Egocentric Vision

Yiming Zhao, Taein Kwon, Paul Streli et al.

CVPR 2025highlightarXiv:2409.02224
#22180

Generative Video Bi-flow

Chen Liu, Tobias Ritschel

ICCV 2025posterarXiv:2503.06364
#22181

Transformer-based Tooth Alignment Prediction with Occlusion and Collision Constraints

DongZhenXing DongZhenXing, Jiazhou Chen

ICCV 2025posterarXiv:2410.20806
#22182

A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness

Xiaoyi Feng, Tao Huang, Peng Wang et al.

ICCV 2025poster
#22183

FHGS: Feature-Homogenized Gaussian Splatting

qigeng duan, Benyun ZHAO, Mingqiao Han et al.

NEURIPS 2025posterarXiv:2505.19154
#22184

Towards Robust Defense against Customization via Protective Perturbation Resistant to Diffusion-based Purification

Wenkui Yang, Jie Cao, Junxian Duan et al.

ICCV 2025highlightarXiv:2509.13922
#22185

SD2Actor: Continuous State Decomposition via Diffusion Embeddings for Robotic Manipulation

lijiayi jiayi

ICCV 2025poster
#22186

ADCD-Net: Robust Document Image Forgery Localization via Adaptive DCT Feature and Hierarchical Content Disentanglement

KA WONG, Jicheng Zhou, Haiwei Wu et al.

ICCV 2025posterarXiv:2507.16397
#22187

Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis

Xinyu Hou, Zongsheng Yue, Xiaoming Li et al.

ICCV 2025posterarXiv:2411.17769
#22188

PixTalk: Controlling Photorealistic Image Processing and Editing with Language

Marcos Conde, Zihao Lu, Radu Timofte

ICCV 2025poster
#22189

Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification

Guibao SHEN, Luozhou Wang, Jiantao Lin et al.

ICCV 2025poster
#22190

ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection

Hongchi Ma, Guanglei Yang, Debin Zhao et al.

ICCV 2025poster
#22191

GMMamba: Group Masking Mamba for Whole Slide Image Classification

Tingting Zheng, Hongxun Yao, Kui Jiang et al.

ICCV 2025poster
#22192

TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction

Dadong Jiang, Zhi Hou, Zhihui Ke et al.

ICCV 2025posterarXiv:2411.11941
#22193

Beyond Brain Decoding: Visual-Semantic Reconstructions to Mental Creation Extension Based on fMRI

Haodong Jing, Dongyao Jiang, Yongqiang Ma et al.

ICCV 2025poster
#22194

RareCLIP: Rarity-aware Online Zero-shot Industrial Anomaly Detection

Jianfang He, Min Cao, Silong Peng et al.

ICCV 2025poster
#22195

Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection

Yichen Lu, Siwei Nie, Minlong Lu et al.

ICCV 2025poster
#22196

Pretrained Reversible Generation as Unsupervised Visual Representation Learning

Rongkun Xue, Jinouwen Zhang, Yazhe Niu et al.

ICCV 2025posterarXiv:2412.01787
#22197

BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

Ruotong Wang, Mingli Zhu, Jiarong Ou et al.

ICCV 2025posterarXiv:2504.16907
#22198

Temporal Rate Reduction Clustering for Human Motion Segmentation

Xianghan Meng, Zhengyu Tong, Zhiyuan Huang et al.

ICCV 2025posterarXiv:2506.21249
#22199

Hierarchy UGP: Hierarchy Unified Gaussian Primitive for Large-Scale Dynamic Scene Reconstruction

Hongyang Sun, Qinglin Yang, Jiawei Wang et al.

ICCV 2025poster
#22200

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

Gang Dai, Yifan Zhang, Yutao Qin et al.

ICCV 2025posterarXiv:2508.03256