Most Cited ICCV "bev segmentation" Papers

2,701 papers found • Page 12 of 14

#2201

Federated Domain Generalization with Domain-specific Soft Prompts Generation

Jianhan Wu, Xiaoyang Qu, Zhangcheng Huang et al.

ICCV 2025posterarXiv:2509.20807
#2202

ForgeLens: Data-Efficient Forgery Focus for Generalizable Forgery Image Detection

Yingjian Chen, Lei Zhang, Yakun Niu

ICCV 2025posterarXiv:2408.13697
#2203

Incremental Few-Shot Semantic Segmentation via Multi-Level Switchable Visual Prompts

Maoxian Wan, Kaige Li, Qichuan Geng et al.

ICCV 2025poster
#2204

Embodied Representation Alignment with Mirror Neurons

Wentao Zhu, Zhining Zhang, Yuwei Ren et al.

ICCV 2025posterarXiv:2509.21136
#2205

Selective Contrastive Learning for Weakly Supervised Affordance Grounding

WonJun Moon, Hyun Seok Seong, Jae-Pil Heo

ICCV 2025posterarXiv:2508.07877
#2206

EVOLVE: Event-Guided Deformable Feature Transfer and Dual-Memory Refinement for Low-Light Video Object Segmentation

Jong Hyeon Baek, Jiwon oh, Yeong Jun Koh

ICCV 2025poster
#2207

AG2aussian: Anchor-Graph Structured Gaussian Splatting for Instance-Level 3D Scene Understanding and Editing

Zhaonan Wang, Manyi Li, Changhe Tu

ICCV 2025poster
#2208

InterGSEdit: Interactive 3D Gaussian Splatting Editing with 3D Geometry-Consistent Attention Prior

Minghao Wen, Shengjie Wu, Kangkan Wang et al.

ICCV 2025posterarXiv:2507.04961
#2209

Benchmarking Multimodal Large Language Models Against Image Corruptions

Xinkuan Qiu, Meina Kan, Yongbin Zhou et al.

ICCV 2025poster
#2210

Deterministic Object Pose Confidence Region Estimation

Jinghao Wang, Zhang Li, Zi Wang et al.

ICCV 2025posterarXiv:2506.22720
#2211

Decoupled Multi-Predictor Optimization for Inference-Efficient Model Tuning

Liwei Luo, Shuaitengyuan Li, Dongwei Ren et al.

ICCV 2025posterarXiv:2511.03245
#2212

ACAM-KD: Adaptive and Cooperative Attention Masking for Knowledge Distillation

Qizhen Lan, Qing Tian

ICCV 2025posterarXiv:2503.06307
#2213

MotionCtrl: A Real-time Controllable Vision-Language-Motion Model

Bin Cao, Sipeng Zheng, Ye Wang et al.

ICCV 2025poster
#2214

SALAD -- Semantics-Aware Logical Anomaly Detection

Matic Fučka, Vitjan Zavrtanik, Danijel Skocaj

ICCV 2025posterarXiv:2509.02101
#2215

VLR-Driver: Large Vision-Language-Reasoning Models for Embodied Autonomous Driving

Fanjie Kong, Yitong Li, Weihuang Chen et al.

ICCV 2025poster
#2216

Vid-Group: Temporal Video Grounding Pretraining from Unlabeled Videos in the Wild

Peijun Bao, Chenqi Kong, SIYUAN YANG et al.

ICCV 2025poster
#2217

Temperature in Cosine-based Softmax Loss

Takumi Kobayashi

ICCV 2025poster
#2218

Multi-modal Segment Anything Model for Camouflaged Scene Segmentation

Guangyu Ren, Hengyan Liu, Michalis Lazarou et al.

ICCV 2025poster
#2219

Can We Achieve Efficient Diffusion Without Self-Attention? Distilling Self-Attention into Convolutions

ZiYi Dong, Chengxing Zhou, Weijian Deng et al.

ICCV 2025posterarXiv:2504.21292
#2220

Ultra-Precision 6DoF Pose Estimation Using 2-D Interpolated Discrete Fourier Transform

Guowei Shi, Zian Mao, Peisen Huang

ICCV 2025poster
#2221

AMDANet: Attention-Driven Multi-Perspective Discrepancy Alignment for RGB-Infrared Image Fusion and Segmentation

Haifeng Zhong, Fan Tang, Zhuo Chen et al.

ICCV 2025poster
#2222

Prompt Guidance and Human Proximal Perception for HOT Prediction with Regional Joint Loss

Yuxiao Wang, Yu Lei, Zhenao WEI et al.

ICCV 2025posterarXiv:2507.01630
#2223

Coupling the Generator with Teacher for Effective Data-Free Knowledge Distillation

Xu Chen, Yang Li, Yahong Han et al.

ICCV 2025poster
#2224

Towards a Universal Image Degradation Model via Content-Degradation Disentanglement

Wenbo Yang, Zhongling Wang, Zhou Wang

ICCV 2025posterarXiv:2505.12860
#2225

Know Your Attention Maps: Class-specific Token Masking for Weakly Supervised Semantic Segmentation

Joëlle Hanna, Damian Borth

ICCV 2025posterarXiv:2507.06848
#2226

Unraveling the Smoothness Properties of Diffusion Models: A Gaussian Mixture Perspective

Yingyu Liang, Zhizhou Sha, Zhenmei Shi et al.

ICCV 2025posterarXiv:2405.16418
#2227

FDPT: Federated Discrete Prompt Tuning for Black-Box Visual-Language Models

Jiaqi Wu, Simin Chen, Jing Tang et al.

ICCV 2025poster
#2228

A Tiny Change, A Giant Leap: Long-Tailed Class-Incremental Learning via Geometric Prototype Alignment

xinyi lai, Luojun Lin, Weijie Chen et al.

ICCV 2025poster
#2229

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178
#2230

Underwater Visual SLAM with Depth Uncertainty and Medium Modeling

Rui Liu, Sheng Fan, Wenguan Wang et al.

ICCV 2025highlight
#2231

LangBridge: Interpreting Image as a Combination of Language Embeddings

Jiaqi Liao, Yuwei Niu, Fanqing Meng et al.

ICCV 2025posterarXiv:2503.19404
#2232

Embodied Navigation with Auxiliary Task of Action Description Prediction

Haru Kondoh, Asako Kanezaki

ICCV 2025posterarXiv:2510.21809
#2233

Contrastive Flow Matching

George Stoica, Vivek Ramanujan, Xiang Fan et al.

ICCV 2025posterarXiv:2506.05350
#2234

HOLa: Zero-Shot HOI Detection with Low-Rank Decomposed VLM Feature Adaptation

Qinqian Lei, Bo Wang, Robby Tan

ICCV 2025posterarXiv:2507.15542
#2235

AllGCD: Leveraging All Unlabeled Data for Generalized Category Discovery

Xinzi Cao, Ke Chen, Feidiao Yang et al.

ICCV 2025poster
#2236

Towards Long-Horizon Vision-Language-Action System: Reasoning, Acting and Memory

Daixun Li, Yusi Zhang, Mingxiang Cao et al.

ICCV 2025poster
#2237

UniFuse: A Unified All-in-One Framework for Multi-Modal Medical Image Fusion Under Diverse Degradations and Misalignments

Dayong Su, Yafei Zhang, Huafeng Li et al.

ICCV 2025posterarXiv:2506.22736
#2238

CopyrightShield: Enhancing Diffusion Model Security Against Copyright Infringement Attacks

Zhixiang Guo, Siyuan Liang, Aishan Liu et al.

ICCV 2025posterarXiv:2412.01528
#2239

Learnable Logit Adjustment for Imbalanced Semi-Supervised Learning under Class Distribution Mismatch

lee hyuck, Taemin Park, Heeyoung Kim

ICCV 2025poster
#2240

DiffPCI: Large Motion Point Cloud frame Interpolation with Diffusion Model

tianyu zhang, Haobo Jiang, jian Yang et al.

ICCV 2025poster
#2241

Local Dense Logit Relations for Enhanced Knowledge Distillation

Liuchi Xu, Kang Liu, Jinshuai Liu et al.

ICCV 2025posterarXiv:2507.15911
#2242

HIS-GPT: Towards 3D Human-In-Scene Multimodal Understanding

JIAHE ZHAO, RuiBing Hou, zejie tian et al.

ICCV 2025posterarXiv:2503.12955
#2243

Soft Local Completeness: Rethinking Completeness in XAI

Ziv Weiss Haddad, Oren Barkan, Yehonatan Elisha et al.

ICCV 2025poster
#2244

PBFG: A New Physically-Based Dataset and Removal of Lens Flares and Glares

Jie Zhu, Sungkil Lee

ICCV 2025poster
#2245

Correspondence as Video: Test-Time Adaption on SAM2 for Reference Segmentation in the Wild

Haoran Wang, Zekun Li, Jian Zhang et al.

ICCV 2025posterarXiv:2508.07759
#2246

An Information-Theoretic Regularizer for Lossy Neural Image Compression

ZHANG YINGWEN, Meng Wang, Xihua Sheng et al.

ICCV 2025posterarXiv:2411.16727
#2247

Controllable Feature Whitening for Hyperparameter-Free Bias Mitigation

Yooshin Cho, Hanbyel Cho, Janghyeon Lee et al.

ICCV 2025posterarXiv:2507.20284
#2248

KV-Edit: Training-Free Image Editing for Precise Background Preservation

Tianrui Zhu, Shiyi Zhang, Jiawei Shao et al.

ICCV 2025posterarXiv:2502.17363
#2249

FusionPhys: A Flexible Framework for Fusing Complementary Sensing Modalities in Remote Physiological Measurement

Chenhang Ying, Huiyu Yang, Jieyi Ge et al.

ICCV 2025poster
#2250

DiffVSR: Revealing an Effective Recipe for Taming Robust Video Super-Resolution Against Complex Degradations

Xiaohui Li, Yihao Liu, Shuo Cao et al.

ICCV 2025posterarXiv:2501.10110
#2251

Power of Cooperative Supervision: Multiple Teachers Framework for Advanced 3D Semi-Supervised Object Detection

Jin-Hee Lee, Jae-keun Lee, Jeseok Kim et al.

ICCV 2025poster
#2252

Adapting In-Domain Few-Shot Segmentation to New Domains without Source Domain Retraining

Qi Fan, Kaiqi Liu, Nian Liu et al.

ICCV 2025posterarXiv:2504.21414
#2253

ASGS: Single-Domain Generalizable Open-Set Object Detection via Adaptive Subgraph Searching

Yuxuan Yuan, Luyao Tang, Chaoqi Chen et al.

ICCV 2025poster
#2254

COVTrack: Continuous Open-Vocabulary Tracking via Adaptive Multi-Cue Fusion

Zekun Qian, Ruize Han, Zhixiang Wang et al.

ICCV 2025poster
#2255

CasP: Improving Semi-Dense Feature Matching Pipeline Leveraging Cascaded Correspondence Priors for Guidance

Peiqi Chen, Lei Yu, Yi Wan et al.

ICCV 2025highlightarXiv:2507.17312
#2256

MMAIF: Multi-task and Multi-degradation All-in-One for Image Fusion with Language Guidance

Zihan Cao, Yu Zhong, Ziqi Wang et al.

ICCV 2025posterarXiv:2503.14944
#2257

Blind Video Super-Resolution based on Implicit Kernels

Qiang Zhu, Yuxuan Jiang, Shuyuan Zhu et al.

ICCV 2025posterarXiv:2503.07856
#2258

Toward Long-Tailed Online Anomaly Detection through Class-Agnostic Concepts

Chiao-An Yang, Kuan-Chuan Peng, Raymond A. Yeh

ICCV 2025posterarXiv:2507.16946
#2259

Adversarial Robustness of Discriminative Self-Supervised Learning in Vision

Ömer Veysel Çağatan, Ömer TAL, M. Emre Gursoy

ICCV 2025posterarXiv:2503.06361
#2260

HPSv3: Towards Wide-Spectrum Human Preference Score

Yuhang Ma, Keqiang Sun, Xiaoshi Wu et al.

ICCV 2025posterarXiv:2508.03789
#2261

UNIS: A Unified Framework for Achieving Unbiased Neural Implicit Surfaces in Volume Rendering

Junkai Deng, Hanting Niu, Jiaze Li et al.

ICCV 2025poster
#2262

IntrinsicControlNet: Cross-distribution Image Generation with Real and Unreal

Jiayuan Lu, Rengan Xie, Zixuan Xie et al.

ICCV 2025poster
#2263

Advancing Text-to-3D Generation with Linearized Lookahead Variational Score Distillation

Yu Lei, Bingde Liu, Qingsong Xie et al.

ICCV 2025posterarXiv:2507.09748
#2264

Steering Guidance for Personalized Text-to-Image Diffusion Models

Sunghyun Park, Seokeon Choi, Hyoungwoo Park et al.

ICCV 2025posterarXiv:2508.00319
#2265

ONLY: One-Layer Intervention Sufficiently Mitigates Hallucinations in Large Vision-Language Models

Zifu Wan, Ce Zhang, Silong Yong et al.

ICCV 2025posterarXiv:2507.00898
#2266

Domain-aware Category-level Geometry Learning Segmentation for 3D Point Clouds

Pei He, Lingling Li, Licheng Jiao et al.

ICCV 2025posterarXiv:2508.11265
#2267

Event-aided Dense and Continuous Point Tracking: Everywhere and Anytime

Zhexiong Wan, Jianqin Luo, Yuchao Dai et al.

ICCV 2025poster
#2268

Context-Aware Academic Emotion Dataset and Benchmark

Luming Zhao, Jingwen Xuan, Jiamin Lou et al.

ICCV 2025posterarXiv:2507.00586
#2269

FlowSeek: Optical Flow Made Easier with Depth Foundation Models and Motion Bases

Matteo Poggi, Fabio Tosi

ICCV 2025posterarXiv:2509.05297
#2270

TPG-INR: Target Prior-Guided Implicit 3D CT Reconstruction for Enhanced Sparse-view Imaging

QingleiCao QingleiCao, Ziyao Tang, Xiaoqin Tang

ICCV 2025highlight
#2271

Efficient Visual Place Recognition Through Multimodal Semantic Knowledge Integration

Sitao Zhang, Hongda Mao, Qingshuang Chen et al.

ICCV 2025poster
#2272

COME: Dual Structure-Semantic Learning with Collaborative MoE for Universal Lesion Detection Across Heterogeneous Ultrasound Datasets

Lingyu Chen, Yawen Zeng, Yue Wang et al.

ICCV 2025posterarXiv:2508.09886
#2273

NATRA: Noise-Agnostic Framework for Trajectory Prediction with Noisy Observations

Rongqing Li, Changsheng Li, Ruilin Lv et al.

ICCV 2025poster
#2274

MS3D: High-Quality 3D Generation via Multi-Scale Representation Modeling

Guan Luo, Jianfeng Zhang

ICCV 2025poster
#2275

UniDxMD: Towards Unified Representation for Cross-Modal Unsupervised Domain Adaptation in 3D Semantic Segmentation

Zhengyin Liang, Hui Yin, Min Liang et al.

ICCV 2025highlight
#2276

PLAN: Proactive Low-Rank Allocation for Continual Learning

XIEQUN WANG, Zhan Zhuang, Yu Zhang

ICCV 2025posterarXiv:2510.21188
#2277

Leveraging Spatial Invariance to Boost Adversarial Transferability

Zihan Zhou, LI LI, Yanli Ren et al.

ICCV 2025poster
#2278

TerraMind: Large-Scale Generative Multimodality for Earth Observation

Johannes Jakubik, Felix Yang, Benedikt Blumenstiel et al.

ICCV 2025posterarXiv:2504.11171
#2279

SD2Actor: Continuous State Decomposition via Diffusion Embeddings for Robotic Manipulation

lijiayi jiayi

ICCV 2025poster
#2280

Omegance: A Single Parameter for Various Granularities in Diffusion-Based Synthesis

Xinyu Hou, Zongsheng Yue, Xiaoming Li et al.

ICCV 2025posterarXiv:2411.17769
#2281

Scene Graph Guided Generation: Enable Accurate Relations Generation in Text-to-Image Models via Textural Rectification

Guibao SHEN, Luozhou Wang, Jiantao Lin et al.

ICCV 2025poster
#2282

ReMP-AD: Retrieval-enhanced Multi-modal Prompt Fusion for Few-Shot Industrial Visual Anomaly Detection

Hongchi Ma, Guanglei Yang, Debin Zhao et al.

ICCV 2025poster
#2283

TimeFormer: Capturing Temporal Relationships of Deformable 3D Gaussians for Robust Reconstruction

Dadong Jiang, Zhi Hou, Zhihui Ke et al.

ICCV 2025posterarXiv:2411.11941
#2284

Backdoor Mitigation by Distance-Driven Detoxification

Shaokui Wei, Jiayin Liu, Hongyuan Zha

ICCV 2025highlightarXiv:2411.09585
#2285

UnrealZoo: Enriching Photo-realistic Virtual Worlds for Embodied AI

Fangwei Zhong, Kui Wu, Churan Wang et al.

ICCV 2025highlightarXiv:2412.20977
#2286

HFD-Teacher: High-Frequency Depth Distillation from Depth Foundation Models for Enhanced Depth Completion

Zhiyuan Yang, Anqi Cheng, Haiyue Zhu et al.

ICCV 2025poster
#2287

Height-Fidelity Dense Global Fusion for Multi-modal 3D Object Detection

Hanshi Wang, Jin Gao, Weiming Hu et al.

ICCV 2025highlightarXiv:2507.04369
#2288

SMSTracker: Tri-path Score Mask Sigma Fusion for Multi-Modal Tracking

Sixian Chan, Zedong Li, Xiaoqin Zhang et al.

ICCV 2025highlight
#2289

Two Losses, One Goal: Balancing Conflict Gradients for Semi-supervised Semantic Segmentation

Rui Sun, Huayu Mai, Wangkai Li et al.

ICCV 2025highlight
#2290

Region-based Cluster Discrimination for Visual Representation Learning

Yin Xie, Kaicheng Yang, Xiang An et al.

ICCV 2025highlightarXiv:2507.20025
#2291

CMB-ML: A Cosmic Microwave Background Dataset for the Oldest Possible Computer Vision Task

James Amato, Yunan Xie, Leonel Medina-Varela et al.

ICCV 2025poster
#2292

Shape of Motion: 4D Reconstruction from a Single Video

Qianqian Wang, Vickie Ye, Hang Gao et al.

ICCV 2025highlightarXiv:2407.13764
#2293

EditCLIP: Representation Learning for Image Editing

Qian Wang, Aleksandar Cvejic, Abdelrahman Eldesokey et al.

ICCV 2025posterarXiv:2503.20318
#2294

MOVE: Motion-Guided Few-Shot Video Object Segmentation

Kaining Ying, Hengrui Hu, Henghui Ding

ICCV 2025posterarXiv:2507.22061
#2295

CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation

Dengke Zhang, Fagui Liu, Quan Tang

ICCV 2025posterarXiv:2411.10086
#2296

mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework

Bingyi Liu, Jian Teng, Hongfei Xue et al.

ICCV 2025posterarXiv:2501.12263
#2297

FreqPDE: Rethinking Positional Depth Embedding for Multi-View 3D Object Detection Transformers

Junjie Zhang, Haisheng Su, Feixiang Song et al.

ICCV 2025posterarXiv:2510.15385
#2298

RealCam-I2V: Real-World Image-to-Video Generation with Interactive Complex Camera Control

Teng Li, Guangcong Zheng, Rui Jiang et al.

ICCV 2025posterarXiv:2502.10059
#2299

VAGUE: Visual Contexts Clarify Ambiguous Expressions

Heejeong Nam, Jinwoo Ahn, Keummin Ka et al.

ICCV 2025posterarXiv:2411.14137
#2300

What's Making That Sound Right Now? Video-centric Audio-Visual Localization

hahyeon choi, Junhoo Lee, Nojun Kwak

ICCV 2025posterarXiv:2507.04667
#2301

RARE: Refine Any Registration of Pairwise Point Clouds via Zero-Shot Learning

Chengyu Zheng, Honghua Chen, Jin Huang et al.

ICCV 2025posterarXiv:2507.19950
#2302

OV-SCAN: Semantically Consistent Alignment for Novel Object Discovery in Open-Vocabulary 3D Object Detection

Adrian Chow, Evelien Riddell, Yimu Wang et al.

ICCV 2025posterarXiv:2503.06435
#2303

SC-Lane: Slope-aware and Consistent Road Height Estimation Framework for 3D Lane Detection

Chaesong Park, Eunbin Seo, JihyeonHwang JihyeonHwang et al.

ICCV 2025posterarXiv:2508.10411
#2304

Exploring the Visual Feature Space for Multimodal Neural Decoding

Weihao Xia, Cengiz Oztireli

ICCV 2025posterarXiv:2505.15755
#2305

Backdoor Defense via Enhanced Splitting and Trap Isolation

Hongrui Yu, Lu Qi, Wanyu Lin et al.

ICCV 2025poster
#2306

ATAS: Any-to-Any Self-Distillation for Enhanced Open-Vocabulary Dense Prediction

Soonwoo Cha, Jiwoo Song, Juan Yeo et al.

ICCV 2025posterarXiv:2506.08678
#2307

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

Chende Zheng, Ruiqi suo, Chenhao Lin et al.

ICCV 2025posterarXiv:2508.00701
#2308

Overcoming Dual Drift for Continual Long-Tailed Visual Question Answering

Feifei Zhang, Zhihao Wang, Xi Zhang et al.

ICCV 2025poster
#2309

χ: Symmetry Understanding of 3D Shapes via Chirality Disentanglement

Weikang Wang, Tobias Weißberg, Nafie El Amrani et al.

ICCV 2025poster
#2310

VideoAuteur: Towards Long Narrative Video Generation

Junfei Xiao, Feng Cheng, Lu Qi et al.

ICCV 2025posterarXiv:2501.06173
#2311

Robust and Efficient 3D Gaussian Splatting for Urban Scene Reconstruction

Zhensheng Yuan, Haozhi Huang, Zhen Xiong et al.

ICCV 2025posterarXiv:2507.23006
#2312

Neural Architecture Search Driven by Locally Guided Diffusion for Personalized Federated Learning

PENG LIAO, Xilu Wang, Yaochu Jin et al.

ICCV 2025poster
#2313

Bridging Local Inductive Bias and Long-Range Dependencies with Pixel-Mamba for End-to-end Whole Slide Image Analysis

Zhongwei Qiu, Hanqing Chao, Tiancheng Lin et al.

ICCV 2025poster
#2314

Neuroverse3D: Developing In-Context Learning Universal Model for Neuroimaging in 3D

Jiesi Hu, Hanyang Peng, Yanwu Yang et al.

ICCV 2025posterarXiv:2503.02410
#2315

Taming Flow Matching with Unbalanced Optimal Transport into Fast Pansharpening

Zihan Cao, Yu Zhong, Liang-Jian Deng

ICCV 2025posterarXiv:2503.14975
#2316

ZeroKey: Point-Level Reasoning and Zero-Shot 3D Keypoint Detection from Large Language Models

Bingchen Gong, Diego Gomez, Abdullah Hamdi et al.

ICCV 2025posterarXiv:2412.06292
#2317

How Do Multimodal Large Language Models Handle Complex Multimodal Reasoning? Placing Them in An Extensible Escape Game

Ziyue Wang, Yurui Dong, Fuwen Luo et al.

ICCV 2025poster
#2318

Towards Human-like Virtual Beings: Simulating Human Behavior in 3D Scenes

CHEN LIANG, Wenguan Wang, Yi Yang

ICCV 2025poster
#2319

S3R-GS: Streamlining the Pipeline for Large-Scale Street Scene Reconstruction

Guangting Zheng, Jiajun Deng, Xiaomeng Chu et al.

ICCV 2025posterarXiv:2503.08217
#2320

The Source Image is the Best Attention for Infrared and Visible Image Fusion

Song Wang, Xie Han, Liqun Kuang et al.

ICCV 2025poster
#2321

Video2BEV: Transforming Drone Videos to BEVs for Video-based Geo-localization

Hao Ju, Shaofei Huang, Si Liu et al.

ICCV 2025posterarXiv:2411.13610
#2322

CLIPer: Hierarchically Improving Spatial Representation of CLIP for Open-Vocabulary Semantic Segmentation

Lin Sun, Jiale Cao, Jin Xie et al.

ICCV 2025posterarXiv:2411.13836
#2323

Wave-MambaAD: Wavelet-driven State Space Model for Multi-class Unsupervised Anomaly Detection

Qiao Zhang, Mingwen Shao, Xinyuan Chen et al.

ICCV 2025poster
#2324

Scendi Score: Prompt‑Aware Diversity Evaluation via Schur Complement of CLIP Embeddings

Azim Ospanov, Mohammad Jalali, Farzan Farnia

ICCV 2025highlightarXiv:2412.18645
#2325

Scaling Laws for Native Multimodal Models

Mustafa Shukor, Enrico Fini, Victor Guilherme Turrisi da Costa et al.

ICCV 2025posterarXiv:2504.07951
#2326

VoxelKP: A Voxel-based Network Architecture for Human Keypoint Estimation in LiDAR Data

Jian Shi, Peter Wonka

ICCV 2025posterarXiv:2312.08871
#2327

A View-consistent Sampling Method for Regularized Training of Neural Radiance Fields

Aoxiang Fan, Corentin Dumery, Nicolas Talabot et al.

ICCV 2025posterarXiv:2507.04408
#2328

Autoregressive Denoising Score Matching is a Good Video Anomaly Detector

hanwen Zhang, Congqi Cao, Qinyi Lv et al.

ICCV 2025posterarXiv:2506.23282
#2329

IFAdapter: Instance Feature Control for Grounded Text-to-Image Generation

YINWEI WU, Xianpan Zhou, bing ma et al.

ICCV 2025posterarXiv:2409.08240
#2330

A Constrained Optimization Approach for Gaussian Splatting from Coarsely-posed Images and Noisy Lidar Point Clouds

Jizong Peng, Tze Ho Elden Tse, Kai Xu et al.

ICCV 2025highlightarXiv:2504.09129
#2331

Towards Scalable Spatial Intelligence via 2D-to-3D Data Lifting

Xingyu Miao, Haoran Duan, Quanhao Qian et al.

ICCV 2025highlightarXiv:2507.18678
#2332

EYE3:Turn Anything into Naked-eye 3D

Yingde Song, Zongyuan Yang, Baolin Liu et al.

ICCV 2025poster
#2333

C2MIL: Synchronizing Semantic and Topological Causalities in Multiple Instance Learning for Robust and Interpretable Survival Analysis

Min Cen, Zhenfeng Zhuang, Yuzhe Zhang et al.

ICCV 2025poster
#2334

SEGA: A Stepwise Evolution Paradigm for Content-Aware Layout Generation with Design Prior

Bo Zhao, Haoran Wang, Jinghui Wang et al.

ICCV 2025highlightarXiv:2510.15749
#2335

TryOn-Refiner: Conditional Rectified-flow-based TryOn Refiner for More Accurate Detail Reconstruction

Wen Qian

ICCV 2025poster
#2336

Scoring, Remember, and Reference: Catching Camouflaged Objects in Videos

Yuang Feng, Shuyong Gao, Fuzhen Yan et al.

ICCV 2025posterarXiv:2503.17050
#2337

Recognizing Actions from Robotic View for Natural Human-Robot Interaction

Ziyi Wang, Peiming Li, Hong Liu et al.

ICCV 2025posterarXiv:2507.22522
#2338

Addressing Text Embedding Leakage in Diffusion-based Image Editing

Sunung Mun, Jinhwan Nam, Sunghyun Cho et al.

ICCV 2025posterarXiv:2412.04715
#2339

Robin3D: Improving 3D Large Language Model via Robust Instruction Tuning

Weitai Kang, Haifeng Huang, Yuzhang Shang et al.

ICCV 2025posterarXiv:2410.00255
#2340

FRET: Feature Redundancy Elimination for Test Time Adaptation

Linjing You, Jiabao Lu, Xiayuan Huang et al.

ICCV 2025posterarXiv:2505.10641
#2341

Motion-2-to-3: Leveraging 2D Motion Data for 3D Motion Generations

Ruoxi Guo, Huaijin Pi, Zehong Shen et al.

ICCV 2025poster
#2342

A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Rongtao Xu, Jian Zhang, Minghao Guo et al.

ICCV 2025posterarXiv:2504.12636
#2343

PVMamba: Parallelizing Vision Mamba via Dynamic State Aggregation

Fei Xie, Zhongdao Wang, Weijia Zhang et al.

ICCV 2025poster
#2344

Controllable and Expressive One-Shot Video Head Swapping

Chaonan Ji, Jinwei Qi, Peng Zhang et al.

ICCV 2025posterarXiv:2506.16852
#2345

Adversarial Training for Probabilistic Robustness

YI ZHANG, Yuhang Chen, Zhen Chen et al.

ICCV 2025poster
#2346

Learning to See Inside Opaque Liquid Containers using Speckle Vibrometry

Matan Kichler, Shai Bagon, Mark Sheinin

ICCV 2025posterarXiv:2507.20757
#2347

LightBSR: Towards Lightweight Blind Super-Resolution via Discriminative Implicit Degradation Representation Learning

Jiang Yuan, ji ma, Bo Wang et al.

ICCV 2025posterarXiv:2506.22710
#2348

When Pixel Difference Patterns Meet ViT: PiDiViT for Few-Shot Object Detection

Hongliang Zhou, Yongxiang Liu, Canyu Mo et al.

ICCV 2025poster
#2349

SPD: Shallow Backdoor Protecting Deep Backdoor Against Backdoor Detection

Shunjie Yuan, Xinghua Li, Xuelin Cao et al.

ICCV 2025poster
#2350

Rethinking DPO-style Diffusion Aligning Frameworks

XUN WU, Shaohan Huang, Lingjie Jiang et al.

ICCV 2025highlight
#2351

Ensemble Foreground Management for Unsupervised Object Discovery

Ziling Wu, Armaghan Moemeni, Praminda Caleb-Solly

ICCV 2025highlightarXiv:2507.20860
#2352

Hierarchical Variational Test-Time Prompt Generation for Zero-Shot Generalization

Zhaoyang Wu, Fang Liu, Licheng Jiao et al.

ICCV 2025poster
#2353

OCSplats: Observation Completeness Quantification and Label Noise Separation in 3DGS

Han Ling, Yinghui Sun, Xian Xu et al.

ICCV 2025posterarXiv:2508.01239
#2354

GWM: Towards Scalable Gaussian World Models for Robotic Manipulation

Guanxing Lu, Baoxiong Jia, Puhao Li et al.

ICCV 2025posterarXiv:2508.17600
#2355

Boosting Multimodal Learning via Disentangled Gradient Learning

Shicai Wei, Chunbo Luo, Yang Luo

ICCV 2025posterarXiv:2507.10213
#2356

TAG-WM: Tamper-Aware Generative Image Watermarking via Diffusion Inversion Sensitivity

Yuzhuo Chen, Zehua Ma, Han Fang et al.

ICCV 2025posterarXiv:2506.23484
#2357

HORT: Monocular Hand-held Objects Reconstruction with Transformers

Zerui Chen, Rolandos Alexandros Potamias, Shizhe Chen et al.

ICCV 2025posterarXiv:2503.21313
#2358

CaliMatch: Adaptive Calibration for Improving Safe Semi-supervised Learning

Jinsoo Bae, Seoung Bum Kim, Hyungrok Do

ICCV 2025posterarXiv:2508.00922
#2359

Reminiscence Attack on Residuals: Exploiting Approximate Machine Unlearning for Privacy

Yaxin Xiao, Qingqing Ye, Li Hu et al.

ICCV 2025posterarXiv:2507.20573
#2360

Tensor-aggregated LoRA in Federated Fine-tuning

Zhixuan Li, Binqian Xu, Xiangbo Shu et al.

ICCV 2025poster
#2361

QR-LoRA: Efficient and Disentangled Fine-tuning via QR Decomposition for Customized Generation

Jiahui Yang, Yongjia Ma, Donglin Di et al.

ICCV 2025posterarXiv:2507.04599
#2362

Self-Supervised Sparse Sensor Fusion for Long Range Perception

Edoardo Palladin, Samuel Brucker, Filippo Ghilotti et al.

ICCV 2025posterarXiv:2508.13995
#2363

AccidentalGS: 3D Gaussian Splatting from Accidental Camera Motion

Mao Mao, Xujie Shen, Guyuan Chen et al.

ICCV 2025poster
#2364

Competitive Distillation: A Simple Learning Strategy for Improving Visual Classification

Daqian Shi, Xiaolei Diao, Xu Chen et al.

ICCV 2025posterarXiv:2506.23285
#2365

Unified Adversarial Augmentation for Improving Palmprint Recognition

Jianlong Jin, Chenglong Zhao, Ruixin Zhang et al.

ICCV 2025poster
#2366

Adding Additional Control to One-Step Diffusion with Joint Distribution Matching

Yihong Luo, Tianyang Hu, Yifan Song et al.

ICCV 2025posterarXiv:2503.06652
#2367

Unified Multi-Agent Trajectory Modeling with Masked Trajectory Diffusion

songru Yang, Zhenwei Shi, Zhengxia Zou

ICCV 2025poster
#2368

Bridging Class Imbalance and Partial Labeling via Spectral-Balanced Energy Propagation for Skeleton-based Action Recognition

Yandan Wang, Chenqi Guo, Yinglong Ma et al.

ICCV 2025poster
#2369

ForeSight: Multi-View Streaming Joint Object Detection and Trajectory Forecasting

Sandro Papais, Letian Wang, Brian Cheong et al.

ICCV 2025posterarXiv:2508.07089
#2370

Dual Domain Control via Active Learning for Remote Sensing Domain Incremental Object Detection

Jiachen Sun, De Cheng, Xi Yang et al.

ICCV 2025poster
#2371

Enpowering Your Pansharpening Models with Generalizability: Unified Distribution is All You Need

Yongchuan Cui, Peng Liu, HUI ZHANG

ICCV 2025posterarXiv:2510.22217
#2372

Beyond Low-Rank Tuning: Model Prior-Guided Rank Allocation for Effective Transfer in Low-Data and Large-Gap Regimes.

Chuyan Zhang, Kefan Wang, Yun Gu

ICCV 2025posterarXiv:2507.00327
#2373

OracleFusion: Assisting the Decipherment of Oracle Bone Script with Structurally Constrained Semantic Typography

Li Caoshuo, Zengmao Ding, Xiaobin Hu et al.

ICCV 2025posterarXiv:2506.21101
#2374

COSMO: Combination of Selective Memorization for Low-cost Vision-and-Language Navigation

Siqi Zhang, Yanyuan Qiao, Qunbo Wang et al.

ICCV 2025posterarXiv:2503.24065
#2375

CoStoDet-DDPM: Collaborative Training of Stochastic and Deterministic Models Improves Surgical Workflow Anticipation and Recognition

Kaixiang Yang, Xin Li, Qiang Li et al.

ICCV 2025posterarXiv:2503.10216
#2376

Exploring Weather-aware Aggregation and Adaptation for Semantic Segmentation under Adverse Conditions

Yuwen Pan, Rui Sun, Wangkai Li et al.

ICCV 2025poster
#2377

Transparent Vision: A Theory of Hierarchical Invariant Representations

Shuren Qi, Yushu Zhang, CHAO WANG et al.

ICCV 2025poster
#2378

TemCoCo: Temporally Consistent Multi-modal Video Fusion with Visual-Semantic Collaboration

Gong Meiqi, Hao Zhang, Xunpeng Yi et al.

ICCV 2025posterarXiv:2508.17817
#2379

RetinexMCNet: A Memory Controller Dominated Network for Low-Light Video Enhancement Based on Retinex

Meiao Wang, Xuejing Kang, Yaxi Lu et al.

ICCV 2025poster
#2380

Sliced Wasserstein Bridge for Open-Vocabulary Video Instance Segmentation

Zheyun Qin, Deng Yu, Chuanchen Luo et al.

ICCV 2025highlight
#2381

Frequency-Aware Autoregressive Modeling for Efficient High-Resolution Image Synthesis

Zhuokun Chen, Jugang Fan, Zhuowei Yu et al.

ICCV 2025posterarXiv:2507.20454
#2382

Lightweight Gradient-Aware Upscaling of 3D Gaussian Splatting Images

Simon Niedermayr, Christoph Neuhauser, Rüdiger Westermann

ICCV 2025posterarXiv:2503.14171
#2383

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Kaidong Zhang, Rongtao Xu, Ren Pengzhen et al.

ICCV 2025posterarXiv:2505.01709
#2384

3D Gaussian Splatting Driven Multi-View Robust Physical Adversarial Camouflage Generation

Tianrui Lou, Xiaojun Jia, Siyuan Liang et al.

ICCV 2025posterarXiv:2507.01367
#2385

LLM-Assisted Semantic Guidance for Sparsely Annotated Remote Sensing Object Detection

Wei Liao, Chunyan Xu, Chenxu Wang et al.

ICCV 2025posterarXiv:2509.16970
#2386

DADM: Dual Alignment of Domain and Modality for Face Anti-spoofing

Yang JingYi, Xun Lin, Zitong YU et al.

ICCV 2025posterarXiv:2503.00429
#2387

PacGDC: Label-Efficient Generalizable Depth Completion with Projection Ambiguity and Consistency

Haotian Wang, Aoran Xiao, Xiaoqin Zhang et al.

ICCV 2025posterarXiv:2507.07374
#2388

SuMa: A Subspace Mapping Approach for Robust and Effective Concept Erasure in Text-to-Image Diffusion Models

Kien Nguyen, Anh Tran, Cuong Pham

ICCV 2025posterarXiv:2509.05625
#2389

Recovering Parametric Scenes from Very Few Time-of-Flight Pixels

Carter Sifferman, Yiquan Li, Yiming Li et al.

ICCV 2025posterarXiv:2509.16132
#2390

MCAM: Multimodal Causal Analysis Model for Ego-Vehicle-Level Driving Video Understanding

Tongtong Cheng, Rongzhen Li, Yixin Xiong et al.

ICCV 2025posterarXiv:2507.06072
#2391

Engage for All: Making Ordinary Image Descriptions Appealing Again!

Yuyan Chen, Yifan Jiang, Li Zhou et al.

ICCV 2025poster
#2392

HiGarment: Cross-modal Harmony Based Diffusion Model for Flat Sketch to Realistic Garment Image

Junyi Guo, Jingxuan Zhang, Fangyu Wu et al.

ICCV 2025posterarXiv:2505.23186
#2393

Geometry Distributions

Biao Zhang, Jing Ren, Peter Wonka

ICCV 2025highlightarXiv:2411.16076
#2394

Towards Effective Foundation Model Adaptation for Extreme Cross-Domain Few-Shot Learning

Fei Zhou, Peng Wang, Lei Zhang et al.

ICCV 2025poster
#2395

Efficient Adaptation of Pre-trained Vision Transformer underpinned by Approximately Orthogonal Fine-Tuning Strategy

Yiting Yang, Hao Luo, Yuan Sun et al.

ICCV 2025posterarXiv:2507.13260
#2396

ConsistentCity: Semantic Flow-guided Occupancy DiT for Temporally Consistent Driving Scene Synthesis

Benjin Zhu, Xiaogang Wang, Hongsheng Li

ICCV 2025poster
#2397

Gaussian Variation Field Diffusion for High-fidelity Video-to-4D Synthesis

Bowen Zhang, Sicheng Xu, Chuxin Wang et al.

ICCV 2025posterarXiv:2507.23785
#2398

Neurons: Emulating the Human Visual Cortex Improves Fidelity and Interpretability in fMRI-to-Video Reconstruction

Haonan Wang, Qixiang ZHANG, Lehan Wang et al.

ICCV 2025posterarXiv:2503.11167
#2399

Outdoor Monocular SLAM with Global Scale-Consistent 3D Gaussian Pointmaps

Chong Cheng, Sicheng Yu, Zijian Wang et al.

ICCV 2025posterarXiv:2507.03737
#2400

RogSplat: Robust Gaussian Splatting via Generative Priors

Hanyang Kong, Xingyi Yang, Xinchao Wang

ICCV 2025poster