Most Cited ICCV "cost minimization" Papers

2,701 papers found • Page 10 of 14

#1801

Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification

Wajahat Khalid, Bin Liu, Xulin Li et al.

ICCV 2025
#1802

PASD: A Pixel-Adaptive Swarm Dynamics Approach for Unsupervised Low-Light Image Enhancement

Shuai Jin, Yuhua Qian, Feijiang Li et al.

ICCV 2025
#1803

Proactive Scene Decomposition and Reconstruction

Baicheng Li, Zike Yan, Dong Wu et al.

ICCV 2025arXiv:2510.16272
#1804

Unified Category-Level Object Detection and Pose Estimation from RGB Images using 3D Prototypes

Tom Fischer, Xiaojie Zhang, Eddy Ilg

ICCV 2025arXiv:2508.02157
#1805

A Hyperdimensional One Place Signature to Represent Them All: Stackable Descriptors For Visual Place Recognition

Connor Malone, Somayeh Hussaini, Tobias Fischer et al.

ICCV 2025arXiv:2412.06153
#1806

WalkVLM: Aid Visually Impaired People Walking by Vision Language Model

Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.

ICCV 2025
#1807

Error Recognition in Procedural Videos using Generalized Task Graph

Shih-Po Lee, Ehsan Elhamifar

ICCV 2025
#1808

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Lixing Xiao, Shunlin Lu, Huaijin Pi et al.

ICCV 2025arXiv:2503.15451
#1809

Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection

Giacomo D'Amicantonio, Snehashis Majhi, Quan Kong et al.

ICCV 2025highlightarXiv:2508.06318
#1810

What If: Understanding Motion Through Sparse Interactions

Stefan A. Baumann, Nick Stracke, Timy Phan et al.

ICCV 2025
#1811

RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration

Longxin Kou, Fei Ni, Jianye HAO et al.

ICCV 2025
#1812

FaceShield: Defending Facial Image against Deepfake Threats

Jaehwan Jeong, Sumin In, Sieun Kim et al.

ICCV 2025arXiv:2412.09921
#1813

Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers

An Lun Liu, Yu-Wei Chao, Yi-Ting Chen

ICCV 2025arXiv:2507.11287
#1814

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

shanlin sun, Yifan Wang, Hanwen Zhang et al.

ICCV 2025arXiv:2508.14461
#1815

Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition

Zefeng Qian, Xincheng Yao, Yifei Huang et al.

ICCV 2025arXiv:2507.16287
#1816

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

Liyuan Deng, Yunpeng Bai, Yongkang Dai et al.

ICCV 2025arXiv:2511.17647
#1817

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.

ICCV 2025arXiv:2508.14187
#1818

EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models

Yufei Cai, Hu Han, Yuxiang Wei et al.

ICCV 2025arXiv:2503.19369
#1819

InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians

Kefan Chen, Sergiu Oprea, Justin Theiss et al.

ICCV 2025arXiv:2504.07949
#1820

Im2Haircut: Single-view Strand-based Hair Reconstruction for Human Avatars

Vanessa Sklyarova, Egor Zakharov, Malte Prinzler et al.

ICCV 2025arXiv:2509.01469
#1821

TeRA: Rethinking Text-guided Realistic 3D Avatar Generation

Yanwen Wang, Yiyu Zhuang, Jiawei Zhang et al.

ICCV 2025arXiv:2509.02466
#1822

Open-World Skill Discovery from Unsegmented Demonstration Videos

Jingwen Deng, Zihao Wang, Shaofei Cai et al.

ICCV 2025
#1823

Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening

Hebaixu Wang, Jiayi Ma

ICCV 2025
#1824

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion

Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.

ICCV 2025
#1825

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

Dat NGUYEN, Marcella Astrid, Anis Kacem et al.

ICCV 2025arXiv:2501.01184
#1826

Multi-modal Identity Extraction

Ryan Webster, Teddy Furon

ICCV 2025
#1827

E-NeMF: Event-based Neural Motion Field for Novel Space-time View Synthesis of Dynamic Scenes

Yan Liu, Zehao Chen, Haojie Yan et al.

ICCV 2025
#1828

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Peng Chen, Pi Bu, Yingyao Wang et al.

ICCV 2025arXiv:2503.09527
#1829

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval

Luca Collorone, Matteo Gioia, Massimiliano Pappa et al.

ICCV 2025arXiv:2510.03200
#1830

Blind Noisy Image Deblurring Using Residual Guidance Strategy

Heyan Liu, Jianing Sun, Jun Liu et al.

ICCV 2025
#1831

Drawing Developmental Trajectory from Cortical Surface Reconstruction

WENXUAN WU, ruowen qu, Zhongliang Liu et al.

ICCV 2025
#1832

Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.

ICCV 2025arXiv:2503.13859
#1833

DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads

Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.

ICCV 2025
#1834

VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks

shiduo zhang, Zhe Xu, Peiju Liu et al.

ICCV 2025arXiv:2412.18194
#1835

TrackVerse: A Large-Scale Object-Centric Video Dataset for Image-Level Representation Learning

Yibing Wei, Samuel Church, Victor Suciu et al.

ICCV 2025
#1836

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

ICCV 2025arXiv:2506.23263
#1837

Robust Test-Time Adaptation for Single Image Denoising Using Deep Gaussian Prior

Qing Ma, Pengwei Liang, Xiong Zhou et al.

ICCV 2025
#1838

Augmented Mass-Spring Model for Real-Time Dense Hair Simulation

Jorge Herrera, Yi Zhou, Xin Sun et al.

ICCV 2025arXiv:2412.17144
#1839

Punching Bag vs. Punching Person: Motion Transferability in Videos

Raiyaan Abdullah, Jared Claypoole, Michael Cogswell et al.

ICCV 2025arXiv:2508.00085
#1840

Laboring on less labors: RPCA Paradigm for Pan-sharpening

honghui xu, Chuangjie Fang, Yibin Wang et al.

ICCV 2025
#1841

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

ICCV 2025highlightarXiv:2506.22802
#1842

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025arXiv:2503.19457
#1843

WarpHE4D: Dense 4D Head Map toward Full Head Reconstruction

Jongseob Yun, Yong-Hoon Kwon, Min-Gyu Park et al.

ICCV 2025
#1844

Continuous-Time Human Motion Field from Event Cameras

Ziyun Wang, Ruijun Zhang, Zi-Yan Liu et al.

ICCV 2025
#1845

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.

ICCV 2025highlight
#1846

LDIP: Long Distance Information Propagation for Video Super-Resolution

Michael Bernasconi, Abdelaziz Djelouah, Yang Zhang et al.

ICCV 2025
#1847

MBTI: Masked Blending Transformers with Implicit Positional Encoding for Frame-rate Agnostic Motion Estimation

Jungwoo Huh, Yeseung Park, Seongjean Kim et al.

ICCV 2025
#1848

Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene

Donggeun Lim, Jinseok Bae, Inwoo Hwang et al.

ICCV 2025arXiv:2507.19232
#1849

Fast Image Super-Resolution via Consistency Rectified Flow

Jiaqi Xu, Wenbo Li, Haoze Sun et al.

ICCV 2025
#1850

GENMO: A GENeralist Model for Human MOtion

Jiefeng Li, Jinkun Cao, Haotian Zhang et al.

ICCV 2025highlightarXiv:2505.01425
#1851

Event-guided HDR Reconstruction with Diffusion Priors

Yixin Yang, jiawei zhang, Yang Zhang et al.

ICCV 2025
#1852

Learning Efficient and Generalizable Human Representation with Human Gaussian Model

Yifan Liu, Shengjun Zhang, Chensheng Dai et al.

ICCV 2025arXiv:2507.18758
#1853

AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

Yilin Wei, Mu Lin, Yuhao Lin et al.

ICCV 2025arXiv:2503.07360
#1854

Robust Adverse Weather Removal via Spectral-based Spatial Grouping

Yuhwan Jeong, Yunseo Yang, Youngho Yoon et al.

ICCV 2025arXiv:2507.22498
#1855

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

ICCV 2025arXiv:2412.18386
#1856

Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image

Shuang Xu, Zixiang Zhao, Haowen Bai et al.

ICCV 2025arXiv:2412.04201
#1857

Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars

Tobias Kirschstein, Javier Romero, Artem Sevastopolsky et al.

ICCV 2025arXiv:2502.20220
#1858

TimeBooth: Disentangled Facial Invariant Representation for Diverse and Personalized Face Aging

Zepeng Su, zhulin liu, Zongyan Zhang et al.

ICCV 2025
#1859

GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule

Rui Wang, Yimu Sun, Jingxing Guo et al.

ICCV 2025arXiv:2512.10252
#1860

Scaling Action Detection: AdaTAD++ with Transformer-Enhanced Temporal-Spatial Adaptation

Tanay Agrawal, Abid Ali, Antitza Dantcheva et al.

ICCV 2025
#1861

VideoSetDiff: Identifying and Reasoning Similarities and Differences in Similar Videos

YUE QIU, Yanjun Sun, Takuma Yagi et al.

ICCV 2025
#1862

NAPPure: Adversarial Purification for Robust Image Classification under Non-Additive Perturbations

Junjie Nan, Jianing Li, Wei Chen et al.

ICCV 2025arXiv:2510.14025
#1863

HADES: Human Avatar with Dynamic Explicit Hair Strands

Zhanfeng Liao, Hanzhang Tu, Cheng Peng et al.

ICCV 2025
#1864

FlowDPS : Flow-Driven Posterior Sampling for Inverse Problems

Jeongsol Kim, Bryan Sangwoo Kim, Jong Ye

ICCV 2025
#1865

ZFusion: Efficient Deep Compositional Zero-shot Learning for Blind Image Super-Resolution with Generative Diffusion Prior

Alireza Esmaeilzehi, Hossein Zaredar, Yapeng Tian et al.

ICCV 2025
#1866

DreamRelation: Relation-Centric Video Customization

Yujie Wei, Shiwei Zhang, Hangjie Yuan et al.

ICCV 2025arXiv:2503.07602
#1867

Learning A Unified Template for Gait Recognition

Panjian Huang, Saihui Hou, Junzhou Huang et al.

ICCV 2025
#1868

GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation

Quanwei Yang, Luying Huang, Kaisiyuan Wang et al.

ICCV 2025arXiv:2507.22731
#1869

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Hao Li, Xiang Chen, Jiangxin Dong et al.

ICCV 2025arXiv:2412.01427
#1870

Highlight What You Want: Weakly-Supervised Instance-Level Controllable Infrared-Visible Image Fusion

Zeyu Wang, Jizheng Zhang, Haiyu Song et al.

ICCV 2025
#1871

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025arXiv:2412.17812
#1872

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727
#1873

Latent-Reframe: Enabling Camera Control for Video Diffusion Models without Training

Zhenghong Zhou, Jie An, Jiebo Luo

ICCV 2025arXiv:2412.06029
#1874

Neuromanifold-Regularized KANs for Shape-fair Feature Representations

Mazlum Arslan, Weihong Guo, Shuo Li

ICCV 2025
#1875

Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution

Vlad Hosu, Lorenzo Agnolucci, Daisuke Iso et al.

ICCV 2025arXiv:2502.06476
#1876

Less Static, More Private: Towards Transferable Privacy-Preserving Action Recognition by Generative Decoupled Learning

Zhi-Wei Xia, Kun-Yu Lin, Yuan-Ming Li et al.

ICCV 2025
#1877

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025arXiv:2303.05183
#1878

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025arXiv:2508.01984
#1879

AdaDCP: Learning an Adapter with Discrete Cosine Prior for Clear-to-Adverse Domain Generalization

Qi Bi, Yixian Shen, Jingjun Yi et al.

ICCV 2025
#1880

MorphoGen: Efficient Unconditional Generation of Long-Range Projection Neuronal Morphology via a Global-to-Local Framework

Tianfang Zhu, Hongyang Zhou, Anan LI

ICCV 2025
#1881

GaussianSpeech: Audio-Driven Personalized 3D Gaussian Avatars

Shivangi Aneja, Artem Sevastopolsky, Tobias Kirschstein et al.

ICCV 2025
#1882

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition

Jie Zhu, Yiyang Su, Minchul Kim et al.

ICCV 2025arXiv:2508.00053
#1883

Capturing head avatar with hand contacts from a monocular video

Haonan He, Yufeng Zheng, Jie Song

ICCV 2025arXiv:2510.17181
#1884

MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation

Sungwoo Cho, Jeongsoo Choi, Sungnyun Kim et al.

ICCV 2025arXiv:2503.11026
#1885

Privacy-centric Deep Motion Retargeting for Anonymization of Skeleton-Based Motion Visualization

Thomas Carr, Depeng Xu, Shuhan Yuan et al.

ICCV 2025
#1886

UniPhys: Unified Planner and Controller with Diffusion for Flexible Physics-Based Character Control

Yan Wu, Korrawe Karunratanakul, Zhengyi Luo et al.

ICCV 2025highlightarXiv:2504.12540
#1887

UniRes: Universal Image Restoration for Complex Degradations

Mo Zhou, Keren Ye, Mauricio Delbracio et al.

ICCV 2025arXiv:2506.05599
#1888

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Chun-Han Yao, Yiming Xie, Vikram Voleti et al.

ICCV 2025arXiv:2503.16396
#1889

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Yujie Zhou, Jiazi Bu, Pengyang Ling et al.

ICCV 2025arXiv:2502.08590
#1890

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Ke Fan, Shunlin Lu, Minyue Dai et al.

ICCV 2025highlightarXiv:2507.07095
#1891

DynamicFace: High-Quality and Consistent Face Swapping for Image and Video using Composable 3D Facial Priors

Runqi Wang, Yang Chen, Sijie Xu et al.

ICCV 2025arXiv:2501.08553
#1892

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262
#1893

Controllable Weather Synthesis and Removal with Video Diffusion Models

Chih-Hao Lin, Zian Wang, Ruofan Liang et al.

ICCV 2025arXiv:2505.00704
#1894

T2Bs: Text-to-Character Blendshapes via Video Generation

Jiahao Luo, Chaoyang Wang, Michael Vasilkovsky et al.

ICCV 2025arXiv:2509.10678
#1895

Unfolding-Associative Encoder-Decoder Network with Progressive Alignment for Pansharpening

Shijie Fang, Hongping Gan

ICCV 2025
#1896

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration

Tao Wang, Peiwen Xia, Bo Li et al.

ICCV 2025
#1897

LOMM: Latest Object Memory Management for Temporally Consistent Video Instance Segmentation

Seunghun Lee, Jiwan Seo, Minwoo Choi et al.

ICCV 2025
#1898

DuoCLR: Dual-Surrogate Contrastive Learning for Skeleton-based Human Action Segmentation

Haitao Tian

ICCV 2025arXiv:2509.05543
#1899

EVDM: Event-based Real-world Video Deblurring with Mamba

Zhijing Sun, Senyan Xu, Kean Liu et al.

ICCV 2025
#1900

Q-Norm: Robust Representation Learning via Quality-Adaptive Normalization

Lanning Zhang, Ying Zhou, Fei Gao et al.

ICCV 2025
#1901

Proxy-Bridged Game Transformer for Interactive Extreme Motion Prediction

Yanwen Fang, Wenqi Jia, Xu Cao et al.

ICCV 2025
#1902

π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?

Susan Liang, Chao Huang, Yolo Yunlong Tang et al.

ICCV 2025
#1903

SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning

Lanmiao Liu, Esam Ghaleb, asli ozyurek et al.

ICCV 2025arXiv:2507.19359
#1904

Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions

Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein

ICCV 2025arXiv:2406.05400
#1905

RobAVA: A Large-scale Dataset and Baseline Towards Video based Robotic Arm Action Understanding

Baoli Sun, Ning Wang, Xinzhu Ma et al.

ICCV 2025
#1906

IDFace: Face Template Protection for Efficient and Secure Identification

Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.

ICCV 2025arXiv:2507.12050
#1907

I2VControl: Disentangled and Unified Video Motion Synthesis Control

Wanquan Feng, Tianhao Qi, Jiawei Liu et al.

ICCV 2025arXiv:2411.17765
#1908

Generic Event Boundary Detection via Denoising Diffusion

Jaejun Hwang, Dayoung Gong, Manjin Kim et al.

ICCV 2025arXiv:2508.12084
#1909

Not All Degradations Are Equal: A Targeted Feature Denoising Framework for Generalizable Image Super-Resolution

hongjun wang, Jiyuan Chen, Zhengwei Yin et al.

ICCV 2025arXiv:2509.14841
#1910

Fine-Grained 3D Gaussian Head Avatars Modeling from Static Captures via Joint Reconstruction and Registration

Yuan Sun, Xuan Wang, Cong Wang et al.

ICCV 2025
#1911

Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking

Yunhao Li, Yifan Jiao, Dan Meng et al.

ICCV 2025arXiv:2503.08145
#1912

MistSense: Versatile Online Detection of Procedural and Execution Mistakes

Constantin Patsch, Yuankai Wu, Marsil Zakour et al.

ICCV 2025
#1913

SEREP: Semantic Facial Expression Representation for Robust In-the-Wild Capture and Retargeting

Arthur Josi, Luiz Gustavo Hafemann, Abdallah Dib et al.

ICCV 2025arXiv:2412.14371
#1914

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025arXiv:2509.00346
#1915

Morph: A Motion-free Physics Optimization Framework for Human Motion Generation

Zhuo Li, Mingshuang Luo, RuiBing Hou et al.

ICCV 2025arXiv:2411.14951
#1916

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation

Syed Talal Wasim, Hamid Suleman, Olga Zatsarynna et al.

ICCV 2025arXiv:2509.11394
#1917

DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding

Thomas Kreutz, Max Mühlhäuser, Alejandro Sanchez Guinea

ICCV 2025arXiv:2506.13897
#1918

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135
#1919

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Taekyung Ki, Dongchan Min, Gyeongsu Chae

ICCV 2025arXiv:2412.01064
#1920

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025arXiv:2503.09320
#1921

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Wenhao Wang, Yi Yang

ICCV 2025arXiv:2411.04709
#1922

SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Pingchuan Ma, Xiaopei Yang, Ming Gui et al.

ICCV 2025arXiv:2508.03402
#1923

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang et al.

ICCV 2025arXiv:2509.16968
#1924

RayZer: A Self-supervised Large View Synthesis Model

Hanwen Jiang, Hao Tan, Peng Wang et al.

ICCV 2025arXiv:2505.00702
#1925

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025arXiv:2411.18677
#1926

Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

Zhengyao Lyu, Chenyang Si, Tianlin Pan et al.

ICCV 2025
#1927

Straighten Viscous Rectified Flow via Noise Optimization

Jimin Dai, Jiexi Yan, Jian Yang et al.

ICCV 2025highlightarXiv:2507.10218
#1928

Scalable Dual Fingerprinting for Hierarchical Attribution of Text-to-Image Models

Jianwei Fei, Yunshu Dai, Peipeng Yu et al.

ICCV 2025highlight
#1929

QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

Junyi Wu, Zhiteng Li, Zheng Hui et al.

ICCV 2025arXiv:2503.06545
#1930

CRAM: Large Scale Video Continual Learning with Bootstrapped Compression

Shivani Mall, Joao F. Henriques

ICCV 2025arXiv:2508.05001
#1931

Tree-NeRV: Efficient Non-Uniform Sampling for Neural Video Representation via Tree-Structured Feature Grids

Jiancheng Zhao, Yifan Zhan, Qingtian Zhu et al.

ICCV 2025
#1932

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

Nisha Huang, Henglin Liu, Yizhou Lin et al.

ICCV 2025
#1933

VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation

Shoubin Yu, Difan Liu, Ziqiao Ma et al.

ICCV 2025arXiv:2503.14350
#1934

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

Yi Liu, Shengqian Li, Zuzeng Lin et al.

ICCV 2025arXiv:2506.23347
#1935

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Kumara Kahatapitiya, Haozhe Liu, Sen He et al.

ICCV 2025arXiv:2411.02397
#1936

Edicho: Consistent Image Editing in the Wild

Qingyan Bai, Hao Ouyang, Yinghao Xu et al.

ICCV 2025arXiv:2412.21079
#1937

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

Worameth Chinchuthakun, Tossaporn Saengja, Nontawat Tritrong et al.

ICCV 2025arXiv:2503.11054
#1938

FlowChef: Steering of Rectified Flow Models for Controlled Generations

Maitreya Patel, Song Wen, Dimitris Metaxas et al.

ICCV 2025
#1939

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025arXiv:2508.10407
#1940

SynTag: Enhancing the Geometric Robustness of Inversion-based Generative Image Watermarking

Han Fang, Kejiang Chen, Zehua Ma et al.

ICCV 2025
#1941

IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Khaled Abud, Sergey Lavrushkin, Alexey Kirillov et al.

ICCV 2025highlightarXiv:2412.01794
#1942

Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion

Jiwon Kim, Pureum Kim, SeonHwa Kim et al.

ICCV 2025arXiv:2508.09575
#1943

Anti-Tamper Protection for Unauthorized Individual Image Generation

Zelin Li, Ruohan Zong, Yifan Liu et al.

ICCV 2025arXiv:2508.06325
#1944

Continual Personalization for Diffusion Models

Yu-Chien Liao, Jr-Jen Chen, Chi-Pin Huang et al.

ICCV 2025arXiv:2510.02296
#1945

WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation

Zhongyu Yang, Jun Chen, Dannong Xu et al.

ICCV 2025arXiv:2503.19065
#1946

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

Haoxuan Wang, Yuzhang Shang, Zhihang Yuan et al.

ICCV 2025arXiv:2402.03666
#1947

Split-and-Combine: Enhancing Style Augmentation for Single Domain Generalization

Zhen Zhang, Zhen Zhang, Qianlong Dang et al.

ICCV 2025
#1948

Zero-Shot Depth Aware Image Editing with Diffusion Models

Rishubh Parihar, Sachidanand VS, Venkatesh Babu Radhakrishnan

ICCV 2025
#1949

Global and Local Entailment Learning for Natural World Imagery

Srikumar Sastry, Aayush Dhakal, Eric Xing et al.

ICCV 2025arXiv:2506.21476
#1950

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Zhu Xu, Ting Lei, Zhimin Li et al.

ICCV 2025arXiv:2508.04943
#1951

Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Yuran Dong, Mang Ye

ICCV 2025arXiv:2507.03402
#1952

Who Controls the Authorization? Invertible Networks for Copyright Protection in Text-to-Image Synthesis

Baoyue Hu, Yang Wei, Junhao Xiao et al.

ICCV 2025
#1953

Magic Insert: Style-Aware Drag-and-Drop

Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa et al.

ICCV 2025highlightarXiv:2407.02489
#1954

FontAnimate: High Quality Few-shot Font Generation via Animating Font Transfer Process

Bin Fu, Zixuan Wang, Kainan Yan et al.

ICCV 2025
#1955

CharaConsist: Fine-Grained Consistent Character Generation

Mengyu Wang, Henghui Ding, Jianing Peng et al.

ICCV 2025arXiv:2507.11533
#1956

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation

Jiahao Wang, Ning Kang, Lewei Yao et al.

ICCV 2025arXiv:2501.12976
#1957

TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control

Zhenyu Yan, Jian Wang, Aoqiang Wang et al.

ICCV 2025arXiv:2410.09879
#1958

Beyond Perspective: Neural 360-Degree Video Compression

Andy Regensky, Marc Windsheimer, Fabian Brand et al.

ICCV 2025
#1959

MCID: Multi-aspect Copyright Infringement Detection for Generated Images

Chuanwei Huang, Zexi Jia, Hongyan Fei et al.

ICCV 2025
#1960

Text2Outfit: Controllable Outfit Generation with Multimodal Language Models

Yuanhao Zhai, Yen-Liang Lin, Minxu Peng et al.

ICCV 2025
#1961

One-Step Specular Highlight Removal with Adapted Diffusion Models

Mahir Atmis, LEVENT KARACAN, Mehmet SARIGÜL

ICCV 2025
#1962

DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting

Jingyi Pan, Dan Xu, Qiong Luo

ICCV 2025arXiv:2507.00429
#1963

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025arXiv:2506.05108
#1964

From Linearity to Non-Linearity: How Masked Autoencoders Capture Spatial Correlations

Anthony Bisulco, Rahul Ramesh, Randall Balestriero et al.

ICCV 2025arXiv:2508.15404
#1965

Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets

Dale Decatur, Thibault Groueix, Wang Yifan et al.

ICCV 2025arXiv:2508.21032
#1966

Cross-Granularity Online Optimization with Masked Compensated Information for Learned Image Compression

Haowei Kuang, Wenhan Yang, Zongming Guo et al.

ICCV 2025
#1967

Stroke2Sketch: Harnessing Stroke Attributes for Training-Free Sketch Generation

Rui Yang, Huining Li, Yiyi Long et al.

ICCV 2025arXiv:2510.16319
#1968

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

Minghao Fu, Guo-Hua Wang, Xiaohao Chen et al.

ICCV 2025arXiv:2507.18192
#1969

FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Minghan LI, Chenxi Xie, Yichen Wu et al.

ICCV 2025
#1970

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

Bowen Fu, Wei Wei, Jiaqi Tang et al.

ICCV 2025
#1971

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025arXiv:2508.03696
#1972

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Haonan Qiu, Shiwei Zhang, Yujie Wei et al.

ICCV 2025arXiv:2412.09626
#1973

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025arXiv:2507.02358
#1974

Toward Better Out-painting: Improving the Image Composition with Initialization Policy Model

Xuan Han, Yihao Zhao, Yanhao Ge et al.

ICCV 2025
#1975

Versatile Transition Generation with Image-to-Video Diffusion

Zuhao Yang, Jiahui Zhang, Yingchen Yu et al.

ICCV 2025arXiv:2508.01698
#1976

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Shengbang Tong, David Fan, Jiachen Zhu et al.

ICCV 2025arXiv:2412.14164
#1977

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

Runtao Liu, I Chen, Jindong Gu et al.

ICCV 2025
#1978

DiffIP: Representation Fingerprints for Robust IP Protection of Diffusion Models

Zhuoling Li, Haoxuan Qu, Jason Kuen et al.

ICCV 2025
#1979

FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models

Yuxuan Wang, Tianwei Cao, Huayu Zhang et al.

ICCV 2025arXiv:2507.02714
#1980

Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

Ryan Ramos, Vladan Stojnić, Giorgos Kordopatis-Zilos et al.

ICCV 2025highlightarXiv:2508.10637
#1981

AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild

Siyoon Jin, Jisu Nam, Jiyoung Kim et al.

ICCV 2025
#1982

Diffusion Epistemic Uncertainty with Asymmetric Learning for Diffusion-Generated Image Detection

Yingsong Huang, Hui Guo, Jing Huang et al.

ICCV 2025arXiv:2601.14625
#1983

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025arXiv:2508.03481
#1984

Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles

Eric Slyman, Mehrab Tanjim, Kushal Kafle et al.

ICCV 2025arXiv:2509.08777
#1985

V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim et al.

ICCV 2025arXiv:2508.03254
#1986

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025arXiv:2510.14230
#1987

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

Zeyi Sun, Ziyang Chu, Pan Zhang et al.

ICCV 2025
#1988

AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Xincheng Shuai, Hao Luo et al.

ICCV 2025arXiv:2507.02857
#1989

Streamlining Image Editing with Layered Diffusion Brushes

Peyman Gholami, Robert Xiao

ICCV 2025arXiv:2405.00313
#1990

EEdit : Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing

Zexuan Yan, Yue Ma, Chang Zou et al.

ICCV 2025arXiv:2503.10270
#1991

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528
#1992

Instruction-based Image Editing with Planning, Reasoning, and Generation

Liya Ji, Chenyang Qi, Qifeng Chen

ICCV 2025
#1993

HDR Image Generation via Gain Map Decomposed Diffusion

Yuanshen Guan, Ruikang Xu, Yinuo Liao et al.

ICCV 2025
#1994

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee, Kyungho Bae, Kyle Min et al.

ICCV 2025highlightarXiv:2508.10896
#1995

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025arXiv:2503.05156
#1996

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101
#1997

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025arXiv:2501.05442
#1998

ArtEditor: Learning Customized Instructional Image Editor from Few-Shot Examples

Shijie Huang, Yiren Song, Yuxuan Zhang et al.

ICCV 2025
#1999

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

Yunqiu Xu, Linchao Zhu, Yi Yang

ICCV 2025arXiv:2410.12332
#2000

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025arXiv:2503.07661