Most Cited ICCV "tracking-by-detection" Papers

2,701 papers found • Page 10 of 14

Filters:Most Cited ICCV tracking-by-detection Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#1801

Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification

Wajahat Khalid, Bin Liu, Xulin Li et al.

ICCV 2025

#1802

PASD: A Pixel-Adaptive Swarm Dynamics Approach for Unsupervised Low-Light Image Enhancement

Shuai Jin, Yuhua Qian, Feijiang Li et al.

ICCV 2025

#1803

Proactive Scene Decomposition and Reconstruction

Baicheng Li, Zike Yan, Dong Wu et al.

ICCV 2025arXiv:2510.16272

#1804

Unified Category-Level Object Detection and Pose Estimation from RGB Images using 3D Prototypes

Tom Fischer, Xiaojie Zhang, Eddy Ilg

ICCV 2025arXiv:2508.02157

#1805

A Hyperdimensional One Place Signature to Represent Them All: Stackable Descriptors For Visual Place Recognition

Connor Malone, Somayeh Hussaini, Tobias Fischer et al.

ICCV 2025arXiv:2412.06153

#1806

WalkVLM: Aid Visually Impaired People Walking by Vision Language Model

Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.

ICCV 2025

#1807

Error Recognition in Procedural Videos using Generalized Task Graph

Shih-Po Lee, Ehsan Elhamifar

ICCV 2025

#1808

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Lixing Xiao, Shunlin Lu, Huaijin Pi et al.

ICCV 2025arXiv:2503.15451

#1809

Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection

Giacomo D'Amicantonio, Snehashis Majhi, Quan Kong et al.

ICCV 2025highlightarXiv:2508.06318

#1810

What If: Understanding Motion Through Sparse Interactions

Stefan A. Baumann, Nick Stracke, Timy Phan et al.

ICCV 2025

#1811

RoboAnnotatorX: A Comprehensive and Universal Annotation Framework for Accurate Understanding of Long-horizon Robot Demonstration

Longxin Kou, Fei Ni, Jianye HAO et al.

ICCV 2025

#1812

FaceShield: Defending Facial Image against Deepfake Threats

Jaehwan Jeong, Sumin In, Sieun Kim et al.

ICCV 2025arXiv:2412.09921

#1813

Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers

An Lun Liu, Yu-Wei Chao, Yi-Ting Chen

ICCV 2025arXiv:2507.11287

#1814

Ouroboros: Single-step Diffusion Models for Cycle-consistent Forward and Inverse Rendering

shanlin sun, Yifan Wang, Hanwen Zhang et al.

ICCV 2025arXiv:2508.14461

#1815

Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition

Zefeng Qian, Xincheng Yao, Yifei Huang et al.

ICCV 2025arXiv:2507.16287

#1816

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

Liyuan Deng, Yunpeng Bai, Yongkang Dai et al.

ICCV 2025arXiv:2511.17647

#1817

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.

ICCV 2025arXiv:2508.14187

#1818

EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models

Yufei Cai, Hu Han, Yuxiang Wei et al.

ICCV 2025arXiv:2503.19369

#1819

InteractAvatar: Modeling Hand-Face Interaction in Photorealistic Avatars with Deformable Gaussians

Kefan Chen, Sergiu Oprea, Justin Theiss et al.

ICCV 2025arXiv:2504.07949

#1820

Im2Haircut: Single-view Strand-based Hair Reconstruction for Human Avatars

Vanessa Sklyarova, Egor Zakharov, Malte Prinzler et al.

ICCV 2025arXiv:2509.01469

#1821

TeRA: Rethinking Text-guided Realistic 3D Avatar Generation

Yanwen Wang, Yiyu Zhuang, Jiawei Zhang et al.

ICCV 2025arXiv:2509.02466

#1822

Open-World Skill Discovery from Unsegmented Demonstration Videos

Jingwen Deng, Zihao Wang, Shaofei Cai et al.

ICCV 2025

#1823

Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening

Hebaixu Wang, Jiayi Ma

ICCV 2025

#1824

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion

Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.

ICCV 2025

#1825

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

Dat NGUYEN, Marcella Astrid, Anis Kacem et al.

ICCV 2025arXiv:2501.01184

#1826

Multi-modal Identity Extraction

Ryan Webster, Teddy Furon

ICCV 2025

#1827

E-NeMF: Event-based Neural Motion Field for Novel Space-time View Synthesis of Dynamic Scenes

Yan Liu, Zehao Chen, Haojie Yan et al.

ICCV 2025

#1828

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Peng Chen, Pi Bu, Yingyao Wang et al.

ICCV 2025arXiv:2503.09527

#1829

MonSTeR: a Unified Model for Motion, Scene, Text Retrieval

Luca Collorone, Matteo Gioia, Massimiliano Pappa et al.

ICCV 2025arXiv:2510.03200

#1830

Blind Noisy Image Deblurring Using Residual Guidance Strategy

Heyan Liu, Jianing Sun, Jun Liu et al.

ICCV 2025

#1831

Drawing Developmental Trajectory from Cortical Surface Reconstruction

WENXUAN WU, ruowen qu, Zhongliang Liu et al.

ICCV 2025

#1832

Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.

ICCV 2025arXiv:2503.13859

#1833

DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads

Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.

ICCV 2025

#1834

VLABench: A Large-Scale Benchmark for Language-Conditioned Robotics Manipulation with Long-Horizon Reasoning Tasks

shiduo zhang, Zhe Xu, Peiju Liu et al.

ICCV 2025arXiv:2412.18194

#1835

TrackVerse: A Large-Scale Object-Centric Video Dataset for Image-Level Representation Learning

Yibing Wei, Samuel Church, Victor Suciu et al.

ICCV 2025

#1836

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

ICCV 2025arXiv:2506.23263

#1837

Robust Test-Time Adaptation for Single Image Denoising Using Deep Gaussian Prior

Qing Ma, Pengwei Liang, Xiong Zhou et al.

ICCV 2025

#1838

Augmented Mass-Spring Model for Real-Time Dense Hair Simulation

Jorge Herrera, Yi Zhou, Xin Sun et al.

ICCV 2025arXiv:2412.17144

#1839

Punching Bag vs. Punching Person: Motion Transferability in Videos

Raiyaan Abdullah, Jared Claypoole, Michael Cogswell et al.

ICCV 2025arXiv:2508.00085

#1840

Laboring on less labors: RPCA Paradigm for Pan-sharpening

honghui xu, Chuangjie Fang, Yibin Wang et al.

ICCV 2025

#1841

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

ICCV 2025highlightarXiv:2506.22802

#1842

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025arXiv:2503.19457

#1843

WarpHE4D: Dense 4D Head Map toward Full Head Reconstruction

Jongseob Yun, Yong-Hoon Kwon, Min-Gyu Park et al.

ICCV 2025

#1844

Continuous-Time Human Motion Field from Event Cameras

Ziyun Wang, Ruijun Zhang, Zi-Yan Liu et al.

ICCV 2025

#1845

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.

ICCV 2025highlight

#1846

LDIP: Long Distance Information Propagation for Video Super-Resolution

Michael Bernasconi, Abdelaziz Djelouah, Yang Zhang et al.

ICCV 2025

#1847

MBTI: Masked Blending Transformers with Implicit Positional Encoding for Frame-rate Agnostic Motion Estimation

Jungwoo Huh, Yeseung Park, Seongjean Kim et al.

ICCV 2025

#1848

Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene

Donggeun Lim, Jinseok Bae, Inwoo Hwang et al.

ICCV 2025arXiv:2507.19232

#1849

Fast Image Super-Resolution via Consistency Rectified Flow

Jiaqi Xu, Wenbo Li, Haoze Sun et al.

ICCV 2025

#1850

GENMO: A GENeralist Model for Human MOtion

Jiefeng Li, Jinkun Cao, Haotian Zhang et al.

ICCV 2025highlightarXiv:2505.01425

#1851

Event-guided HDR Reconstruction with Diffusion Priors

Yixin Yang, jiawei zhang, Yang Zhang et al.

ICCV 2025

#1852

Learning Efficient and Generalizable Human Representation with Human Gaussian Model

Yifan Liu, Shengjun Zhang, Chensheng Dai et al.

ICCV 2025arXiv:2507.18758

#1853

AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

Yilin Wei, Mu Lin, Yuhao Lin et al.

ICCV 2025arXiv:2503.07360

#1854

Robust Adverse Weather Removal via Spectral-based Spatial Grouping

Yuhwan Jeong, Yunseo Yang, Youngho Yoon et al.

ICCV 2025arXiv:2507.22498

#1855

Switch-a-View: View Selection Learned from Unlabeled In-the-wild Videos

Sagnik Majumder, Tushar Nagarajan, Ziad Al-Halah et al.

ICCV 2025arXiv:2412.18386

#1856

Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image

Shuang Xu, Zixiang Zhao, Haowen Bai et al.

ICCV 2025arXiv:2412.04201

#1857

Avat3r: Large Animatable Gaussian Reconstruction Model for High-fidelity 3D Head Avatars

Tobias Kirschstein, Javier Romero, Artem Sevastopolsky et al.

ICCV 2025arXiv:2502.20220

#1858

TimeBooth: Disentangled Facial Invariant Representation for Diverse and Personalized Face Aging

Zepeng Su, zhulin liu, Zongyan Zhang et al.

ICCV 2025

#1859

GDKVM: Echocardiography Video Segmentation via Spatiotemporal Key-Value Memory with Gated Delta Rule

Rui Wang, Yimu Sun, Jingxing Guo et al.

ICCV 2025arXiv:2512.10252

#1860

Scaling Action Detection: AdaTAD++ with Transformer-Enhanced Temporal-Spatial Adaptation

Tanay Agrawal, Abid Ali, Antitza Dantcheva et al.

ICCV 2025

#1861

VideoSetDiff: Identifying and Reasoning Similarities and Differences in Similar Videos

YUE QIU, Yanjun Sun, Takuma Yagi et al.

ICCV 2025

#1862

NAPPure: Adversarial Purification for Robust Image Classification under Non-Additive Perturbations

Junjie Nan, Jianing Li, Wei Chen et al.

ICCV 2025arXiv:2510.14025

#1863

HADES: Human Avatar with Dynamic Explicit Hair Strands

Zhanfeng Liao, Hanzhang Tu, Cheng Peng et al.

ICCV 2025

#1864

FlowDPS : Flow-Driven Posterior Sampling for Inverse Problems

Jeongsol Kim, Bryan Sangwoo Kim, Jong Ye

ICCV 2025

#1865

ZFusion: Efficient Deep Compositional Zero-shot Learning for Blind Image Super-Resolution with Generative Diffusion Prior

Alireza Esmaeilzehi, Hossein Zaredar, Yapeng Tian et al.

ICCV 2025

#1866

DreamRelation: Relation-Centric Video Customization

Yujie Wei, Shiwei Zhang, Hangjie Yuan et al.

ICCV 2025arXiv:2503.07602

#1867

Learning A Unified Template for Gait Recognition

Panjian Huang, Saihui Hou, Junzhou Huang et al.

ICCV 2025

#1868

GestureHYDRA: Semantic Co-speech Gesture Synthesis via Hybrid Modality Diffusion Transformer and Cascaded-Synchronized Retrieval-Augmented Generation

Quanwei Yang, Luying Huang, Kaisiyuan Wang et al.

ICCV 2025arXiv:2507.22731

#1869

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Hao Li, Xiang Chen, Jiangxin Dong et al.

ICCV 2025arXiv:2412.01427

#1870

Highlight What You Want: Weakly-Supervised Instance-Level Controllable Infrared-Visible Image Fusion

Zeyu Wang, Jizheng Zhang, Haiyu Song et al.

ICCV 2025

#1871

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025arXiv:2412.17812

#1872

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727

#1873

Latent-Reframe: Enabling Camera Control for Video Diffusion Models without Training

Zhenghong Zhou, Jie An, Jiebo Luo

ICCV 2025arXiv:2412.06029

#1874

Neuromanifold-Regularized KANs for Shape-fair Feature Representations

Mazlum Arslan, Weihong Guo, Shuo Li

ICCV 2025

#1875

Image Intrinsic Scale Assessment: Bridging the Gap Between Quality and Resolution

Vlad Hosu, Lorenzo Agnolucci, Daisuke Iso et al.

ICCV 2025arXiv:2502.06476

#1876

Less Static, More Private: Towards Transferable Privacy-Preserving Action Recognition by Generative Decoupled Learning

Zhi-Wei Xia, Kun-Yu Lin, Yuan-Ming Li et al.

ICCV 2025

#1877

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025arXiv:2303.05183

#1878

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025arXiv:2508.01984

#1879

AdaDCP: Learning an Adapter with Discrete Cosine Prior for Clear-to-Adverse Domain Generalization

Qi Bi, Yixian Shen, Jingjun Yi et al.

ICCV 2025

#1880

MorphoGen: Efficient Unconditional Generation of Long-Range Projection Neuronal Morphology via a Global-to-Local Framework

Tianfang Zhu, Hongyang Zhou, Anan LI

ICCV 2025

#1881

GaussianSpeech: Audio-Driven Personalized 3D Gaussian Avatars

Shivangi Aneja, Artem Sevastopolsky, Tobias Kirschstein et al.

ICCV 2025

#1882

A Quality-Guided Mixture of Score-Fusion Experts Framework for Human Recognition

Jie Zhu, Yiyang Su, Minchul Kim et al.

ICCV 2025arXiv:2508.00053

#1883

Capturing head avatar with hand contacts from a monocular video

Haonan He, Yufeng Zheng, Jie Song

ICCV 2025arXiv:2510.17181

#1884

MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation

Sungwoo Cho, Jeongsoo Choi, Sungnyun Kim et al.

ICCV 2025arXiv:2503.11026

#1885

Privacy-centric Deep Motion Retargeting for Anonymization of Skeleton-Based Motion Visualization

Thomas Carr, Depeng Xu, Shuhan Yuan et al.

ICCV 2025

#1886

UniPhys: Unified Planner and Controller with Diffusion for Flexible Physics-Based Character Control

Yan Wu, Korrawe Karunratanakul, Zhengyi Luo et al.

ICCV 2025highlightarXiv:2504.12540

#1887

UniRes: Universal Image Restoration for Complex Degradations

Mo Zhou, Keren Ye, Mauricio Delbracio et al.

ICCV 2025arXiv:2506.05599

#1888

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Chun-Han Yao, Yiming Xie, Vikram Voleti et al.

ICCV 2025arXiv:2503.16396

#1889

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Yujie Zhou, Jiazi Bu, Pengyang Ling et al.

ICCV 2025arXiv:2502.08590

#1890

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Ke Fan, Shunlin Lu, Minyue Dai et al.

ICCV 2025highlightarXiv:2507.07095

#1891

DynamicFace: High-Quality and Consistent Face Swapping for Image and Video using Composable 3D Facial Priors

Runqi Wang, Yang Chen, Sijie Xu et al.

ICCV 2025arXiv:2501.08553

#1892

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262

#1893

Controllable Weather Synthesis and Removal with Video Diffusion Models

Chih-Hao Lin, Zian Wang, Ruofan Liang et al.

ICCV 2025arXiv:2505.00704

#1894

T2Bs: Text-to-Character Blendshapes via Video Generation

Jiahao Luo, Chaoyang Wang, Michael Vasilkovsky et al.

ICCV 2025arXiv:2509.10678

#1895

Unfolding-Associative Encoder-Decoder Network with Progressive Alignment for Pansharpening

Shijie Fang, Hongping Gan

ICCV 2025

#1896

MOERL: When Mixture-of-Experts Meet Reinforcement Learning for Adverse Weather Image Restoration

Tao Wang, Peiwen Xia, Bo Li et al.

ICCV 2025

#1897

LOMM: Latest Object Memory Management for Temporally Consistent Video Instance Segmentation

Seunghun Lee, Jiwan Seo, Minwoo Choi et al.

ICCV 2025

#1898

DuoCLR: Dual-Surrogate Contrastive Learning for Skeleton-based Human Action Segmentation

Haitao Tian

ICCV 2025arXiv:2509.05543

#1899

EVDM: Event-based Real-world Video Deblurring with Mamba

Zhijing Sun, Senyan Xu, Kean Liu et al.

ICCV 2025

#1900

Q-Norm: Robust Representation Learning via Quality-Adaptive Normalization

Lanning Zhang, Ying Zhou, Fei Gao et al.

ICCV 2025

#1901

Proxy-Bridged Game Transformer for Interactive Extreme Motion Prediction

Yanwen Fang, Wenqi Jia, Xu Cao et al.

ICCV 2025

#1902

π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?

Susan Liang, Chao Huang, Yolo Yunlong Tang et al.

ICCV 2025

#1903

SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning

Lanmiao Liu, Esam Ghaleb, asli ozyurek et al.

ICCV 2025arXiv:2507.19359

#1904

Metric Convolutions: A Unifying Theory to Adaptive Image Convolutions

Thomas Dagès, Michael Lindenbaum, Alfred Bruckstein

ICCV 2025arXiv:2406.05400

#1905

RobAVA: A Large-scale Dataset and Baseline Towards Video based Robotic Arm Action Understanding

Baoli Sun, Ning Wang, Xinzhu Ma et al.

ICCV 2025

#1906

IDFace: Face Template Protection for Efficient and Secure Identification

Sunpill Kim, Seunghun Paik, Chanwoo Hwang et al.

ICCV 2025arXiv:2507.12050

#1907

I2VControl: Disentangled and Unified Video Motion Synthesis Control

Wanquan Feng, Tianhao Qi, Jiawei Liu et al.

ICCV 2025arXiv:2411.17765

#1908

Generic Event Boundary Detection via Denoising Diffusion

Jaejun Hwang, Dayoung Gong, Manjin Kim et al.

ICCV 2025arXiv:2508.12084

#1909

Not All Degradations Are Equal: A Targeted Feature Denoising Framework for Generalizable Image Super-Resolution

hongjun wang, Jiyuan Chen, Zhengwei Yin et al.

ICCV 2025arXiv:2509.14841

#1910

Fine-Grained 3D Gaussian Head Avatars Modeling from Static Captures via Joint Reconstruction and Registration

Yuan Sun, Xuan Wang, Cong Wang et al.

ICCV 2025

#1911

Attention to Trajectory: Trajectory-Aware Open-Vocabulary Tracking

Yunhao Li, Yifan Jiao, Dan Meng et al.

ICCV 2025arXiv:2503.08145

#1912

MistSense: Versatile Online Detection of Procedural and Execution Mistakes

Constantin Patsch, Yuankai Wu, Marsil Zakour et al.

ICCV 2025

#1913

SEREP: Semantic Facial Expression Representation for Robust In-the-Wild Capture and Retargeting

Arthur Josi, Luiz Gustavo Hafemann, Abdallah Dib et al.

ICCV 2025arXiv:2412.14371

#1914

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025arXiv:2509.00346

#1915

Morph: A Motion-free Physics Optimization Framework for Human Motion Generation

Zhuo Li, Mingshuang Luo, RuiBing Hou et al.

ICCV 2025arXiv:2411.14951

#1916

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation

Syed Talal Wasim, Hamid Suleman, Olga Zatsarynna et al.

ICCV 2025arXiv:2509.11394

#1917

DeSPITE: Exploring Contrastive Deep Skeleton-Pointcloud-IMU-Text Embeddings for Advanced Point Cloud Human Activity Understanding

Thomas Kreutz, Max Mühlhäuser, Alejandro Sanchez Guinea

ICCV 2025arXiv:2506.13897

#1918

Efficient Concertormer for Image Deblurring and Beyond

Pin-Hung Kuo, Jinshan Pan, Shao-Yi Chien et al.

ICCV 2025arXiv:2404.06135

#1919

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Taekyung Ki, Dongchan Min, Gyeongsu Chae

ICCV 2025arXiv:2412.01064

#1920

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025arXiv:2503.09320

#1921

TIP-I2V: A Million-Scale Real Text and Image Prompt Dataset for Image-to-Video Generation

Wenhao Wang, Yi Yang

ICCV 2025arXiv:2411.04709

#1922

SCFlow: Implicitly Learning Style and Content Disentanglement with Flow Models

Pingchuan Ma, Xiaopei Yang, Ming Gui et al.

ICCV 2025arXiv:2508.03402

#1923

Penalizing Boundary Activation for Object Completeness in Diffusion Models

Haoyang Xu, Tianhao Zhao, Sibei Yang et al.

ICCV 2025arXiv:2509.16968

#1924

RayZer: A Self-supervised Large View Synthesis Model

Hanwen Jiang, Hao Tan, Peng Wang et al.

ICCV 2025arXiv:2505.00702

#1925

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025arXiv:2411.18677

#1926

Dual-Expert Consistency Model for Efficient and High-Quality Video Generation

Zhengyao Lyu, Chenyang Si, Tianlin Pan et al.

ICCV 2025

#1927

Straighten Viscous Rectified Flow via Noise Optimization

Jimin Dai, Jiexi Yan, Jian Yang et al.

ICCV 2025highlightarXiv:2507.10218

#1928

Scalable Dual Fingerprinting for Hierarchical Attribution of Text-to-Image Models

Jianwei Fei, Yunshu Dai, Peipeng Yu et al.

ICCV 2025highlight

#1929

QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

Junyi Wu, Zhiteng Li, Zheng Hui et al.

ICCV 2025arXiv:2503.06545

#1930

CRAM: Large Scale Video Continual Learning with Bootstrapped Compression

Shivani Mall, Joao F. Henriques

ICCV 2025arXiv:2508.05001

#1931

Tree-NeRV: Efficient Non-Uniform Sampling for Neural Video Representation via Tree-Structured Feature Grids

Jiancheng Zhao, Yifan Zhan, Qingtian Zhu et al.

ICCV 2025

#1932

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

Nisha Huang, Henglin Liu, Yizhou Lin et al.

ICCV 2025

#1933

VEGGIE: Instructional Editing and Reasoning Video Concepts with Grounded Generation

Shoubin Yu, Difan Liu, Ziqiao Ma et al.

ICCV 2025arXiv:2503.14350

#1934

CycleVAR: Repurposing Autoregressive Model for Unsupervised One-Step Image Translation

Yi Liu, Shengqian Li, Zuzeng Lin et al.

ICCV 2025arXiv:2506.23347

#1935

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Kumara Kahatapitiya, Haozhe Liu, Sen He et al.

ICCV 2025arXiv:2411.02397

#1936

Edicho: Consistent Image Editing in the Wild

Qingyan Bai, Hao Ouyang, Yinghao Xu et al.

ICCV 2025arXiv:2412.21079

#1937

LUSD: Localized Update Score Distillation for Text-Guided Image Editing

Worameth Chinchuthakun, Tossaporn Saengja, Nontawat Tritrong et al.

ICCV 2025arXiv:2503.11054

#1938

FlowChef: Steering of Rectified Flow Models for Controlled Generations

Maitreya Patel, Song Wen, Dimitris Metaxas et al.

ICCV 2025

#1939

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025arXiv:2508.10407

#1940

SynTag: Enhancing the Geometric Robustness of Inversion-based Generative Image Watermarking

Han Fang, Kejiang Chen, Zehua Ma et al.

ICCV 2025

#1941

IQA-Adapter: Exploring Knowledge Transfer from Image Quality Assessment to Diffusion-based Generative Models

Khaled Abud, Sergey Lavrushkin, Alexey Kirillov et al.

ICCV 2025highlightarXiv:2412.01794

#1942

Dual Recursive Feedback on Generation and Appearance Latents for Pose-Robust Text-to-Image Diffusion

Jiwon Kim, Pureum Kim, SeonHwa Kim et al.

ICCV 2025arXiv:2508.09575

#1943

Anti-Tamper Protection for Unauthorized Individual Image Generation

Zelin Li, Ruohan Zong, Yifan Liu et al.

ICCV 2025arXiv:2508.06325

#1944

Continual Personalization for Diffusion Models

Yu-Chien Liao, Jr-Jen Chen, Chi-Pin Huang et al.

ICCV 2025arXiv:2510.02296

#1945

WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation

Zhongyu Yang, Jun Chen, Dannong Xu et al.

ICCV 2025arXiv:2503.19065

#1946

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

Haoxuan Wang, Yuzhang Shang, Zhihang Yuan et al.

ICCV 2025arXiv:2402.03666

#1947

Split-and-Combine: Enhancing Style Augmentation for Single Domain Generalization

Zhen Zhang, Zhen Zhang, Qianlong Dang et al.

ICCV 2025

#1948

Zero-Shot Depth Aware Image Editing with Diffusion Models

Rishubh Parihar, Sachidanand VS, Venkatesh Babu Radhakrishnan

ICCV 2025

#1949

Global and Local Entailment Learning for Natural World Imagery

Srikumar Sastry, Aayush Dhakal, Eric Xing et al.

ICCV 2025arXiv:2506.21476

#1950

TRKT: Weakly Supervised Dynamic Scene Graph Generation with Temporal-enhanced Relation-aware Knowledge Transferring

Zhu Xu, Ting Lei, Zhimin Li et al.

ICCV 2025arXiv:2508.04943

#1951

Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Yuran Dong, Mang Ye

ICCV 2025arXiv:2507.03402

#1952

Who Controls the Authorization? Invertible Networks for Copyright Protection in Text-to-Image Synthesis

Baoyue Hu, Yang Wei, Junhao Xiao et al.

ICCV 2025

#1953

Magic Insert: Style-Aware Drag-and-Drop

Nataniel Ruiz, Yuanzhen Li, Neal Wadhwa et al.

ICCV 2025highlightarXiv:2407.02489

#1954

FontAnimate: High Quality Few-shot Font Generation via Animating Font Transfer Process

Bin Fu, Zixuan Wang, Kainan Yan et al.

ICCV 2025

#1955

CharaConsist: Fine-Grained Consistent Character Generation

Mengyu Wang, Henghui Ding, Jianing Peng et al.

ICCV 2025arXiv:2507.11533

#1956

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation

Jiahao Wang, Ning Kang, Lewei Yao et al.

ICCV 2025arXiv:2501.12976

#1957

TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control

Zhenyu Yan, Jian Wang, Aoqiang Wang et al.

ICCV 2025arXiv:2410.09879

#1958

Beyond Perspective: Neural 360-Degree Video Compression

Andy Regensky, Marc Windsheimer, Fabian Brand et al.

ICCV 2025

#1959

MCID: Multi-aspect Copyright Infringement Detection for Generated Images

Chuanwei Huang, Zexi Jia, Hongyan Fei et al.

ICCV 2025

#1960

Text2Outfit: Controllable Outfit Generation with Multimodal Language Models

Yuanhao Zhai, Yen-Liang Lin, Minxu Peng et al.

ICCV 2025

#1961

One-Step Specular Highlight Removal with Adapted Diffusion Models

Mahir Atmis, LEVENT KARACAN, Mehmet SARIGÜL

ICCV 2025

#1962

DiGA3D: Coarse-to-Fine Diffusional Propagation of Geometry and Appearance for Versatile 3D Inpainting

Jingyi Pan, Dan Xu, Qiong Luo

ICCV 2025arXiv:2507.00429

#1963

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025arXiv:2506.05108

#1964

From Linearity to Non-Linearity: How Masked Autoencoders Capture Spatial Correlations

Anthony Bisulco, Rahul Ramesh, Randall Balestriero et al.

ICCV 2025arXiv:2508.15404

#1965

Reusing Computation in Text-to-Image Diffusion for Efficient Generation of Image Sets

Dale Decatur, Thibault Groueix, Wang Yifan et al.

ICCV 2025arXiv:2508.21032

#1966

Cross-Granularity Online Optimization with Masked Compensated Information for Learned Image Compression

Haowei Kuang, Wenhan Yang, Zongming Guo et al.

ICCV 2025

#1967

Stroke2Sketch: Harnessing Stroke Attributes for Training-Free Sketch Generation

Rui Yang, Huining Li, Yiyi Long et al.

ICCV 2025arXiv:2510.16319

#1968

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

Minghao Fu, Guo-Hua Wang, Xiaohao Chen et al.

ICCV 2025arXiv:2507.18192

#1969

FiVE-Bench: A Fine-grained Video Editing Benchmark for Evaluating Emerging Diffusion and Rectified Flow Models

Minghan LI, Chenxi Xie, Yichen Wu et al.

ICCV 2025

#1970

Co-Painter: Fine-Grained Controllable Image Stylization via Implicit Decoupling and Adaptive Injection

Bowen Fu, Wei Wei, Jiaqi Tang et al.

ICCV 2025

#1971

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025arXiv:2508.03696

#1972

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Haonan Qiu, Shiwei Zhang, Yujie Wei et al.

ICCV 2025arXiv:2412.09626

#1973

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025arXiv:2507.02358

#1974

Toward Better Out-painting: Improving the Image Composition with Initialization Policy Model

Xuan Han, Yihao Zhao, Yanhao Ge et al.

ICCV 2025

#1975

Versatile Transition Generation with Image-to-Video Diffusion

Zuhao Yang, Jiahui Zhang, Yingchen Yu et al.

ICCV 2025arXiv:2508.01698

#1976

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Shengbang Tong, David Fan, Jiachen Zhu et al.

ICCV 2025arXiv:2412.14164

#1977

SafetyDPO: Scalable Safety Alignment for Text-to-Image Generation

Runtao Liu, I Chen, Jindong Gu et al.

ICCV 2025

#1978

DiffIP: Representation Fingerprints for Robust IP Protection of Diffusion Models

Zhuoling Li, Haoxuan Qu, Jason Kuen et al.

ICCV 2025

#1979

FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models

Yuxuan Wang, Tianwei Cao, Huayu Zhang et al.

ICCV 2025arXiv:2507.02714

#1980

Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

Ryan Ramos, Vladan Stojnić, Giorgos Kordopatis-Zilos et al.

ICCV 2025highlightarXiv:2508.10637

#1981

AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild

Siyoon Jin, Jisu Nam, Jiyoung Kim et al.

ICCV 2025

#1982

Diffusion Epistemic Uncertainty with Asymmetric Learning for Diffusion-Generated Image Detection

Yingsong Huang, Hui Guo, Jing Huang et al.

ICCV 2025arXiv:2601.14625

#1983

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025arXiv:2508.03481

#1984

Calibrating MLLM-as-a-judge via Multimodal Bayesian Prompt Ensembles

Eric Slyman, Mehrab Tanjim, Kushal Kafle et al.

ICCV 2025arXiv:2509.08777

#1985

V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim et al.

ICCV 2025arXiv:2508.03254

#1986

LOTA: Bit-Planes Guided AI-Generated Image Detection

Renxi Cheng, Hongsong Wang, Yang Zhang et al.

ICCV 2025arXiv:2510.14230

#1987

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

Zeyi Sun, Ziyang Chu, Pan Zhang et al.

ICCV 2025

#1988

AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Xincheng Shuai, Hao Luo et al.

ICCV 2025arXiv:2507.02857

#1989

Streamlining Image Editing with Layered Diffusion Brushes

Peyman Gholami, Robert Xiao

ICCV 2025arXiv:2405.00313

#1990

EEdit : Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing

Zexuan Yan, Yue Ma, Chang Zou et al.

ICCV 2025arXiv:2503.10270

#1991

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528

#1992

Instruction-based Image Editing with Planning, Reasoning, and Generation

Liya Ji, Chenyang Qi, Qifeng Chen

ICCV 2025

#1993

HDR Image Generation via Gain Map Decomposed Diffusion

Yuanshen Guan, Ruikang Xu, Yinuo Liao et al.

ICCV 2025

#1994

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee, Kyungho Bae, Kyle Min et al.

ICCV 2025highlightarXiv:2508.10896

#1995

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025arXiv:2503.05156

#1996

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101

#1997

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025arXiv:2501.05442

#1998

ArtEditor: Learning Customized Instructional Image Editor from Few-Shot Examples

Shijie Huang, Yiren Song, Yuxuan Zhang et al.

ICCV 2025

#1999

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

Yunqiu Xu, Linchao Zhu, Yi Yang

ICCV 2025arXiv:2410.12332

#2000

Disrupting Model Merging: A Parameter-Level Defense Without Sacrificing Accuracy

JUNHAO WEI, YU ZHE, Jun Sakuma

ICCV 2025arXiv:2503.07661

← Previous

1...8 9 10 11 12...14