Most Cited 2025 "masked autoencoder paradigm" Papers

22,274 papers found • Page 107 of 112

Filters:Most Cited 2025 masked autoencoder paradigm Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#21201

CHARM3R: Towards Unseen Camera Height Robust Monocular 3D Detector

Abhinav Kumar, Yuliang Guo, Zhihao Zhang et al.

ICCV 2025posterarXiv:2508.11185

#21202

PID-controlled Langevin Dynamics for Faster Sampling on Generative Models

Hongyi Chen, Jianhai Shu, Jingtao Ding et al.

NEURIPS 2025posterarXiv:2511.12603

#21203

Learning on the Go: A Meta-learning Object Navigation Model

Xiaorong Qin, Xinhang Song, Sixian Zhang et al.

ICCV 2025poster

#21204

WonderPlay: Dynamic 3D Scene Generation from a Single Image and Actions

Zizhang Li, Hong-Xing Yu, Wei Liu et al.

ICCV 2025highlightarXiv:2505.18151

#21205

Beyond the Destination: A Novel Benchmark for Exploration-Aware Embodied Question Answering

Kaixuan Jiang, Yang Liu, Weixing Chen et al.

ICCV 2025posterarXiv:2503.11117

#21206

Not all Views are Created Equal: Analyzing Viewpoint Instabilities in Vision Foundation Models

Mateusz Michalkiewicz, Xinyue Bai, Mahsa Baktashmotlagh et al.

ICCV 2025posterarXiv:2412.19920

#21207

CHROME: Clothed Human Reconstruction with Occlusion-Resilience and Multiview-Consistency from a Single Image

Arindam Dutta, Meng Zheng, Zhongpai Gao et al.

ICCV 2025highlightarXiv:2503.15671

#21208

ReCoT: Reflective Self-Correction Training for Mitigating Confirmation Bias in Large Vision-Language Models

Mengxue Qu, Yibo Hu, Kunyang Han et al.

ICCV 2025poster

#21209

STree: Speculative Tree Decoding for Hybrid State Space Models

Yangchao Wu, Zongyue Qin, Alex Wong et al.

NEURIPS 2025posterarXiv:2505.14969

#21210

Discovering Knowledge Deficiencies of Language Models on Massive Knowledge Base

Linxin Song, Xuwei Ding, Jieyu Zhang et al.

COLM 2025paper

#21211

OMNI-DC: Highly Robust Depth Completion with Multiresolution Depth Integration

Yiming Zuo, Willow Yang, Zeyu Ma et al.

ICCV 2025posterarXiv:2411.19278

#21212

Multi-Objective Reinforcement Learning with Max-Min Criterion: A Game-Theoretic Approach

woohyeon Byeon, Giseung Park, Jongseong Chae et al.

NEURIPS 2025posterarXiv:2510.20235

#21213

SNEAKDOOR: Stealthy Backdoor Attacks against Distribution Matching-based Dataset Condensation

He Yang, Dongyi Lv, Song Ma et al.

NEURIPS 2025poster

#21214

CogNav: Cognitive Process Modeling for Object Goal Navigation with LLMs

Yihan Cao, Jiazhao Zhang, Zhinan Yu et al.

ICCV 2025posterarXiv:2412.10439

#21215

Disentangling Hyperedges through the Lens of Category Theory

Yoonho Lee, Junseok Lee, Sangwoo Seo et al.

NEURIPS 2025posterarXiv:2510.16289

#21216

Bridging the Sky and Ground: Towards View-Invariant Feature Learning for Aerial-Ground Person Re-Identification

Wajahat Khalid, Bin Liu, Xulin Li et al.

ICCV 2025poster

#21217

WalkVLM: Aid Visually Impaired People Walking by Vision Language Model

Zhiqiang Yuan, Ting Zhang, Yeshuang Zhu et al.

ICCV 2025poster

#21218

VIGFace: Virtual Identity Generation for Privacy-Free Face Recognition Dataset

Minsoo Kim, Min-Cheol Sagong, Gi Pyo Nam et al.

ICCV 2025poster

#21219

MotionStreamer: Streaming Motion Generation via Diffusion-based Autoregressive Model in Causal Latent Space

Lixing Xiao, Shunlin Lu, Huaijin Pi et al.

ICCV 2025posterarXiv:2503.15451

#21220

Benchmarking Retrieval-Augmented Multimomal Generation for Document Question Answering

Kuicai Dong, CHANG YUJING, Shijie Huang et al.

NEURIPS 2025poster

#21221

MUniverse: A Simulation and Benchmarking Suite for Motor Unit Decomposition

Pranav Mamidanna, Thomas Klotz, Dimitrios Chalatsis et al.

NEURIPS 2025poster

#21222

CineTechBench: A Benchmark for Cinematographic Technique Understanding and Generation

Xinran Wang, Songyu Xu, Shan Xiangxuan et al.

NEURIPS 2025posterarXiv:2505.15145

#21223

Mixture of Experts Guided by Gaussian Splatters Matters: A new Approach to Weakly-Supervised Video Anomaly Detection

Giacomo D'Amicantonio, Snehashis Majhi, Quan Kong et al.

ICCV 2025highlightarXiv:2508.06318

#21224

What If: Understanding Motion Through Sparse Interactions

Stefan A. Baumann, Nick Stracke, Timy Phan et al.

ICCV 2025poster

#21225

Sekai: A Video Dataset towards World Exploration

Zhen Li, Chuanhao Li, Xiaofeng Mao et al.

NEURIPS 2025posterarXiv:2506.15675

#21226

Beyond Label Semantics: Language-Guided Action Anatomy for Few-shot Action Recognition

Zefeng Qian, Xincheng Yao, Yifei Huang et al.

ICCV 2025posterarXiv:2507.16287

#21227

Homogeneous Algorithms Can Reduce Competition in Personalized Pricing

Nathanael Jo, Ashia Wilson, Kathleen Creel et al.

NEURIPS 2025posterarXiv:2503.15634

#21228

TIDMAD: Time Series Dataset for Discovering Dark Matter with AI Denoising

Jessica Fry, Xinyi Fu, Zhenghao Fu et al.

NEURIPS 2025spotlightarXiv:2406.04378

#21229

Revisiting Generative Infrared and Visible Image Fusion Based on Human Cognitive Laws

Lin Guo, Xiaoqing Luo, Wei Xie et al.

NEURIPS 2025spotlightarXiv:2510.26268

#21230

MamTiff-CAD: Multi-Scale Latent Diffusion with Mamba+ for Complex Parametric Sequence

Liyuan Deng, Yunpeng Bai, Yongkang Dai et al.

ICCV 2025posterarXiv:2511.17647

#21231

Local Scale Equivariance with Latent Deep Equilibrium Canonicalizer

Md Ashiqur Rahman, Chiao-An Yang, Michael N Cheng et al.

ICCV 2025posterarXiv:2508.14187

#21232

RigAnyFace: Scaling Neural Facial Mesh Auto-Rigging with Unlabeled Data

Wenchao Ma, Dario Kneubuehler, Maurice Chu et al.

NEURIPS 2025posterarXiv:2511.18601

#21233

EfficientMT: Efficient Temporal Adaptation for Motion Transfer in Text-to-Video Diffusion Models

Yufei Cai, Hu Han, Yuxiang Wei et al.

ICCV 2025posterarXiv:2503.19369

#21234

Moment- and Power-Spectrum-Based Gaussianity Regularization for Text-to-Image Models

Jisung Hwang, Jaihoon Kim, Minhyuk Sung

NEURIPS 2025posterarXiv:2509.07027

#21235

Collab-RAG: Boosting Retrieval-Augmented Generation for Complex Question Answering via White-Box and Black-Box LLM Collaboration

Ran Xu, Wenqi Shi, Yuchen Zhuang et al.

COLM 2025paper

#21236

Deep Adaptive Unfolded Network via Spatial Morphology Stripping and Spectral Filtration for Pan-sharpening

Hebaixu Wang, Jiayi Ma

ICCV 2025poster

#21237

Reference-based Super-Resolution via Image-based Retrieval-Augmented Generation Diffusion

Byeonghun Lee, Hyunmin Cho, Honggyu Choi et al.

ICCV 2025poster

#21238

Vulnerability-Aware Spatio-Temporal Learning for Generalizable Deepfake Video Detection

Dat NGUYEN, Marcella Astrid, Anis Kacem et al.

ICCV 2025posterarXiv:2501.01184

#21239

Multi-modal Identity Extraction

Ryan Webster, Teddy Furon

ICCV 2025poster

#21240

Understanding Differential Transformer Unchains Pretrained Self-Attentions

Chaerin Kong, Jiho Jang, Nojun Kwak

NEURIPS 2025posterarXiv:2505.16333

#21241

CombatVLA: An Efficient Vision-Language-Action Model for Combat Tasks in 3D Action Role-Playing Games

Peng Chen, Pi Bu, Yingyao Wang et al.

ICCV 2025posterarXiv:2503.09527

#21242

Blind Noisy Image Deblurring Using Residual Guidance Strategy

Heyan Liu, Jianing Sun, Jun Liu et al.

ICCV 2025poster

#21243

Drawing Developmental Trajectory from Cortical Surface Reconstruction

WENXUAN WU, ruowen qu, Zhongliang Liu et al.

ICCV 2025poster

#21244

ActiveVOO: Value of Observation Guided Active Knowledge Acquisition for Open-World Embodied Lifted Regression Planning

Xiaotian Liu, Ali Pesaranghader, Jaehong Kim et al.

NEURIPS 2025poster

#21245

Less is More: Improving Motion Diffusion Models with Sparse Keyframes

Jinseok Bae, Inwoo Hwang, Young-Yoon Lee et al.

ICCV 2025posterarXiv:2503.13859

#21246

DGTalker: Disentangled Generative Latent Space Learning for Audio-Driven Gaussian Talking Heads

Xiaoxi Liang, Yanbo Fan, Qiya Yang et al.

ICCV 2025poster

#21247

Sample Efficient Preference Alignment in LLMs via Active Exploration

Viraj Mehta, Syrine Belakaria, Vikramjeet Das et al.

COLM 2025paper

#21248

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

ICCV 2025posterarXiv:2506.23263

#21249

Riemannian-Geometric Fingerprints of Generative Models

Hae Jin Song, Laurent Itti

ICCV 2025highlightarXiv:2506.22802

#21250

G-DexGrasp: Generalizable Dexterous Grasping Synthesis Via Part-Aware Prior Retrieval and Prior-Assisted Generation

Juntao Jian, Xiuping Liu, Zixuanchen Zixuanchen et al.

ICCV 2025posterarXiv:2503.19457

#21251

Fast Projection-Free Approach (without Optimization Oracle) for Optimization over Compact Convex Set

Chenghao Liu, Enming Liang, Minghua Chen

NEURIPS 2025spotlight

#21252

ISP2HRNet: Learning to Reconstruct High Resolution Image from Irregularly Sampled Pixels via Hierarchical Gradient Learning

Yuanlin Wang, Ruiqin Xiong, Rui Zhao et al.

ICCV 2025highlight

#21253

Learning to Factorize Spatio-Temporal Foundation Models

Siru Zhong, Junjie Qiu, Yangyu Wu et al.

NEURIPS 2025oral

#21254

Event-Driven Storytelling with Multiple Lifelike Humans in a 3D Scene

Donggeun Lim, Jinseok Bae, Inwoo Hwang et al.

ICCV 2025posterarXiv:2507.19232

#21255

Robust and Scalable Autonomous Reinforcement Learning in Irreversible Environments

Sang-Hyun Lee

NEURIPS 2025poster

#21256

Disentangling misreporting from genuine adaptation in strategic settings: a causal approach

Dylan Zapzalka, Trenton Chang, Lindsay Warrenburg et al.

NEURIPS 2025poster

#21257

Fast Image Super-Resolution via Consistency Rectified Flow

Jiaqi Xu, Wenbo Li, Haoze Sun et al.

ICCV 2025poster

#21258

Event-guided HDR Reconstruction with Diffusion Priors

Yixin Yang, jiawei zhang, Yang Zhang et al.

ICCV 2025poster

#21259

AffordDexGrasp: Open-set Language-guided Dexterous Grasp with Generalizable-Instructive Affordance

Yilin Wei, Mu Lin, Yuhao Lin et al.

ICCV 2025posterarXiv:2503.07360

#21260

Robust Adverse Weather Removal via Spectral-based Spatial Grouping

Yuhwan Jeong, Yunseo Yang, Youngho Yoon et al.

ICCV 2025posterarXiv:2507.22498

#21261

Hipandas: Hyperspectral Image Joint Denoising and Super-Resolution by Image Fusion with the Panchromatic Image

Shuang Xu, Zixiang Zhao, Haowen Bai et al.

ICCV 2025posterarXiv:2412.04201

#21262

Revolutionizing Graph Aggregation: From Suppression to Amplification via BoostGCN

Jiaxin Wu, Chenglong Pang, Guangxiong Chen et al.

NEURIPS 2025poster

#21263

VA-GS: Enhancing the Geometric Representation of Gaussian Splatting via View Alignment

Qing Li, Huifang Feng, Xun Gong et al.

NEURIPS 2025posterarXiv:2510.11473

#21264

Scaling Law with Learning Rate Annealing

Howe Tissue, Venus Wang, Lu Wang

NEURIPS 2025posterarXiv:2408.11029

#21265

RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models

Yiqi Tian, Pengfei Jin, Mingze Yuan et al.

NEURIPS 2025posterarXiv:2507.12201

#21266

A*-Thought: Efficient Reasoning via Bidirectional Compression for Low-Resource Settings

Xiaoang Xu, Shuo Wang, Xu Han et al.

NEURIPS 2025posterarXiv:2505.24550

#21267

VideoSetDiff: Identifying and Reasoning Similarities and Differences in Similar Videos

YUE QIU, Yanjun Sun, Takuma Yagi et al.

ICCV 2025poster

#21268

HADES: Human Avatar with Dynamic Explicit Hair Strands

Zhanfeng Liao, Hanzhang Tu, Cheng Peng et al.

ICCV 2025poster

#21269

DreamRelation: Relation-Centric Video Customization

Yujie Wei, Shiwei Zhang, Hangjie Yuan et al.

ICCV 2025posterarXiv:2503.07602

#21270

A Learning-Augmented Dynamic Programming Approach for Orienteering Problem with Time Windows

Guansheng Peng, Lining Xing, Fuyan Ma et al.

NEURIPS 2025poster

#21271

FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration

Hao Li, Xiang Chen, Jiangxin Dong et al.

ICCV 2025posterarXiv:2412.01427

#21272

Highlight What You Want: Weakly-Supervised Instance-Level Controllable Infrared-Visible Image Fusion

Zeyu Wang, Jizheng Zhang, Haiyu Song et al.

ICCV 2025poster

#21273

FaceLift: Learning Generalizable Single Image 3D Face Reconstruction from Synthetic Heads

Weijie Lyu, Yi Zhou, Ming-Hsuan Yang et al.

ICCV 2025posterarXiv:2412.17812

#21274

Visual Chronicles: Using Multimodal LLMs to Analyze Massive Collections of Images

Boyang Deng, Kyle Genova, Songyou Peng et al.

ICCV 2025highlightarXiv:2504.08727

#21275

Blind2Sound: Self-Supervised Image Denoising without Residual Noise

Jiazheng Liu, Zejin Wang, Bohao Chen et al.

ICCV 2025posterarXiv:2303.05183

#21276

IMoRe: Implicit Program-Guided Reasoning for Human Motion Q&A

Chen Li, Chinthani Sugandhika, Ee Yeo Keat et al.

ICCV 2025posterarXiv:2508.01984

#21277

$100K or 100 Days: Trade-offs when Pre-Training with Academic Resources

Apoorv Khandelwal, Tian Yun, Nihal V. Nayak et al.

COLM 2025paper

#21278

MAVFlow: Preserving Paralinguistic Elements with Conditional Flow Matching for Zero-Shot AV2AV Multilingual Translation

Sungwoo Cho, Jeongsoo Choi, Sungnyun Kim et al.

ICCV 2025posterarXiv:2503.11026

#21279

Privacy-centric Deep Motion Retargeting for Anonymization of Skeleton-Based Motion Visualization

Thomas Carr, Depeng Xu, Shuhan Yuan et al.

ICCV 2025poster

#21280

UniPhys: Unified Planner and Controller with Diffusion for Flexible Physics-Based Character Control

Yan Wu, Korrawe Karunratanakul, Zhengyi Luo et al.

ICCV 2025highlightarXiv:2504.12540

#21281

UniRes: Universal Image Restoration for Complex Degradations

Mo Zhou, Keren Ye, Mauricio Delbracio et al.

ICCV 2025posterarXiv:2506.05599

#21282

SV4D 2.0: Enhancing Spatio-Temporal Consistency in Multi-View Video Diffusion for High-Quality 4D Generation

Chun-Han Yao, Yiming Xie, Vikram Voleti et al.

ICCV 2025posterarXiv:2503.16396

#21283

Light-A-Video: Training-free Video Relighting via Progressive Light Fusion

Yujie Zhou, Jiazi Bu, Pengyang Ling et al.

ICCV 2025posterarXiv:2502.08590

#21284

Go to Zero: Towards Zero-shot Motion Generation with Million-scale Data

Ke Fan, Shunlin Lu, Minyue Dai et al.

ICCV 2025highlightarXiv:2507.07095

#21285

Graph Few-Shot Learning via Adaptive Spectrum Experts and Cross-Set Distribution Calibration

Yonghao Liu, Yajun Wang, Chunli Guo et al.

NEURIPS 2025posterarXiv:2510.12140

#21286

Re-coding for Uncertainties: Edge-awareness Semantic Concordance for Resilient Event-RGB Segmentation

Nan Bao, Yifan Zhao, Lin Zhu et al.

NEURIPS 2025posterarXiv:2511.08269

#21287

Group-wise Scaling and Orthogonal Decomposition for Domain-Invariant Feature Extraction in Face Anti-Spoofing

Seungjin Jung, Kanghee Lee, Yonghyun Jeong et al.

ICCV 2025posterarXiv:2507.04006

#21288

LLM Unlearning Without an Expert Curated Dataset

Xiaoyuan Zhu, Muru Zhang, Ollie Liu et al.

COLM 2025paperarXiv:2508.06595

#21289

RRO: LLM Agent Optimization Through Rising Reward Trajectories

Zilong Wang, Jingfeng Yang, Sreyashi Nag et al.

COLM 2025paper

#21290

DynamicFace: High-Quality and Consistent Face Swapping for Image and Video using Composable 3D Facial Priors

Runqi Wang, Yang Chen, Sijie Xu et al.

ICCV 2025posterarXiv:2501.08553

#21291

DisenQ: Disentangling Q-Former for Activity-Biometrics

Shehreen Azad, Yogesh Rawat

ICCV 2025highlightarXiv:2507.07262

#21292

Finding Low-Rank Matrix Weights in DNNs via Riemannian Optimization: RAdaGrad and RAdamW

Fengmiao Bian, Jinyang ZHENG, Ziyun Liu et al.

NEURIPS 2025poster

#21293

Online Bilateral Trade With Minimal Feedback: Don’t Waste Seller’s Time

Francesco Bacchiocchi, Matteo Castiglioni, Roberto Colomboni et al.

NEURIPS 2025poster

#21294

ProtoPairNet: Interpretable Regression through Prototypical Pair Reasoning

Rose Gurung, Ronilo Ragodos, Chiyu Ma et al.

NEURIPS 2025poster

#21295

Revisiting Frank-Wolfe for Structured Nonconvex Optimization

Hoomaan Maskan, Yikun Hou, Suvrit Sra et al.

NEURIPS 2025posterarXiv:2503.08921

#21296

T2Bs: Text-to-Character Blendshapes via Video Generation

Jiahao Luo, Chaoyang Wang, Michael Vasilkovsky et al.

ICCV 2025posterarXiv:2509.10678

#21297

LOMM: Latest Object Memory Management for Temporally Consistent Video Instance Segmentation

Seunghun Lee, Jiwan Seo, Minwoo Choi et al.

ICCV 2025poster

#21298

Chirality in Action: Time-Aware Video Representation Learning by Latent Straightening

Piyush Nitin Bagad, Andrew Zisserman

NEURIPS 2025oralarXiv:2509.08502

#21299

Quasi-Self-Concordant Optimization with $\ell_{\infty}$ Lewis Weights

Alina Ene, Ta Duy Nguyen, Adrian Vladu

NEURIPS 2025poster

#21300

MeshAnything V2: Artist-Created Mesh Generation with Adjacent Mesh Tokenization

Yiwen Chen, Yikai Wang, Yihao Luo et al.

ICCV 2025posterarXiv:2408.02555

#21301

π-AVAS: Can Physics-Integrated Audio-Visual Modeling Boost Neural Acoustic Synthesis?

Susan Liang, Chao Huang, Yolo Yunlong Tang et al.

ICCV 2025poster

#21302

SemGes: Semantics-aware Co-Speech Gesture Generation using Semantic Coherence and Relevance Learning

Lanmiao Liu, Esam Ghaleb, asli ozyurek et al.

ICCV 2025posterarXiv:2507.19359

#21303

I2VControl: Disentangled and Unified Video Motion Synthesis Control

Wanquan Feng, Tianhao Qi, Jiawei Liu et al.

ICCV 2025posterarXiv:2411.17765

#21304

MeshLLM: Empowering Large Language Models to Progressively Understand and Generate 3D Mesh

Shuangkang Fang, I-Chao Shen, Yufeng Wang et al.

ICCV 2025highlightarXiv:2508.01242

#21305

LUT-Fuse: Towards Extremely Fast Infrared and Visible Image Fusion via Distillation to Learnable Look-Up Tables

Xunpeng Yi, yibing zhang, Xinyu Xiang et al.

ICCV 2025posterarXiv:2509.00346

#21306

MixANT: Observation-dependent Memory Propagation for Stochastic Dense Action Anticipation

Syed Talal Wasim, Hamid Suleman, Olga Zatsarynna et al.

ICCV 2025posterarXiv:2509.11394

#21307

Imagine All The Relevance: Scenario-Profiled Indexing with Knowledge Expansion for Dense Retrieval

Sangam Lee, Ryang Heo, SeongKu Kang et al.

COLM 2025paper

#21308

FLOAT: Generative Motion Latent Flow Matching for Audio-driven Talking Portrait

Taekyung Ki, Dongchan Min, Gyeongsu Chae

ICCV 2025posterarXiv:2412.01064

#21309

M²IV: Towards Efficient and Fine-grained Multimodal In-Context Learning via Representation Engineering

Yanshu Li, Yi Cao, Hongyang He et al.

COLM 2025paper

#21310

2HandedAfforder: Learning Precise Actionable Bimanual Affordances from Human Videos

Marvin Heidinger, Snehal Jauhri, Vignesh Prasad et al.

ICCV 2025posterarXiv:2503.09320

#21311

Language models align with brain regions that represent concepts across modalities

Maria Ryskina, Greta Tuckute, Alexander Fung et al.

COLM 2025paperarXiv:2508.11536

#21312

Mixed-Sample SGD: an End-to-end Analysis of Supervised Transfer Learning

Yuyang Deng, Samory Kpotufe

NEURIPS 2025posterarXiv:2507.04194

#21313

RayZer: A Self-supervised Large View Synthesis Model

Hanwen Jiang, Hao Tan, Peng Wang et al.

ICCV 2025posterarXiv:2505.00702

#21314

MatchDiffusion: Training-free Generation of Match-Cuts

Alejandro Pardo, Fabio Pizzati, Tong Zhang et al.

ICCV 2025posterarXiv:2411.18677

#21315

Scalable Dual Fingerprinting for Hierarchical Attribution of Text-to-Image Models

Jianwei Fei, Yunshu Dai, Peipeng Yu et al.

ICCV 2025highlight

#21316

QuantCache: Adaptive Importance-Guided Quantization with Hierarchical Latent and Layer Caching for Video Generation

Junyi Wu, Zhiteng Li, Zheng Hui et al.

ICCV 2025posterarXiv:2503.06545

#21317

Tree-NeRV: Efficient Non-Uniform Sampling for Neural Video Representation via Tree-Structured Feature Grids

Jiancheng Zhao, Yifan Zhan, Qingtian Zhu et al.

ICCV 2025poster

#21318

MaTe: Images Are All You Need for Material Transfer via Diffusion Transformer

Nisha Huang, Henglin Liu, Yizhou Lin et al.

ICCV 2025poster

#21319

Adaptive Caching for Faster Video Generation with Diffusion Transformers

Kumara Kahatapitiya, Haozhe Liu, Sen He et al.

ICCV 2025posterarXiv:2411.02397

#21320

FlowChef: Steering of Rectified Flow Models for Controlled Generations

Maitreya Patel, Song Wen, Dimitris Metaxas et al.

ICCV 2025poster

#21321

SynTag: Enhancing the Geometric Robustness of Inversion-based Generative Image Watermarking

Han Fang, Kejiang Chen, Zehua Ma et al.

ICCV 2025poster

#21322

Robo-Instruct: Simulator-Augmented Instruction Alignment For Finetuning Code LLMs

Zichao Hu, Junyi Jessy Li, Arjun Guha et al.

COLM 2025paper

#21323

WikiAutoGen: Towards Multi-Modal Wikipedia-Style Article Generation

Zhongyu Yang, Jun Chen, Dannong Xu et al.

ICCV 2025posterarXiv:2503.19065

#21324

QuEST: Low-bit Diffusion Model Quantization via Efficient Selective Finetuning

Haoxuan Wang, Yuzhang Shang, Zhihang Yuan et al.

ICCV 2025posterarXiv:2402.03666

#21325

ALOPE: Adaptive Layer Optimization for Translation Quality Estimation using Large Language Models

Archchana Sindhujan, Shenbin Qian, Chan Chi Chun Matthew et al.

COLM 2025paper

#21326

Split-and-Combine: Enhancing Style Augmentation for Single Domain Generalization

Zhen Zhang, Zhen Zhang, Qianlong Dang et al.

ICCV 2025poster

#21327

Fast Non-Log-Concave Sampling under Nonconvex Equality and Inequality Constraints with Landing

Kijung Jeon, Michael Muehlebach, Molei Tao

NEURIPS 2025posterarXiv:2510.22044

#21328

Fractional Langevin Dynamics for Combinatorial Optimization via Polynomial-Time Escape

Shiyue Wang, Ziao Guo, Changhong Lu et al.

NEURIPS 2025poster

#21329

Zero-Shot Depth Aware Image Editing with Diffusion Models

Rishubh Parihar, Sachidanand VS, Venkatesh Babu Radhakrishnan

ICCV 2025poster

#21330

Pose-Star: Anatomy-Aware Editing for Open-World Fashion Images

Yuran Dong, Mang Ye

ICCV 2025posterarXiv:2507.03402

#21331

Who Controls the Authorization? Invertible Networks for Copyright Protection in Text-to-Image Synthesis

Baoyue Hu, Yang Wei, Junhao Xiao et al.

ICCV 2025poster

#21332

Retrieval-Augmented Generation with Conflicting Evidence

Han Wang, Archiki Prasad, Elias Stengel-Eskin et al.

COLM 2025paperarXiv:2504.13079

#21333

FontAnimate: High Quality Few-shot Font Generation via Animating Font Transfer Process

Bin Fu, Zixuan Wang, Kainan Yan et al.

ICCV 2025poster

#21334

CASCADE Your Datasets for Cross-Mode Knowledge Retrieval of Language Models

Runlong Zhou, Yi Zhang

COLM 2025paper

#21335

LiT: Delving into a Simple Linear Diffusion Transformer for Image Generation

Jiahao Wang, Ning Kang, Lewei Yao et al.

ICCV 2025posterarXiv:2501.12976

#21336

Sherkala-Chat: Building a State-of-the-Art LLM for Kazakh in a Moderately Resourced Setting

Fajri Koto, Rituraj Joshi, Nurdaulet Mukhituly et al.

COLM 2025paper

#21337

Pre-Trained Policy Discriminators are General Reward Models

Shihan Dou, Shichun Liu, Yuming Yang et al.

NEURIPS 2025posterarXiv:2507.05197

#21338

TextMaster: A Unified Framework for Realistic Text Editing via Glyph-Style Dual-Control

Zhenyu Yan, Jian Wang, Aoqiang Wang et al.

ICCV 2025posterarXiv:2410.09879

#21339

MCID: Multi-aspect Copyright Infringement Detection for Generated Images

Chuanwei Huang, Zexi Jia, Hongyan Fei et al.

ICCV 2025poster

#21340

Text2Outfit: Controllable Outfit Generation with Multimodal Language Models

Yuanhao Zhai, Yen-Liang Lin, Minxu Peng et al.

ICCV 2025poster

#21341

Advancing Language Multi-Agent Learning with Credit Re-Assignment for Interactive Environment Generalization

Zhitao He, Zijun Liu, Peng Li et al.

COLM 2025paper

#21342

Self-Steering Language Models

Gabriel Grand, Joshua B. Tenenbaum, Vikash Mansinghka et al.

COLM 2025paper

#21343

Universal Few-shot Spatial Control for Diffusion Models

Kiet Nguyen, Chanhyuk Lee, Donggyun Kim et al.

NEURIPS 2025posterarXiv:2509.07530

#21344

DIMCIM: A Quantitative Evaluation Framework for Default-mode Diversity and Generalization in Text-to-Image Generative Models

Revant Teotia, Candace Ross, Karen Ullrich et al.

ICCV 2025posterarXiv:2506.05108

#21345

Cross-Granularity Online Optimization with Masked Compensated Information for Learned Image Compression

Haowei Kuang, Wenhan Yang, Zongming Guo et al.

ICCV 2025poster

#21346

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

Fei Wang, Li Shen, Liang Ding et al.

NEURIPS 2025posterarXiv:2510.15304

#21347

MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks

Rui Jiao, Hanlin Wu, Wenbing Huang et al.

NEURIPS 2025poster

#21348

TeEFusion: Blending Text Embeddings to Distill Classifier-Free Guidance

Minghao Fu, Guo-Hua Wang, Xiaohao Chen et al.

ICCV 2025posterarXiv:2507.18192

#21349

CompSlider: Compositional Slider for Disentangled Multiple-Attribute Image Generation

Zixin Zhu, Kevin Duarte, Mamshad Nayeem Rizve et al.

ICCV 2025posterarXiv:2509.01028

#21350

Predicting Functional Brain Connectivity with Context-Aware Deep Neural Networks

Alexander Ratzan, Sidharth Goel, Junhao Wen et al.

NEURIPS 2025poster

#21351

PLA: Prompt Learning Attack against Text-to-Image Generative Models

XINQI LYU, Yihao LIU, Yanjie Li et al.

ICCV 2025posterarXiv:2508.03696

#21352

Holistic Tokenizer for Autoregressive Image Generation

Anlin Zheng, Haochen Wang, Yucheng Zhao et al.

ICCV 2025posterarXiv:2507.02358

#21353

DanceEditor: Towards Iterative Editable Music-driven Dance Generation with Open-Vocabulary Descriptions

Hengyuan Zhang, Zhe Li, Xingqun Qi et al.

ICCV 2025posterarXiv:2508.17342

#21354

Toward Better Out-painting: Improving the Image Composition with Initialization Policy Model

Xuan Han, Yihao Zhao, Yanhao Ge et al.

ICCV 2025poster

#21355

Versatile Transition Generation with Image-to-Video Diffusion

Zuhao Yang, Jiahui Zhang, Yingchen Yu et al.

ICCV 2025posterarXiv:2508.01698

#21356

MetaMorph: Multimodal Understanding and Generation via Instruction Tuning

Shengbang Tong, David Fan, Jiachen Zhu et al.

ICCV 2025posterarXiv:2412.14164

#21357

DiffIP: Representation Fingerprints for Robust IP Protection of Diffusion Models

Zhuoling Li, Haoxuan Qu, Jason Kuen et al.

ICCV 2025poster

#21358

Processing and acquisition traces in visual encoders: What does CLIP know about your camera?

Ryan Ramos, Vladan Stojnić, Giorgos Kordopatis-Zilos et al.

ICCV 2025highlightarXiv:2508.10637

#21359

AM-Adapter: Appearance Matching Adapter for Exemplar-based Semantic Image Synthesis in-the-Wild

Siyoon Jin, Jisu Nam, Jiyoung Kim et al.

ICCV 2025poster

#21360

Diffusion Epistemic Uncertainty with Asymmetric Learning for Diffusion-Generated Image Detection

Yingsong Huang, Hui Guo, Jing Huang et al.

ICCV 2025posterarXiv:2601.14625

#21361

Draw Your Mind: Personalized Generation via Condition-Level Modeling in Text-to-Image Diffusion Models

Hyungjin Kim, Seokho Ahn, Young-Duk Seo

ICCV 2025posterarXiv:2508.03481

#21362

KINDLE: Knowledge-Guided Distillation for Prior-Free Gene Regulatory Network Inference

Rui Peng, Yuchen Lu, Qichen Sun et al.

NEURIPS 2025oralarXiv:2505.09664

#21363

V.I.P. : Iterative Online Preference Distillation for Efficient Video Diffusion Models

Jisoo Kim, Wooseok Seo, Junwan Kim et al.

ICCV 2025posterarXiv:2508.03254

#21364

X-Prompt: Generalizable Auto-Regressive Visual Learning with In-Context Prompting

Zeyi Sun, Ziyang Chu, Pan Zhang et al.

ICCV 2025poster

#21365

AnyI2V: Animating Any Conditional Image with Motion Control

Ziye Li, Xincheng Shuai, Hao Luo et al.

ICCV 2025posterarXiv:2507.02857

#21366

Diffusion Models Meet Contextual Bandits

Imad Aouali

NEURIPS 2025posterarXiv:2402.10028

#21367

Transfer Learning on Edge Connecting Probability Estimation Under Graphon Model

Yuyao Wang, Yu-Hung Cheng, Debarghya Mukherjee et al.

NEURIPS 2025posterarXiv:2510.05527

#21368

EEdit : Rethinking the Spatial and Temporal Redundancy for Efficient Image Editing

Zexuan Yan, Yue Ma, Chang Zou et al.

ICCV 2025posterarXiv:2503.10270

#21369

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528

#21370

Instruction-based Image Editing with Planning, Reasoning, and Generation

Liya Ji, Chenyang Qi, Qifeng Chen

ICCV 2025poster

#21371

HDR Image Generation via Gain Map Decomposed Diffusion

Yuanshen Guan, Ruikang Xu, Yinuo Liao et al.

ICCV 2025poster

#21372

ESSENTIAL: Episodic and Semantic Memory Integration for Video Class-Incremental Learning

Jongseo Lee, Kyungho Bae, Kyle Min et al.

ICCV 2025highlightarXiv:2508.10896

#21373

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025posterarXiv:2503.05156

#21374

The Silent Assistant: NoiseQuery as Implicit Guidance for Goal-Driven Image Generation

Ruoyu Wang, Huayang Huang, Ye Zhu et al.

ICCV 2025highlightarXiv:2412.05101

#21375

Progressive Growing of Video Tokenizers for Temporally Compact Latent Spaces

Aniruddha Mahapatra, Long Mai, David Bourgin et al.

ICCV 2025posterarXiv:2501.05442

#21376

MC-Bench: A Benchmark for Multi-Context Visual Grounding in the Era of MLLMs

Yunqiu Xu, Linchao Zhu, Yi Yang

ICCV 2025posterarXiv:2410.12332

#21377

Planning and Learning in Average Risk-aware MDPs

Weikai Wang, Erick Delage

NEURIPS 2025posterarXiv:2503.17629

#21378

HyTIP: Hybrid Temporal Information Propagation for Masked Conditional Residual Video Coding

Yi-Hsin Chen, Yi-Chen Yao, Kuan-Wei Ho et al.

ICCV 2025posterarXiv:2508.02072

#21379

DACoN: DINO for Anime Paint Bucket Colorization with Any Number of Reference Images

Kazuma Nagata, Naoshi Kaneko

ICCV 2025posterarXiv:2509.14685

#21380

Vertical Federated Feature Screening

Huajun Yin, Liyuan Wang, Yingqiu Zhu et al.

NEURIPS 2025poster

#21381

Parametric Shadow Control for Portrait Generation in Text-to-Image Diffusion Models

Haoming Cai, Tsung-Wei Huang, Shiv Gehlot et al.

ICCV 2025posterarXiv:2503.21943

#21382

UniversalBooth: Model-Agnostic Personalized Text-to-Image Generation

Songhua Liu, Ruonan Yu, Xinchao Wang

ICCV 2025poster

#21383

Uni-RL: Unifying Online and Offline RL via Implicit Value Regularization

Haoran Xu, Liyuan Mao, Hui Jin et al.

NEURIPS 2025poster

#21384

Tight Bounds for Maximum Weight Matroid Independent Set and Matching in the Zero Communication Model

Ilan Doron-Arad

NEURIPS 2025poster

#21385

CoMatch: Dynamic Covisibility-Aware Transformer for Bilateral Subpixel-Level Semi-Dense Image Matching

Zizhuo Li, Yifan Lu, Linfeng Tang et al.

ICCV 2025highlightarXiv:2503.23925

#21386

Hyper-Modality Enhancement for Multimodal Sentiment Analysis with Missing Modalities

Yan Zhuang, Minhao Liu, Wei Bai et al.

NEURIPS 2025poster

#21387

LoMix: Learnable Weighted Multi-Scale Logits Mixing for Medical Image Segmentation

Md Mostafijur Rahman, Radu Marculescu

NEURIPS 2025posterarXiv:2510.22995

#21388

LOCATEdit: Graph Laplacian Optimized Cross Attention for Localized Text-Guided Image Editing

Achint Soni, Meet Soni, Sirisha Rambhatla

ICCV 2025posterarXiv:2503.21541

#21389

Style over Substance: Distilled Language Models Reason Via Stylistic Replication

Philip Lippmann, Jie Yang

COLM 2025paper

#21390

FastVAR: Linear Visual Autoregressive Modeling via Cached Token Pruning

Hang Guo, Yawei Li, Taolin Zhang et al.

ICCV 2025posterarXiv:2503.23367

#21391

Beyond Isolated Words: Diffusion Brush for Handwritten Text-Line Generation

Gang Dai, Yifan Zhang, Yutao Qin et al.

ICCV 2025posterarXiv:2508.03256

#21392

BadVideo: Stealthy Backdoor Attack against Text-to-Video Generation

Ruotong Wang, Mingli Zhu, Jiarong Ou et al.

ICCV 2025posterarXiv:2504.16907

#21393

Spectral Analysis of Representational Similarity with Limited Neurons

Hyunmo Kang, Abdulkadir Canatar, SueYeon Chung

NEURIPS 2025posterarXiv:2502.19648

#21394

SmolVLM: Redefining small and efficient multimodal models

Andrés Marafioti, Orr Zohar, Miquel Farré et al.

COLM 2025paper

#21395

Hybrid Autoencoders for Tabular Data: Leveraging Model-Based Augmentation in Low-Label Settings

Erel Naor, Ofir Lindenbaum

NEURIPS 2025posterarXiv:2511.06961

#21396

Tracing Copied Pixels and Regularizing Patch Affinity in Copy Detection

Yichen Lu, Siwei Nie, Minlong Lu et al.

ICCV 2025poster

#21397

PixTalk: Controlling Photorealistic Image Processing and Editing with Language

Marcos Conde, Zihao Lu, Radu Timofte

ICCV 2025poster

#21398

A Unified Framework for Industrial Cel-Animation Colorization with Temporal-Structural Awareness

Xiaoyi Feng, Tao Huang, Peng Wang et al.

ICCV 2025poster

#21399

T2I-Copilot: A Training-Free Multi-Agent Text-to-Image System for Enhanced Prompt Interpretation and Interactive Generation

Chieh-Yun Chen, Min Shi, Gong Zhang et al.

ICCV 2025posterarXiv:2507.20536

#21400

LayerLock: Non-collapsing Representation Learning with Progressive Freezing

Goker Erdogan, Nikhil Parthasarathy, Catalin Ionescu et al.

ICCV 2025posterarXiv:2509.10156

← Previous

1...105 106 107 108 109...112