Most Cited 2025 "analysis" Papers

22,274 papers found • Page 33 of 112

Filters:Most Cited 2025 analysis Clear all

Conference

AAAI 2025 (3,028)COLM 2025 (418)CVPR 2025 (2,873)ICCV 2025 (2,701)ICLR 2025 (3,827)ICML 2025 (3,340)ISMAR 2025 (229)NEURIPS 2025 (5,858)AAAI 2024 (2,289)CVPR 2024 (2,716)ECCV 2024 (2,387)ICLR 2024 (2,297)ICML 2024 (2,635)

Paper Type

poster (24,624)paper (8,558)oral (1,594)spotlight (1,421)highlight (975)

#6401

GCE-Pose: Global Context Enhancement for Category-level Object Pose Estimation

Weihang Li, Hongli XU, Junwen Huang et al.

CVPR 2025arXiv:2502.04293

citations

#6402

Question-Aware Gaussian Experts for Audio-Visual Question Answering

Hongyeob Kim, Inyoung Jung, Dayoon Suh et al.

CVPR 2025highlightarXiv:2503.04459

citations

#6403

RelationField: Relate Anything in Radiance Fields

Sebastian Koch, Johanna Wald, Mirco Colosi et al.

CVPR 2025arXiv:2412.13652

citations

#6404

TRCE: Towards Reliable Malicious Concept Erasure in Text-to-Image Diffusion Models

Ruidong Chen, honglin guo, Lanjun Wang et al.

ICCV 2025arXiv:2503.07389

citations

#6405

Boosting the visual interpretability of CLIP via adversarial fine-tuning

Shizhan Gong, Haoyu LEI, Qi Dou et al.

ICLR 2025

citations

#6406

LIRM: Large Inverse Rendering Model for Progressive Reconstruction of Shape, Materials and View-dependent Radiance Fields

Zhengqin Li, Dilin Wang, Ka chen et al.

CVPR 2025arXiv:2504.20026

citations

#6407

LazyMAR: Accelerating Masked Autoregressive Models via Feature Caching

Feihong Yan, qingyan wei, Jiayi Tang et al.

ICCV 2025arXiv:2503.12450

citations

#6408

Do Computer Vision Foundation Models Learn the Low-level Characteristics of the Human Visual System?

Yancheng Cai, Fei Yin, Dounia Hammou et al.

CVPR 2025highlightarXiv:2502.20256

citations

#6409

Active Task Disambiguation with LLMs

Katarzyna Kobalczyk, Nicolás Astorga, Tennison Liu et al.

ICLR 2025arXiv:2502.04485

citations

#6410

MotionAgent: Fine-grained Controllable Video Generation via Motion Field Agent

Xinyao Liao, Xianfang Zeng, Liao Wang et al.

ICCV 2025arXiv:2502.03207

citations

#6411

ChatHuman: Chatting about 3D Humans with Tools

Jing Lin, Yao Feng, Weiyang Liu et al.

CVPR 2025arXiv:2405.04533

citations

#6412

OpenRSD: Towards Open-prompts for Object Detection in Remote Sensing Images

Ziyue Huang, Yongchao Feng, Ziqi Liu et al.

ICCV 2025arXiv:2503.06146

citations

#6413

TurboFill: Adapting Few-step Text-to-image Model for Fast Image Inpainting

Liangbin Xie, Daniil Pakhomov, Zhonghao Wang et al.

CVPR 2025arXiv:2504.00996

citations

#6414

S4M: S4 for multivariate time series forecasting with Missing values

Jing Peng, Meiqi Yang, Qiong Zhang et al.

ICLR 2025oralarXiv:2503.00900

citations

#6415

MotiF: Making Text Count in Image Animation with Motion Focal Loss

Shijie Wang, Samaneh Azadi, Rohit Girdhar et al.

CVPR 2025arXiv:2412.16153

citations

#6416

Keyframe-oriented Vision Token Pruning: Enhancing Efficiency of Large Vision Language Models on Long-Form Video Processing

Yudong Liu, Jingwei Sun, Yueqian Lin et al.

ICCV 2025arXiv:2503.10742

citations

#6417

Towards Understanding the Robustness of Diffusion-Based Purification: A Stochastic Perspective

Yiming Liu, Kezhao Liu, Yao Xiao et al.

ICLR 2025arXiv:2404.14309

citations

#6418

AerialVG: A Challenging Benchmark for Aerial Visual Grounding by Exploring Positional Relations

Junli Liu, Qizhi Chen, Zhigang Wang et al.

ICCV 2025arXiv:2504.07836

citations

#6419

EvEnhancer: Empowering Effectiveness, Efficiency and Generalizability for Continuous Space-Time Video Super-Resolution with Events

Shuoyan Wei, Feng Li, Shengeng Tang et al.

CVPR 2025highlightarXiv:2505.04657

citations

#6420

CWNet: Causal Wavelet Network for Low-Light Image Enhancement

Tongshun Zhang, Pingping Liu, Yubing Lu et al.

ICCV 2025arXiv:2507.10689

citations

#6421

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

Liuyi Wang, Xinyuan Xia, Hui Zhao et al.

ICCV 2025arXiv:2507.13019

citations

#6422

Understanding and Mitigating Bottlenecks of State Space Models through the Lens of Recency and Over-smoothing

Peihao Wang, Ruisi Cai, Yuehao Wang et al.

ICLR 2025arXiv:2501.00658

citations

#6423

GOAL: Global-local Object Alignment Learning

Hyungyu Choi, Young Kyun Jang, Chanho Eom

CVPR 2025arXiv:2503.17782

citations

#6424

GeoLoRA: Geometric integration for parameter efficient fine-tuning

Steffen Schotthöfer, Emanuele Zangrando, Gianluca Ceruti et al.

ICLR 2025arXiv:2410.18720

citations

#6425

PADRe: A Unifying Polynomial Attention Drop-in Replacement for Efficient Vision Transformer

Pierre-David Letourneau, Manish Singh, Hsin-Pai Cheng et al.

ICLR 2025arXiv:2407.11306

citations

#6426

Reconstructing Humans with a Biomechanically Accurate Skeleton

Yan Xia, Xiaowei Zhou, Etienne Vouga et al.

CVPR 2025arXiv:2503.21751

citations

#6427

UNOPose: Unseen Object Pose Estimation with an Unposed RGB-D Reference Image

Xingyu Liu, Gu Wang, Ruida Zhang et al.

CVPR 2025arXiv:2411.16106

citations

#6428

Reference-Based 3D-Aware Image Editing with Triplanes

Bahri Batuhan Bilecen, Yiğit Yalın, Ning Yu et al.

CVPR 2025highlightarXiv:2404.03632

citations

#6429

LookCloser: Frequency-aware Radiance Field for Tiny-Detail Scene

Xiaoyu Zhang, Weihong Pan, Chong Bao et al.

CVPR 2025arXiv:2503.18513

citations

#6430

FlexGen: Flexible Multi-View Generation from Text and Image Inputs

Xinli Xu, Wenhang Ge, Jiantao Lin et al.

ICCV 2025arXiv:2410.10745

citations

#6431

Distributed Speculative Inference (DSI): Speculation Parallelism for Provably Faster Lossless Language Model Inference

Nadav Timor, Jonathan Mamou, Daniel Korat et al.

ICLR 2025arXiv:2405.14105

citations

#6432

HarmonySet: A Comprehensive Dataset for Understanding Video-Music Semantic Alignment and Temporal Synchronization

Zitang Zhou, Ke Mei, Yu Lu et al.

CVPR 2025arXiv:2503.01725

citations

#6433

Gradient-Guided Annealing for Domain Generalization

Aristotelis Ballas, Christos Diou

CVPR 2025highlightarXiv:2502.20162

citations

#6434

SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization

Hongrui Jia, Chaoya Jiang, Haiyang Xu et al.

CVPR 2025arXiv:2411.11909

citations

#6435

APHQ-ViT: Post-Training Quantization with Average Perturbation Hessian Based Reconstruction for Vision Transformers

Zhuguanyu Wu, Jiayi Zhang, Jiaxin Chen et al.

CVPR 2025arXiv:2504.02508

citations

#6436

Co-op: Correspondence-based Novel Object Pose Estimation

Sungphill Moon, Hyeontae Son, Dongcheol Hur et al.

CVPR 2025arXiv:2503.17731

citations

#6437

TruthPrInt: Mitigating Large Vision-Language Models Object Hallucination Via Latent Truthful-Guided Pre-Intervention

Jinhao Duan, Fei Kong, Hao Cheng et al.

ICCV 2025

citations

#6438

RoBridge: A Hierarchical Architecture Bridging Cognition and Execution for General Robotic Manipulation

Kaidong Zhang, Rongtao Xu, Ren Pengzhen et al.

ICCV 2025arXiv:2505.01709

citations

#6439

RigGS: Rigging of 3D Gaussians for Modeling Articulated Objects in Videos

Yuxin Yao, Zhi Deng, Junhui Hou

CVPR 2025arXiv:2503.16822

citations

#6440

VLsI: Verbalized Layers-to-Interactions from Large to Small Vision Language Models

Byung-Kwan Lee, Ryo Hachiuma, Yu-Chiang Frank Wang et al.

CVPR 2025arXiv:2412.01822

citations

#6441

Can Generative Video Models Help Pose Estimation?

Ruojin Cai, Jason Y. Zhang, Philipp Henzler et al.

CVPR 2025highlightarXiv:2412.16155

citations

#6442

RealEdit: Reddit Edits As a Large-scale Empirical Dataset for Image Transformations

Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim et al.

CVPR 2025arXiv:2502.03629

citations

#6443

Causal Representation Learning from Multimodal Biomedical Observations

Yuewen Sun, Lingjing Kong, Guangyi Chen et al.

ICLR 2025arXiv:2411.06518

citations

#6444

Robustness Auditing for Linear Regression: To Singularity and Beyond

Ittai Rubinstein, Samuel Hopkins

ICLR 2025arXiv:2410.07916

citations

#6445

Integrative Decoding: Improving Factuality via Implicit Self-consistency

Yi Cheng, Xiao Liang, Yeyun Gong et al.

ICLR 2025arXiv:2410.01556

citations

#6446

FOCUS: Knowledge-enhanced Adaptive Visual Compression for Few-shot Whole Slide Image Classification

Zhengrui Guo, Conghao Xiong, Jiabo MA et al.

CVPR 2025arXiv:2411.14743

citations

#6447

Vision-Language Models Can't See the Obvious

YASSER ABDELAZIZ DAHOU DJILALI, Ngoc Huynh, Phúc Lê Khắc et al.

ICCV 2025arXiv:2507.04741

citations

#6448

RI3D: Few-Shot Gaussian Splatting With Repair and Inpainting Diffusion Priors

Avinash Paliwal, xilong zhou, Wei Ye et al.

ICCV 2025arXiv:2503.10860

citations

#6449

Beyond One Shot, Beyond One Perspective: Cross-View and Long-Horizon Distillation for Better LiDAR Representations

Xiang Xu, Lingdong Kong, Song Wang et al.

ICCV 2025arXiv:2507.05260

citations

#6450

VideoRFSplat: Direct Scene-Level Text-to-3D Gaussian Splatting Generation with Flexible Pose and Multi-View Joint Modeling

Hyojun Go, Byeongjun Park, Hyelin Nam et al.

ICCV 2025arXiv:2503.15855

citations

#6451

FlowR: Flowing from Sparse to Dense 3D Reconstructions

Tobias Fischer, Samuel Rota Bulò, Yung-Hsu Yang et al.

ICCV 2025highlightarXiv:2504.01647

citations

#6452

ODDR: Outlier Detection & Dimension Reduction Based Defense Against Adversarial Patches

Nandish Chattopadhyay, Amira Guesmi, Muhammad Abdullah Hanif et al.

ICCV 2025arXiv:2311.12084

citations

#6453

Cycle Consistency as Reward: Learning Image-Text Alignment without Human Preferences

Hyojin Bahng, Caroline Chan, Fredo Durand et al.

ICCV 2025arXiv:2506.02095

citations

#6454

HERMES: temporal-coHERent long-forM understanding with Episodes and Semantics

Gueter Josmy Faure, Jia-Fong Yeh, Min-Hung Chen et al.

ICCV 2025arXiv:2408.17443

citations

#6455

DASH: Detection and Assessment of Systematic Hallucinations of VLMs

Maximilian Augustin, Yannic Neuhaus, Matthias Hein

ICCV 2025arXiv:2503.23573

citations

#6456

Mesh-RFT: Enhancing Mesh Generation via Fine-grained Reinforcement Fine-Tuning

Jian Liu, Jing Xu, Song Guo et al.

NEURIPS 2025spotlightarXiv:2505.16761

citations

#6457

DataRater: Meta-Learned Dataset Curation

Dan Andrei Calian, Greg Farquhar, Iurii Kemaev et al.

NEURIPS 2025arXiv:2505.17895

citations

#6458

Hyperbolic Dataset Distillation

Wenyuan Li, Guang Li, Keisuke Maeda et al.

NEURIPS 2025arXiv:2505.24623

citations

#6459

Noise Injection Reveals Hidden Capabilities of Sandbagging Language Models

Cameron Tice, Philipp Kreer, Nathan Helm-Burger et al.

NEURIPS 2025arXiv:2412.01784

citations

#6460

On the Relation between Rectified Flows and Optimal Transport

Johannes Hertrich, Antonin Chambolle, Julie Delon

NEURIPS 2025arXiv:2505.19712

citations

#6461

MLRC-Bench: Can Language Agents Solve Machine Learning Research Challenges?

Yunxiang Zhang, Muhammad Khalifa, Shitanshu Bhushan et al.

NEURIPS 2025arXiv:2504.09702

citations

#6462

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317

citations

#6463

Dynam3D: Dynamic Layered 3D Tokens Empower VLM for Vision-and-Language Navigation

Zihan Wang, Seungjun Lee, Gim Hee Lee

NEURIPS 2025oralarXiv:2505.11383

citations

#6464

LayerCraft: Enhancing Text-to-Image Generation with CoT Reasoning and Layered Object Integration

Yuyao Zhang, Jinghao Li, Yu-Wing Tai

NEURIPS 2025arXiv:2504.00010

citations

#6465

Compiler-R1: Towards Agentic Compiler Auto-tuning with Reinforcement Learning

Haolin Pan, Hongyu Lin, Haoran Luo et al.

NEURIPS 2025arXiv:2506.15701

citations

#6466

On Extending Direct Preference Optimization to Accommodate Ties

Jinghong Chen, Guangyu Yang, Weizhe Lin et al.

NEURIPS 2025arXiv:2409.17431

citations

#6467

Generalized Gradient Norm Clipping & Non-Euclidean $(L_0,L_1)$-Smoothness

Thomas Pethick, Wanyun Xie, Mete Erdogan et al.

NEURIPS 2025oralarXiv:2506.01913

citations

#6468

Straight-Line Diffusion Model for Efficient 3D Molecular Generation

Yuyan Ni, Shikun Feng, Haohan Chi et al.

NEURIPS 2025arXiv:2503.02918

citations

#6469

Multimodal LLMs as Customized Reward Models for Text-to-Image Generation

Shijie Zhou, Ruiyi Zhang, Huaisheng Zhu et al.

ICCV 2025arXiv:2507.21391

citations

#6470

Space Group Equivariant Crystal Diffusion

Rees Chang, Angela Pak, Alex Guerra et al.

NEURIPS 2025arXiv:2505.10994

citations

#6471

Latent Policy Barrier: Learning Robust Visuomotor Policies by Staying In-Distribution

Zhanyi Sun, Shuran Song

NEURIPS 2025spotlightarXiv:2508.05941

citations

#6472

Accelerating Diffusion Transformer via Gradient-Optimized Cache

Junxiang Qiu, Lin Liu, Shuo Wang et al.

ICCV 2025arXiv:2503.05156

citations

#6473

Importance-Based Token Merging for Efficient Image and Video Generation

Haoyu Wu, Jingyi Xu, Hieu Le et al.

ICCV 2025arXiv:2411.16720

citations

#6474

Universal generalization guarantees for Wasserstein distributionally robust models

Tam Le, Jerome Malick

ICLR 2025arXiv:2402.11981

citations

#6475

TempSamp-R1: Effective Temporal Sampling with Reinforcement Fine-Tuning for Video LLMs

Yunheng Li, Jing Cheng, Shaoyong Jia et al.

NEURIPS 2025oralarXiv:2509.18056

citations

#6476

LittleBit: Ultra Low-Bit Quantization via Latent Factorization

Banseok Lee, Dongkyu Kim, Youngcheon You et al.

NEURIPS 2025arXiv:2506.13771

citations

#6477

ObjectGS: Object-aware Scene Reconstruction and Scene Understanding via Gaussian Splatting

Ruijie Zhu, Mulin Yu, Linning Xu et al.

ICCV 2025arXiv:2507.15454

citations

#6478

MLE-Dojo: Interactive Environments for Empowering LLM Agents in Machine Learning Engineering

Rushi Qiang, Yuchen Zhuang, Yinghao Li et al.

NEURIPS 2025arXiv:2505.07782

citations

#6479

Dense SAE Latents Are Features, Not Bugs

Xiaoqing Sun, Alessandro Stolfo, Joshua Engels et al.

NEURIPS 2025arXiv:2506.15679

citations

#6480

HAIF-GS: Hierarchical and Induced Flow-Guided Gaussian Splatting for Dynamic Scene

Jianing Chen, Zehao Li, Yujun Cai et al.

NEURIPS 2025oralarXiv:2506.09518

citations

#6481

Domain-RAG: Retrieval-Guided Compositional Image Generation for Cross-Domain Few-Shot Object Detection

Yu Li, Xingyu Qiu, Yuqian Fu et al.

NEURIPS 2025arXiv:2506.05872

citations

#6482

NOVA: A Benchmark for Rare Anomaly Localization and Clinical Reasoning in Brain MRI

Cosmin Bercea, Jun Li, Philipp Raffler et al.

NEURIPS 2025oral

citations

#6483

Is PRM Necessary? Problem-Solving RL Implicitly Induces PRM Capability in LLMs

Zhangyin Feng, Qianglong Chen, Ning Lu et al.

NEURIPS 2025arXiv:2505.11227

citations

#6484

Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Zichen Liu, Yihao Meng, Hao Ouyang et al.

ICCV 2025arXiv:2404.11614

citations

#6485

BOOM: Benchmarking Out-Of-distribution Molecular Property Predictions of Machine Learning Models

Evan Antoniuk, Shehtab Zaman, Tal Ben-Nun et al.

NEURIPS 2025arXiv:2505.01912

citations

#6486

Wicked Oddities: Selectively Poisoning for Effective Clean-Label Backdoor Attacks

Hung Quang Nguyen, Hieu Nguyen, Anh Ta et al.

ICLR 2025arXiv:2407.10825

citations

#6487

CLIP Under the Microscope: A Fine-Grained Analysis of Multi-Object Representation

Reza Abbasi, Ali Nazari, Aminreza Sefid et al.

CVPR 2025arXiv:2502.19842

citations

#6488

Model Reveals What to Cache: Profiling-Based Feature Reuse for Video Diffusion Models

Xuran Ma, Yexin Liu, Yaofu LIU et al.

ICCV 2025arXiv:2504.03140

citations

#6489

RGB-Event ISP: The Dataset and Benchmark

Yunfan LU, Yanlin Qian, Ziyang Rao et al.

ICLR 2025arXiv:2501.19129

citations

#6490

Cross-modal Ship Re-Identification via Optical and SAR Imagery: A Novel Dataset and Method

Han Wang, Shengyang Li, Jian Yang et al.

ICCV 2025arXiv:2506.22027

citations

#6491

mmCooper: A Multi-agent Multi-stage Communication-efficient and Collaboration-robust Cooperative Perception Framework

Bingyi Liu, Jian Teng, Hongfei Xue et al.

ICCV 2025arXiv:2501.12263

citations

#6492

CompCap: Improving Multimodal Large Language Models with Composite Captions

Xiaohui Chen, Satya Narayan Shukla, Mahmoud Azab et al.

ICCV 2025arXiv:2412.05243

citations

#6493

TurboReg: TurboClique for Robust and Efficient Point Cloud Registration

Shaocheng Yan, Pengcheng Shi, Zhenjun Zhao et al.

ICCV 2025arXiv:2507.01439

citations

#6494

SpatialSplat: Efficient Semantic 3D from Sparse Unposed Images

Yu Sheng, Jiajun Deng, Xinran Zhang et al.

ICCV 2025arXiv:2505.23044

citations

#6495

Seg4Diff: Unveiling Open-Vocabulary Semantic Segmentation in Text-to-Image Diffusion Transformers

Chaehyun Kim, Heeseong Shin, Eunbeen Hong et al.

NEURIPS 2025

citations

#6496

PhysDrive: A Multimodal Remote Physiological Measurement Dataset for In-vehicle Driver Monitoring

Wang, Xiao Yang, Qingyong Hu et al.

NEURIPS 2025arXiv:2507.19172

citations

#6497

A Statistical Framework for Ranking LLM-based Chatbots

Siavash Ameli, Siyuan Zhuang, Ion Stoica et al.

ICLR 2025arXiv:2412.18407

citations

#6498

IGL-Bench: Establishing the Comprehensive Benchmark for Imbalanced Graph Learning

Jiawen Qin, Haonan Yuan, Qingyun Sun et al.

ICLR 2025arXiv:2406.09870

citations

#6499

RadarSplat: Radar Gaussian Splatting for High-Fidelity Data Synthesis and 3D Reconstruction of Autonomous Driving Scenes

Pou-Chun Kung, Skanda Harisha, Ram Vasudevan et al.

ICCV 2025arXiv:2506.01379

citations

#6500

DyMU: Dynamic Merging and Virtual Unmerging for Efficient Variable-Length VLMs

Zhenhailong Wang, Senthil Purushwalkam, Caiming Xiong et al.

NEURIPS 2025

citations

#6501

Uncertain Multimodal Intention and Emotion Understanding in the Wild

Qu Yang, QingHongYa Shi, Tongxin Wang et al.

CVPR 2025

citations

#6502

ConStellaration: A dataset of QI-like stellarator plasma boundaries and optimization benchmarks

Santiago Cadena, Andrea Merlo, Emanuel Laude et al.

NEURIPS 2025arXiv:2506.19583

citations

#6503

Not-So-Optimal Transport Flows for 3D Point Cloud Generation

Ka-Hei Hui, Chao Liu, xiaohui zeng et al.

ICLR 2025arXiv:2502.12456

citations

#6504

EchoWorld: Learning Motion-Aware World Models for Echocardiography Probe Guidance

Yang Yue, Yulin Wang, Haojun Jiang et al.

CVPR 2025arXiv:2504.13065

citations

#6505

CamFreeDiff: Camera-free Image to Panorama Generation with Diffusion Model

Xiaoding Yuan, Shitao Tang, Kejie Li et al.

CVPR 2025arXiv:2407.07174

citations

#6506

Multitwine: Multi-Object Compositing with Text and Layout Control

Gemma Canet Tarrés, Zhe Lin, Zhifei Zhang et al.

CVPR 2025highlightarXiv:2502.05165

citations

#6507

VELOCITI: Benchmarking Video-Language Compositional Reasoning with Strict Entailment

Darshana Saravanan, Varun Gupta, Darshan Singh S et al.

CVPR 2025arXiv:2406.10889

citations

#6508

Reinforcement learning with combinatorial actions for coupled restless bandits

Lily Xu, Bryan Wilder, Elias Khalil et al.

ICLR 2025arXiv:2503.01919

citations

#6509

Diffusion Bridge AutoEncoders for Unsupervised Representation Learning

Yeongmin Kim, Kwanghyeon Lee, Minsang Park et al.

ICLR 2025arXiv:2405.17111

citations

#6510

Reconciling Model Multiplicity for Downstream Decision Making

Ally Du, Dung Daniel Ngo, Steven Wu

ICLR 2025arXiv:2405.19667

citations

#6511

Overcoming Sparsity Artifacts in Crosscoders to Interpret Chat-Tuning

Julian Minder, Clément Dumas, Caden Juang et al.

NEURIPS 2025arXiv:2504.02922

citations

#6512

Lay2Story: Extending Diffusion Transformers for Layout-Togglable Story Generation

Ao Ma, Jiasong Feng, Ke Cao et al.

ICCV 2025arXiv:2508.08949

citations

#6513

Student-Informed Teacher Training

Nico Messikommer, Jiaxu Xing, Elie Aljalbout et al.

ICLR 2025arXiv:2412.09149

citations

#6514

FaceShield: Defending Facial Image against Deepfake Threats

Jaehwan Jeong, Sumin In, Sieun Kim et al.

ICCV 2025arXiv:2412.09921

citations

#6515

Improving Gaussian Splatting with Localized Points Management

Haosen Yang, Chenhao Zhang, Wenqing Wang et al.

CVPR 2025highlightarXiv:2406.04251

citations

#6516

ReNeg: Learning Negative Embedding with Reward Guidance

Xiaomin Li, yixuan liu, Takashi Isobe et al.

CVPR 2025highlightarXiv:2412.19637

citations

#6517

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Fucai Ke, Vijay Kumar b g, Xingjian Leng et al.

ICCV 2025arXiv:2503.19263

citations

#6518

Time Travel is Cheating: Going Live with DeepFund for Real-Time Fund Investment Benchmarking

Changlun Li, Yao SHI, Chen Wang et al.

NEURIPS 2025arXiv:2505.11065

citations

#6519

GenPO: Generative Diffusion Models Meet On-Policy Reinforcement Learning

Shutong Ding, Ke Hu, Shan Zhong et al.

NEURIPS 2025arXiv:2505.18763

citations

#6520

Rotation-Equivariant Self-Supervised Method in Image Denoising

Hanze Liu, Jiahong Fu, Qi Xie et al.

CVPR 2025arXiv:2505.19618

citations

#6521

On Speeding Up Language Model Evaluation

Jin Zhou, Christian Belardi, Ruihan Wu et al.

ICLR 2025arXiv:2407.06172

citations

#6522

Understanding Representation Dynamics of Diffusion Models via Low-Dimensional Modeling

Xiao Li, Zekai Zhang, Xiang Li et al.

NEURIPS 2025arXiv:2502.05743

citations

#6523

Motion Modes: What Could Happen Next?

Karran Pandey, Yannick Hold-Geoffroy, Matheus Gadelha et al.

CVPR 2025arXiv:2412.00148

citations

#6524

Learning from Streaming Video with Orthogonal Gradients

Tengda Han, Dilara Gokay, Joseph Heyward et al.

CVPR 2025arXiv:2504.01961

citations

#6525

``Principal Components" Enable A New Language of Images

Xin Wen, Bingchen Zhao, Ismail Elezi et al.

ICCV 2025

citations

#6526

CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation

Xiangyang Luo, Ye Zhu, Yunfei Liu et al.

ICCV 2025arXiv:2507.02691

citations

#6527

DreamFuse: Adaptive Image Fusion with Diffusion Transformer

Junjia Huang, Pengxiang Yan, Jiyang Liu et al.

ICCV 2025arXiv:2504.08291

citations

#6528

AURELIA: Test-time Reasoning Distillation in Audio-Visual LLMs

Sanjoy Chowdhury, Hanan Gani, Nishit Anand et al.

ICCV 2025arXiv:2503.23219

citations

#6529

Divergence-enhanced Knowledge-guided Context Optimization for Visual-Language Prompt Tuning

Yilun Li, Miaomiao Cheng, Xu Han et al.

ICLR 2025

citations

#6530

Robust 3D Shape Reconstruction in Zero-Shot from a Single Image in the Wild

Junhyeong Cho, Kim Youwang, Hunmin Yang et al.

CVPR 2025arXiv:2403.14539

citations

#6531

Driving View Synthesis on Free-form Trajectories with Generative Prior

Zeyu Yang, Zijie Pan, Yuankun Yang et al.

ICCV 2025arXiv:2412.01717

citations

#6532

Bayesian Experimental Design Via Contrastive Diffusions

Jacopo Iollo, Christophe Heinkelé, Pierre Alliez et al.

ICLR 2025arXiv:2410.11826

citations

#6533

Adapt-$\infty$: Scalable Continual Multimodal Instruction Tuning via Dynamic Data Selection

Adyasha Maharana, Jaehong Yoon, Tianlong Chen et al.

ICLR 2025arXiv:2410.10636

citations

#6534

Secure and Confidential Certificates of Online Fairness

Olive Franzese, Ali Shahin Shamsabadi, Carter Luck et al.

NEURIPS 2025arXiv:2410.02777

citations

#6535

CDI: Copyrighted Data Identification in Diffusion Models

Jan Dubiński, Antoni Kowalczuk, Franziska Boenisch et al.

CVPR 2025arXiv:2411.12858

citations

#6536

Learning a Neural Solver for Parametric PDEs to Enhance Physics-Informed Methods

Lise Le Boudec, Emmanuel de Bézenac, Louis Serrano et al.

ICLR 2025arXiv:2410.06820

citations

#6537

GIViC: Generative Implicit Video Compression

Ge Gao, Siyue Teng, Tianhao Peng et al.

ICCV 2025arXiv:2503.19604

citations

#6538

ZeroStereo: Zero-shot Stereo Matching from Single Images

Xianqi Wang, Hao Yang, Gangwei Xu et al.

ICCV 2025arXiv:2501.08654

citations

#6539

CATSplat: Context-Aware Transformer with Spatial Guidance for Generalizable 3D Gaussian Splatting from A Single-View Image

Wonseok Roh, Hwanhee Jung, JongWook Kim et al.

ICCV 2025arXiv:2412.12906

citations

#6540

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia et al.

CVPR 2025arXiv:2503.01980

citations

#6541

Large Self-Supervised Models Bridge the Gap in Domain Adaptive Object Detection

Marc-Antoine Lavoie, Anas Mahmoud, Steven L. Waslander

CVPR 2025arXiv:2503.23220

citations

#6542

RAGDiffusion: Faithful Cloth Generation via External Knowledge Assimilation

Yuhan Li, Xianfeng Tan, Wenxiang Shang et al.

ICCV 2025highlightarXiv:2411.19528

citations

#6543

Decoupling Layout from Glyph in Online Chinese Handwriting Generation

Minsi Ren, Yan-Ming Zhang, yi chen

ICLR 2025arXiv:2410.02309

citations

#6544

DualPM: Dual Posed-Canonical Point Maps for 3D Shape and Pose Reconstruction

Ben Kaye, Tomas Jakab, Shangzhe Wu et al.

CVPR 2025highlightarXiv:2412.04464

citations

#6545

Prompt-CAM: Making Vision Transformers Interpretable for Fine-Grained Analysis

Arpita Chowdhury, Dipanjyoti Paul, Zheda Mai et al.

CVPR 2025arXiv:2501.09333

citations

#6546

Efficient Prompt Compression with Evaluator Heads for Long-Context Transformer Inference

Weizhi Fei, Xueyan Niu, XIE GUOQING et al.

NEURIPS 2025spotlightarXiv:2501.12959

citations

#6547

3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

Qizhi Pei, Rui Yan, Kaiyuan Gao et al.

ICLR 2025arXiv:2406.05797

citations

#6548

Sparfels: Fast Reconstruction from Sparse Unposed Imagery

Shubhendu Jena, Amine Ouasfi, Mae Younes et al.

ICCV 2025highlightarXiv:2505.02178

citations

#6549

Enhancing Reward Models for High-quality Image Generation: Beyond Text-Image Alignment

ying ba, Tianyu Zhang, Yalong Bai et al.

ICCV 2025arXiv:2507.19002

citations

#6550

Curly Flow Matching for Learning Non-gradient Field Dynamics

Katarina Petrović, Lazar Atanackovic, Viggo Moro et al.

NEURIPS 2025arXiv:2510.26645

citations

#6551

LASeR: Learning to Adaptively Select Reward Models with Multi-Arm Bandits

Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin et al.

NEURIPS 2025arXiv:2410.01735

citations

#6552

Diffuman4D: 4D Consistent Human View Synthesis from Sparse-View Videos with Spatio-Temporal Diffusion Models

Yudong Jin, Sida Peng, Xuan Wang et al.

ICCV 2025arXiv:2507.13344

citations

#6553

Dense-SfM: Structure from Motion with Dense Consistent Matching

JongMin Lee, Sungjoo Yoo

CVPR 2025arXiv:2501.14277

citations

#6554

Efficient Source-Free Time-Series Adaptation via Parameter Subspace Disentanglement

Gaurav Patel, Christopher M. Sandino, Behrooz Mahasseni et al.

ICLR 2025arXiv:2410.02147

citations

#6555

Dual-Process Image Generation

Grace Luo, Jonathan Granskog, Aleksander Holynski et al.

ICCV 2025arXiv:2506.01955

citations

#6556

Multilevel Generative Samplers for Investigating Critical Phenomena

Ankur Singha, Elia Cellini, Kim A. Nicoli et al.

ICLR 2025arXiv:2503.08918

citations

#6557

E(3)-equivariant models cannot learn chirality: Field-based molecular generation

Alexandru Dumitrescu, Dani Korpela, Markus Heinonen et al.

ICLR 2025arXiv:2402.15864

citations

#6558

Execution Guided Line-by-Line Code Generation

Boaz Lavon, Shahar Katz, Lior Wolf

NEURIPS 2025arXiv:2506.10948

citations

#6559

On the Loss of Context Awareness in General Instruction Fine-tuning

Yihan Wang, Andrew Bai, Nanyun Peng et al.

NEURIPS 2025arXiv:2411.02688

citations

#6560

KAC: Kolmogorov-Arnold Classifier for Continual Learning

Yusong Hu, Zichen Liang, Fei Yang et al.

CVPR 2025highlightarXiv:2503.21076

citations

#6561

Dynamical Low-Rank Compression of Neural Networks with Robustness under Adversarial Attacks

Steffen Schotthöfer, Lexie Yang, Stefan Schnake

NEURIPS 2025oralarXiv:2505.08022

citations

#6562

Deep Continuous-Time State-Space Models for Marked Event Sequences

Yuxin Chang, Alex Boyd, Cao (Danica) Xiao et al.

NEURIPS 2025oralarXiv:2412.19634

citations

#6563

The Curse of Conditions: Analyzing and Improving Optimal Transport for Conditional Flow-Based Generation

Ho Kei Cheng, Alex Schwing

ICCV 2025arXiv:2503.10636

citations

#6564

SparseAlign: a Fully Sparse Framework for Cooperative Object Detection

Yunshuang Yuan, Yan Xia, Daniel Cremers et al.

CVPR 2025arXiv:2503.12982

citations

#6565

EVOS: Efficient Implicit Neural Training via EVOlutionary Selector

Weixiang Zhang, Shuzhao Xie, Chengwei Ren et al.

CVPR 2025arXiv:2412.10153

citations

#6566

Vgent: Graph-based Retrieval-Reasoning-Augmented Generation For Long Video Understanding

Xiaoqian Shen, Wenxuan Zhang, Jun Chen et al.

NEURIPS 2025oralarXiv:2510.14032

citations

#6567

Towards In-the-wild 3D Plane Reconstruction from a Single Image

Jiachen Liu, Rui Yu, Sili Chen et al.

CVPR 2025highlightarXiv:2506.02493

citations

#6568

Augmented Mass-Spring Model for Real-Time Dense Hair Simulation

Jorge Herrera, Yi Zhou, Xin Sun et al.

ICCV 2025arXiv:2412.17144

citations

#6569

On the Expressive Power of Sparse Geometric MPNNs

Yonatan Sverdlov, Nadav Dym

ICLR 2025arXiv:2407.02025

citations

#6570

Cross-Modal Interactive Perception Network with Mamba for Lung Tumor Segmentation in PET-CT Images

Jie Mei, Chenyu Lin, Yu Qiu et al.

CVPR 2025arXiv:2503.17261

citations

#6571

IGD: Token Decisiveness Modeling via Information Gain in LLMs for Personalized Recommendation

Zijie Lin, Yang Zhang, Xiaoyan Zhao et al.

NEURIPS 2025arXiv:2506.13229

citations

#6572

OpenVision: A Fully-Open, Cost-Effective Family of Advanced Vision Encoders for Multimodal Learning

Xianhang Li, Yanqing Liu, Haoqin Tu et al.

ICCV 2025arXiv:2505.04601

citations

#6573

Tight Time Complexities in Parallel Stochastic Optimization with Arbitrary Computation Dynamics

Alexander Tyurin

ICLR 2025arXiv:2408.04929

citations

#6574

Probabilistic Learning to Defer: Handling Missing Expert Annotations and Controlling Workload Distribution

Cuong Nguyen, Thanh-Toan Do, Gustavo Carneiro

ICLR 2025

citations

#6575

Control-oriented Clustering of Visual Latent Representation

Han Qi, Haocheng Yin, Heng Yang

ICLR 2025arXiv:2410.05063

citations

#6576

Open-Vocabulary Octree-Graph for 3D Scene Understanding

Zhigang Wang, Yifei Su, Chenhui Li et al.

ICCV 2025arXiv:2411.16253

citations

#6577

Towards Realistic Example-based Modeling via 3D Gaussian Stitching

Xinyu Gao, Ziyi Yang, Bingchen Gong et al.

CVPR 2025arXiv:2408.15708

citations

#6578

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Kaining Ying, Henghui Ding, Guangquan Jie et al.

ICCV 2025arXiv:2507.22886

citations

#6579

VasTSD: Learning 3D Vascular Tree-state Space Diffusion Model for Angiography Synthesis

Zhifeng Wang, Renjiao Yi, Xin Wen et al.

CVPR 2025arXiv:2503.12758

citations

#6580

Ensembles of Low-Rank Expert Adapters

Yinghao Li, Vianne Gao, Chao Zhang et al.

ICLR 2025arXiv:2502.00089

citations

#6581

DropletVideo: A Dataset and Approach to Explore Integral Spatio-Temporal Consistent Video Generation

Runze Zhang, Guoguang Du, Xiaochuan Li et al.

ICCV 2025highlightarXiv:2503.06053

citations

#6582

Mitigating Hallucination Through Theory-Consistent Symmetric Multimodal Preference Optimization

Wenqi Liu, Xuemeng Song, Jiaxi Li et al.

NEURIPS 2025arXiv:2506.11712

citations

#6583

Weakly Supervised Temporal Action Localization via Dual-Prior Collaborative Learning Guided by Multimodal Large Language Models

Quan Zhang, Jinwei Fang, Rui Yuan et al.

CVPR 2025arXiv:2411.08466

citations

#6584

On Denoising Walking Videos for Gait Recognition

Dongyang Jin, Chao Fan, Jingzhe Ma et al.

CVPR 2025arXiv:2505.18582

citations

#6585

Think-RM: Enabling Long-Horizon Reasoning in Generative Reward Models

Ilgee Hong, Changlong Yu, Liang Qiu et al.

NEURIPS 2025arXiv:2505.16265

citations

#6586

Thought Communication in Multiagent Collaboration

Yujia Zheng, Zhuokai Zhao, Zijian Li et al.

NEURIPS 2025spotlightarXiv:2510.20733

citations

#6587

TSAM: Temporal SAM Augmented with Multimodal Prompts for Referring Audio-Visual Segmentation

Abduljalil Radman, Jorma Laaksonen

CVPR 2025

citations

#6588

3D Occupancy Prediction with Low-Resolution Queries via Prototype-aware View Transformation

Gyeongrok Oh, Sung June Kim, Heeju Ko et al.

CVPR 2025arXiv:2503.15185

citations

#6589

Where, What, Why: Towards Explainable Driver Attention Prediction

Yuchen Zhou, Jiayu Tang, Xiaoyan Xiao et al.

ICCV 2025highlightarXiv:2506.23088

citations

#6590

Imputation-free and Alignment-free: Incomplete Multi-view Clustering Driven by Consensus Semantic Learning

yuzhuo dai, Jiaqi Jin, Zhibin Dong et al.

CVPR 2025arXiv:2505.11182

citations

#6591

Ineq-Comp: Benchmarking Human-Intuitive Compositional Reasoning in Automated Theorem Proving of Inequalities

Haoyu Zhao, Yihan Geng, Shange Tang et al.

NEURIPS 2025arXiv:2505.12680

citations

#6592

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes et al.

CVPR 2025highlightarXiv:2503.12127

citations

#6593

SQLens: An End-to-End Framework for Error Detection and Correction in Text-to-SQL

Yue Gong, Chuan Lei, Xiao Qin et al.

NEURIPS 2025arXiv:2506.04494

citations

#6594

KLASS: KL-Guided Fast Inference in Masked Diffusion Models

Seo Hyun Kim, Sunwoo Hong, Hojung Jung et al.

NEURIPS 2025spotlightarXiv:2511.05664

citations

#6595

Masked Point-Entity Contrast for Open-Vocabulary 3D Scene Understanding

Yan Wang, Baoxiong Jia, Ziyu Zhu et al.

CVPR 2025arXiv:2504.19500

citations

#6596

FlexOLMo: Open Language Models for Flexible Data Use

Weijia Shi, Akshita Bhagia, Kevin Farhat et al.

NEURIPS 2025spotlightarXiv:2507.07024

citations

#6597

SimXRD-4M: Big Simulated X-ray Diffraction Data and Crystal Symmetry Classification Benchmark

Bin Cao, Yang Liu, Zinan Zheng et al.

ICLR 2025

citations

#6598

Generalizable, real-time neural decoding with hybrid state-space models

Avery Hee-Woon Ryoo, Nanda H Krishna, Ximeng Mao et al.

NEURIPS 2025arXiv:2506.05320

citations

#6599

Bridging Information Asymmetry in Text-video Retrieval: A Data-centric Approach

Zechen Bai, Tianjun Xiao, Tong He et al.

ICLR 2025arXiv:2408.07249

citations

#6600

VALLR: Visual ASR Language Model for Lip Reading

Marshall Thomas, Edward Fish, Richard Bowden

ICCV 2025arXiv:2503.21408

citations

← Previous

1...31 32 33 34 35...112