Most Cited 2025 "benchmark failure modes" Papers

22,274 papers found • Page 15 of 112

#2801

ViSpeak: Visual Instruction Feedback in Streaming Videos

Shenghao Fu, Qize Yang, Yuan-Ming Li et al.

ICCV 2025posterarXiv:2503.12769
11
citations
#2802

Enhancing Trustworthiness of Graph Neural Networks with Rank-Based Conformal Training

Ting Wang, Zhixin Zhou, Rui Luo

AAAI 2025paperarXiv:2501.02767
11
citations
#2803

Measuring Human and AI Values Based on Generative Psychometrics with Large Language Models

Haoran Ye, Yuhang Xie, Yuanyi Ren et al.

AAAI 2025paperarXiv:2409.12106
10
citations
#2804

Consistency Checks for Language Model Forecasters

Daniel Paleka, Abhimanyu Pallavi Sudhir, Alejandro Alvarez et al.

ICLR 2025posterarXiv:2412.18544
10
citations
#2805

Beyond Walking: A Large-Scale Image-Text Benchmark for Text-based Person Anomaly Search

Shuyu Yang, Yaxiong Wang, Li Zhu et al.

ICCV 2025highlightarXiv:2411.17776
10
citations
#2806

Efficient Model Editing with Task-Localized Sparse Fine-tuning

Leonardo Iurada, Marco Ciccone, Tatiana Tommasi

ICLR 2025posterarXiv:2504.02620
10
citations
#2807

Chain of Attack: On the Robustness of Vision-Language Models Against Transfer-Based Adversarial Attacks

Peng Xie, Yequan Bie, Jianda Mao et al.

CVPR 2025posterarXiv:2411.15720
10
citations
#2808

Emergent Temporal Correspondences from Video Diffusion Transformers

Jisu Nam, Soowon Son, Dahyun Chung et al.

NEURIPS 2025oralarXiv:2506.17220
10
citations
#2809

Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics

Dongyoung Kim, Huiwon Jang, Sumin Park et al.

NEURIPS 2025posterarXiv:2506.00070
10
citations
#2810

DyWA: Dynamics-adaptive World Action Model for Generalizable Non-prehensile Manipulation

Jiangran Lyu, Ziming Li, Xuesong Shi et al.

ICCV 2025posterarXiv:2503.16806
10
citations
#2811

RoMA: Scaling up Mamba-based Foundation Models for Remote Sensing

Fengxiang Wang, Yulin Wang, Mingshuo Chen et al.

NEURIPS 2025posterarXiv:2503.10392
10
citations
#2812

SSLAM: Enhancing Self-Supervised Models with Audio Mixtures for Polyphonic Soundscapes

Tony Alex, Sara Atito, Armin Mustafa et al.

ICLR 2025posterarXiv:2506.12222
10
citations
#2813

AdvPrefix: An Objective for Nuanced LLM Jailbreaks

Sicheng Zhu, Brandon Amos, Yuandong Tian et al.

NEURIPS 2025posterarXiv:2412.10321
10
citations
#2814

Neural Exploratory Landscape Analysis for Meta-Black-Box-Optimization

Zeyuan Ma, Jiacheng Chen, Hongshu Guo et al.

ICLR 2025posterarXiv:2408.10672
10
citations
#2815

MIMO: A Medical Vision Language Model with Visual Referring Multimodal Input and Pixel Grounding Multimodal Output

Yanyuan Chen, Dexuan Xu, Yu Huang et al.

CVPR 2025posterarXiv:2510.10011
10
citations
#2816

VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving

Haiming Zhang, Wending Zhou, Shenzhen The Chinese University of Hongkong et al.

CVPR 2025posterarXiv:2411.14716
10
citations
#2817

Deep Linear Probe Generators for Weight Space Learning

Jonathan Kahana, Eliahu Horwitz, Imri Shuval et al.

ICLR 2025posterarXiv:2410.10811
10
citations
#2818

Towards Training-free Anomaly Detection with Vision and Language Foundation Models

Jinjin Zhang, Guodong Wang, yizhou jin et al.

CVPR 2025posterarXiv:2503.18325
10
citations
#2819

Vector-ICL: In-context Learning with Continuous Vector Representations

Yufan Zhuang, Chandan Singh, Liyuan Liu et al.

ICLR 2025posterarXiv:2410.05629
10
citations
#2820

LLM Can be a Dangerous Persuader: Empirical Study of Persuasion Safety in Large Language Models

Minqian Liu, Zhiyang Xu, Xinyi Zhang et al.

COLM 2025paperarXiv:2504.10430
10
citations
#2821

SplatFlow: Self-Supervised Dynamic Gaussian Splatting in Neural Motion Flow Field for Autonomous Driving

Su Sun, Cheng Zhao, Zhuoyang Sun et al.

CVPR 2025highlightarXiv:2411.15482
10
citations
#2822

Flexible Frame Selection for Efficient Video Reasoning

Shyamal Buch, Arsha Nagrani, Anurag Arnab et al.

CVPR 2025poster
10
citations
#2823

Ouroboros-Diffusion: Exploring Consistent Content Generation in Tuning-free Long Video Diffusion

Jingyuan Chen, Fuchen Long, Jie An et al.

AAAI 2025paperarXiv:2501.09019
10
citations
#2824

GTR: Improving Large 3D Reconstruction Models through Geometry and Texture Refinement

Peiye Zhuang, Songfang Han, Chaoyang Wang et al.

ICLR 2025posterarXiv:2406.05649
10
citations
#2825

S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation

Yichen Xie, Runsheng Xu, Tong He et al.

CVPR 2025poster
10
citations
#2826

Scalable and Certifiable Graph Unlearning: Overcoming the Approximation Error Barrier

Lu Yi, Zhewei Wei

ICLR 2025posterarXiv:2408.09212
10
citations
#2827

Seeing Your Speech Style: A Novel Zero-Shot Identity-Disentanglement Face-based Voice Conversion

Yan Rong, Li Liu

AAAI 2025paperarXiv:2409.00700
10
citations
#2828

Unifying Autoregressive and Diffusion-Based Sequence Generation

Nima Fathi, Torsten Scholak, Pierre-Andre Noel

COLM 2025paperarXiv:2504.06416
10
citations
#2829

Unleashing Hour-Scale Video Training for Long Video-Language Understanding

Jingyang Lin, Jialian Wu, Ximeng Sun et al.

NEURIPS 2025oralarXiv:2506.05332
10
citations
#2830

Confidence Estimation for Error Detection in Text-to-SQL Systems

Oleg Somov, Elena Tutubalina

AAAI 2025paperarXiv:2501.09527
10
citations
#2831

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference

Zhuomin He, Yizhen Yao, Pengfei Zuo et al.

AAAI 2025paperarXiv:2501.02336
10
citations
#2832

EnergyMoGen: Compositional Human Motion Generation with Energy-Based Diffusion Model in Latent Space

Jianrong Zhang, Hehe Fan, Yi Yang

CVPR 2025highlightarXiv:2412.14706
10
citations
#2833

Data-Driven Performance Guarantees for Classical and Learned Optimizers

Rajiv Sambharya, Bartolomeo Stellato

NEURIPS 2025posterarXiv:2404.13831
10
citations
#2834

WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image

Yuci Liang, Xinheng Lyu, Meidan Ding et al.

ICCV 2025posterarXiv:2412.02141
10
citations
#2835

Forensics-Bench: A Comprehensive Forgery Detection Benchmark Suite for Large Vision Language Models

Jin Wang, Chenghui Lv, Xian Li et al.

CVPR 2025posterarXiv:2503.15024
10
citations
#2836

Law of the Weakest Link: Cross Capabilities of Large Language Models

Ming Zhong, Aston Zhang, Xuewei Wang et al.

ICLR 2025posterarXiv:2409.19951
10
citations
#2837

ID-Patch: Robust ID Association for Group Photo Personalization

Yimeng Zhang, Tiancheng Zhi, Jing Liu et al.

CVPR 2025posterarXiv:2411.13632
10
citations
#2838

Think Then React: Towards Unconstrained Action-to-Reaction Motion Generation

Wenhui Tan, Boyuan Li, Chuhao Jin et al.

ICLR 2025poster
10
citations
#2839

LoRe: Personalizing LLMs via Low-Rank Reward Modeling

Avinandan Bose, Zhihan Xiong, Yuejie Chi et al.

COLM 2025paperarXiv:2504.14439
10
citations
#2840

CLIP is Strong Enough to Fight Back: Test-time Counterattacks towards Zero-shot Adversarial Robustness of CLIP

Songlong Xing, Zhengyu Zhao, Nicu Sebe

CVPR 2025posterarXiv:2503.03613
10
citations
#2841

Latent-EnSF: A Latent Ensemble Score Filter for High-Dimensional Data Assimilation with Sparse Observation Data

Phillip Si, Peng Chen

ICLR 2025posterarXiv:2409.00127
10
citations
#2842

Scalable Surrogate Verification of Image-Based Neural Network Control Systems Using Composition and Unrolling

Feiyang Cai, Chuchu Fan, Stanley Bak

AAAI 2025paperarXiv:2405.18554
10
citations
#2843

BASKET: A Large-Scale Video Dataset for Fine-Grained Skill Estimation

Yulu Pan, Ce Zhang, Gedas Bertasius

CVPR 2025posterarXiv:2503.20781
10
citations
#2844

Di[M]O: Distilling Masked Diffusion Models into One-step Generator

Yuanzhi Zhu, Xi WANG, Stéphane Lathuilière et al.

ICCV 2025poster
10
citations
#2845

LLM Unlearning Reveals a Stronger-Than-Expected Coreset Effect in Current Benchmarks

Soumyadeep Pal, Changsheng Wang, James Diffenderfer et al.

COLM 2025paperarXiv:2504.10185
10
citations
#2846

EllieSQL: Cost-Efficient Text-to-SQL with Complexity-Aware Routing

Yizhang Zhu, Runzhi JIANG, Boyan Li et al.

COLM 2025paperarXiv:2503.22402
10
citations
#2847

3D Mesh Editing using Masked LRMs

William Gao, Dilin Wang, Yuchen Fan et al.

ICCV 2025posterarXiv:2412.08641
10
citations
#2848

MMKE-Bench: A Multimodal Editing Benchmark for Diverse Visual Knowledge

yuntao du, Kailin Jiang, Zhi Gao et al.

ICLR 2025posterarXiv:2502.19870
10
citations
#2849

ADAM: An Embodied Causal Agent in Open-World Environments

Shu Yu, Chaochao Lu

ICLR 2025posterarXiv:2410.22194
10
citations
#2850

Breaking the Data Barrier -- Building GUI Agents Through Task Generalization

Junlei Zhang, Zichen Ding, Chang Ma et al.

COLM 2025paperarXiv:2504.10127
10
citations
#2851

ConfTuner: Training Large Language Models to Express Their Confidence Verbally

Yibo Li, Miao Xiong, Jiaying Wu et al.

NEURIPS 2025posterarXiv:2508.18847
10
citations
#2852

Holistic Reasoning with Long-Context LMs: A Benchmark for Database Operations on Massive Textual Data

Seiji Maekawa, Hayate Iso, Nikita Bhutani

ICLR 2025posterarXiv:2410.11996
10
citations
#2853

Sample Efficient Preference Alignment in LLMs via Active Exploration

Viraj Mehta, Syrine Belakaria, Vikramjeet Das et al.

COLM 2025paperarXiv:2312.00267
10
citations
#2854

General Scene Adaptation for Vision-and-Language Navigation

Haodong Hong, Yanyuan Qiao, Sen Wang et al.

ICLR 2025posterarXiv:2501.17403
10
citations
#2855

Why and How LLMs Hallucinate: Connecting the Dots with Subsequence Associations

Yiyou Sun, Yu Gai, Lijie Chen et al.

NEURIPS 2025posterarXiv:2504.12691
10
citations
#2856

STCOcc: Sparse Spatial-Temporal Cascade Renovation for 3D Occupancy and Scene Flow Prediction

Zhimin Liao, Ping Wei, Shuaijia Chen et al.

CVPR 2025posterarXiv:2504.19749
10
citations
#2857

Diff2Flow: Training Flow Matching Models via Diffusion Model Alignment

Johannes Schusterbauer, Ming Gui, Frank Fundel et al.

CVPR 2025posterarXiv:2506.02221
10
citations
#2858

More RLHF, More Trust? On The Impact of Preference Alignment On Trustworthiness

Aaron J. Li, Satyapriya Krishna, Hima Lakkaraju

ICLR 2025posterarXiv:2404.18870
10
citations
#2859

Advancing Language Multi-Agent Learning with Credit Re-Assignment for Interactive Environment Generalization

Zhitao He, Zijun Liu, Peng Li et al.

COLM 2025paperarXiv:2502.14496
10
citations
#2860

Generalized Recorrupted-to-Recorrupted: Self-Supervised Learning Beyond Gaussian Noise

Brayan Monroy, Jorge Bacca, Julián Tachella

CVPR 2025posterarXiv:2412.04648
10
citations
#2861

First SFT, Second RL, Third UPT: Continual Improving Multi-Modal LLM Reasoning via Unsupervised Post-Training

Lai Wei, Yuting Li, Chen Wang et al.

NEURIPS 2025posterarXiv:2505.22453
10
citations
#2862

The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text

Nikhil Kandpal, Brian Lester, Colin Raffel et al.

NEURIPS 2025posterarXiv:2506.05209
10
citations
#2863

Aligning Human Motion Generation with Human Perceptions

Haoru Wang, Wentao Zhu, Luyi Miao et al.

ICLR 2025posterarXiv:2407.02272
10
citations
#2864

Towards Multiple Character Image Animation Through Enhancing Implicit Decoupling

Jingyun Xue, WANG HongFa, Qi Tian et al.

ICLR 2025posterarXiv:2406.03035
10
citations
#2865

EqNIO: Subequivariant Neural Inertial Odometry

Royina Karegoudra Jayanth, Yinshuang Xu, Ziyun Wang et al.

ICLR 2025posterarXiv:2408.06321
10
citations
#2866

Label-Free Backdoor Attacks in Vertical Federated Learning

Wei Shen, Wenke Huang, Guancheng Wan et al.

AAAI 2025paper
10
citations
#2867

ResCLIP: Residual Attention for Training-free Dense Vision-language Inference

Jinhong Deng, Yuhang Yang, Wen Li et al.

CVPR 2025posterarXiv:2411.15851
10
citations
#2868

A Single Goal is All You Need: Skills and Exploration Emerge from Contrastive RL without Rewards, Demonstrations, or Subgoals

Grace Liu, Michael Tang, Benjamin Eysenbach

ICLR 2025posterarXiv:2408.05804
10
citations
#2869

Deep MMD Gradient Flow without adversarial training

Alexandre Galashov, Valentin De Bortoli, Arthur Gretton

ICLR 2025posterarXiv:2405.06780
10
citations
#2870

MagicColor: Multi-instance Sketch Colorization

yinhan Zhang, Yue Ma, Bingyuan Wang et al.

ICCV 2025poster
10
citations
#2871

Gaussian Eigen Models for Human Heads

Wojciech Zielonka, Timo Bolkart, Thabo Beeler et al.

CVPR 2025posterarXiv:2407.04545
10
citations
#2872

Fluid Language Model Benchmarking

Valentin Hofmann, David Heineman, Ian Magnusson et al.

COLM 2025paperarXiv:2509.11106
10
citations
#2873

Steering the CensorShip: Uncovering Representation Vectors for LLM "Thought'' Control

Hannah Cyberey, David Evans

COLM 2025paper
10
citations
#2874

Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance

Sachin Goyal, Christina Baek, Zico Kolter et al.

ICLR 2025poster
10
citations
#2875

Benchmarking Multimodal CoT Reward Model Stepwise by Visual Program

Minghe Gao, Xuqi Liu, Zhongqi Yue et al.

ICCV 2025posterarXiv:2504.06606
10
citations
#2876

Breach By A Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses

David Glukhov, Ziwen Han, I Shumailov et al.

ICLR 2025posterarXiv:2407.02551
10
citations
#2877

Rethinking Visual Counterfactual Explanations Through Region Constraint

Bartlomiej Sobieski, Jakub Grzywaczewski, Bartłomiej Sadlej et al.

ICLR 2025posterarXiv:2410.12591
10
citations
#2878

Multiple Heads are Better than One: Mixture of Modality Knowledge Experts for Entity Representation Learning

Yichi Zhang, Zhuo Chen, Lingbing Guo et al.

ICLR 2025posterarXiv:2405.16869
10
citations
#2879

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Drew Linsley, Peisen Zhou, Alekh Ashok et al.

ICLR 2025posterarXiv:2406.04138
10
citations
#2880

Reason-RFT: Reinforcement Fine-Tuning for Visual Reasoning of Vision Language Models

Huajie Tan, Yuheng Ji, Xiaoshuai Hao et al.

NEURIPS 2025posterarXiv:2503.20752
10
citations
#2881

EDEN: Enhanced Diffusion for High-quality Large-motion Video Frame Interpolation

Zihao Zhang, Haoran Chen, Haoyu Zhao et al.

CVPR 2025posterarXiv:2503.15831
10
citations
#2882

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Federico Cocchi, Nicholas Moratelli, Marcella Cornia et al.

CVPR 2025posterarXiv:2411.16863
10
citations
#2883

MAD-TD: Model-Augmented Data stabilizes High Update Ratio RL

Claas Voelcker, Marcel Hussing, ERIC EATON et al.

ICLR 2025posterarXiv:2410.08896
10
citations
#2884

Causal Inference over Visual-Semantic-Aligned Graph for Image Classification

Lei Meng, Xiangxian Li, Xiaoshuo Yan et al.

AAAI 2025paper
10
citations
#2885

Temporal Separation with Entropy Regularization for Knowledge Distillation in Spiking Neural Networks

Kairong Yu, Chengting Yu, Tianqing Zhang et al.

CVPR 2025posterarXiv:2503.03144
10
citations
#2886

DocVLM: Make Your VLM an Efficient Reader

Mor Shpigel Nacson, Aviad Aberdam, Roy Ganz et al.

CVPR 2025posterarXiv:2412.08746
10
citations
#2887

RGBAvatar: Reduced Gaussian Blendshapes for Online Modeling of Head Avatars

Linzhou Li, Yumeng Li, Yanlin Weng et al.

CVPR 2025highlightarXiv:2503.12886
10
citations
#2888

V2X-R: Cooperative LiDAR-4D Radar Fusion with Denoising Diffusion for 3D Object Detection

Xun Huang, Jinlong Wang, Qiming Xia et al.

CVPR 2025posterarXiv:2411.08402
10
citations
#2889

Dual Process Learning: Controlling Use of In-Context vs. In-Weights Strategies with Weight Forgetting

Suraj Anand, Michael Lepori, Jack Merullo et al.

ICLR 2025posterarXiv:2406.00053
10
citations
#2890

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Yikang Zhou, Tao Zhang, Shilin Xu et al.

ICCV 2025posterarXiv:2501.04670
10
citations
#2891

PCDreamer: Point Cloud Completion Through Multi-view Diffusion Priors

Guangshun Wei, Yuan Feng, Long Ma et al.

CVPR 2025posterarXiv:2411.19036
10
citations
#2892

Harnessing Massive Satellite Imagery with Efficient Masked Image Modeling

Fengxiang Wang, Hongzhen Wang, Di Wang et al.

ICCV 2025posterarXiv:2406.11933
10
citations
#2893

Pareto Set Learning for Multi-Objective Reinforcement Learning

Erlong Liu, Yu-Chang Wu, Xiaobin Huang et al.

AAAI 2025paperarXiv:2501.06773
10
citations
#2894

Noise Stability Optimization for Finding Flat Minima: A Hessian-based Regularization Approach

Haotian Ju, Hongyang Zhang, Dongyue Li

ICLR 2025posterarXiv:2306.08553
10
citations
#2895

WildSAT: Learning Satellite Image Representations from Wildlife Observations

Rangel Daroya, Elijah Cole, Oisin Mac Aodha et al.

ICCV 2025posterarXiv:2412.14428
10
citations
#2896

VTON 360: High-Fidelity Virtual Try-On from Any Viewing Direction

Zijian He, Yuwei Ning, Yipeng Qin et al.

CVPR 2025posterarXiv:2503.12165
10
citations
#2897

SuperDec: 3D Scene Decomposition with Superquadrics Primitives

Elisabetta Fedele, Boyang Sun, Francis Engelmann et al.

ICCV 2025posterarXiv:2504.00992
10
citations
#2898

LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models

Haiwen Huang, Anpei Chen, Volodymyr Havrylov et al.

ICCV 2025posterarXiv:2504.14032
10
citations
#2899

$Q\sharp$: Provably Optimal Distributional RL for LLM Post-Training

Jin Zhou, Kaiwen Wang, Jonathan Chang et al.

NEURIPS 2025posterarXiv:2502.20548
10
citations
#2900

Generative Monoculture in Large Language Models

Fan Wu, Emily Black, Varun Chandrasekaran

ICLR 2025posterarXiv:2407.02209
10
citations
#2901

DisCO: Reinforcing Large Reasoning Models with Discriminative Constrained Optimization

Gang Li, Ming Lin, Tomer Galanti et al.

NEURIPS 2025posterarXiv:2505.12366
10
citations
#2902

ParZC: Parametric Zero-Cost Proxies for Efficient NAS

Peijie Dong, Lujun Li, Zhenheng Tang et al.

AAAI 2025paperarXiv:2402.02105
10
citations
#2903

Video Summarization with Large Language Models

Min Jung Lee, Dayoung Gong, Minsu Cho

CVPR 2025posterarXiv:2504.11199
10
citations
#2904

Generalized Few-shot 3D Point Cloud Segmentation with Vision-Language Model

Zhaochong An, Guolei Sun, Yun Liu et al.

CVPR 2025posterarXiv:2503.16282
10
citations
#2905

DOVE: Efficient One-Step Diffusion Model for Real-World Video Super-Resolution

Zheng Chen, Zichen Zou, Kewei Zhang et al.

NEURIPS 2025posterarXiv:2505.16239
10
citations
#2906

Generalized and Efficient 2D Gaussian Splatting for Arbitrary-scale Super-Resolution

Du Chen, Liyi Chen, Zhengqiang ZHANG et al.

ICCV 2025posterarXiv:2501.06838
10
citations
#2907

70% Size, 100% Accuracy: Lossless LLM Compression for Efficient GPU Inference via Dynamic-Length Float (DFloat11)

Tianyi Zhang, Mohsen Hariri, Shaochen (Henry) Zhong et al.

NEURIPS 2025posterarXiv:2504.11651
10
citations
#2908

Knowledge Graph Completion with Relation-Aware Anchor Enhancement

Duanyang Yuan, Sihang Zhou, Xiaoshu Chen et al.

AAAI 2025paperarXiv:2504.06129
10
citations
#2909

ReCap: Better Gaussian Relighting with Cross-Environment Captures

Jingzhi Li, Zongwei Wu, Eduard Zamfir et al.

CVPR 2025posterarXiv:2412.07534
10
citations
#2910

Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems

Jie Chen

ICLR 2025posterarXiv:2406.00809
10
citations
#2911

Revisiting Multimodal Fusion for 3D Anomaly Detection from an Architectural Perspective

Kaifang Long, Guoyang Xie, Lianbo Ma et al.

AAAI 2025paperarXiv:2412.17297
10
citations
#2912

DSP: Dynamic Sequence Parallelism for Multi-Dimensional Transformers

Xuanlei Zhao, Shenggan Cheng, Chang Chen et al.

ICML 2025posterarXiv:2403.10266
10
citations
#2913

MindLLM: A Subject-Agnostic and Versatile Model for fMRI-to-text Decoding

Weikang Qiu, Zheng Huang, Haoyu Hu et al.

ICML 2025posterarXiv:2502.15786
10
citations
#2914

Open-World Amodal Appearance Completion

Jiayang Ao, Yanbei Jiang, Qiuhong Ke et al.

CVPR 2025posterarXiv:2411.13019
10
citations
#2915

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

Yiren Song, Cheng Liu, Mike Zheng Shou

NEURIPS 2025posterarXiv:2505.18445
10
citations
#2916

Anyprefer: An Agentic Framework for Preference Data Synthesis

Yiyang Zhou, Zhaoyang Wang, Tianle Wang et al.

ICLR 2025posterarXiv:2504.19276
10
citations
#2917

Budgeted Online Continual Learning by Adaptive Layer Freezing and Frequency-based Sampling

Minhyuk Seo, Hyunseo Koh, Jonghyun Choi

ICLR 2025posterarXiv:2410.15143
10
citations
#2918

TokenFormer: Rethinking Transformer Scaling with Tokenized Model Parameters

Haiyang Wang, Yue Fan, Muhammad Ferjad Naeem et al.

ICLR 2025posterarXiv:2410.23168
10
citations
#2919

MINERVA: Evaluating Complex Video Reasoning

Arsha Nagrani, Sachit Menon, Ahmet Iscen et al.

ICCV 2025posterarXiv:2505.00681
10
citations
#2920

Efficiently Parameterized Neural Metriplectic Systems

Anthony Gruber, Kookjin Lee, Haksoo Lim et al.

ICLR 2025posterarXiv:2405.16305
10
citations
#2921

The Double-Ellipsoid Geometry of CLIP

Meir Yossef Levi, Guy Gilboa

ICML 2025posterarXiv:2411.14517
10
citations
#2922

Generate, but Verify: Reducing Hallucination in Vision-Language Models with Retrospective Resampling

Tsung-Han (Patrick) Wu, Heekyung Lee, Jiaxin Ge et al.

NEURIPS 2025posterarXiv:2504.13169
10
citations
#2923

A Large Recurrent Action Model: xLSTM enables Fast Inference for Robotics Tasks

Thomas Schmied, Thomas Adler, Vihang Patil et al.

ICML 2025posterarXiv:2410.22391
10
citations
#2924

HIIF: Hierarchical Encoding based Implicit Image Function for Continuous Super-resolution

Yuxuan Jiang, Ho Man Kwan, jasmine peng et al.

CVPR 2025posterarXiv:2412.03748
10
citations
#2925

MoS: Unleashing Parameter Efficiency of Low-Rank Adaptation with Mixture of Shards

Sheng Wang, Liheng Chen, Pengan CHEN et al.

ICLR 2025posterarXiv:2410.00938
10
citations
#2926

DOTA: Distributional Test-time Adaptation of Vision-Language Models

Zongbo Han, Jialong Yang, Guangyu Wang et al.

NEURIPS 2025posterarXiv:2409.19375
10
citations
#2927

Probing the Latent Hierarchical Structure of Data via Diffusion Models

Antonio Sclocchi, Alessandro Favero, Noam Levi et al.

ICLR 2025posterarXiv:2410.13770
10
citations
#2928

SlerpFace: Face Template Protection via Spherical Linear Interpolation

Zhizhou Zhong, Yuxi Mi, Yuge Huang et al.

AAAI 2025paperarXiv:2407.03043
10
citations
#2929

SpatialDreamer: Self-supervised Stereo Video Synthesis from Monocular Input

Zhen Lv, Yangqi Long, Congzhentao Huang et al.

CVPR 2025posterarXiv:2411.11934
10
citations
#2930

DisEnvisioner: Disentangled and Enriched Visual Prompt for Customized Image Generation

Jing He, Haodong Li, huyongzhe et al.

ICLR 2025posterarXiv:2410.02067
10
citations
#2931

Training-Free Guidance Beyond Differentiability: Scalable Path Steering with Tree Search in Diffusion and Flow Models

Yingqing Guo, Yukang Yang, Hui Yuan et al.

NEURIPS 2025posterarXiv:2502.11420
10
citations
#2932

CL-MoE: Enhancing Multimodal Large Language Model with Dual Momentum Mixture-of-Experts for Continual Visual Question Answering

Tianyu Huai, Jie Zhou, Xingjiao Wu et al.

CVPR 2025highlightarXiv:2503.00413
10
citations
#2933

GRPose: Learning Graph Relations for Human Image Generation with Pose Priors

Xiangchen Yin, Donglin Di, Lei Fan et al.

AAAI 2025paperarXiv:2408.16540
10
citations
#2934

Light-T2M: A Lightweight and Fast Model for Text-to-motion Generation

Ling-An Zeng, Guohong Huang, Gaojie Wu et al.

AAAI 2025paperarXiv:2412.11193
10
citations
#2935

X-Fi: A Modality-Invariant Foundation Model for Multimodal Human Sensing

Xinyan Chen, Jianfei Yang

ICLR 2025posterarXiv:2410.10167
10
citations
#2936

Trajectory Mamba: Efficient Attention-Mamba Forecasting Model Based on Selective SSM

Yizhou Huang, Yihua Cheng, Kezhi Wang

CVPR 2025posterarXiv:2503.10898
10
citations
#2937

Measuring memorization in RLHF for code completion

Jamie Hayes, I Shumailov, Billy Porter et al.

ICLR 2025posterarXiv:2406.11715
10
citations
#2938

Visual Test-time Scaling for GUI Agent Grounding

Tiange Luo, Lajanugen Logeswaran, Justin Johnson et al.

ICCV 2025highlightarXiv:2505.00684
10
citations
#2939

PhyMPGN: Physics-encoded Message Passing Graph Network for spatiotemporal PDE systems

Bocheng Zeng, Qi Wang, Mengtao Yan et al.

ICLR 2025oralarXiv:2410.01337
10
citations
#2940

Hybrid Global-Local Representation with Augmented Spatial Guidance for Zero-Shot Referring Image Segmentation

Ting Liu, Siyuan Li

CVPR 2025posterarXiv:2504.00356
10
citations
#2941

Harmony in Divergence: Towards Fast, Accurate, and Memory-efficient Zeroth-order LLM Fine-tuning

Qitao Tan, Jun Liu, Zheng Zhan et al.

NEURIPS 2025posterarXiv:2502.03304
10
citations
#2942

HOIGPT: Learning Long-Sequence Hand-Object Interaction with Language Models

Mingzhen Huang, Fu-Jen Chu, Bugra Tekin et al.

CVPR 2025posterarXiv:2503.19157
10
citations
#2943

Layered Image Vectorization via Semantic Simplification

Zhenyu Wang, Jianxi Huang, Zhida Sun et al.

CVPR 2025posterarXiv:2406.05404
10
citations
#2944

Radiology Report Generation via Multi-objective Preference Optimization

Ting Xiao, Lei Shi, Peng Liu et al.

AAAI 2025paperarXiv:2412.08901
10
citations
#2945

GUI-Xplore: Empowering Generalizable GUI Agents with One Exploration

Yuchen Sun, Shanhui Zhao, Tao Yu et al.

CVPR 2025posterarXiv:2503.17709
10
citations
#2946

Task-Agnostic Guided Feature Expansion for Class-Incremental Learning

Bowen Zheng, Da-Wei Zhou, Han-Jia Ye et al.

CVPR 2025posterarXiv:2503.00823
10
citations
#2947

VTON-HandFit: Virtual Try-on for Arbitrary Hand Pose Guided by Hand Priors Embedding

Yujie Liang, Xiaobin Hu, Boyuan Jiang et al.

CVPR 2025posterarXiv:2408.12340
10
citations
#2948

Prompt-A-Video: Prompt Your Video Diffusion Model via Preference-Aligned LLM

Yatai Ji, Jiacheng Zhang, Jie Wu et al.

ICCV 2025posterarXiv:2412.15156
10
citations
#2949

CCIN: Compositional Conflict Identification and Neutralization for Composed Image Retrieval

Likai Tian, Jian Zhao, Zechao Hu et al.

CVPR 2025highlight
10
citations
#2950

Extrapolating and Decoupling Image-to-Video Generation Models: Motion Modeling is Easier Than You Think

Zhenyi Lu, Xiaoye Qu, Zhenyi Lu et al.

CVPR 2025highlightarXiv:2503.00948
10
citations
#2951

Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions

Siqiao Mu, Diego Klabjan

NEURIPS 2025posterarXiv:2409.09778
10
citations
#2952

Text-to-CAD Generation Through Infusing Visual Feedback in Large Language Models

Ruiyu Wang, Yu Yuan, Shizhao Sun et al.

ICML 2025posterarXiv:2501.19054
10
citations
#2953

Node-Time Conditional Prompt Learning in Dynamic Graphs

Xingtong Yu, Zhenghao Liu, Xinming Zhang et al.

ICLR 2025oralarXiv:2405.13937
10
citations
#2954

Mol-LLaMA: Towards General Understanding of Molecules in Large Molecular Language Model

Dongki Kim, Wonbin Lee, Sung Ju Hwang

NEURIPS 2025posterarXiv:2502.13449
10
citations
#2955

Spectral Image Tokenizer

Carlos Esteves, Mohammed Suhail, Ameesh Makadia

ICCV 2025posterarXiv:2412.09607
10
citations
#2956

VideoUFO: A Million-Scale User-Focused Dataset for Text-to-Video Generation

Wenhao Wang, Yi Yang

NEURIPS 2025posterarXiv:2503.01739
10
citations
#2957

CoreMatching: A Co-adaptive Sparse Inference Framework with Token and Neuron Pruning for Comprehensive Acceleration of Vision-Language Models

Qinsi Wang, Hancheng Ye, Ming-Yu Chung et al.

ICML 2025posterarXiv:2505.19235
10
citations
#2958

RomanTex: Decoupling 3D-aware Rotary Positional Embedded Multi-Attention Network for Texture Synthesis

yifei feng, Mx Yang, Shuhui Yang et al.

ICCV 2025posterarXiv:2503.19011
10
citations
#2959

Point2RBox-v2: Rethinking Point-supervised Oriented Object Detection with Spatial Layout Among Instances

Yi Yu, Botao Ren, Peiyuan Zhang et al.

CVPR 2025posterarXiv:2502.04268
10
citations
#2960

DRIFT: Dynamic Rule-Based Defense with Injection Isolation for Securing LLM Agents

Hao Li, Xiaogeng Liu, CHIU Chun et al.

NEURIPS 2025posterarXiv:2506.12104
10
citations
#2961

Reliable and Efficient Amortized Model-based Evaluation

Sang Truong, Yuheng Tu, Percy Liang et al.

ICML 2025posterarXiv:2503.13335
10
citations
#2962

Attention as a Hypernetwork

Simon Schug, Seijin Kobayashi, Yassir Akram et al.

ICLR 2025posterarXiv:2406.05816
10
citations
#2963

Beyond Graphs: Can Large Language Models Comprehend Hypergraphs?

Yifan Feng, Chengwu Yang, Xingliang Hou et al.

ICLR 2025posterarXiv:2410.10083
10
citations
#2964

CLEVER: A Curated Benchmark for Formally Verified Code Generation

Amitayush Thakur, Jasper Lee, George Tsoukalas et al.

NEURIPS 2025posterarXiv:2505.13938
10
citations
#2965

Tuning the Frequencies: Robust Training for Sinusoidal Neural Networks

Tiago Novello, Diana Aldana Moreno, André Araujo et al.

CVPR 2025highlightarXiv:2407.21121
10
citations
#2966

AlphaPO: Reward Shape Matters for LLM Alignment

Aman Gupta, Shao Tang, Qingquan Song et al.

ICML 2025posterarXiv:2501.03884
10
citations
#2967

The ML.ENERGY Benchmark: Toward Automated Inference Energy Measurement and Optimization

Jae-Won Chung, Jeff J. Ma, Ruofan Wu et al.

NEURIPS 2025spotlightarXiv:2505.06371
10
citations
#2968

ADBA: Approximation Decision Boundary Approach for Black-Box Adversarial Attacks

Feiyang Wang, Xingquan Zuo, Hai Huang et al.

AAAI 2025paper
10
citations
#2969

Needle Threading: Can LLMs Follow Threads Through Near-Million-Scale Haystacks?

Jonathan Roberts, Kai Han, Samuel Albanie

ICLR 2025posterarXiv:2411.05000
10
citations
#2970

Taylor Series-Inspired Local Structure Fitting Network for Few-shot Point Cloud Semantic Segmentation

Changshuo Wang, Shuting He, Xiang Fang et al.

AAAI 2025paperarXiv:2504.02454
10
citations
#2971

Unifying 2D and 3D Vision-Language Understanding

Ayush Jain, Alexander Swerdlow, Yuzhou Wang et al.

ICML 2025posterarXiv:2503.10745
10
citations
#2972

A Closer Look at TabPFN v2: Understanding Its Strengths and Extending Its Capabilities

Han-Jia Ye, Si-Yang Liu, Wei-Lun (Harry) Chao

NEURIPS 2025posterarXiv:2502.17361
10
citations
#2973

Unlearn and Burn: Adversarial Machine Unlearning Requests Destroy Model Accuracy

Yangsibo Huang, Daogao Liu, Lynn Chua et al.

ICLR 2025posterarXiv:2410.09591
10
citations
#2974

QERA: an Analytical Framework for Quantization Error Reconstruction

Cheng Zhang, Jeffrey T. H. Wong, Can Xiao et al.

ICLR 2025posterarXiv:2410.06040
10
citations
#2975

Transformer Copilot: Learning from The Mistake Log in LLM Fine-tuning

Jiaru Zou, Yikun Ban, Zihao Li et al.

NEURIPS 2025spotlightarXiv:2505.16270
10
citations
#2976

CameraCtrl II: Dynamic Scene Exploration via Camera-controlled Video Diffusion Models

Hao He, Ceyuan Yang, Shanchuan Lin et al.

ICCV 2025posterarXiv:2503.10592
10
citations
#2977

DreamRelation: Bridging Customization and Relation Generation

Qingyu Shi, Lu Qi, Jianzong Wu et al.

CVPR 2025posterarXiv:2410.23280
10
citations
#2978

Periodic Materials Generation using Text-Guided Joint Diffusion Model

KISHALAY DAS, Subhojyoti Khastagir, Pawan Goyal et al.

ICLR 2025posterarXiv:2503.00522
10
citations
#2979

FormalAlign: Automated Alignment Evaluation for Autoformalization

Jianqiao Lu, Yingjia Wan, Yinya Huang et al.

ICLR 2025posterarXiv:2410.10135
10
citations
#2980

PatchDPO: Patch-level DPO for Finetuning-free Personalized Image Generation

Qihan Huang, Weilong Dai, Jinlong Liu et al.

CVPR 2025posterarXiv:2412.03177
10
citations
#2981

Scaling Laws for Optimal Data Mixtures

Mustafa Shukor, Louis Bethune, Dan Busbridge et al.

NEURIPS 2025posterarXiv:2507.09404
10
citations
#2982

Boosting Latent Diffusion with Perceptual Objectives

Tariq Berrada, Pietro Astolfi, Melissa Hall et al.

ICLR 2025posterarXiv:2411.04873
10
citations
#2983

AdaRankGrad: Adaptive Gradient Rank and Moments for Memory-Efficient LLMs Training and Fine-Tuning

Yehonathan Refael, Jonathan Svirsky, Boris Shustin et al.

ICLR 2025posterarXiv:2410.17881
10
citations
#2984

UCF-Crime-DVS: A Novel Event-Based Dataset for Video Anomaly Detection with Spiking Neural Networks

Yuanbin Qian, Shuhan Ye, Chong Wang et al.

AAAI 2025paperarXiv:2503.12905
10
citations
#2985

Attention layers provably solve single-location regression

Pierre Marion, Raphaël Berthier, Gérard Biau et al.

ICLR 2025posterarXiv:2410.01537
10
citations
#2986

Efficient Attention-Sharing Information Distillation Transformer for Lightweight Single Image Super-Resolution

Karam Park, Jae Woong Soh, Nam Ik Cho

AAAI 2025paperarXiv:2501.15774
10
citations
#2987

Nautilus: Locality-aware Autoencoder for Scalable Mesh Generation

Yuxuan Wang, Xuanyu Yi, Haohan Weng et al.

ICCV 2025posterarXiv:2501.14317
10
citations
#2988

X-Dancer: Expressive Music to Human Dance Video Generation

Zeyuan Chen, Hongyi Xu, Guoxian Song et al.

ICCV 2025highlightarXiv:2502.17414
10
citations
#2989

Topological Blindspots: Understanding and Extending Topological Deep Learning Through the Lens of Expressivity

Yam Eitan, Yoav Gelberg, Guy Bar-Shalom et al.

ICLR 2025posterarXiv:2408.05486
10
citations
#2990

DM-Adapter: Domain-Aware Mixture-of-Adapters for Text-Based Person Retrieval

Yating Liu, Zimo Liu, Xiangyuan Lan et al.

AAAI 2025paperarXiv:2503.04144
10
citations
#2991

RayFlow: Instance-Aware Diffusion Acceleration via Adaptive Flow Trajectories

Huiyang Shao, Xin Xia, Yuhong Yang et al.

CVPR 2025posterarXiv:2503.07699
10
citations
#2992

VSTAR: Generative Temporal Nursing for Longer Dynamic Video Synthesis

Yumeng Li, William H Beluch, Margret Keuper et al.

ICLR 2025oralarXiv:2403.13501
10
citations
#2993

Model-based RL as a Minimalist Approach to Horizon-Free and Second-Order Bounds

Zhiyong Wang, Dongruo Zhou, John C.S. Lui et al.

ICLR 2025posterarXiv:2408.08994
10
citations
#2994

Hydra-SGG: Hybrid Relation Assignment for One-stage Scene Graph Generation

Minghan Chen, Guikun Chen, Wenguan Wang et al.

ICLR 2025posterarXiv:2409.10262
10
citations
#2995

Open-World Reinforcement Learning over Long Short-Term Imagination

Jiajian Li, Qi Wang, Yunbo Wang et al.

ICLR 2025posterarXiv:2410.03618
10
citations
#2996

PLeaS - Merging Models with Permutations and Least Squares

Anshul Nasery, Jonathan Hayase, Pang Wei Koh et al.

CVPR 2025posterarXiv:2407.02447
10
citations
#2997

Efficient Gaussian Splatting for Monocular Dynamic Scene Rendering via Sparse Time-Variant Attribute Modeling

Hanyang Kong, Xingyi Yang, Xinchao Wang

AAAI 2025paperarXiv:2502.20378
10
citations
#2998

Fast training and sampling of Restricted Boltzmann Machines

Nicolas BEREUX, Aurélien Decelle, Cyril Furtlehner et al.

ICLR 2025posterarXiv:2405.15376
10
citations
#2999

Overcoming False Illusions in Real-World Face Restoration with Multi-Modal Guided Diffusion Model

Keda TAO, Jinjin Gu, Yulun Zhang et al.

ICLR 2025posterarXiv:2410.04161
10
citations
#3000

On the Crucial Role of Initialization for Matrix Factorization

Bingcong Li, Liang Zhang, Aryan Mokhtari et al.

ICLR 2025posterarXiv:2410.18965
10
citations