Most Cited 2025 "non-stationary data" Papers

22,274 papers found • Page 13 of 112

#2401

Rope to Nope and Back Again: A New Hybrid Attention Strategy

Bowen Yang, Bharat Venkitesh, Dwaraknath Gnaneshwar Talupuru et al.

NEURIPS 2025arXiv:2501.18795
20
citations
#2402

Do as We Do, Not as You Think: the Conformity of Large Language Models

Zhiyuan Weng, Guikun Chen, Wenguan Wang

ICLR 2025arXiv:2501.13381
20
citations
#2403

Revelio: Interpreting and leveraging semantic information in diffusion models

Dahye Kim, Xavier Thomas, Deepti Ghadiyaram

ICCV 2025arXiv:2411.16725
20
citations
#2404

Breaking the Low-Rank Dilemma of Linear Attention

Qihang Fan, Huaibo Huang, Ran He

CVPR 2025arXiv:2411.07635
20
citations
#2405

Occlusion-Embedded Hybrid Transformer for Light Field Super-Resolution

Zeyu Xiao, Zhuoyuan Li, Wei Jia

AAAI 2025paper
20
citations
#2406

Towards Better Alignment: Training Diffusion Models with Reinforcement Learning Against Sparse Rewards

Zijing Hu, Fengda Zhang, Long Chen et al.

CVPR 2025arXiv:2503.11240
20
citations
#2407

HaWoR: World-Space Hand Motion Reconstruction from Egocentric Videos

Jinglei Zhang, Jiankang Deng, Chao Ma et al.

CVPR 2025highlightarXiv:2501.02973
20
citations
#2408

Forking Paths in Neural Text Generation

Eric Bigelow, Ari Holtzman, Hidenori Tanaka et al.

ICLR 2025arXiv:2412.07961
20
citations
#2409

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Barrett Tang, Zile Huang, Chengzhi Liu et al.

ICLR 2025
20
citations
#2410

Taming Teacher Forcing for Masked Autoregressive Video Generation

Deyu Zhou, Quan Sun, Yuang Peng et al.

CVPR 2025arXiv:2501.12389
20
citations
#2411

Nova: Generative Language Models for Assembly Code with Hierarchical Attention and Contrastive Learning

Nan Jiang, Chengxiao Wang, Kevin Liu et al.

ICLR 2025arXiv:2311.13721
20
citations
#2412

Text2PDE: Latent Diffusion Models for Accessible Physics Simulation

Anthony Zhou, Zijie Li, Michael Schneier et al.

ICLR 2025oralarXiv:2410.01153
20
citations
#2413

Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models

Mateusz Pach, Shyamgopal Karthik, Quentin Bouniot et al.

NEURIPS 2025arXiv:2504.02821
20
citations
#2414

Inference-Time Alignment of Diffusion Models with Direct Noise Optimization

Zhiwei Tang, Jiangweizhi Peng, Jiasheng Tang et al.

ICML 2025arXiv:2405.18881
20
citations
#2415

Do Vision & Language Decoders use Images and Text equally? How Self-consistent are their Explanations?

Letitia Parcalabescu, Anette Frank

ICLR 2025arXiv:2404.18624
20
citations
#2416

KAN-AD: Time Series Anomaly Detection with Kolmogorov–Arnold Networks

Quan Zhou, Changhua Pei, Fei Sun et al.

ICML 2025arXiv:2411.00278
20
citations
#2417

Fantastic Targets for Concept Erasure in Diffusion Models and Where To Find Them

Anh Bui, Thuy-Trang Vu, Long Vuong et al.

ICLR 2025arXiv:2501.18950
20
citations
#2418

Sort-free Gaussian Splatting via Weighted Sum Rendering

Qiqi Hou, Randall Rauwendaal, Zifeng Li et al.

ICLR 2025arXiv:2410.18931
20
citations
#2419

OmniGuard: Hybrid Manipulation Localization via Augmented Versatile Deep Image Watermarking

Xuanyu Zhang, Zecheng Tang, Zhipei Xu et al.

CVPR 2025arXiv:2412.01615
20
citations
#2420

Learning Continually by Spectral Regularization

Alex Lewandowski, Michał Bortkiewicz, Saurabh Kumar et al.

ICLR 2025arXiv:2406.06811
20
citations
#2421

Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment

ziang yan, Zhilin Li, Yinan He et al.

CVPR 2025arXiv:2412.19326
20
citations
#2422

SynCity: Training-Free Generation of 3D Cities

Paul Engstler, Aleksandar Shtedritski, Iro Laina et al.

ICCV 2025
20
citations
#2423

WonderTurbo: Generating Interactive 3D World in 0.72 Seconds

Chaojun Ni, Xiaofeng Wang, Zheng Zhu et al.

ICCV 2025arXiv:2504.02261
20
citations
#2424

InsViE-1M: Effective Instruction-based Video Editing with Elaborate Dataset Construction

Yuhui WU, Liyi Chen, Ruibin Li et al.

ICCV 2025arXiv:2503.20287
20
citations
#2425

You Only Sample Once: Taming One-Step Text-to-Image Synthesis by Self-Cooperative Diffusion GANs

Yihong Luo, Xiaolong Chen, Xinghua Qu et al.

ICLR 2025arXiv:2403.12931
20
citations
#2426

UniK3D: Universal Camera Monocular 3D Estimation

Luigi Piccinelli, Christos Sakaridis, Mattia Segu et al.

CVPR 2025arXiv:2503.16591
20
citations
#2427

SynFlowNet: Design of Diverse and Novel Molecules with Synthesis Constraints

Miruna Cretu, Charles Harris, Ilia Igashov et al.

ICLR 2025arXiv:2405.01155
20
citations
#2428

InstructAvatar: Text-Guided Emotion and Motion Control for Avatar Generation

Yuchi Wang, Junliang Guo, Jianhong Bai et al.

AAAI 2025paperarXiv:2405.15758
20
citations
#2429

MindOmni: Unleashing Reasoning Generation in Vision Language Models with RGPO

Yicheng Xiao, Lin Song, Yukang Chen et al.

NEURIPS 2025arXiv:2505.13031
20
citations
#2430

From Low Rank Gradient Subspace Stabilization to Low-Rank Weights: Observations, Theories, and Applications

Ajay Jaiswal, Yifan Wang, Lu Yin et al.

ICML 2025arXiv:2407.11239
20
citations
#2431

{$\tau$}-bench: A Benchmark for \underline{T}ool-\underline{A}gent-\underline{U}ser Interaction in Real-World Domains

Shunyu Yao, Noah Shinn, Pedram Razavi et al.

ICLR 2025
20
citations
#2432

CLEAR: Conv-Like Linearization Revs Pre-Trained Diffusion Transformers Up

Songhua Liu, Zhenxiong Tan, Xinchao Wang

NEURIPS 2025arXiv:2412.16112
20
citations
#2433

Rethinking Light Decoder-based Solvers for Vehicle Routing Problems

Ziwei Huang, Jianan Zhou, Zhiguang Cao et al.

ICLR 2025arXiv:2503.00753
20
citations
#2434

ShiftwiseConv: Small Convolutional Kernel with Large Kernel Effect

Dachong Li, li li, zhuangzhuang chen et al.

CVPR 2025arXiv:2401.12736
19
citations
#2435

4D LangSplat: 4D Language Gaussian Splatting via Multimodal Large Language Models

Wanhua Li, Renping Zhou, Jiawei Zhou et al.

CVPR 2025arXiv:2503.10437
19
citations
#2436

FreeSplatter: Pose-free Gaussian Splatting for Sparse-view 3D Reconstruction

Jiale Xu, Shenghua Gao, Ying Shan

ICCV 2025arXiv:2412.09573
19
citations
#2437

Spectral Motion Alignment for Video Motion Transfer Using Diffusion Models

Geon Yeong Park, Hyeonho Jeong, Sang Wan Lee et al.

AAAI 2025paperarXiv:2403.15249
19
citations
#2438

Capturing the Temporal Dependence of Training Data Influence

Jiachen (Tianhao) Wang, Dawn Song, James Y Zou et al.

ICLR 2025oralarXiv:2412.09538
19
citations
#2439

Around the World in 80 Timesteps: A Generative Approach to Global Visual Geolocation

Nicolas Dufour, Vicky Kalogeiton, David Picard et al.

CVPR 2025arXiv:2412.06781
19
citations
#2440

CAD-Recode: Reverse Engineering CAD Code from Point Clouds

Danila Rukhovich, Elona Dupont, Dimitrios Mallis et al.

ICCV 2025arXiv:2412.14042
19
citations
#2441

MoonCast: High-Quality Zero-Shot Podcast Generation

Zeqian Ju, Dongchao Yang, Shen Kai et al.

NEURIPS 2025oralarXiv:2503.14345
19
citations
#2442

Generative Omnimatte: Learning to Decompose Video into Layers

Yao-Chih Lee, Erika Lu, Sarah Rumbley et al.

CVPR 2025highlightarXiv:2411.16683
19
citations
#2443

TabDPT: Scaling Tabular Foundation Models on Real Data

Junwei Ma, Valentin Thomas, Rasa Hosseinzadeh et al.

NEURIPS 2025arXiv:2410.18164
19
citations
#2444

Commit0: Library Generation from Scratch

Wenting Zhao, Nan Jiang, Celine Lee et al.

ICLR 2025arXiv:2412.01769
19
citations
#2445

SELF-EVOLVED REWARD LEARNING FOR LLMS

Chenghua Huang, Zhizhen Fan, Lu Wang et al.

ICLR 2025arXiv:2411.00418
19
citations
#2446

DKDM: Data-Free Knowledge Distillation for Diffusion Models with Any Architecture

Qianlong Xiang, Miao Zhang, Yuzhang Shang et al.

CVPR 2025arXiv:2409.03550
19
citations
#2447

Block Verification Accelerates Speculative Decoding

Ziteng Sun, Uri Mendlovic, Yaniv Leviathan et al.

ICLR 2025arXiv:2403.10444
19
citations
#2448

Multi-Domain Graph Foundation Models: Robust Knowledge Transfer via Topology Alignment

Shuo Wang, Bokui Wang, Zhixiang Shen et al.

ICML 2025arXiv:2502.02017
19
citations
#2449

TexGaussian: Generating High-quality PBR Material via Octree-based 3D Gaussian Splatting

Bojun Xiong, Jialun Liu, JiaKui Hu et al.

CVPR 2025arXiv:2411.19654
19
citations
#2450

Universal Length Generalization with Turing Programs

Kaiying Hou, David Brandfonbrener, Sham Kakade et al.

ICML 2025arXiv:2407.03310
19
citations
#2451

TAPNext: Tracking Any Point (TAP) as Next Token Prediction

Artem Zholus, Carl Doersch, Yi Yang et al.

ICCV 2025arXiv:2504.05579
19
citations
#2452

Analyzing and Boosting the Power of Fine-Grained Visual Recognition for Multi-modal Large Language Models

Hulingxiao He, Geng Li, Zijun Geng et al.

ICLR 2025arXiv:2501.15140
19
citations
#2453

Look Inside for More: Internal Spatial Modality Perception for 3D Anomaly Detection

Hanzhe Liang, Guoyang Xie, Chengbin Hou et al.

AAAI 2025paperarXiv:2412.13461
19
citations
#2454

GuardReasoner-VL: Safeguarding VLMs via Reinforced Reasoning

Yue Liu, Shengfang Zhai, Mingzhe Du et al.

NEURIPS 2025arXiv:2505.11049
19
citations
#2455

Progress or Regress? Self-Improvement Reversal in Post-training

Ting Wu, Xuefeng Li, Pengfei Liu

ICLR 2025arXiv:2407.05013
19
citations
#2456

Standing on the Shoulders of Giants: Reprogramming Visual-Language Model for General Deepfake Detection

Kaiqing Lin, Yuzhen Lin, Weixiang Li et al.

AAAI 2025paperarXiv:2409.02664
19
citations
#2457

Model Swarms: Collaborative Search to Adapt LLM Experts via Swarm Intelligence

Shangbin Feng, Zifeng Wang, Yike Wang et al.

ICML 2025arXiv:2410.11163
19
citations
#2458

From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data

Zheyang Xiong, Vasilis Papageorgiou, Kangwook Lee et al.

ICLR 2025arXiv:2406.19292
19
citations
#2459

Segmenting Maxillofacial Structures in CBCT Volumes

Federico Bolelli, Kevin Marchesini, Niels van Nistelrooij et al.

CVPR 2025
19
citations
#2460

AGLLDiff: Guiding Diffusion Models Towards Unsupervised Training-free Real-world Low-light Image Enhancement

Yunlong Lin, Tian Ye, Sixiang Chen et al.

AAAI 2025paperarXiv:2407.14900
19
citations
#2461

Feat2GS: Probing Visual Foundation Models with Gaussian Splatting

Yue Chen, Xingyu Chen, Anpei Chen et al.

CVPR 2025arXiv:2412.09606
19
citations
#2462

Score Forgetting Distillation: A Swift, Data-Free Method for Machine Unlearning in Diffusion Models

Tianqi Chen, Shujian Zhang, Mingyuan Zhou

ICLR 2025arXiv:2409.11219
19
citations
#2463

Streamlining Redundant Layers to Compress Large Language Models

Xiaodong Chen, Yuxuan Hu, Jing Zhang et al.

ICLR 2025arXiv:2403.19135
19
citations
#2464

RainbowPO: A Unified Framework for Combining Improvements in Preference Optimization

Hanyang Zhao, Genta Winata, Anirban Das et al.

ICLR 2025arXiv:2410.04203
19
citations
#2465

Effective Interplay between Sparsity and Quantization: From Theory to Practice

Simla Harma, Ayan Chakraborty, Elizaveta Kostenok et al.

ICLR 2025arXiv:2405.20935
19
citations
#2466

PersonalVideo: High ID-Fidelity Video Customization without Dynamic and Semantic Degradation

Hengjia Li, Haonan Qiu, Shiwei Zhang et al.

ICCV 2025arXiv:2411.17048
19
citations
#2467

Temporal Query Network for Efficient Multivariate Time Series Forecasting

Shengsheng Lin, Haojun Chen, Haijie Wu et al.

ICML 2025oralarXiv:2505.12917
19
citations
#2468

3DitScene: Editing Any Scene via Language-guided Disentangled Gaussian Splatting

Qihang Zhang, Yinghao Xu, Chaoyang Wang et al.

ICLR 2025arXiv:2405.18424
19
citations
#2469

The Surprising Agreement Between Convex Optimization Theory and Learning-Rate Scheduling for Large Model Training

Fabian Schaipp, Alexander Hägele, Adrien Taylor et al.

ICML 2025arXiv:2501.18965
19
citations
#2470

Two Effects, One Trigger: On the Modality Gap, Object Bias, and Information Imbalance in Contrastive Vision-Language Models

Simon Schrodi, David T. Hoffmann, Max Argus et al.

ICLR 2025arXiv:2404.07983
19
citations
#2471

QJL: 1-Bit Quantized JL Transform for KV Cache Quantization with Zero Overhead

Amir Zandieh, Majid Daliri, Insu Han

AAAI 2025paperarXiv:2406.03482
19
citations
#2472

Rethinking Transformer-Based Blind-Spot Network for Self-Supervised Image Denoising

Junyi Li, Zhilu Zhang, Wangmeng Zuo

AAAI 2025paperarXiv:2404.07846
19
citations
#2473

Beyond Chemical QA: Evaluating LLM's Chemical Reasoning with Modular Chemical Operations

Li Hao, He CAO, Bin Feng et al.

NEURIPS 2025arXiv:2505.21318
19
citations
#2474

Unleashing the Temporal-Spatial Reasoning Capacity of GPT for Training-Free Audio and Language Referenced Video Object Segmentation

Shaofei Huang, Rui Ling, Hongyu Li et al.

AAAI 2025paperarXiv:2408.15876
19
citations
#2475

The Crucial Role of Samplers in Online Direct Preference Optimization

Ruizhe Shi, Runlong Zhou, Simon Du

ICLR 2025arXiv:2409.19605
19
citations
#2476

BirdSet: A Large-Scale Dataset for Audio Classification in Avian Bioacoustics

Lukas Rauch, Raphael Schwinger, Moritz Wirth et al.

ICLR 2025arXiv:2403.10380
19
citations
#2477

KBQA-o1: Agentic Knowledge Base Question Answering with Monte Carlo Tree Search

Haoran Luo, Haihong E, Yikai Guo et al.

ICML 2025arXiv:2501.18922
19
citations
#2478

EVEv2: Improved Baselines for Encoder-Free Vision-Language Models

Haiwen Diao, Xiaotong Li, Yufeng Cui et al.

ICCV 2025highlightarXiv:2502.06788
19
citations
#2479

NExT-Mol: 3D Diffusion Meets 1D Language Modeling for 3D Molecule Generation

Zhiyuan Liu, Yanchen Luo, Han Huang et al.

ICLR 2025arXiv:2502.12638
19
citations
#2480

ReferDINO: Referring Video Object Segmentation with Visual Grounding Foundations

Tianming Liang, Kun-Yu Lin, Chaolei Tan et al.

ICCV 2025arXiv:2501.14607
19
citations
#2481

Accelerating neural network training: An analysis of the AlgoPerf competition

Priya Kasimbeg, Frank Schneider, Runa Eschenhagen et al.

ICLR 2025arXiv:2502.15015
19
citations
#2482

Aligning Vision to Language: Annotation-Free Multimodal Knowledge Graph Construction for Enhanced LLMs Reasoning

Junming Liu, Siyuan Meng, Yanting Gao et al.

ICCV 2025arXiv:2503.12972
19
citations
#2483

Mellow: a small audio language model for reasoning

Soham Deshmukh, Satvik Dixit, Rita Singh et al.

NEURIPS 2025arXiv:2503.08540
19
citations
#2484

CrAM: Credibility-Aware Attention Modification in LLMs for Combating Misinformation in RAG

Boyi Deng, Wenjie Wang, Fengbin Zhu et al.

AAAI 2025paperarXiv:2406.11497
19
citations
#2485

SlowFast-VGen: Slow-Fast Learning for Action-Driven Long Video Generation

Yining Hong, Beide Liu, Maxine Wu et al.

ICLR 2025oralarXiv:2410.23277
19
citations
#2486

Adversaries Can Misuse Combinations of Safe Models

Erik Jones, Anca Dragan, Jacob Steinhardt

ICML 2025arXiv:2406.14595
19
citations
#2487

Learning to Reason for Long-Form Story Generation

Alexander Gurung, Mirella Lapata

COLM 2025paper
19
citations
#2488

ABQ-LLM: Arbitrary-Bit Quantized Inference Acceleration for Large Language Models

Chao Zeng, Songwei Liu, Yusheng Xie et al.

AAAI 2025paperarXiv:2408.08554
19
citations
#2489

MaskGaussian: Adaptive 3D Gaussian Representation from Probabilistic Masks

Yifei Liu, Zhihang Zhong, Yifan Zhan et al.

CVPR 2025arXiv:2412.20522
19
citations
#2490

A Controlled Study on Long Context Extension and Generalization in LLMs

Yi Lu, Jing Nathan Yan, Songlin Yang et al.

COLM 2025paperarXiv:2409.12181
19
citations
#2491

Hydra-NeXt: Robust Closed-Loop Driving with Open-Loop Training

Zhenxin Li, Shihao Wang, Shiyi Lan et al.

ICCV 2025arXiv:2503.12030
19
citations
#2492

Design Principle Transfer in Neural Architecture Search via Large Language Models

Xun Zhou, Xingyu Wu, Liang Feng et al.

AAAI 2025paperarXiv:2408.11330
19
citations
#2493

Learn from Downstream and Be Yourself in Multimodal Large Language Models Fine-Tuning

Wenke Huang, Jian Liang, Zekun Shi et al.

ICML 2025arXiv:2411.10928
19
citations
#2494

COAT: Compressing Optimizer states and Activations for Memory-Efficient FP8 Training

Haocheng Xi, Han Cai, Ligeng Zhu et al.

ICLR 2025arXiv:2410.19313
19
citations
#2495

Benchmarking Predictive Coding Networks -- Made Simple

Luca Pinchetti, Chang Qi, Oleh Lokshyn et al.

ICLR 2025arXiv:2407.01163
19
citations
#2496

Bilinear MLPs enable weight-based mechanistic interpretability

Michael Pearce, Thomas Dooms, Alice Rigg et al.

ICLR 2025arXiv:2410.08417
19
citations
#2497

AE-NeRF: Augmenting Event-Based Neural Radiance Fields for Non-ideal Conditions and Larger Scenes

Chaoran Feng, Wangbo Yu, Xinhua Cheng et al.

AAAI 2025paperarXiv:2501.02807
19
citations
#2498

AVHBench: A Cross-Modal Hallucination Benchmark for Audio-Visual Large Language Models

Kim Sung-Bin, Oh Hyun-Bin, Lee Jung-Mok et al.

ICLR 2025arXiv:2410.18325
19
citations
#2499

Deep Video Discovery: Agentic Search with Tool Use for Long-form Video Understanding

Xiaoyi Zhang, Zhaoyang Jia, Zongyu Guo et al.

NEURIPS 2025oralarXiv:2505.18079
19
citations
#2500

Neighboring Autoregressive Modeling for Efficient Visual Generation

Yefei He, Yuanyu He, Shaoxuan He et al.

ICCV 2025arXiv:2503.10696
19
citations
#2501

Scaling Inference-Time Search with Vision Value Model for Improved Visual Comprehension

Xiyao Wang, Zhengyuan Yang, Linjie Li et al.

ICCV 2025arXiv:2412.03704
19
citations
#2502

MotionCraft: Crafting Whole-Body Motion with Plug-and-Play Multimodal Controls

Yuxuan Bian, Ailing Zeng, Xuan Ju et al.

AAAI 2025paperarXiv:2407.21136
19
citations
#2503

Factored-NeuS: Reconstructing Surfaces, Illumination, and Materials of Possibly Glossy Objects

Yue Fan, Ningjing Fan, Ivan Skorokhodov et al.

CVPR 2025arXiv:2305.17929
19
citations
#2504

Rethinking Fine-Tuning when Scaling Test-Time Compute: Limiting Confidence Improves Mathematical Reasoning

Feng Chen, Allan Raventós, Nan Cheng et al.

NEURIPS 2025arXiv:2502.07154
19
citations
#2505

Optimizing Anytime Reasoning via Budget Relative Policy Optimization

Penghui Qi, Zichen Liu, Tianyu Pang et al.

NEURIPS 2025arXiv:2505.13438
19
citations
#2506

RLVR-World: Training World Models with Reinforcement Learning

Jialong Wu, Shaofeng Yin, Ningya Feng et al.

NEURIPS 2025arXiv:2505.13934
19
citations
#2507

Interpretable Bilingual Multimodal Large Language Model for Diverse Biomedical Tasks

Lehan Wang, Haonan Wang, Honglong Yang et al.

ICLR 2025arXiv:2410.18387
19
citations
#2508

Self-MoE: Towards Compositional Large Language Models with Self-Specialized Experts

Junmo Kang, Leonid Karlinsky, Hongyin Luo et al.

ICLR 2025arXiv:2406.12034
19
citations
#2509

Interpreting CLIP with Hierarchical Sparse Autoencoders

Vladimir Zaigrajew, Hubert Baniecki, Przemysław Biecek

ICML 2025arXiv:2502.20578
19
citations
#2510

Sparse Autoencoders for Hypothesis Generation

Rajiv Movva, Kenny Peng, Nikhil Garg et al.

ICML 2025arXiv:2502.04382
19
citations
#2511

Provable Benefit of Annealed Langevin Monte Carlo for Non-log-concave Sampling

Wei Guo, Molei Tao, Yongxin Chen

ICLR 2025arXiv:2407.16936
19
citations
#2512

PolyGuard: A Multilingual Safety Moderation Tool for 17 Languages

Priyanshu Kumar, Devansh Jain, Akhila Yerukola et al.

COLM 2025paperarXiv:2504.04377
19
citations
#2513

Automated Benchmark Generation for Repository-Level Coding Tasks

Konstantinos Vergopoulos, Mark Müller, Martin Vechev

ICML 2025arXiv:2503.07701
19
citations
#2514

SnapGen: Taming High-Resolution Text-to-Image Models for Mobile Devices with Efficient Architectures and Training

Jierun Chen, Dongting Hu, Xijie Huang et al.

CVPR 2025highlightarXiv:2412.09619
19
citations
#2515

GenFusion: Closing the Loop between Reconstruction and Generation via Videos

Sibo Wu, Congrong Xu, Binbin Huang et al.

CVPR 2025arXiv:2503.21219
19
citations
#2516

Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision

Orr Zohar, Xiaohan Wang, Yonatan Bitton et al.

ICLR 2025arXiv:2407.06189
19
citations
#2517

Softmax is not Enough (for Sharp Size Generalisation)

Petar Veličković, Christos Perivolaropoulos, Federico Barbero et al.

ICML 2025arXiv:2410.01104
19
citations
#2518

MatExpert: Decomposing Materials Discovery By Mimicking Human Experts

Qianggang Ding, Santiago Miret, Bang Liu

ICLR 2025arXiv:2410.21317
19
citations
#2519

FreeScale: Unleashing the Resolution of Diffusion Models via Tuning-Free Scale Fusion

Haonan Qiu, Shiwei Zhang, Yujie Wei et al.

ICCV 2025arXiv:2412.09626
19
citations
#2520

KGARevion: An AI Agent for Knowledge-Intensive Biomedical QA

Xiaorui Su, Yibo Wang, Shanghua Gao et al.

ICLR 2025arXiv:2410.04660
19
citations
#2521

WebGen-Bench: Evaluating LLMs on Generating Interactive and Functional Websites from Scratch

Zimu Lu, Yunqiao Yang, Houxing Ren et al.

NEURIPS 2025oralarXiv:2505.03733
19
citations
#2522

Zero-shot forecasting of chaotic systems

Yuanzhao Zhang, William Gilpin

ICLR 2025arXiv:2409.15771
19
citations
#2523

Sensitivity-Aware Amortized Bayesian Inference

Lasse Elsemüller, Hans Olischläger, Marvin Schmitt et al.

ICLR 2025arXiv:2310.11122
19
citations
#2524

NerfBaselines: Consistent and Reproducible Evaluation of Novel View Synthesis Methods

Jonas Kulhanek, Torsten Sattler

NEURIPS 2025arXiv:2406.17345
19
citations
#2525

Tool-Planner: Task Planning with Clusters across Multiple Tools

Yanming Liu, Xinyue Peng, Jiannan Cao et al.

ICLR 2025arXiv:2406.03807
19
citations
#2526

Scalable Image Tokenization with Index Backpropagation Quantization

Fengyuan Shi, Zhuoyan Luo, Yixiao Ge et al.

ICCV 2025arXiv:2412.02692
19
citations
#2527

MarS: a Financial Market Simulation Engine Powered by Generative Foundation Model

Junjie Li, Yang Liu, Weiqing Liu et al.

ICLR 2025arXiv:2409.07486
19
citations
#2528

Reinforce LLM Reasoning through Multi-Agent Reflection

Yurun Yuan, Tengyang Xie

ICML 2025arXiv:2506.08379
19
citations
#2529

POI-Enhancer: An LLM-based Semantic Enhancement Framework for POI Representation Learning

Jiawei Cheng, Jingyuan Wang, Yichuan Zhang et al.

AAAI 2025paperarXiv:2502.10038
19
citations
#2530

AlphaForge: A Framework to Mine and Dynamically Combine Formulaic Alpha Factors

Hao Shi, Weili Song, Xinting Zhang et al.

AAAI 2025paperarXiv:2406.18394
19
citations
#2531

Efficiently Scaling LLM Reasoning Programs with Certaindex

Yichao Fu, Junda Chen, Siqi Zhu et al.

NEURIPS 2025
19
citations
#2532

MMR: A Large-scale Benchmark Dataset for Multi-target and Multi-granularity Reasoning Segmentation

Donggon Jang, Yucheol Cho, Suin Lee et al.

ICLR 2025arXiv:2503.13881
19
citations
#2533

Lyra: An Efficient and Speech-Centric Framework for Omni-Cognition

Zhisheng Zhong, Chengyao Wang, Yuqi Liu et al.

ICCV 2025arXiv:2412.09501
19
citations
#2534

TinyViM: Frequency Decoupling for Tiny Hybrid Vision Mamba

Xiaowen Ma, Zhen-Liang Ni, Xinghao Chen

ICCV 2025arXiv:2411.17473
19
citations
#2535

Emergent Symbolic Mechanisms Support Abstract Reasoning in Large Language Models

Yukang Yang, Declan Campbell, Kaixuan Huang et al.

ICML 2025arXiv:2502.20332
19
citations
#2536

DashGaussian: Optimizing 3D Gaussian Splatting in 200 Seconds

Youyu Chen, Junjun Jiang, Kui Jiang et al.

CVPR 2025highlightarXiv:2503.18402
19
citations
#2537

Beyond Single Concept Vector: Modeling Concept Subspace in LLMs with Gaussian Distribution

Haiyan Zhao, Heng Zhao, Bo Shen et al.

ICLR 2025arXiv:2410.00153
19
citations
#2538

Schedule On the Fly: Diffusion Time Prediction for Faster and Better Image Generation

Zilyu Ye, Zhiyang Chen, Tiancheng Li et al.

CVPR 2025arXiv:2412.01243
19
citations
#2539

CarPlanner: Consistent Auto-regressive Trajectory Planning for Large-Scale Reinforcement Learning in Autonomous Driving

Dongkun Zhang, Jiaming Liang, Ke Guo et al.

CVPR 2025arXiv:2502.19908
19
citations
#2540

DynaPrompt: Dynamic Test-Time Prompt Tuning

Zehao Xiao, Shilin Yan, Jack Hong et al.

ICLR 2025arXiv:2501.16404
19
citations
#2541

Learning to Discretize Denoising Diffusion ODEs

Vinh Tong, Trung-Dung Hoang, Anji Liu et al.

ICLR 2025arXiv:2405.15506
19
citations
#2542

Subobject-level Image Tokenization

Delong Chen, Samuel Cahyawijaya, Jianfeng Liu et al.

ICML 2025arXiv:2402.14327
19
citations
#2543

NavMorph: A Self-Evolving World Model for Vision-and-Language Navigation in Continuous Environments

Xuan Yao, Junyu Gao, Changsheng Xu

ICCV 2025arXiv:2506.23468
19
citations
#2544

Pre-training Auto-regressive Robotic Models with 4D Representations

Dantong Niu, Yuvan Sharma, Haoru Xue et al.

ICML 2025arXiv:2502.13142
19
citations
#2545

Instant3dit: Multiview Inpainting for Fast Editing of 3D Objects

Amir Barda, Matheus Gadelha, Vladimir G. Kim et al.

CVPR 2025arXiv:2412.00518
19
citations
#2546

Robotouille: An Asynchronous Planning Benchmark for LLM Agents

Gonzalo Gonzalez-Pumariega, Leong Yean, Neha Sunkara et al.

ICLR 2025arXiv:2502.05227
19
citations
#2547

LumiNet: Latent Intrinsics Meets Diffusion Models for Indoor Scene Relighting

Xiaoyan Xing, Konrad Groh, Sezer Karaoglu et al.

CVPR 2025arXiv:2412.00177
19
citations
#2548

Cross-View Completion Models are Zero-shot Correspondence Estimators

Honggyu An, Jin Hyeon Kim, Seonghoon Park et al.

CVPR 2025highlightarXiv:2412.09072
19
citations
#2549

Memory Layers at Scale

Vincent-Pierre Berges, Barlas Oğuz, Daniel HAZIZA et al.

ICML 2025arXiv:2412.09764
19
citations
#2550

Mitigating Overthinking in Large Reasoning Models via Manifold Steering

Yao Huang, Huanran Chen, Shouwei Ruan et al.

NEURIPS 2025arXiv:2505.22411
19
citations
#2551

Two-stream Beats One-stream: Asymmetric Siamese Network for Efficient Visual Tracking

Jiawen Zhu, Huayi Tang, Xin Chen et al.

AAAI 2025paperarXiv:2503.00516
19
citations
#2552

Design Principles and Challenges for Gaze + Pinch Interaction in XR

Ken Pfeuffer, Hans Gellersen, Mar Gonzalez-Franco

ISMAR 2025paper
19
citations
#2553

AvatarGO: Zero-shot 4D Human-Object Interaction Generation and Animation

Yukang Cao, Liang Pan, Kai Han et al.

ICLR 2025arXiv:2410.07164
19
citations
#2554

SAFE: Multitask Failure Detection for Vision-Language-Action Models

Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.

NEURIPS 2025arXiv:2506.09937
19
citations
#2555

Target Concrete Score Matching: A Holistic Framework for Discrete Diffusion

Ruixiang Zhang, Shuangfei Zhai, Yizhe Zhang et al.

ICML 2025arXiv:2504.16431
19
citations
#2556

Combating Multimodal LLM Hallucination via Bottom-Up Holistic Reasoning

Shengqiong Wu, Hao Fei, Liangming Pan et al.

AAAI 2025paperarXiv:2412.11124
19
citations
#2557

Mechanism Design for LLM Fine-tuning with Multiple Reward Models

Haoran Sun, Yurong Chen, Siwei Wang et al.

NEURIPS 2025arXiv:2405.16276
19
citations
#2558

CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset

Xiao Wang, Fuling Wang, Yuehang Li et al.

CVPR 2025arXiv:2410.00379
19
citations
#2559

Model Equality Testing: Which Model is this API Serving?

Irena Gao, Percy Liang, Carlos Guestrin

ICLR 2025arXiv:2410.20247
19
citations
#2560

Hash3D: Training-free Acceleration for 3D Generation

Xingyi Yang, Songhua Liu, Xinchao Wang

CVPR 2025arXiv:2404.06091
19
citations
#2561

DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving

Zhenhua Xu, Yan Bai, Yujia Zhang et al.

CVPR 2025highlight
19
citations
#2562

Auditing $f$-differential privacy in one run

Saeed Mahloujifar, Luca Melis, Kamalika Chaudhuri

ICML 2025oralarXiv:2410.22235
19
citations
#2563

A Rainbow in Deep Network Black Boxes

Florentin Guth, Brice Ménard, Gaspar Rochette et al.

ICLR 2025arXiv:2305.18512
19
citations
#2564

Adaptive Message Passing: A General Framework to Mitigate Oversmoothing, Oversquashing, and Underreaching

Federico Errica, Henrik Christiansen, Viktor Zaverkin et al.

ICML 2025arXiv:2312.16560
19
citations
#2565

Magnetic Preference Optimization: Achieving Last-iterate Convergence for Language Model Alignment

Mingzhi Wang, Chengdong Ma, Qizhi Chen et al.

ICLR 2025arXiv:2410.16714
19
citations
#2566

SensorLM: Learning the Language of Wearable Sensors

Yuwei Zhang, Kumar Ayush, Siyuan Qiao et al.

NEURIPS 2025arXiv:2506.09108
19
citations
#2567

Systematic Outliers in Large Language Models

Yongqi An, Xu Zhao, Tao Yu et al.

ICLR 2025arXiv:2502.06415
19
citations
#2568

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation

Jiaming Zhou, Teli Ma, Kun-Yu Lin et al.

CVPR 2025arXiv:2406.14235
19
citations
#2569

Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Simran Kaur, Simon Park, Anirudh Goyal et al.

ICLR 2025arXiv:2408.14774
19
citations
#2570

Rethinking Reward Model Evaluation: Are We Barking up the Wrong Tree?

xueru wen, Jie Lou, Yaojie Lu et al.

ICLR 2025arXiv:2410.05584
19
citations
#2571

ClearSight: Visual Signal Enhancement for Object Hallucination Mitigation in Multimodal Large Language Models

Hao Yin, Guangzong Si, Zilei Wang

CVPR 2025arXiv:2503.13107
19
citations
#2572

Unveiling the Ignorance of MLLMs: Seeing Clearly, Answering Incorrectly

Yexin Liu, Zhengyang Liang, Yueze Wang et al.

CVPR 2025arXiv:2406.10638
19
citations
#2573

Flexible and Efficient Grammar-Constrained Decoding

Kanghee Park, Timothy Zhou, Loris D'Antoni

ICML 2025arXiv:2502.05111
19
citations
#2574

Emoji Attack: Enhancing Jailbreak Attacks Against Judge LLM Detection

Zhipeng Wei, Yuqi Liu, N. Benjamin Erichson

ICML 2025arXiv:2411.01077
19
citations
#2575

ITFormer: Bridging Time Series and Natural Language for Multi-Modal QA with Large-Scale Multitask Dataset

Yilin Wang, Peixuan Lei, Jie Song et al.

ICML 2025oralarXiv:2506.20093
19
citations
#2576

Influence-Guided Diffusion for Dataset Distillation

Mingyang Chen, Jiawei Du, Bo Huang et al.

ICLR 2025
19
citations
#2577

Wavelet Diffusion Neural Operator

Peiyan Hu, Rui Wang, Xiang Zheng et al.

ICLR 2025arXiv:2412.04833
19
citations
#2578

T2SG: Traffic Topology Scene Graph for Topology Reasoning in Autonomous Driving

Changsheng Lv, Mengshi Qi, Liang Liu et al.

CVPR 2025arXiv:2411.18894
19
citations
#2579

Online Reward-Weighted Fine-Tuning of Flow Matching with Wasserstein Regularization

Jiajun Fan, Shuaike Shen, Chaoran Cheng et al.

ICLR 2025arXiv:2502.06061
19
citations
#2580

IterComp: Iterative Composition-Aware Feedback Learning from Model Gallery for Text-to-Image Generation

Xinchen Zhang, Ling Yang, Guohao Li et al.

ICLR 2025arXiv:2410.07171
19
citations
#2581

MMAD: A Comprehensive Benchmark for Multimodal Large Language Models in Industrial Anomaly Detection

Xi Jiang, Jian Li, Hanqiu Deng et al.

ICLR 2025arXiv:2410.09453
18
citations
#2582

Perm: A Parametric Representation for Multi-Style 3D Hair Modeling

Chengan He, Xin Sun, Zhixin Shu et al.

ICLR 2025arXiv:2407.19451
18
citations
#2583

Score as Action: Fine Tuning Diffusion Generative Models by Continuous-time Reinforcement Learning

Hanyang Zhao, Haoxian Chen, Ji Zhang et al.

ICML 2025arXiv:2502.01819
18
citations
#2584

LangTime: A Language-Guided Unified Model for Time Series Forecasting with Proximal Policy Optimization

Wenzhe Niu, Zongxia Xie, Yanru Sun et al.

ICML 2025oralarXiv:2503.08271
18
citations
#2585

FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity

Hang Hua, Qing Liu, Lingzhi Zhang et al.

CVPR 2025arXiv:2411.15411
18
citations
#2586

Aioli: A Unified Optimization Framework for Language Model Data Mixing

Mayee Chen, Michael Hu, Nicholas Lourie et al.

ICLR 2025arXiv:2411.05735
18
citations
#2587

Unveiling Visual Perception in Language Models: An Attention Head Analysis Approach

Jing Bi, Lianggong Bruce Wen, Zhang Liu et al.

CVPR 2025arXiv:2412.18108
18
citations
#2588

4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration

Jiahui Zhang, Yurui Chen, Yueming Xu et al.

NEURIPS 2025oralarXiv:2506.22242
18
citations
#2589

Baking Gaussian Splatting into Diffusion Denoiser for Fast and Scalable Single-stage Image-to-3D Generation and Reconstruction

Yuanhao Cai, He Zhang, Kai Zhang et al.

ICCV 2025arXiv:2411.14384
18
citations
#2590

RoboScape: Physics-informed Embodied World Model

Yu Shang, Xin Zhang, Yinzhou Tang et al.

NEURIPS 2025oralarXiv:2506.23135
18
citations
#2591

S^3cMath: Spontaneous Step-Level Self-Correction Makes Large Language Models Better Mathematical Reasoners

Yuchen Yan, Jin Jiang, Yang Liu et al.

AAAI 2025paperarXiv:2409.01524
18
citations
#2592

Highly Efficient Self-Adaptive Reward Shaping for Reinforcement Learning

Haozhe Ma, Zhengding Luo, Thanh Vinh Vo et al.

ICLR 2025arXiv:2408.03029
18
citations
#2593

Decompositional Neural Scene Reconstruction with Generative Diffusion Prior

Junfeng Ni, Yu Liu, Ruijie Lu et al.

CVPR 2025arXiv:2503.14830
18
citations
#2594

Digital Twin Catalog: A Large-Scale Photorealistic 3D Object Digital Twin Dataset

Zhao Dong, Ka chen, Zhaoyang Lv et al.

CVPR 2025highlightarXiv:2504.08541
18
citations
#2595

Is Artificial Intelligence Generated Image Detection a Solved Problem?

Ziqiang Li, Jiazhen Yan, Ziwen He et al.

NEURIPS 2025arXiv:2505.12335
18
citations
#2596

Incentivizing Dual Process Thinking for Efficient Large Language Model Reasoning

Xiaoxue Cheng, Junyi Li, Zhenduo Zhang et al.

NEURIPS 2025arXiv:2505.16315
18
citations
#2597

VideoShield: Regulating Diffusion-based Video Generation Models via Watermarking

Runyi Hu, Jie Zhang, Yiming Li et al.

ICLR 2025oralarXiv:2501.14195
18
citations
#2598

Exploring the Limits of Vision-Language-Action Manipulation in Cross-task Generalization

Jiaming Zhou, Ke Ye, Jiayi Liu et al.

NEURIPS 2025arXiv:2505.15660
18
citations
#2599

Adversarial Reasoning at Jailbreaking Time

Mahdi Sabbaghi, Paul Kassianik, George Pappas et al.

ICML 2025arXiv:2502.01633
18
citations
#2600

Navigating Neural Space: Revisiting Concept Activation Vectors to Overcome Directional Divergence

Frederik Pahde, Maximilian Dreyer, Moritz Weckbecker et al.

ICLR 2025arXiv:2202.03482
18
citations