Most Cited ICLR "object-centric abstractions" Papers

6,124 papers found • Page 2 of 31

#201

On the self-verification limitations of large language models on reasoning and planning tasks

Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati

ICLR 2025arXiv:2402.08115
100
citations
#202

RegMix: Data Mixture as Regression for Language Model Pre-training

Qian Liu, Xiaosen Zheng, Niklas Muennighoff et al.

ICLR 2025arXiv:2407.01492
100
citations
#203

Feast Your Eyes: Mixture-of-Resolution Adaptation for Multimodal Large Language Models

Gen Luo, Yiyi Zhou, Yuxin Zhang et al.

ICLR 2025arXiv:2403.03003
100
citations
#204

RM-Bench: Benchmarking Reward Models of Language Models with Subtlety and Style

Yantao Liu, Zijun Yao, Rui Min et al.

ICLR 2025arXiv:2410.16184
100
citations
#205

LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs

Yushi Bai, Jiajie Zhang, Xin Lv et al.

ICLR 2025arXiv:2408.07055
100
citations
#206

Semantic Image Inversion and Editing using Rectified Stochastic Differential Equations

Litu Rout, Yujia Chen, Nataniel Ruiz et al.

ICLR 2025arXiv:2410.10792
99
citations
#207

UniversalNER: Targeted Distillation from Large Language Models for Open Named Entity Recognition

Wenxuan Zhou, Sheng Zhang, Yu Gu et al.

ICLR 2024arXiv:2308.03279
98
citations
#208

Physics of Language Models: Part 2.1, Grade-School Math and the Hidden Reasoning Process

Tian Ye, Zicheng Xu, Yuanzhi Li et al.

ICLR 2025arXiv:2407.20311
98
citations
#209

Fine-Tuning Enhances Existing Mechanisms: A Case Study on Entity Tracking

Nikhil Prakash, Tamar Shaham, Tal Haklay et al.

ICLR 2024arXiv:2402.14811
97
citations
#210

DFormer: Rethinking RGBD Representation Learning for Semantic Segmentation

Bowen Yin, Xuying Zhang, Zhong-Yu Li et al.

ICLR 2024arXiv:2309.09668
96
citations
#211

Rethinking Model Ensemble in Transfer-based Adversarial Attacks

Huanran Chen, Yichi Zhang, Yinpeng Dong et al.

ICLR 2024arXiv:2303.09105
96
citations
#212

HyperAttention: Long-context Attention in Near-Linear Time

Insu Han, Rajesh Jayaram, Amin Karbasi et al.

ICLR 2024arXiv:2310.05869
94
citations
#213

CBraMod: A Criss-Cross Brain Foundation Model for EEG Decoding

Jiquan Wang, Sha Zhao, Zhiling Luo et al.

ICLR 2025oralarXiv:2412.07236
93
citations
#214

Noise-free Score Distillation

Oren Katzir, Or Patashnik, Daniel Cohen-Or et al.

ICLR 2024arXiv:2310.17590
93
citations
#215

Copilot4D: Learning Unsupervised World Models for Autonomous Driving via Discrete Diffusion

Lunjun Zhang, Yuwen Xiong, Ze Yang et al.

ICLR 2024arXiv:2311.01017
92
citations
#216

Decoding Natural Images from EEG for Object Recognition

Yonghao Song, Bingchuan Liu, Xiang Li et al.

ICLR 2024oralarXiv:2308.13234
92
citations
#217

Consistency-guided Prompt Learning for Vision-Language Models

Shuvendu Roy, Ali Etemad

ICLR 2024arXiv:2306.01195
91
citations
#218

DreamBench++: A Human-Aligned Benchmark for Personalized Image Generation

Yuang Peng, Yuxin Cui, Haomiao Tang et al.

ICLR 2025arXiv:2406.16855
91
citations
#219

Deconstructing Denoising Diffusion Models for Self-Supervised Learning

Xinlei Chen, Zhuang Liu, Saining Xie et al.

ICLR 2025arXiv:2401.14404
91
citations
#220

ColPali: Efficient Document Retrieval with Vision Language Models

Manuel Faysse, Hugues Sibille, Tony Wu et al.

ICLR 2025arXiv:2407.01449
91
citations
#221

SVDQuant: Absorbing Outliers by Low-Rank Component for 4-Bit Diffusion Models

Muyang Li, Yujun Lin, Zhekai Zhang et al.

ICLR 2025arXiv:2411.05007
90
citations
#222

When Attention Sink Emerges in Language Models: An Empirical View

Xiangming Gu, Tianyu Pang, Chao Du et al.

ICLR 2025arXiv:2410.10781
90
citations
#223

LVSM: A Large View Synthesis Model with Minimal 3D Inductive Bias

Haian Jin, Hanwen Jiang, Hao Tan et al.

ICLR 2025arXiv:2410.17242
90
citations
#224

Brain decoding: toward real-time reconstruction of visual perception

Yohann Benchetrit, Hubert Banville, Jean-Remi King

ICLR 2024oralarXiv:2310.19812
90
citations
#225

At Which Training Stage Does Code Data Help LLMs Reasoning?

ma yingwei, Yue Liu, Yue Yu et al.

ICLR 2024spotlightarXiv:2309.16298
90
citations
#226

Loopy: Taming Audio-Driven Portrait Avatar with Long-Term Motion Dependency

Jianwen Jiang, Chao Liang, Jiaqi Yang et al.

ICLR 2025oralarXiv:2409.02634
89
citations
#227

OmniEdit: Building Image Editing Generalist Models Through Specialist Supervision

Cong Wei, Zheyang Xiong, Weiming Ren et al.

ICLR 2025arXiv:2411.07199
89
citations
#228

Not All Language Model Features Are One-Dimensionally Linear

Josh Engels, Eric Michaud, Isaac Liao et al.

ICLR 2025arXiv:2405.14860
89
citations
#229

Mechanistically analyzing the effects of fine-tuning on procedurally defined tasks

Samyak Jain, Robert Kirk, Ekdeep Singh Lubana et al.

ICLR 2024arXiv:2311.12786
89
citations
#230

Training Socially Aligned Language Models on Simulated Social Interactions

Ruibo Liu, Ruixin Yang, Chenyan Jia et al.

ICLR 2024arXiv:2305.16960
88
citations
#231

Improved sampling via learned diffusions

Lorenz Richter, Julius Berner

ICLR 2024arXiv:2307.01198
88
citations
#232

Kolmogorov-Arnold Transformer

Xingyi Yang, Xinchao Wang

ICLR 2025arXiv:2409.10594
88
citations
#233

LiveBench: A Challenging, Contamination-Limited LLM Benchmark

Colin White, Samuel Dooley, Manley Roberts et al.

ICLR 2025arXiv:2406.19314
87
citations
#234

SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore

Sewon Min, Suchin Gururangan, Eric Wallace et al.

ICLR 2024spotlightarXiv:2308.04430
87
citations
#235

AdaIR: Adaptive All-in-One Image Restoration via Frequency Mining and Modulation

Yuning Cui, Syed Waqas Zamir, Salman Khan et al.

ICLR 2025arXiv:2403.14614
86
citations
#236

Making Text Embedders Few-Shot Learners

Chaofan Li, Minghao Qin, Shitao Xiao et al.

ICLR 2025arXiv:2409.15700
86
citations
#237

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

Weifeng Lin, Xinyu Wei, Ruichuan An et al.

ICLR 2025arXiv:2403.20271
86
citations
#238

TimeMixer++: A General Time Series Pattern Machine for Universal Predictive Analysis

Shiyu Wang, Jiawei LI, Xiaoming Shi et al.

ICLR 2025oralarXiv:2410.16032
85
citations
#239

KoLA: Carefully Benchmarking World Knowledge of Large Language Models

Jifan Yu, Xiaozhi Wang, Shangqing Tu et al.

ICLR 2024arXiv:2306.09296
85
citations
#240

Finetuning Text-to-Image Diffusion Models for Fairness

Xudong Shen, Chao Du, Tianyu Pang et al.

ICLR 2024arXiv:2311.07604
85
citations
#241

How Many Pretraining Tasks Are Needed for In-Context Learning of Linear Regression?

Jingfeng Wu, Difan Zou, Zixiang Chen et al.

ICLR 2024spotlightarXiv:2310.08391
85
citations
#242

Predictive Inverse Dynamics Models are Scalable Learners for Robotic Manipulation

Yang Tian, Sizhe Yang, Jia Zeng et al.

ICLR 2025arXiv:2412.15109
85
citations
#243

Vision-LSTM: xLSTM as Generic Vision Backbone

Benedikt Alkin, Maximilian Beck, Korbinian Pöppel et al.

ICLR 2025arXiv:2406.04303
85
citations
#244

The Hedgehog & the Porcupine: Expressive Linear Attentions with Softmax Mimicry

Michael Zhang, Kush Bhatia, Hermann Kumbong et al.

ICLR 2024arXiv:2402.04347
84
citations
#245

Training-free Camera Control for Video Generation

Chen Hou, Zhibo Chen

ICLR 2025arXiv:2406.10126
84
citations
#246

Robust Watermarking Using Generative Priors Against Image Editing: From Benchmarking to Advances

Shilin Lu, Zihan Zhou, Jiayou Lu et al.

ICLR 2025arXiv:2410.18775
84
citations
#247

AHA: A Vision-Language-Model for Detecting and Reasoning Over Failures in Robotic Manipulation

Jiafei Duan, Wilbert Pumacay, Nishanth Kumar et al.

ICLR 2025arXiv:2410.00371
84
citations
#248

Human Feedback is not Gold Standard

Tom Hosking, Phil Blunsom, Max Bartolo

ICLR 2024arXiv:2309.16349
83
citations
#249

Detecting, Explaining, and Mitigating Memorization in Diffusion Models

Yuxin Wen, Yuchen Liu, Chen Chen et al.

ICLR 2024arXiv:2407.21720
83
citations
#250

Turning Up the Heat: Min-p Sampling for Creative and Coherent LLM Outputs

Minh Nguyen, Andrew Baker, Clement Neo et al.

ICLR 2025arXiv:2407.01082
82
citations
#251

Soft Merging of Experts with Adaptive Routing

Haokun Liu, Muqeeth Mohammed, Colin Raffel

ICLR 2025arXiv:2306.03745
82
citations
#252

DynaMath: A Dynamic Visual Benchmark for Evaluating Mathematical Reasoning Robustness of Vision Language Models

Chengke Zou, Xingang Guo, Rui Yang et al.

ICLR 2025arXiv:2411.00836
82
citations
#253

Unlocking Guidance for Discrete State-Space Diffusion and Flow Models

Hunter Nisonoff, Junhao Xiong, Stephan Allenspach et al.

ICLR 2025arXiv:2406.01572
82
citations
#254

Consistency Models Made Easy

Zhengyang Geng, Ashwini Pokle, Weijian Luo et al.

ICLR 2025arXiv:2406.14548
81
citations
#255

Batch Calibration: Rethinking Calibration for In-Context Learning and Prompt Engineering

Han Zhou, Xingchen Wan, Lev Proleev et al.

ICLR 2024arXiv:2309.17249
81
citations
#256

AntGPT: Can Large Language Models Help Long-term Action Anticipation from Videos?

Qi Zhao, Shijie Wang, Ce Zhang et al.

ICLR 2024oralarXiv:2307.16368
81
citations
#257

MLLMs Know Where to Look: Training-free Perception of Small Visual Details with Multimodal LLMs

jiarui zhang, Mahyar Khayatkhoei, Prateek Chhikara et al.

ICLR 2025arXiv:2502.17422
81
citations
#258

In-Context Pretraining: Language Modeling Beyond Document Boundaries

Weijia Shi, Sewon Min, Maria Lomeli et al.

ICLR 2024spotlightarXiv:2310.10638
81
citations
#259

BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

Zhen Xiang, Fengqing Jiang, Zidi Xiong et al.

ICLR 2024arXiv:2401.12242
80
citations
#260

CFG++: Manifold-constrained Classifier Free Guidance for Diffusion Models

Hyungjin Chung, Jeongsol Kim, Geon Yeong Park et al.

ICLR 2025arXiv:2406.08070
80
citations
#261

PB-LLM: Partially Binarized Large Language Models

Zhihang Yuan, Yuzhang Shang, Zhen Dong

ICLR 2024arXiv:2310.00034
80
citations
#262

Real-Time Video Generation with Pyramid Attention Broadcast

Xuanlei Zhao, Xiaolong Jin, Kai Wang et al.

ICLR 2025arXiv:2408.12588
79
citations
#263

A Benchmark for Learning to Translate a New Language from One Grammar Book

Garrett Tanzer, Mirac Suzgun, Eline Visser et al.

ICLR 2024spotlightarXiv:2309.16575
79
citations
#264

Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting

Zilong (Ryan) Wang, Zifeng Wang, Long Le et al.

ICLR 2025arXiv:2407.08223
78
citations
#265

Towards Foundation Models for Knowledge Graph Reasoning

Mikhail Galkin, Xinyu Yuan, Hesham Mostafa et al.

ICLR 2024arXiv:2310.04562
78
citations
#266

MM-EMBED: UNIVERSAL MULTIMODAL RETRIEVAL WITH MULTIMODAL LLMS

Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi et al.

ICLR 2025arXiv:2411.02571
78
citations
#267

DreamTime: An Improved Optimization Strategy for Diffusion-Guided 3D Generation

Yukun Huang, Jianan Wang, Yukai Shi et al.

ICLR 2024arXiv:2306.12422
78
citations
#268

RoboCat: A Self-Improving Generalist Agent for Robotic Manipulation

Sergio Gómez Colmenarejo, Jost Springenberg, Jose Enrique Chen et al.

ICLR 2025
78
citations
#269

Escape Sky-high Cost: Early-stopping Self-Consistency for Multi-step Reasoning

Yiwei Li, Peiwen Yuan, Shaoxiong Feng et al.

ICLR 2024arXiv:2401.10480
78
citations
#270

Amortizing intractable inference in large language models

Edward Hu, Moksh Jain, Eric Elmoznino et al.

ICLR 2024arXiv:2310.04363
78
citations
#271

Language models scale reliably with over-training and on downstream tasks

Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar et al.

ICLR 2025arXiv:2403.08540
77
citations
#272

Curiosity-driven Red-teaming for Large Language Models

Zhang-Wei Hong, Idan Shenfeld, Johnson (Tsun-Hsuan) Wang et al.

ICLR 2024arXiv:2402.19464
77
citations
#273

Do I Know This Entity? Knowledge Awareness and Hallucinations in Language Models

Javier Ferrando, Oscar Obeso, Senthooran Rajamanoharan et al.

ICLR 2025arXiv:2411.14257
77
citations
#274

Large Multilingual Models Pivot Zero-Shot Multimodal Learning across Languages

Jinyi Hu, Yuan Yao, Chongyi Wang et al.

ICLR 2024spotlightarXiv:2308.12038
77
citations
#275

GraphRouter: A Graph-based Router for LLM Selections

Tao Feng, Yanzhen Shen, Jiaxuan You

ICLR 2025arXiv:2410.03834
77
citations
#276

Generalization v.s. Memorization: Tracing Language Models’ Capabilities Back to Pretraining Data

Xinyi Wang, Antonis Antoniades, Yanai Elazar et al.

ICLR 2025arXiv:2407.14985
77
citations
#277

Dissecting Adversarial Robustness of Multimodal LM Agents

Chen Wu, Rishi Shah, Jing Yu Koh et al.

ICLR 2025arXiv:2406.12814
76
citations
#278

Eliciting Human Preferences with Language Models

Belinda Li, Alex Tamkin, Noah Goodman et al.

ICLR 2025oralarXiv:2310.11589
76
citations
#279

LLM-grounded Video Diffusion Models

Long Lian, Baifeng Shi, Adam Yala et al.

ICLR 2024oralarXiv:2309.17444
76
citations
#280

Diffusion-Based Planning for Autonomous Driving with Flexible Guidance

Yinan Zheng, Ruiming Liang, Kexin ZHENG et al.

ICLR 2025arXiv:2501.15564
75
citations
#281

Beyond Autoregression: Discrete Diffusion for Complex Reasoning and Planning

Jiacheng Ye, Jiahui Gao, Shansan Gong et al.

ICLR 2025arXiv:2410.14157
75
citations
#282

FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models

Zhipei Xu, Xuanyu Zhang, Runyi Li et al.

ICLR 2025arXiv:2410.02761
75
citations
#283

Multiscale Positive-Unlabeled Detection of AI-Generated Texts

Yuchuan Tian, Hanting Chen, Xutao Wang et al.

ICLR 2024spotlightarXiv:2305.18149
74
citations
#284

Robustness of AI-Image Detectors: Fundamental Limits and Practical Attacks

Mehrdad Saberi, Vinu Sankar Sadasivan, Keivan Rezaei et al.

ICLR 2024arXiv:2310.00076
74
citations
#285

InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales

Zhepei Wei, Wei-Lin Chen, Yu Meng

ICLR 2025arXiv:2406.13629
74
citations
#286

Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis

Ziyue Jiang, Jinglin Liu, Yi Ren et al.

ICLR 2024arXiv:2307.07218
74
citations
#287

Improved Techniques for Optimization-Based Jailbreaking on Large Language Models

Xiaojun Jia, Tianyu Pang, Chao Du et al.

ICLR 2025arXiv:2405.21018
74
citations
#288

MAVIS: Mathematical Visual Instruction Tuning with an Automatic Data Engine

Renrui Zhang, Xinyu Wei, Dongzhi Jiang et al.

ICLR 2025arXiv:2407.08739
74
citations
#289

MMTEB: Massive Multilingual Text Embedding Benchmark

Kenneth Enevoldsen, Isaac Chung, Imene Kerboua et al.

ICLR 2025arXiv:2502.13595
74
citations
#290

OGBench: Benchmarking Offline Goal-Conditioned RL

Seohong Park, Kevin Frans, Benjamin Eysenbach et al.

ICLR 2025arXiv:2410.20092
74
citations
#291

Confronting Reward Model Overoptimization with Constrained RLHF

Ted Moskovitz, Aaditya Singh, DJ Strouse et al.

ICLR 2024spotlightarXiv:2310.04373
73
citations
#292

Towards 3D Molecule-Text Interpretation in Language Models

Sihang Li, Zhiyuan Liu, Yanchen Luo et al.

ICLR 2024arXiv:2401.13923
73
citations
#293

MaskBit: Embedding-free Image Generation via Bit Tokens

Mark Weber, Lijun Yu, Qihang Yu et al.

ICLR 2025arXiv:2409.16211
73
citations
#294

Language Models Learn to Mislead Humans via RLHF

Jiaxin Wen, Ruiqi Zhong, Akbir Khan et al.

ICLR 2025arXiv:2409.12822
73
citations
#295

Elucidating the Exposure Bias in Diffusion Models

Mang Ning, Mingxiao Li, Jianlin Su et al.

ICLR 2024arXiv:2308.15321
72
citations
#296

Planning in Natural Language Improves LLM Search for Code Generation

Evan Wang, Federico Cassano, Catherine Wu et al.

ICLR 2025arXiv:2409.03733
72
citations
#297

MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine

Yunfei Xie, Ce Zhou, Lang Gao et al.

ICLR 2025arXiv:2408.02900
72
citations
#298

BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games

Davide Paglieri, Bartłomiej Cupiał, Samuel Coward et al.

ICLR 2025arXiv:2411.13543
70
citations
#299

Programming Refusal with Conditional Activation Steering

Bruce W. Lee, Inkit Padhi, Karthikeyan Natesan Ramamurthy et al.

ICLR 2025arXiv:2409.05907
70
citations
#300

Fine-tuning can cripple your foundation model; preserving features may be the solution

Philip Torr, Puneet Dokania, Jishnu Mukhoti et al.

ICLR 2025arXiv:2308.13320
70
citations
#301

PromptTTS 2: Describing and Generating Voices with Text Prompt

Yichong Leng, ZHifang Guo, Kai Shen et al.

ICLR 2024arXiv:2309.02285
70
citations
#302

Weak to Strong Generalization for Large Language Models with Multi-capabilities

Yucheng Zhou, Jianbing Shen, Yu Cheng

ICLR 2025
70
citations
#303

Accelerating Diffusion Transformers with Token-wise Feature Caching

Chang Zou, Xuyang Liu, Ting Liu et al.

ICLR 2025arXiv:2410.05317
69
citations
#304

SolidGen: An Autoregressive Model for Direct B-rep Synthesis

Karl Willis, Joseph Lambourne, Nigel Morris et al.

ICLR 2024
69
citations
#305

EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models

YEFEI HE, Jing Liu, Weijia Wu et al.

ICLR 2024oralarXiv:2310.03270
69
citations
#306

Learning to Act without Actions

Dominik Schmidt, Minqi Jiang

ICLR 2024oralarXiv:2312.10812
69
citations
#307

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Melissa Hall, Michal Drozdzal, Oscar Mañas et al.

ICLR 2025arXiv:2403.17804
69
citations
#308

METRA: Scalable Unsupervised RL with Metric-Aware Abstraction

Seohong Park, Oleh Rybkin, Sergey Levine

ICLR 2024oralarXiv:2310.08887
68
citations
#309

On the Learnability of Watermarks for Language Models

Chenchen Gu, XIANG LI, Percy Liang et al.

ICLR 2024arXiv:2312.04469
68
citations
#310

DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving

Xiaosong Jia, Junqi You, Zhiyuan Zhang et al.

ICLR 2025oralarXiv:2503.07656
67
citations
#311

VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents

Xiao Liu, Tianjie Zhang, Yu Gu et al.

ICLR 2025arXiv:2408.06327
67
citations
#312

CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models

Zheng Chong, Xiao Dong, Haoxiang Li et al.

ICLR 2025arXiv:2407.15886
67
citations
#313

HAMSTER: Hierarchical Action Models for Open-World Robot Manipulation

Yi Li, Yuquan Deng, Jesse Zhang et al.

ICLR 2025arXiv:2502.05485
67
citations
#314

Does Refusal Training in LLMs Generalize to the Past Tense?

Maksym Andriushchenko, Nicolas Flammarion

ICLR 2025arXiv:2407.11969
66
citations
#315

AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents

Ke Yang, Yao Liu, Sapana Chaudhary et al.

ICLR 2025arXiv:2410.13825
66
citations
#316

Scaling Laws for Precision

Tanishq Kumar, Zachary Ankner, Benjamin Spector et al.

ICLR 2025arXiv:2411.04330
66
citations
#317

Deep Temporal Graph Clustering

Meng Liu, Yue Liu, KE LIANG et al.

ICLR 2024oralarXiv:2305.10738
66
citations
#318

Multi-Source Diffusion Models for Simultaneous Music Generation and Separation

Giorgio Mariani, Irene Tallini, Emilian Postolache et al.

ICLR 2024arXiv:2302.02257
65
citations
#319

Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models

Fushuo Huo, Wenchao Xu, Zhong Zhang et al.

ICLR 2025arXiv:2408.02032
65
citations
#320

ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation

Cheng Yang, Chufan Shi, Yaxin Liu et al.

ICLR 2025arXiv:2406.09961
65
citations
#321

CycleResearcher: Improving Automated Research via Automated Review

Yixuan Weng, Minjun Zhu, Guangsheng Bao et al.

ICLR 2025arXiv:2411.00816
65
citations
#322

Reasoning with Latent Thoughts: On the Power of Looped Transformers

Nikunj Saunshi, Nishanth Dikkala, Zhiyuan Li et al.

ICLR 2025arXiv:2502.17416
65
citations
#323

MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences

Canyu Zhao, Mingyu Liu, Wen Wang et al.

ICLR 2025arXiv:2407.16655
64
citations
#324

MagicPIG: LSH Sampling for Efficient LLM Generation

Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye et al.

ICLR 2025arXiv:2410.16179
64
citations
#325

Monte Carlo guided Denoising Diffusion models for Bayesian linear inverse problems.

Gabriel Cardoso, Yazid Janati el idrissi, Sylvain Le Corff et al.

ICLR 2024
63
citations
#326

FreDF: Learning to Forecast in the Frequency Domain

Hao Wang, Lichen Pan, Yuan Shen et al.

ICLR 2025arXiv:2402.02399
63
citations
#327

Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling

Hritik Bansal, Arian Hosseini, Rishabh Agarwal et al.

ICLR 2025arXiv:2408.16737
63
citations
#328

Evaluating the Zero-shot Robustness of Instruction-tuned Language Models

Jiuding Sun, Chantal Shaib, Byron Wallace

ICLR 2024spotlightarXiv:2306.11270
63
citations
#329

ImageFolder: Autoregressive Image Generation with Folded Tokens

Xiang Li, Kai Qiu, Hao Chen et al.

ICLR 2025arXiv:2410.01756
63
citations
#330

Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control

Aleksandar Makelov, Georg Lange, Neel Nanda

ICLR 2025arXiv:2405.08366
63
citations
#331

Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics

Yaniv Nikankin, Anja Reusch, Aaron Mueller et al.

ICLR 2025arXiv:2410.21272
63
citations
#332

Grokking as the transition from lazy to rich training dynamics

Tanishq Kumar, Blake Bordelon, Samuel Gershman et al.

ICLR 2024arXiv:2310.06110
63
citations
#333

DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?

Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.

ICLR 2025arXiv:2409.07703
62
citations
#334

Simple Guidance Mechanisms for Discrete Diffusion Models

Yair Schiff, Subham Sahoo, Hao Phung et al.

ICLR 2025arXiv:2412.10193
62
citations
#335

MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding

Ranajoy Sadhukhan, Jian Chen, Zhuoming Chen et al.

ICLR 2025arXiv:2408.11049
61
citations
#336

Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement

Kai Xu, Rongyu Chen, Gianni Franchi et al.

ICLR 2024arXiv:2310.00227
61
citations
#337

Learning Dynamics of LLM Finetuning

YI REN, Danica Sutherland

ICLR 2025arXiv:2407.10490
61
citations
#338

Animate-X: Universal Character Image Animation with Enhanced Motion Representation

Shuai Tan, Biao Gong, Xiang Wang et al.

ICLR 2025oralarXiv:2410.10306
60
citations
#339

LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving

Tianyu Li, Peijin Jia, Bangjun Wang et al.

ICLR 2024arXiv:2312.16108
60
citations
#340

Space Group Constrained Crystal Generation

Rui Jiao, Wenbing Huang, Yu Liu et al.

ICLR 2024arXiv:2402.03992
60
citations
#341

SafeDiffuser: Safe Planning with Diffusion Probabilistic Models

Wei Xiao, Johnson (Tsun-Hsuan) Wang, Chuang Gan et al.

ICLR 2025arXiv:2306.00148
60
citations
#342

Image and Video Tokenization with Binary Spherical Quantization

Yue Zhao, Yuanjun Xiong, Philipp Krähenbühl

ICLR 2025arXiv:2406.07548
60
citations
#343

Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models

Hyeonho Jeong, Jong Chul YE

ICLR 2024oralarXiv:2310.01107
60
citations
#344

Toward effective protection against diffusion-based mimicry through score distillation

Haotian Xue, Chumeng Liang, Xiaoyu Wu et al.

ICLR 2024arXiv:2311.12832
60
citations
#345

Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks

Marc Rußwurm, Konstantin Klemmer, Esther Rolf et al.

ICLR 2024spotlightarXiv:2310.06743
59
citations
#346

The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing

Shen Nie, Hanzhong Guo, Cheng Lu et al.

ICLR 2024arXiv:2311.01410
59
citations
#347

Repetition Improves Language Model Embeddings

Jacob Springer, Suhas Kotha, Daniel Fried et al.

ICLR 2025arXiv:2402.15449
59
citations
#348

See What You Are Told: Visual Attention Sink in Large Multimodal Models

Seil Kang, Jinyeong Kim, Junhyeok Kim et al.

ICLR 2025arXiv:2503.03321
59
citations
#349

Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation

Tiansheng Huang, Sihao Hu, Fatih Ilhan et al.

ICLR 2025arXiv:2409.01586
59
citations
#350

Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation

Hyungjoo Chae, Namyoung Kim, Kai Ong et al.

ICLR 2025arXiv:2410.13232
59
citations
#351

AI Sandbagging: Language Models can Strategically Underperform on Evaluations

Teun van der Weij, Felix Hofstätter, Oliver Jaffe et al.

ICLR 2025arXiv:2406.07358
59
citations
#352

RazorAttention: Efficient KV Cache Compression Through Retrieval Heads

Hanlin Tang, Yang Lin, Jing Lin et al.

ICLR 2025arXiv:2407.15891
59
citations
#353

CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching

Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.

ICLR 2025arXiv:2410.12261
59
citations
#354

Matryoshka Multimodal Models

Mu Cai, Jianwei Yang, Jianfeng Gao et al.

ICLR 2025arXiv:2405.17430
58
citations
#355

Towards Semantic Equivalence of Tokenization in Multimodal LLM

Shengqiong Wu, Hao Fei, Xiangtai Li et al.

ICLR 2025arXiv:2406.05127
58
citations
#356

Scaling Transformers for Low-Bitrate High-Quality Speech Coding

Julian Parker, Anton Smirnov, Jordi Pons et al.

ICLR 2025arXiv:2411.19842
57
citations
#357

Language Model Inversion

John X. Morris, Wenting Zhao, Justin Chiu et al.

ICLR 2024arXiv:2311.13647
57
citations
#358

Magnushammer: A Transformer-Based Approach to Premise Selection

Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak et al.

ICLR 2024arXiv:2303.04488
57
citations
#359

Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation

Mufei Li, Siqi Miao, Pan Li

ICLR 2025arXiv:2410.20724
57
citations
#360

Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems

Guibin Zhang, Yanwei Yue, Zhixun Li et al.

ICLR 2025oralarXiv:2410.02506
56
citations
#361

BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks

Frederikke Marin, Felix Teufel, Marc Horlacher et al.

ICLR 2024arXiv:2311.12570
56
citations
#362

What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?

Guangkai Xu, yongtao ge, Mingyu Liu et al.

ICLR 2025arXiv:2403.06090
56
citations
#363

LLM-SR: Scientific Equation Discovery via Programming with Large Language Models

Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.

ICLR 2025arXiv:2404.18400
55
citations
#364

Controlling Space and Time with Diffusion Models

Daniel Watson, Saurabh Saxena, Lala Li et al.

ICLR 2025arXiv:2407.07860
55
citations
#365

Self-Improvement in Language Models: The Sharpening Mechanism

Audrey Huang, Adam Block, Dylan Foster et al.

ICLR 2025arXiv:2412.01951
55
citations
#366

Hymba: A Hybrid-head Architecture for Small Language Models

Xin Dong, Yonggan Fu, Shizhe Diao et al.

ICLR 2025arXiv:2411.13676
55
citations
#367

AgentSquare: Automatic LLM Agent Search in Modular Design Space

Yu Shang, Yu Li, Keyu Zhao et al.

ICLR 2025arXiv:2410.06153
55
citations
#368

MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images

Xurui Li, Ziming Huang, Feng Xue et al.

ICLR 2024arXiv:2401.16753
55
citations
#369

SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints

Jianhong Bai, Menghan Xia, Xintao WANG et al.

ICLR 2025arXiv:2412.07760
55
citations
#370

Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent

Yangning Li, Yinghui Li, Xinyu Wang et al.

ICLR 2025arXiv:2411.02937
55
citations
#371

Simplifying Deep Temporal Difference Learning

Matteo Gallici, Mattie Fellows, Benjamin Ellis et al.

ICLR 2025oralarXiv:2407.04811
55
citations
#372

Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts

Xinhua Cheng, Tianyu Yang, Jianan Wang et al.

ICLR 2024arXiv:2310.11784
54
citations
#373

Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning

Yu Fu, Zefan Cai, Abedelkadir Asi et al.

ICLR 2025arXiv:2410.19258
54
citations
#374

KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks

Kaijing Ma, Xeron Du, Yunran Wang et al.

ICLR 2025arXiv:2410.06526
54
citations
#375

FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality

Zhengyao Lyu, Chenyang Si, Junhao Song et al.

ICLR 2025oralarXiv:2410.19355
54
citations
#376

UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling

Haoyu Lu, Yuqi Huo, Guoxing Yang et al.

ICLR 2024arXiv:2302.06605
54
citations
#377

How to Evaluate Reward Models for RLHF

Evan Frick, Tianle Li, Connor Chen et al.

ICLR 2025arXiv:2410.14872
54
citations
#378

MindSearch: Mimicking Human Minds Elicits Deep AI Searcher

Zehui Chen, Kuikun Liu, Qiuchen Wang et al.

ICLR 2025arXiv:2407.20183
53
citations
#379

CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control

Guy Tevet, Sigal Raab, Setareh Cohan et al.

ICLR 2025arXiv:2410.03441
53
citations
#380

Tell me about yourself: LLMs are aware of their learned behaviors

Jan Betley, Xuchan Bao, Martín Soto et al.

ICLR 2025oralarXiv:2501.11120
53
citations
#381

Proteina: Scaling Flow-based Protein Structure Generative Models

Tomas Geffner, Kieran Didi, Zuobai Zhang et al.

ICLR 2025arXiv:2503.00710
53
citations
#382

TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining

Wanchao Liang, Tianyu Liu, Less Wright et al.

ICLR 2025
53
citations
#383

In-Context Learning Learns Label Relationships but Is Not Conventional Learning

Jannik Kossen, Yarin Gal, Tom Rainforth

ICLR 2024arXiv:2307.12375
53
citations
#384

Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology

Xiangyu Wang, Donglin Yang, ziqin wang et al.

ICLR 2025arXiv:2410.07087
52
citations
#385

BOND: Aligning LLMs with Best-of-N Distillation

Pier Giuseppe Sessa, Robert Dadashi, Léonard Hussenot-Desenonges et al.

ICLR 2025arXiv:2407.14622
52
citations
#386

Physics-Informed Diffusion Models

Jan-Hendrik Bastek, WaiChing Sun, Dennis Kochmann

ICLR 2025arXiv:2403.14404
52
citations
#387

RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph

Siru Ouyang, Wenhao Yu, Kaixin Ma et al.

ICLR 2025arXiv:2410.14684
52
citations
#388

A Decade's Battle on Dataset Bias: Are We There Yet?

Zhuang Liu, Kaiming He

ICLR 2025arXiv:2403.08632
52
citations
#389

Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models

Seyedmorteza Sadat, Otmar Hilliges, Romann Weber

ICLR 2025arXiv:2410.02416
52
citations
#390

Does Spatial Cognition Emerge in Frontier Models?

Santhosh Kumar Ramakrishnan, Erik Wijmans, Philipp Krähenbühl et al.

ICLR 2025arXiv:2410.06468
51
citations
#391

Intriguing Properties of Generative Classifiers

Priyank Jaini, Kevin Clark, Robert Geirhos

ICLR 2024spotlightarXiv:2309.16779
51
citations
#392

FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference

Xunhao Lai, Jianqiao Lu, Yao Luo et al.

ICLR 2025arXiv:2502.20766
51
citations
#393

Inference Scaling for Long-Context Retrieval Augmented Generation

Zhenrui Yue, Honglei Zhuang, Aijun Bai et al.

ICLR 2025arXiv:2410.04343
51
citations
#394

Graph Neural Networks for Learning Equivariant Representations of Neural Networks

Miltiadis (Miltos) Kofinas, Boris Knyazev, Yan Zhang et al.

ICLR 2024arXiv:2403.12143
51
citations
#395

AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials

Yiheng Xu, Dunjie Lu, Zhennan Shen et al.

ICLR 2025arXiv:2412.09605
50
citations
#396

Model merging with SVD to tie the Knots

George Stoica, Pratik Ramesh, Boglarka Ecsedi et al.

ICLR 2025arXiv:2410.19735
50
citations
#397

Energy-Based Diffusion Language Models for Text Generation

Minkai Xu, Tomas Geffner, Karsten Kreis et al.

ICLR 2025arXiv:2410.21357
49
citations
#398

AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ

Jonas Belouadi, Anne Lauscher, Steffen Eger

ICLR 2024arXiv:2310.00367
49
citations
#399

From Zero to Turbulence: Generative Modeling for 3D Flow Simulation

Marten Lienen, David Lüdke, Jan Hansen-Palmus et al.

ICLR 2024arXiv:2306.01776
49
citations
#400

OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text

Qingyun Li, Zhe Chen, Weiyun Wang et al.

ICLR 2025arXiv:2406.08418
49
citations