Most Cited ICLR Highlight "stereo vision technologies" Papers

6,124 papers found • Page 1 of 31

#1

SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis

Dustin Podell, Zion English, Kyle Lacey et al.

ICLR 2024spotlightarXiv:2307.01952
3991
citations
#2

MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models

Deyao Zhu, jun chen, Xiaoqian Shen et al.

ICLR 2024arXiv:2304.10592
2806
citations
#3

Let's Verify Step by Step

Hunter Lightman, Vineet Kosaraju, Yuri Burda et al.

ICLR 2024arXiv:2305.20050
2488
citations
#4

SAM 2: Segment Anything in Images and Videos

Nikhila Ravi, Valentin Gabeur, Yuan-Ting Hu et al.

ICLR 2025arXiv:2408.00714
2393
citations
#5

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Clemencia Siro, Guy Gur-Ari, Gaurav Mishra et al.

ICLR 2025oralarXiv:2206.04615
2226
citations
#6

FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Tri Dao

ICLR 2024arXiv:2307.08691
2224
citations
#7

SWE-bench: Can Language Models Resolve Real-world Github Issues?

Carlos E Jimenez, John Yang, Alexander Wettig et al.

ICLR 2024arXiv:2310.06770
1485
citations
#8

Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Akari Asai, Zeqiu Wu, Yizhong Wang et al.

ICLR 2024arXiv:2310.11511
1435
citations
#9

CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer

Zhuoyi Yang, Jiayan Teng, Wendi Zheng et al.

ICLR 2025oralarXiv:2408.06072
1409
citations
#10

Efficient Streaming Language Models with Attention Sinks

Guangxuan Xiao, Yuandong Tian, Beidi Chen et al.

ICLR 2024arXiv:2309.17453
1396
citations
#11

MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework

Sirui Hong, Mingchen Zhuge, Jonathan Chen et al.

ICLR 2024arXiv:2308.00352
1367
citations
#12

Scaling Autoregressive Models for Content-Rich Text-to-Image Generation

Xin Li, Jing Yu Koh, Alexander Ku et al.

ICLR 2024
1366
citations
#13

iTransformer: Inverted Transformers Are Effective for Time Series Forecasting

Yong Liu, Tengge Hu, Haoran Zhang et al.

ICLR 2024oralarXiv:2310.06625
1356
citations
#14

AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning

Yuwei GUO, Ceyuan Yang, Anyi Rao et al.

ICLR 2024oralarXiv:2307.04725
1330
citations
#15

MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts

Pan Lu, Hritik Bansal, Tony Xia et al.

ICLR 2024arXiv:2310.02255
1235
citations
#16

ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs

Yujia Qin, Shihao Liang, Yining Ye et al.

ICLR 2024spotlightarXiv:2307.16789
1197
citations
#17

WizardLM: Empowering Large Pre-Trained Language Models to Follow Complex Instructions

Can Xu, Qingfeng Sun, Kai Zheng et al.

ICLR 2024arXiv:2304.12244
1162
citations
#18

LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code

Naman Jain, Han, Alex Gu et al.

ICLR 2025arXiv:2403.07974
1108
citations
#19

Grounding Multimodal Large Language Models to the World

Zhiliang Peng, Wenhui Wang, Li Dong et al.

ICLR 2024arXiv:2306.14824
1059
citations
#20

A Generalist Agent

Jackie Kay, Sergio Gómez Colmenarejo, Mahyar Bordbar et al.

ICLR 2024
978
citations
#21

Fine-tuning Aligned Language Models Compromises Safety, Even When Users Do Not Intend To!

Xiangyu Qi, Yi Zeng, Tinghao Xie et al.

ICLR 2024arXiv:2310.03693
966
citations
#22

Teaching Large Language Models to Self-Debug

Xinyun Chen, Maxwell Lin, Nathanael Schaerli et al.

ICLR 2024arXiv:2304.05128
959
citations
#23

WebArena: A Realistic Web Environment for Building Autonomous Agents

Shuyan Zhou, Frank F Xu, Hao Zhu et al.

ICLR 2024arXiv:2307.13854
916
citations
#24

DreamGaussian: Generative Gaussian Splatting for Efficient 3D Content Creation

Jiaxiang Tang, Jiawei Ren, Hang Zhou et al.

ICLR 2024arXiv:2309.16653
884
citations
#25

WizardCoder: Empowering Code Large Language Models with Evol-Instruct

Ziyang Luo, Can Xu, Pu Zhao et al.

ICLR 2024arXiv:2306.08568
881
citations
#26

MVDream: Multi-view Diffusion for 3D Generation

Yichun Shi, Peng Wang, Jianglong Ye et al.

ICLR 2024arXiv:2308.16512
880
citations
#27

ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate

Chi-Min Chan, Weize Chen, Yusheng Su et al.

ICLR 2024arXiv:2308.07201
766
citations
#28

Time-LLM: Time Series Forecasting by Reprogramming Large Language Models

Ming Jin, Shiyu Wang, Lintao Ma et al.

ICLR 2024arXiv:2310.01728
765
citations
#29

Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback

Javier Rando, Tony Wang, Stewart Slocum et al.

ICLR 2025arXiv:2307.15217
750
citations
#30

Large Language Models Cannot Self-Correct Reasoning Yet

Jie Huang, Xinyun Chen, Swaroop Mishra et al.

ICLR 2024arXiv:2310.01798
738
citations
#31

Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs

Miao Xiong, Zhiyuan Hu, Xinyang Lu et al.

ICLR 2024arXiv:2306.13063
715
citations
#32

LRM: Large Reconstruction Model for Single Image to 3D

Yicong Hong, Kai Zhang, Jiuxiang Gu et al.

ICLR 2024arXiv:2311.04400
711
citations
#33

A Simple and Effective Pruning Approach for Large Language Models

Mingjie Sun, Zhuang Liu, Anna Bair et al.

ICLR 2024arXiv:2306.11695
700
citations
#34

Training Diffusion Models with Reinforcement Learning

Kevin Black, Michael Janner, Yilun Du et al.

ICLR 2024arXiv:2305.13301
691
citations
#35

Large Language Models as Optimizers

Chengrun Yang, Xuezhi Wang, Yifeng Lu et al.

ICLR 2024arXiv:2309.03409
683
citations
#36

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
655
citations
#37

Vision Transformers Need Registers

Timothée Darcet, Maxime Oquab, Julien Mairal et al.

ICLR 2024arXiv:2309.16588
649
citations
#38

RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment

Jipeng Zhang, Hanze Dong, Tong Zhang et al.

ICLR 2025
642
citations
#39

SyncDreamer: Generating Multiview-consistent Images from a Single-view Image

Yuan Liu, Cheng Lin, Zijiao Zeng et al.

ICLR 2024spotlightarXiv:2309.03453
629
citations
#40

CRITIC: Large Language Models Can Self-Correct with Tool-Interactive Critiquing

Zhibin Gou, Zhihong Shao, Yeyun Gong et al.

ICLR 2024arXiv:2305.11738
621
citations
#41

AutoDAN: Generating Stealthy Jailbreak Prompts on Aligned Large Language Models

Xiaogeng Liu, Nan Xu, Muhao Chen et al.

ICLR 2024arXiv:2310.04451
604
citations
#42

Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting

Melanie Sclar, Yejin Choi, Yulia Tsvetkov et al.

ICLR 2024arXiv:2310.11324
581
citations
#43

MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models

Longhui Yu, Weisen JIANG, Han Shi et al.

ICLR 2024spotlightarXiv:2309.12284
578
citations
#44

Safe RLHF: Safe Reinforcement Learning from Human Feedback

Juntao Dai, Xuehai Pan, Ruiyang Sun et al.

ICLR 2024spotlightarXiv:2310.12773
567
citations
#45

Language Model Beats Diffusion - Tokenizer is key to visual generation

Lijun Yu, José Lezama, Nitesh Bharadwaj Gundavarapu et al.

ICLR 2024arXiv:2310.05737
548
citations
#46

AgentBench: Evaluating LLMs as Agents

Xiao Liu, Hao Yu, Hanchen Zhang et al.

ICLR 2024arXiv:2308.03688
543
citations
#47

GAIA: a benchmark for General AI Assistants

Grégoire Mialon, Clémentine Fourrier, Thomas Wolf et al.

ICLR 2024arXiv:2311.12983
531
citations
#48

Towards Understanding Sycophancy in Language Models

Mrinank Sharma, Meg Tong, Tomek Korbak et al.

ICLR 2024arXiv:2310.13548
526
citations
#49

MAmmoTH: Building Math Generalist Models through Hybrid Instruction Tuning

Xiang Yue, Xingwei Qu, Ge Zhang et al.

ICLR 2024spotlightarXiv:2309.05653
522
citations
#50

AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Weize Chen, Yusheng Su, Jingwei Zuo et al.

ICLR 2024arXiv:2308.10848
503
citations
#51

Patches Are All You Need?

Asher Trockman, J Kolter

ICLR 2024arXiv:2201.09792
494
citations
#52

Eureka: Human-Level Reward Design via Coding Large Language Models

Yecheng Jason Ma, William Liang, Guanzhi Wang et al.

ICLR 2024arXiv:2310.12931
491
citations
#53

Show-o: One Single Transformer to Unify Multimodal Understanding and Generation

Jinheng Xie, Weijia Mao, Zechen Bai et al.

ICLR 2025arXiv:2408.12528
483
citations
#54

SALMONN: Towards Generic Hearing Abilities for Large Language Models

Changli Tang, Wenyi Yu, Guangzhi Sun et al.

ICLR 2024arXiv:2310.13289
462
citations
#55

Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting

Zeyu Yang, Hongye Yang, Zijie Pan et al.

ICLR 2024oralarXiv:2310.10642
460
citations
#56

Ferret: Refer and Ground Anything Anywhere at Any Granularity

Haoxuan You, Haotian Zhang, Zhe Gan et al.

ICLR 2024spotlightarXiv:2310.07704
457
citations
#57

YaRN: Efficient Context Window Extension of Large Language Models

Bowen Peng, Jeffrey Quesnelle, Honglu Fan et al.

ICLR 2024arXiv:2309.00071
440
citations
#58

TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting

Shiyu Wang, Haixu Wu, Xiaoming Shi et al.

ICLR 2024oralarXiv:2405.14616
438
citations
#59

GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models

Iman Mirzadeh, Keivan Alizadeh-Vahid, Hooman Shahrokhi et al.

ICLR 2025arXiv:2410.05229
436
citations
#60

Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation

Yangsibo Huang, Samyak Gupta, Mengzhou Xia et al.

ICLR 2024spotlightarXiv:2310.06987
430
citations
#61

Magic123: One Image to High-Quality 3D Object Generation Using Both 2D and 3D Diffusion Priors

Guocheng Qian, Jinjie Mai, Abdullah Hamdi et al.

ICLR 2024arXiv:2306.17843
429
citations
#62

Sheared LLaMA: Accelerating Language Model Pre-training via Structured Pruning

Mengzhou Xia, Tianyu Gao, Zhiyuan Zeng et al.

ICLR 2024arXiv:2310.06694
426
citations
#63

Mitigating Hallucination in Large Multi-Modal Models via Robust Instruction Tuning

Fuxiao Liu, Kevin Lin, Linjie Li et al.

ICLR 2024arXiv:2306.14565
422
citations
#64

InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation

Yi Wang, Yinan He, Yizhuo Li et al.

ICLR 2024spotlightarXiv:2307.06942
419
citations
#65

NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models

Chankyu Lee, Rajarshi Roy, Mengyao Xu et al.

ICLR 2025arXiv:2405.17428
419
citations
#66

Reasoning on Graphs: Faithful and Interpretable Large Language Model Reasoning

Linhao Luo, Yuan-Fang Li, Reza Haffari et al.

ICLR 2024arXiv:2310.01061
415
citations
#67

WildChat: 1M ChatGPT Interaction Logs in the Wild

Wenting Zhao, Xiang Ren, Jack Hessel et al.

ICLR 2024oralarXiv:2405.01470
411
citations
#68

BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions

Terry Yue Zhuo, Minh Chien Vu, Jenny Chim et al.

ICLR 2025arXiv:2406.15877
410
citations
#69

RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation

Songming Liu, Lingxuan Wu, Bangguo Li et al.

ICLR 2025arXiv:2410.07864
409
citations
#70

Causal Reasoning and Large Language Models: Opening a New Frontier for Causality

Chenhao Tan, Robert Ness, Amit Sharma et al.

ICLR 2025arXiv:2305.00050
403
citations
#71

GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

Youliang Yuan, Wenxiang Jiao, Wenxuan Wang et al.

ICLR 2024arXiv:2308.06463
403
citations
#72

Llemma: An Open Language Model for Mathematics

Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster et al.

ICLR 2024arXiv:2310.10631
402
citations
#73

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

Maksym Andriushchenko, francesco croce, Nicolas Flammarion

ICLR 2025arXiv:2404.02151
401
citations
#74

Universal Guidance for Diffusion Models

Arpit Bansal, Hong-Min Chu, Avi Schwarzschild et al.

ICLR 2024arXiv:2302.07121
399
citations
#75

Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models

Seungone Kim, Jamin Shin, yejin cho et al.

ICLR 2024arXiv:2310.08491
396
citations
#76

Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs

Suyu Ge, Yunan Zhang, Liyuan Liu et al.

ICLR 2024arXiv:2310.01801
390
citations
#77

TokenFlow: Consistent Diffusion Features for Consistent Video Editing

Michal Geyer, Omer Bar Tal, Shai Bagon et al.

ICLR 2024arXiv:2307.10373
389
citations
#78

OpenHands: An Open Platform for AI Software Developers as Generalist Agents

Xingyao Wang, Boxuan Li, Yufan Song et al.

ICLR 2025arXiv:2407.16741
387
citations
#79

Large Language Models Are Not Robust Multiple Choice Selectors

Chujie Zheng, Hao Zhou, Fandong Meng et al.

ICLR 2024oralarXiv:2309.03882
383
citations
#80

Instant3D: Fast Text-to-3D with Sparse-view Generation and Large Reconstruction Model

Jiahao Li, Hao Tan, Kai Zhang et al.

ICLR 2024arXiv:2311.06214
381
citations
#81

Finite Scalar Quantization: VQ-VAE Made Simple

Fabian Mentzer, David Minnen, Eirikur Agustsson et al.

ICLR 2024arXiv:2309.15505
379
citations
#82

Generative Verifiers: Reward Modeling as Next-Token Prediction

Lunjun Zhang, Arian Hosseini, Hritik Bansal et al.

ICLR 2025arXiv:2408.15240
375
citations
#83

LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment

Bin Zhu, Bin Lin, Munan Ning et al.

ICLR 2024arXiv:2310.01852
357
citations
#84

Effective Data Augmentation With Diffusion Models

Brandon Trabucco, Kyle Doherty, Max Gurinas et al.

ICLR 2024arXiv:2302.07944
356
citations
#85

LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset

Lianmin Zheng, Wei-Lin Chiang, Ying Sheng et al.

ICLR 2024spotlightarXiv:2309.11998
352
citations
#86

SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression

Tim Dettmers, Ruslan Svirschevski, Vage Egiazarian et al.

ICLR 2024arXiv:2306.03078
350
citations
#87

Learning Interactive Real-World Simulators

Sherry Yang, Yilun Du, Seyed Ghasemipour et al.

ICLR 2024arXiv:2310.06114
350
citations
#88

NaturalSpeech 2: Latent Diffusion Models are Natural and Zero-Shot Speech and Singing Synthesizers

Kai Shen, Zeqian Ju, Xu Tan et al.

ICLR 2024spotlightarXiv:2304.09116
344
citations
#89

Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think

Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.

ICLR 2025arXiv:2410.06940
342
citations
#90

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Wenqi Shao, Mengzhao Chen, Zhaoyang Zhang et al.

ICLR 2024spotlightarXiv:2308.13137
341
citations
#91

DNABERT-2: Efficient Foundation Model and Benchmark For Multi-Species Genomes

Zhihan Zhou, Yanrong Ji, Weijian Li et al.

ICLR 2024arXiv:2306.15006
338
citations
#92

Safety-Tuned LLaMAs: Lessons From Improving the Safety of Large Language Models that Follow Instructions

Federico Bianchi, Mirac Suzgun, Giuseppe Attanasio et al.

ICLR 2024arXiv:2309.07875
338
citations
#93

What Makes Good Data for Alignment? A Comprehensive Study of Automatic Data Selection in Instruction Tuning

Wei Liu, Weihao Zeng, Keqing He et al.

ICLR 2024arXiv:2312.15685
337
citations
#94

PandaLM: An Automatic Evaluation Benchmark for LLM Instruction Tuning Optimization

Yidong Wang, Zhuohao Yu, Wenjin Yao et al.

ICLR 2024arXiv:2306.05087
336
citations
#95

ControlVideo: Training-free Controllable Text-to-video Generation

Yabo Zhang, Yuxiang Wei, Dongsheng jiang et al.

ICLR 2024arXiv:2305.13077
335
citations
#96

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

Parth Sarthi, Salman Abdullah, Aditi Tuli et al.

ICLR 2024arXiv:2401.18059
333
citations
#97

Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion

Dongjun Kim, Chieh-Hsin Lai, WeiHsiang Liao et al.

ICLR 2024arXiv:2310.02279
333
citations
#98

Improved Techniques for Training Consistency Models

Yang Song, Prafulla Dhariwal

ICLR 2024arXiv:2310.14189
332
citations
#99

Human Motion Diffusion as a Generative Prior

Yonatan Shafir, Guy Tevet, Roy Kapon et al.

ICLR 2024arXiv:2303.01418
331
citations
#100

Directly Fine-Tuning Diffusion Models on Differentiable Rewards

Kevin Clark, Paul Vicol, Kevin Swersky et al.

ICLR 2024arXiv:2309.17400
330
citations
#101

Statistical Rejection Sampling Improves Preference Optimization

Tianqi Liu, Yao Zhao, Rishabh Joshi et al.

ICLR 2024arXiv:2309.06657
329
citations
#102

Detecting Pretraining Data from Large Language Models

Weijia Shi, Anirudh Ajith, Mengzhou Xia et al.

ICLR 2024arXiv:2310.16789
327
citations
#103

Scaling and evaluating sparse autoencoders

Leo Gao, Tom Dupre la Tour, Henk Tillman et al.

ICLR 2025arXiv:2406.04093
326
citations
#104

A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis

Izzeddin Gur, Hiroki Furuta, Austin Huang et al.

ICLR 2024arXiv:2307.12856
325
citations
#105

Training Language Models to Self-Correct via Reinforcement Learning

Aviral Kumar, Vincent Zhuang, Rishabh Agarwal et al.

ICLR 2025arXiv:2409.12917
324
citations
#106

InstaFlow: One Step is Enough for High-Quality Diffusion-Based Text-to-Image Generation

Xingchao Liu, Xiwen Zhang, Jianzhu Ma et al.

ICLR 2024arXiv:2309.06380
323
citations
#107

Vision-Language Foundation Models as Effective Robot Imitators

Xinghang Li, Minghuan Liu, Hanbo Zhang et al.

ICLR 2024spotlightarXiv:2311.01378
320
citations
#108

Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model

Chunting Zhou, Lili Yu, Arun Babu et al.

ICLR 2025arXiv:2408.11039
318
citations
#109

Depth Pro: Sharp Monocular Metric Depth in Less Than a Second

Alexey Bochkovskiy, Amaël Delaunoy, Hugo Germain et al.

ICLR 2025arXiv:2410.02073
316
citations
#110

Making Retrieval-Augmented Language Models Robust to Irrelevant Context

Ori Yoran, Tomer Wolfson, Ori Ram et al.

ICLR 2024arXiv:2310.01558
314
citations
#111

OpenChat: Advancing Open-source Language Models with Mixed-Quality Data

Guan Wang, Sijie Cheng, Xianyuan Zhan et al.

ICLR 2024arXiv:2309.11235
313
citations
#112

TD-MPC2: Scalable, Robust World Models for Continuous Control

Nicklas Hansen, Hao Su, Xiaolong Wang

ICLR 2024spotlightarXiv:2310.16828
308
citations
#113

SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Saleh Ashkboos, Maximilian Croci, Marcelo Gennari do Nascimento et al.

ICLR 2024arXiv:2401.15024
307
citations
#114

AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection

Qihang Zhou, Guansong Pang, Yu Tian et al.

ICLR 2024arXiv:2310.18961
306
citations
#115

Safety Alignment Should be Made More Than Just a Few Tokens Deep

Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu et al.

ICLR 2025arXiv:2406.05946
303
citations
#116

Personalize Segment Anything Model with One Shot

Renrui Zhang, Zhengkai Jiang, Ziyu Guo et al.

ICLR 2024arXiv:2305.03048
301
citations
#117

DoLa: Decoding by Contrasting Layers Improves Factuality in Large Language Models

Yung-Sung Chuang, Yujia Xie, Hongyin Luo et al.

ICLR 2024arXiv:2309.03883
296
citations
#118

Mixture-of-Agents Enhances Large Language Model Capabilities

Junlin Wang, Jue Wang, Ben Athiwaratkun et al.

ICLR 2025arXiv:2406.04692
294
citations
#119

Understanding the Effects of RLHF on LLM Generalisation and Diversity

Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis et al.

ICLR 2024arXiv:2310.06452
287
citations
#120

DreamLLM: Synergistic Multimodal Comprehension and Creation

Runpei Dong, chunrui han, Yuang Peng et al.

ICLR 2024spotlightarXiv:2309.11499
287
citations
#121

Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers

Chenglei Si, Diyi Yang, Tatsunori Hashimoto

ICLR 2025arXiv:2409.04109
285
citations
#122

SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation

Chongyu Fan, Jiancheng Liu, Yihua Zhang et al.

ICLR 2024spotlightarXiv:2310.12508
284
citations
#123

Provable Robust Watermarking for AI-Generated Text

Xuandong Zhao, Prabhanjan Ananth, Lei Li et al.

ICLR 2024arXiv:2306.17439
279
citations
#124

RepoBench: Benchmarking Repository-Level Code Auto-Completion Systems

Tianyang Liu, Canwen Xu, Julian McAuley

ICLR 2024arXiv:2306.03091
278
citations
#125

Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Zhangchen Xu, Fengqing Jiang, Luyao Niu et al.

ICLR 2025arXiv:2406.08464
276
citations
#126

VeRA: Vector-based Random Matrix Adaptation

Dawid Kopiczko, Tijmen Blankevoort, Yuki Asano

ICLR 2024arXiv:2310.11454
276
citations
#127

MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion

Junyi Zhang, Charles Herrmann, Junhwa Hur et al.

ICLR 2025arXiv:2410.03825
276
citations
#128

Analyzing and Mitigating Object Hallucination in Large Vision-Language Models

Yiyang Zhou, Chenhang Cui, Jaehong Yoon et al.

ICLR 2024arXiv:2310.00754
275
citations
#129

The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Bill Yuchen Lin, Abhilasha Ravichander, Ximing Lu et al.

ICLR 2024arXiv:2312.01552
274
citations
#130

Building Cooperative Embodied Agents Modularly with Large Language Models

Hongxin Zhang, Weihua Du, Jiaming Shan et al.

ICLR 2024arXiv:2307.02485
273
citations
#131

Evaluating Large Language Models at Evaluating Instruction Following

Zhiyuan Zeng, Jiatong Yu, Tianyu Gao et al.

ICLR 2024arXiv:2310.07641
273
citations
#132

ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving

Zhibin Gou, Zhihong Shao, Yeyun Gong et al.

ICLR 2024arXiv:2309.17452
272
citations
#133

Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature

Guangsheng Bao, Yanbin Zhao, Zhiyang Teng et al.

ICLR 2024arXiv:2310.05130
269
citations
#134

SpinQuant: LLM Quantization with Learned Rotations

Zechun Liu, Changsheng Zhao, Igor Fedorov et al.

ICLR 2025arXiv:2405.16406
268
citations
#135

JudgeLM: Fine-tuned Large Language Models are Scalable Judges

Lianghui Zhu, Xinggang Wang, Xinlong Wang

ICLR 2025arXiv:2310.17631
265
citations
#136

Large Language Models as Tool Makers

Tianle Cai, Xuezhi Wang, Tengyu Ma et al.

ICLR 2024arXiv:2305.17126
263
citations
#137

EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations

Yi-Lun Liao, Brandon Wood, Abhishek Das et al.

ICLR 2024arXiv:2306.12059
263
citations
#138

Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models

Samuel Marks, Can Rager, Eric Michaud et al.

ICLR 2025arXiv:2403.19647
263
citations
#139

Stochastic Controlled Averaging for Federated Learning with Communication Compression

Xinmeng Huang, Ping Li, Xiaoyun Li

ICLR 2024spotlightarXiv:2308.08165
262
citations
#140

Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts

Jian Xie, Kai Zhang, Jiangjie Chen et al.

ICLR 2024spotlightarXiv:2305.13300
261
citations
#141

MINT: Evaluating LLMs in Multi-turn Interaction with Tools and Language Feedback

Xingyao Wang, Zihan Wang, Jiateng Liu et al.

ICLR 2024arXiv:2309.10691
260
citations
#142

The Alignment Problem from a Deep Learning Perspective

Richard Ngo, Lawrence Chan, Sören Mindermann

ICLR 2024arXiv:2209.00626
258
citations
#143

Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents

Boyu Gou, Demi Ruohan Wang, Boyuan Zheng et al.

ICLR 2025arXiv:2410.05243
255
citations
#144

LoRA Learns Less and Forgets Less

Jonathan Frankle, Jose Javier Gonzalez Ortiz, Cody Blakeney et al.

ICLR 2025arXiv:2405.09673
252
citations
#145

Language Models Represent Space and Time

Wes Gurnee, Max Tegmark

ICLR 2024oralarXiv:2310.02207
251
citations
#146

To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning

Zayne Sprague, Fangcong Yin, Juan Rodriguez et al.

ICLR 2025arXiv:2409.12183
250
citations
#147

Fine-Tuning Language Models for Factuality

Katherine Tian, Eric Mitchell, Huaxiu Yao et al.

ICLR 2024arXiv:2311.08401
249
citations
#148

Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation

Hongtao Wu, Ya Jing, Chilam Cheang et al.

ICLR 2024arXiv:2312.13139
248
citations
#149

Can LLM-Generated Misinformation Be Detected?

Canyu Chen, Kai Shu

ICLR 2024arXiv:2309.13788
248
citations
#150

Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models

Jimeng Sun, Shubhendu Trivedi, Zhen Lin

ICLR 2025arXiv:2305.19187
248
citations
#151

mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models

Jiabo Ye, Haiyang Xu, Haowei Liu et al.

ICLR 2025arXiv:2408.04840
243
citations
#152

Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training

Hong Liu, Zhiyuan Li, David Hall et al.

ICLR 2024arXiv:2305.14342
241
citations
#153

Self-Consuming Generative Models Go MAD

Sina Alemohammad, Josue Casco-Rodriguez, Lorenzo Luzi et al.

ICLR 2024arXiv:2307.01850
241
citations
#154

Q-Bench: A Benchmark for General-Purpose Foundation Models on Low-level Vision

Haoning Wu, Zicheng Zhang, Erli Zhang et al.

ICLR 2024spotlightarXiv:2309.14181
239
citations
#155

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

Xuhui Zhou, Hao Zhu, Leena Mathur et al.

ICLR 2024spotlightarXiv:2310.11667
239
citations
#156

When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method

Biao Zhang, Zhongtao Liu, Colin Cherry et al.

ICLR 2024arXiv:2402.17193
238
citations
#157

MERT: Acoustic Music Understanding Model with Large-Scale Self-supervised Training

Yizhi Li, Ruibin Yuan, Ge Zhang et al.

ICLR 2024arXiv:2306.00107
237
citations
#158

SaProt: Protein Language Modeling with Structure-aware Vocabulary

Jin Su, Chenchen Han, Yuyang Zhou et al.

ICLR 2024spotlight
237
citations
#159

LongLoRA: Efficient Fine-tuning of Long-Context Large Language Models

Yukang Chen, Shengju Qian, Haotian Tang et al.

ICLR 2024arXiv:2309.12307
235
citations
#160

Jailbreak in pieces: Compositional Adversarial Attacks on Multi-Modal Language Models

Erfan Shayegani, Yue Dong, Nael Abu-Ghazaleh

ICLR 2024spotlightarXiv:2307.14539
235
citations
#161

Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge

Jiayi Ye, Yanbo Wang, Yue Huang et al.

ICLR 2025arXiv:2410.02736
229
citations
#162

Pyramidal Flow Matching for Efficient Video Generative Modeling

Yang Jin, Zhicheng Sun, Ningyuan Li et al.

ICLR 2025oralarXiv:2410.05954
227
citations
#163

DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model

Yinghao Xu, Hao Tan, Fujun Luan et al.

ICLR 2024spotlightarXiv:2311.09217
227
citations
#164

Large Brain Model for Learning Generic Representations with Tremendous EEG Data in BCI

Wei-Bang Jiang, Liming Zhao, Bao-liang Lu

ICLR 2024spotlightarXiv:2405.18765
226
citations
#165

TEMPO: Prompt-based Generative Pre-trained Transformer for Time Series Forecasting

Defu Cao, Furong Jia, Sercan Arik et al.

ICLR 2024oralarXiv:2310.04948
225
citations
#166

Chain of Thought Empowers Transformers to Solve Inherently Serial Problems

Zhiyuan Li, Hong Liu, Denny Zhou et al.

ICLR 2024arXiv:2402.12875
224
citations
#167

Listen, Think, and Understand

Yuan Gong, Hongyin Luo, Alexander Liu et al.

ICLR 2024arXiv:2305.10790
224
citations
#168

INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection

Chao Chen, Kai Liu, Ze Chen et al.

ICLR 2024arXiv:2402.03744
224
citations
#169

Data Filtering Networks

Alex Fang, Albin Madappally Jose, Amit Jain et al.

ICLR 2024arXiv:2309.17425
222
citations
#170

DiLu: A Knowledge-Driven Approach to Autonomous Driving with Large Language Models

Licheng Wen, DAOCHENG FU, Xin Li et al.

ICLR 2024arXiv:2309.16292
222
citations
#171

RECOMP: Improving Retrieval-Augmented LMs with Context Compression and Selective Augmentation

Fangyuan Xu, Weijia Shi, Eunsol Choi

ICLR 2024
222
citations
#172

Generative Representational Instruction Tuning

Niklas Muennighoff, Hongjin SU, Liang Wang et al.

ICLR 2025arXiv:2402.09906
222
citations
#173

One For All: Towards Training One Graph Model For All Classification Tasks

Hao Liu, Jiarui Feng, Lecheng Kong et al.

ICLR 2024spotlightarXiv:2310.00149
221
citations
#174

Self-Play Preference Optimization for Language Model Alignment

Yue Wu, Zhiqing Sun, Rina Hughes et al.

ICLR 2025arXiv:2405.00675
221
citations
#175

On-Policy Distillation of Language Models: Learning from Self-Generated Mistakes

Rishabh Agarwal, Nino Vieillard, Yongchao Zhou et al.

ICLR 2024arXiv:2306.13649
218
citations
#176

MagicDrive: Street View Generation with Diverse 3D Geometry Control

Ruiyuan Gao, Kai Chen, Enze Xie et al.

ICLR 2024arXiv:2310.02601
218
citations
#177

Demystifying CLIP Data

Hu Xu, Saining Xie, Xiaoqing Tan et al.

ICLR 2024spotlightarXiv:2309.16671
216
citations
#178

Habitat 3.0: A Co-Habitat for Humans, Avatars, and Robots

Xavier Puig, Eric Undersander, Andrew Szot et al.

ICLR 2024arXiv:2310.13724
214
citations
#179

LongVILA: Scaling Long-Context Visual Language Models for Long Videos

Yukang Chen, Fuzhao Xue, Dacheng Li et al.

ICLR 2025arXiv:2408.10188
214
citations
#180

A Variational Perspective on Solving Inverse Problems with Diffusion Models

Morteza Mardani, Jiaming Song, Jan Kautz et al.

ICLR 2024arXiv:2305.04391
213
citations
#181

FITS: Modeling Time Series with $10k$ Parameters

Zhijian Xu, Ailing Zeng, Qiang Xu

ICLR 2024spotlightarXiv:2307.03756
212
citations
#182

RA-DIT: Retrieval-Augmented Dual Instruction Tuning

Victoria Lin, Xilun Chen, Mingda Chen et al.

ICLR 2024arXiv:2310.01352
210
citations
#183

Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

Zilong Wang, Hao Zhang, Chun-Liang Li et al.

ICLR 2024arXiv:2401.04398
209
citations
#184

DragonDiffusion: Enabling Drag-style Manipulation on Diffusion Models

Chong Mou, Xintao Wang, Jiechong Song et al.

ICLR 2024spotlightarXiv:2307.02421
209
citations
#185

VILA-U: a Unified Foundation Model Integrating Visual Understanding and Generation

Yecheng Wu, Zhuoyang Zhang, Junyu Chen et al.

ICLR 2025arXiv:2409.04429
208
citations
#186

SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction

Xinyuan Chen, Yaohui Wang, Lingjun Zhang et al.

ICLR 2024oralarXiv:2310.20700
208
citations
#187

From Sparse to Soft Mixtures of Experts

Joan Puigcerver, Carlos Riquelme Ruiz, Basil Mustafa et al.

ICLR 2024spotlightarXiv:2308.00951
208
citations
#188

Identifying the Risks of LM Agents with an LM-Emulated Sandbox

Yangjun Ruan, Honghua Dong, Andrew Wang et al.

ICLR 2024spotlightarXiv:2309.15817
208
citations
#189

Hybrid LLM: Cost-Efficient and Quality-Aware Query Routing

Dujian Ding, Ankur Mallick, Chi Wang et al.

ICLR 2024arXiv:2404.14618
208
citations
#190

OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation

Kepan Nan, Rui Xie, Penghao Zhou et al.

ICLR 2025arXiv:2407.02371
207
citations
#191

AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents

Chris Rawles, Sarah Clinckemaillie, Yifan Chang et al.

ICLR 2025arXiv:2405.14573
207
citations
#192

Proving Test Set Contamination in Black-Box Language Models

Yonatan Oren, Nicole Meister, Niladri Chatterji et al.

ICLR 2024arXiv:2310.17623
203
citations
#193

Conformal Risk Control

Anastasios Angelopoulos, Stephen Bates, Adam Fisch et al.

ICLR 2024spotlightarXiv:2208.02814
203
citations
#194

EmerNeRF: Emergent Spatial-Temporal Scene Decomposition via Self-Supervision

Jiawei Yang, Boris Ivanovic, Or Litany et al.

ICLR 2024oralarXiv:2311.02077
203
citations
#195

PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization

Xinyuan Wang, Chenxi Li, Zhen Wang et al.

ICLR 2024arXiv:2310.16427
202
citations
#196

LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models

Yixiao Li, Yifan Yu, Chen Liang et al.

ICLR 2024arXiv:2310.08659
202
citations
#197

Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data

Jingyang Ou, Shen Nie, Kaiwen Xue et al.

ICLR 2025arXiv:2406.03736
202
citations
#198

Multilingual Jailbreak Challenges in Large Language Models

Yue Deng, Wenxuan Zhang, Sinno Pan et al.

ICLR 2024arXiv:2310.06474
201
citations
#199

TEST: Text Prototype Aligned Embedding to Activate LLM's Ability for Time Series

Chenxi Sun, Hongyan Li, Yaliang Li et al.

ICLR 2024arXiv:2308.08241
200
citations
#200

Think before you speak: Training Language Models With Pause Tokens

Sachin Goyal, Ziwei Ji, Ankit Singh Rawat et al.

ICLR 2024arXiv:2310.02226
200
citations
PreviousNext