Most Cited ICLR "interaction energy map" Papers
6,124 papers found • Page 1 of 31
Conference
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
Clemencia Siro, Guy Gur-Ari, Gaurav Mishra et al.
Scaling Autoregressive Models for Content-Rich Text-to-Image Generation
Xin Li, Jing Yu Koh, Alexander Ku et al.
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Akari Asai, Zeqiu Wu, Yizhong Wang et al.
CogVideoX: Text-to-Video Diffusion Models with An Expert Transformer
Zhuoyi Yang, Jiayan Teng, Wendi Zheng et al.
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu, Hritik Bansal, Tony Xia et al.
ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs
Yujia Qin, Shihao Liang, Yining Ye et al.
Grounding Multimodal Large Language Models to the World
Zhiliang Peng, Wenhui Wang, Li Dong et al.
LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code
Naman Jain, Han, Alex Gu et al.
A Generalist Agent
Jackie Kay, Sergio Gómez Colmenarejo, Mahyar Bordbar et al.
MVDream: Multi-view Diffusion for 3D Generation
Yichun Shi, Peng Wang, Jianglong Ye et al.
Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback
Javier Rando, Tony Wang, Stewart Slocum et al.
RAFT: Reward rAnked FineTuning for Generative Foundation Model Alignment
Jipeng Zhang, Hanze Dong, Tong Zhang et al.
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo, Qingfeng Sun, Can Xu et al.
MetaMath: Bootstrap Your Own Mathematical Questions for Large Language Models
Longhui Yu, Weisen JIANG, Han Shi et al.
Language Model Beats Diffusion - Tokenizer is key to visual generation
Lijun Yu, José Lezama, Nitesh Bharadwaj Gundavarapu et al.
Patches Are All You Need?
Asher Trockman, J Kolter
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors
Weize Chen, Yusheng Su, Jingwei Zuo et al.
Eureka: Human-Level Reward Design via Coding Large Language Models
Yecheng Jason Ma, William Liang, Guanzhi Wang et al.
Show-o: One Single Transformer to Unify Multimodal Understanding and Generation
Jinheng Xie, Weijia Mao, Zechen Bai et al.
SALMONN: Towards Generic Hearing Abilities for Large Language Models
Changli Tang, Wenyi Yu, Guangzhi Sun et al.
Real-time Photorealistic Dynamic Scene Representation and Rendering with 4D Gaussian Splatting
Zeyu Yang, Hongye Yang, Zijie Pan et al.
Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Yangsibo Huang, Samyak Gupta, Mengzhou Xia et al.
YaRN: Efficient Context Window Extension of Large Language Models
Bowen Peng, Jeffrey Quesnelle, Honglu Fan et al.
InternVid: A Large-scale Video-Text Dataset for Multimodal Understanding and Generation
Yi Wang, Yinan He, Yizhuo Li et al.
BigCodeBench: Benchmarking Code Generation with Diverse Function Calls and Complex Instructions
Terry Yue Zhuo, Minh Chien Vu, Jenny Chim et al.
Causal Reasoning and Large Language Models: Opening a New Frontier for Causality
Chenhao Tan, Robert Ness, Amit Sharma et al.
Universal Guidance for Diffusion Models
Arpit Bansal, Hong-Min Chu, Avi Schwarzschild et al.
Prometheus: Inducing Fine-Grained Evaluation Capability in Language Models
Seungone Kim, Jamin Shin, yejin cho et al.
Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks
Maksym Andriushchenko, francesco croce, Nicolas Flammarion
Model Tells You What to Discard: Adaptive KV Cache Compression for LLMs
Suyu Ge, Yunan Zhang, Liyuan Liu et al.
Large Language Models Are Not Robust Multiple Choice Selectors
Chujie Zheng, Hao Zhou, Fandong Meng et al.
RDT-1B: a Diffusion Foundation Model for Bimanual Manipulation
Songming Liu, Lingxuan Wu, Bangguo Li et al.
OpenHands: An Open Platform for AI Software Developers as Generalist Agents
Xingyao Wang, Boxuan Li, Yufan Song et al.
Generative Verifiers: Reward Modeling as Next-Token Prediction
Lunjun Zhang, Arian Hosseini, Hritik Bansal et al.
LanguageBind: Extending Video-Language Pretraining to N-modality by Language-based Semantic Alignment
Bin Zhu, Bin Lin, Munan Ning et al.
Learning Interactive Real-World Simulators
Sherry Yang, Yilun Du, Seyed Ghasemipour et al.
ControlVideo: Training-free Controllable Text-to-video Generation
Yabo Zhang, Yuxiang Wei, Dongsheng jiang et al.
Human Motion Diffusion as a Generative Prior
Yonatan Shafir, Guy Tevet, Roy Kapon et al.
Consistency Trajectory Models: Learning Probability Flow ODE Trajectory of Diffusion
Dongjun Kim, Chieh-Hsin Lai, WeiHsiang Liao et al.
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Wenqi Shao, Mengzhao Chen, Zhaoyang Zhang et al.
Vision-Language Foundation Models as Effective Robot Imitators
Xinghang Li, Minghuan Liu, Hanbo Zhang et al.
OpenChat: Advancing Open-source Language Models with Mixed-Quality Data
Guan Wang, Sijie Cheng, Xianyuan Zhan et al.
Representation Alignment for Generation: Training Diffusion Transformers Is Easier Than You Think
Sihyun Yu, Sangkyung Kwak, Huiwon Jang et al.
Training Language Models to Self-Correct via Reinforcement Learning
Aviral Kumar, Vincent Zhuang, Rishabh Agarwal et al.
Directly Fine-Tuning Diffusion Models on Differentiable Rewards
Kevin Clark, Paul Vicol, Kevin Swersky et al.
Depth Pro: Sharp Monocular Metric Depth in Less Than a Second
Alexey Bochkovskiy, Amaël Delaunoy, Hugo Germain et al.
Scaling and evaluating sparse autoencoders
Leo Gao, Tom Dupre la Tour, Henk Tillman et al.
Transfusion: Predict the Next Token and Diffuse Images with One Multi-Modal Model
Chunting Zhou, Lili Yu, Arun Babu et al.
TD-MPC2: Scalable, Robust World Models for Continuous Control
Nicklas Hansen, Hao Su, Xiaolong Wang
SliceGPT: Compress Large Language Models by Deleting Rows and Columns
Saleh Ashkboos, Maximilian Croci, Marcelo Gennari do Nascimento et al.
AnomalyCLIP: Object-agnostic Prompt Learning for Zero-shot Anomaly Detection
Qihang Zhou, Guansong Pang, Yu Tian et al.
Safety Alignment Should be Made More Than Just a Few Tokens Deep
Xiangyu Qi, Ashwinee Panda, Kaifeng Lyu et al.
DreamLLM: Synergistic Multimodal Comprehension and Creation
Runpei Dong, chunrui han, Yuang Peng et al.
Mixture-of-Agents Enhances Large Language Model Capabilities
Junlin Wang, Jue Wang, Ben Athiwaratkun et al.
Can LLMs Generate Novel Research Ideas? A Large-Scale Human Study with 100+ NLP Researchers
Chenglei Si, Diyi Yang, Tatsunori Hashimoto
Provable Robust Watermarking for AI-Generated Text
Xuandong Zhao, Prabhanjan Ananth, Lei Li et al.
Understanding the Effects of RLHF on LLM Generalisation and Diversity
Robert Kirk, Ishita Mediratta, Christoforos Nalmpantis et al.
SalUn: Empowering Machine Unlearning via Gradient-based Weight Saliency in Both Image Classification and Generation
Chongyu Fan, Jiancheng Liu, Yihua Zhang et al.
MonST3R: A Simple Approach for Estimating Geometry in the Presence of Motion
Junyi Zhang, Charles Herrmann, Junhwa Hur et al.
Large Language Models as Tool Makers
Tianle Cai, Xuezhi Wang, Tengyu Ma et al.
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing
Zhangchen Xu, Fengqing Jiang, Luyao Niu et al.
JudgeLM: Fine-tuned Large Language Models are Scalable Judges
Lianghui Zhu, Xinggang Wang, Xinlong Wang
EquiformerV2: Improved Equivariant Transformer for Scaling to Higher-Degree Representations
Yi-Lun Liao, Brandon Wood, Abhishek Das et al.
Sparse Feature Circuits: Discovering and Editing Interpretable Causal Graphs in Language Models
Samuel Marks, Can Rager, Eric Michaud et al.
Adaptive Chameleon or Stubborn Sloth: Revealing the Behavior of Large Language Models in Knowledge Conflicts
Jian Xie, Kai Zhang, Jiangjie Chen et al.
SpinQuant: LLM Quantization with Learned Rotations
Zechun Liu, Changsheng Zhao, Igor Fedorov et al.
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague, Fangcong Yin, Juan Rodriguez et al.
SaProt: Protein Language Modeling with Structure-aware Vocabulary
Jin Su, Chenchen Han, Yuyang Zhou et al.
mPLUG-Owl3: Towards Long Image-Sequence Understanding in Multi-Modal Large Language Models
Jiabo Ye, Haiyang Xu, Haowei Liu et al.
Unleashing Large-Scale Video Generative Pre-training for Visual Robot Manipulation
Hongtao Wu, Ya Jing, Chilam Cheang et al.
LoRA Learns Less and Forgets Less
Jonathan Frankle, Jose Javier Gonzalez Ortiz, Cody Blakeney et al.
Generating with Confidence: Uncertainty Quantification for Black-box Large Language Models
Jimeng Sun, Shubhendu Trivedi, Zhen Lin
Language Models Represent Space and Time
Wes Gurnee, Max Tegmark
DMV3D: Denoising Multi-view Diffusion Using 3D Large Reconstruction Model
Yinghao Xu, Hao Tan, Fujun Luan et al.
SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents
Xuhui Zhou, Hao Zhu, Leena Mathur et al.
RECOMP: Improving Retrieval-Augmented LMs with Context Compression and Selective Augmentation
Fangyuan Xu, Weijia Shi, Eunsol Choi
Sophia: A Scalable Stochastic Second-order Optimizer for Language Model Pre-training
Hong Liu, Zhiyuan Li, David Hall et al.
Listen, Think, and Understand
Yuan Gong, Hongyin Luo, Alexander Liu et al.
Data Filtering Networks
Alex Fang, Albin Madappally Jose, Amit Jain et al.
Pyramidal Flow Matching for Efficient Video Generative Modeling
Yang Jin, Zhicheng Sun, Ningyuan Li et al.
Generative Representational Instruction Tuning
Niklas Muennighoff, Hongjin SU, Liang Wang et al.
SEINE: Short-to-Long Video Diffusion Model for Generative Transition and Prediction
Xinyuan Chen, Yaohui Wang, Lingjun Zhang et al.
Self-Play Preference Optimization for Language Model Alignment
Yue Wu, Zhiqing Sun, Rina Hughes et al.
A Variational Perspective on Solving Inverse Problems with Diffusion Models
Morteza Mardani, Jiaming Song, Jan Kautz et al.
Justice or Prejudice? Quantifying Biases in LLM-as-a-Judge
Jiayi Ye, Yanbo Wang, Yue Huang et al.
Habitat 3.0: A Co-Habitat for Humans, Avatars, and Robots
Xavier Puig, Eric Undersander, Andrew Szot et al.
Demystifying CLIP Data
Hu Xu, Saining Xie, Xiaoqing Tan et al.
OpenVid-1M: A Large-Scale High-Quality Dataset for Text-to-video Generation
Kepan Nan, Rui Xie, Penghao Zhou et al.
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Aojun Zhou, Ke Wang, Zimu Lu et al.
LoftQ: LoRA-Fine-Tuning-aware Quantization for Large Language Models
Yixiao Li, Yifan Yu, Chen Liang et al.
Conformal Risk Control
Anastasios Angelopoulos, Stephen Bates, Adam Fisch et al.
Think before you speak: Training Language Models With Pause Tokens
Sachin Goyal, Ziwei Ji, Ankit Singh Rawat et al.
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff, Qian Liu, Armel Zebaze et al.
Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space
Hengrui Zhang, Jiani Zhang, Zhengyuan Shen et al.
Your Absorbing Discrete Diffusion Secretly Models the Conditional Distributions of Clean Data
Jingyang Ou, Shen Nie, Kaiwen Xue et al.
One Step Diffusion via Shortcut Models
Kevin Frans, Danijar Hafner, Sergey Levine et al.
Inverse Scaling: When Bigger Isn't Better
Joe Cavanagh, Andrew Gritsevskiy, Najoung Kim et al.
AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents
Chris Rawles, Sarah Clinckemaillie, Yifan Chang et al.
ReLoRA: High-Rank Training Through Low-Rank Updates
Vladislav Lialin, Sherin Muckatira, Namrata Shivagunde et al.
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan, Ganqu Cui, Hanbin Wang et al.
Revisiting Feature Prediction for Learning Visual Representations from Video
Quentin Garrido, Yann LeCun, Michael Rabbat et al.
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.
On the Reliability of Watermarks for Large Language Models
John Kirchenbauer, Jonas Geiping, Yuxin Wen et al.
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Jiahui Gao, Renjie Pi, Jipeng Zhang et al.
Can Large Language Models Infer Causation from Correlation?
Zhijing Jin, Jiarui Liu, Zhiheng LYU et al.
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola, Aaron Gokaslan, Justin Chiu et al.
Uni3D: Exploring Unified 3D Representation at Scale
Junsheng Zhou, Jinsheng Wang, Baorui Ma et al.
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo et al.
Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?
Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie et al.
Is Self-Repair a Silver Bullet for Code Generation?
Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang et al.
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Niels Mündler, Jingxuan He, Slobodan Jenko et al.
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou et al.
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi, Jaechan Lee, Yangsibo Huang et al.
Diffusion Models Are Real-Time Game Engines
Dani Valevski, Yaniv Leviathan, Moab Arar et al.
BooookScore: A systematic exploration of book-length summarization in the era of LLMs
Yapei Chang, Kyle Lo, Tanya Goyal et al.
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang, Haoyue Zhan, Liwei Liu et al.
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
Peng Wang, Hao Tan, Sai Bi et al.
SweetDreamer: Aligning Geometric Priors in 2D diffusion for Consistent Text-to-3D
Weiyu LI, Rui Chen, Xuelin Chen et al.
Hypothesis Search: Inductive Reasoning with Language Models
Ruocheng Wang, Eric Zelikman, Gabriel Poesia et al.
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
Sijun Tan, Siyuan Zhuang, Kyle Montgomery et al.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Yangzhen Wu, Zhiqing Sun, Shanda Li et al.
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
Xingxuan Li, Ruochen Zhao, Yew Ken Chia et al.
Gated Delta Networks: Improving Mamba2 with Delta Rule
Songlin Yang, Jan Kautz, Ali Hatamizadeh
Video Language Planning
Yilun Du, Sherry Yang, Pete Florence et al.
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu, Wilson Yan, Matei Zaharia et al.
Physics of Language Models: Part 3.2, Knowledge Manipulation
Zeyuan Allen-Zhu, Yuanzhi Li
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu et al.
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum et al.
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Tinghao Xie, Xiangyu Qi, Yi Zeng et al.
Linearity of Relation Decoding in Transformer Language Models
Evan Hernandez, Arnab Sen Sharma, Tal Haklay et al.
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models
Junfeng Fang, Houcheng Jiang, Kun Wang et al.
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Ted Zadouri, Ahmet Üstün, Arash Ahmadian et al.
Diffusion Policy Policy Optimization
Allen Ren, Justin Lidard, Lars Ankile et al.
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang et al.
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
Bofei Gao, Feifan Song, Zhe Yang et al.
AFlow: Automating Agentic Workflow Generation
Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu et al.
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Huajian Xin, Z.Z. Ren, Junxiao Song et al.
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde, Victoriano Montesinos, Elvis Nava et al.
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning
Zayne Sprague, Xi Ye, Kaj Bostrom et al.
Large Language Models as Analogical Reasoners
Michihiro Yasunaga, Xinyun Chen, Yujia Li et al.
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian et al.
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang, Shoutao Guo, Yan Zhou et al.
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver
Zhenting Qi, Mingyuan MA, Jiahang Xu et al.
Adapting Large Language Models via Reading Comprehension
Daixuan Cheng, Shaohan Huang, Furu Wei
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li, Zedong Wang, Zicheng Liu et al.
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji, Ziyue Jiang, Wen Wang et al.
Interpreting Emergent Planning in Model-Free Reinforcement Learning
Thomas Bush, Stephen Chung, Usman Anwar et al.
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
Ziyang Li, Saikat Dutta, Mayur Naik
AnyText: Multilingual Visual Text Generation and Editing
Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He et al.
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Shi Yu, Chaoyue Tang, Bokai Xu et al.
Automated Design of Agentic Systems
Shengran Hu, Cong Lu, Jeff Clune
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He, Haodong Li, Wei Yin et al.
GenSim: Generating Robotic Simulation Tasks via Large Language Models
Lirui Wang, Yiyang Ling, Zhecheng Yuan et al.
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Weiyun Wang, Min Shi, Qingyun Li et al.
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Hadas Orgad, Michael Toker, Zorik Gekhman et al.
Teaching Arithmetic to Small Transformers
Nayoung Lee, Kartik Sreenivasan, Jason Lee et al.
Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems
Hyungjin Chung, Suhyeon Lee, Jong Chul YE
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
Carles Domingo i Enrich, Michal Drozdzal, Brian Karrer et al.
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Mohammadreza Pourreza, Hailong Li, Ruoxi Sun et al.
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
Rui-Jie Zhu, Qihang Zhao, Jason Eshraghian et al.
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren, Yang Liu, Yadong Lu et al.
Data Scaling Laws in Imitation Learning for Robotic Manipulation
Fanqi Lin, Yingdong Hu, Pingyue Sheng et al.
ToolACE: Winning the Points of LLM Function Calling
Weiwen Liu, Xu Huang, Xingshan Zeng et al.
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin et al.
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Di Wu, Hongwei Wang, Wenhao Yu et al.
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Fei Wang, XINGYU FU, James Y. Huang et al.
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Zehan Qi, Xiao Liu, Iat Long Iong et al.
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan, Tianhong Li, Siyang Qin et al.
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Zeyuan Allen-Zhu, Yuanzhi Li
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
Yingqing He, Shaoshu Yang, Haoxin Chen et al.
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models
Ahmad Faiz, Sotaro Kaneda, Ruhan Wang et al.
Scaling up Masked Diffusion Models on Text
Shen Nie, Fengqi Zhu, Chao Du et al.
Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando, Florian Tramer
Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa, Bhrugu Bharathi, Long Phan et al.
Unpaired Image-to-Image Translation via Neural Schrödinger Bridge
Beomsu Kim, Gihyun Kwon, Kwanyoung Kim et al.
EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
Zeyi Liao, Lingbo Mo, Chejian Xu et al.
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Shaolei Zhang, Qingkai Fang, Yang et al.
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Xiaogeng Liu, Peiran Li, G. Edward Suh et al.
A Paradigm Shift in Machine Translation: Boosting Translation Performance of Large Language Models
Haoran Xu, Young Jin Kim, Amr Mohamed Nabil Aly Aly Sharaf et al.
MS-Diffusion: Multi-subject Zero-shot Image Personalization with Layout Guidance
Xierui Wang, Siming Fu, Qihan Huang et al.
CLIPSelf: Vision Transformer Distills Itself for Open-Vocabulary Dense Prediction
Size Wu, Wenwei Zhang, Lumin Xu et al.
Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization
Weiran Yao, Shelby Heinecke, Juan Carlos Niebles et al.
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling
Kaiwen Zheng, Yongxin Chen, Hanzi Mao et al.
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang, Jingyuan Huang, Kai Mei et al.
OmniRe: Omni Urban Scene Reconstruction
Ziyu Chen, Jiawei Yang, Jiahui Huang et al.
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha, Jacob Springer, Aditi Raghunathan
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control
Longtao Zheng, Rundong Wang, Xinrun Wang et al.
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang, Alexander Bukharin, Olivier Delalleau et al.
VideoPhy: Evaluating Physical Commonsense for Video Generation
Hritik Bansal, Zongyu Lin, Tianyi Xie et al.
AuroraCap: Efficient, Performant Video Detailed Captioning and a New Benchmark
Wenhao Chai, Enxin Song, Yilun Du et al.
Autoregressive Video Generation without Vector Quantization
Haoge Deng, Ting Pan, Haiwen Diao et al.
SWE-bench Multimodal: Do AI Systems Generalize to Visual Software Domains?
John Yang, Carlos E Jimenez, Alex Zhang et al.
MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers
Yiwen Chen, Tong He, Di Huang et al.
Agent S: An Open Agentic Framework that Uses Computers Like a Human
Saaket Agashe, Jiuzhou Han, Shuyu Gan et al.
Unified Human-Scene Interaction via Prompted Chain-of-Contacts
Zeqi Xiao, Tai Wang, Jingbo Wang et al.
On the self-verification limitations of large language models on reasoning and planning tasks
Kaya Stechly, Karthik Valmeekam, Subbarao Kambhampati