Most Cited ICLR "training-free augmentation" Papers
6,124 papers found • Page 2 of 31
Conference
Mixed-Type Tabular Data Synthesis with Score-based Diffusion in Latent Space
Hengrui Zhang, Jiani Zhang, Zhengyuan Shen et al.
Think-on-Graph: Deep and Responsible Reasoning of Large Language Model on Knowledge Graph
Jiashuo Sun, Chengjin Xu, Lumingyuan Tang et al.
Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification
Aojun Zhou, Ke Wang, Zimu Lu et al.
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language Models
Marianne Arriola, Aaron Gokaslan, Justin Chiu et al.
Function Vectors in Large Language Models
Eric Todd, Millicent Li, Arnab Sen Sharma et al.
Revisiting Feature Prediction for Learning Visual Representations from Video
Quentin Garrido, Yann LeCun, Michael Rabbat et al.
AdaMerging: Adaptive Model Merging for Multi-Task Learning
Enneng Yang, Zhenyi Wang, Li Shen et al.
One Step Diffusion via Shortcut Models
Kevin Frans, Danijar Hafner, Sergey Levine et al.
Nougat: Neural Optical Understanding for Academic Documents
Lukas Blecher, Guillem Cucurull Preixens, Thomas Scialom et al.
Time-MoE: Billion-Scale Time Series Foundation Models with Mixture of Experts
Xiaoming Shi, Shiyu Wang, Yuqi Nie et al.
OctoPack: Instruction Tuning Code Large Language Models
Niklas Muennighoff, Qian Liu, Armel Zebaze et al.
OmniControl: Control Any Joint at Any Time for Human Motion Generation
Yiming Xie, Varun Jampani, Lei Zhong et al.
Solving Inverse Problems with Latent Diffusion Models via Hard Data Consistency
Bowen Song, Soo Min Kwon, Zecheng Zhang et al.
Vocos: Closing the gap between time-domain and Fourier-based neural vocoders for high-quality audio synthesis
Hubert Siuzdak
MMICL: Empowering Vision-language Model with Multi-Modal In-Context Learning
Haozhe Zhao, Zefan Cai, Shuzheng Si et al.
Making LLaMA SEE and Draw with SEED Tokenizer
Yuying Ge, Sijie Zhao, Ziyun Zeng et al.
Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
Huaixiu Steven Zheng, Swaroop Mishra, Xinyun Chen et al.
Reward Model Ensembles Help Mitigate Overoptimization
Thomas Coste, Usman Anwar, Robert Kirk et al.
Nearly $d$-Linear Convergence Bounds for Diffusion Models via Stochastic Localization
Joe Benton, Valentin De Bortoli, Arnaud Doucet et al.
ReLoRA: High-Rank Training Through Low-Rank Updates
Vladislav Lialin, Sherin Muckatira, Namrata Shivagunde et al.
Inverse Scaling: When Bigger Isn't Better
Joe Cavanagh, Andrew Gritsevskiy, Najoung Kim et al.
Towards Best Practices of Activation Patching in Language Models: Metrics and Methods
Fred Zhang, Neel Nanda
On the Reliability of Watermarks for Large Language Models
John Kirchenbauer, Jonas Geiping, Yuxin Wen et al.
Bias Runs Deep: Implicit Reasoning Biases in Persona-Assigned LLMs
Shashank Gupta, Vaishnavi Shrivastava, Ameet Deshpande et al.
Advancing LLM Reasoning Generalists with Preference Trees
Lifan Yuan, Ganqu Cui, Hanbin Wang et al.
DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads
Guangxuan Xiao, Jiaming Tang, Jingwei Zuo et al.
SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning
Ning Miao, Yee Whye Teh, Tom Rainforth
Gated Delta Networks: Improving Mamba2 with Delta Rule
Songlin Yang, Jan Kautz, Ali Hatamizadeh
MMAU: A Massive Multi-Task Audio Understanding and Reasoning Benchmark
Sakshi, Utkarsh Tyagi, Sonal Kumar et al.
Uni3D: Exploring Unified 3D Representation at Scale
Junsheng Zhou, Jinsheng Wang, Baorui Ma et al.
Beyond Memorization: Violating Privacy via Inference with Large Language Models
Robin Staab, Mark Vero, Mislav Balunovic et al.
Talk like a Graph: Encoding Graphs for Large Language Models
Bahare Fatemi, Jonathan Halcrow, Bryan Perozzi
G-LLaVA: Solving Geometric Problem with Multi-Modal Large Language Model
Jiahui Gao, Renjie Pi, Jipeng Zhang et al.
Ring-A-Bell! How Reliable are Concept Removal Methods For Diffusion Models?
Yu-Lin Tsai, Chia-Yi Hsu, Chulin Xie et al.
Diffusion Models Are Real-Time Game Engines
Dani Valevski, Yaniv Leviathan, Moab Arar et al.
The Unreasonable Ineffectiveness of the Deeper Layers
Andrey Gromov, Kushal Tirumala, Hassan Shapourian et al.
DreamCraft3D: Hierarchical 3D Generation with Bootstrapped Diffusion Prior
Jingxiang Sun, Bo Zhang, Ruizhi Shao et al.
Can Large Language Models Infer Causation from Correlation?
Zhijing Jin, Jiarui Liu, Zhiheng LYU et al.
What Algorithms can Transformers Learn? A Study in Length Generalization
Hattie Zhou, Arwen Bradley, Etai Littwin et al.
Inversion by Direct Iteration: An Alternative to Denoising Diffusion for Image Restoration
Peyman Milanfar, Mauricio Delbracio
MathCoder: Seamless Code Integration in LLMs for Enhanced Mathematical Reasoning
Ke Wang, Houxing Ren, Aojun Zhou et al.
OLMoE: Open Mixture-of-Experts Language Models
Niklas Muennighoff, Luca Soldaini, Dirk Groeneveld et al.
Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning
Amrith Setlur, Chirag Nagpal, Adam Fisch et al.
Latent Action Pretraining from Videos
Seonghyeon Ye, Joel Jang, Byeongguk Jeon et al.
SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression
Xin Wang, Yu Zheng, Zhongwei Wan et al.
Is Self-Repair a Silver Bullet for Code Generation?
Theo X. Olausson, Jeevana Priya Inala, Chenglong Wang et al.
MUSE: Machine Unlearning Six-Way Evaluation for Language Models
Weijia Shi, Jaechan Lee, Yangsibo Huang et al.
ZipIt! Merging Models from Different Tasks without Training
George Stoica, Daniel Bolya, Jakob Bjorner et al.
Self-Alignment with Instruction Backtranslation
Xian Li, Ping Yu, Chunting Zhou et al.
Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory
Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou et al.
BooookScore: A systematic exploration of book-length summarization in the era of LLMs
Yapei Chang, Kyle Lo, Tanya Goyal et al.
Diffusion-TS: Interpretable Diffusion for General Time Series Generation
Xinyu Yuan, Yan Qiao
JudgeBench: A Benchmark for Evaluating LLM-Based Judges
Sijun Tan, Siyuan Zhuang, Kyle Montgomery et al.
Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation
Niels Mündler, Jingxuan He, Slobodan Jenko et al.
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang, Haoyue Zhan, Liwei Liu et al.
RAIN: Your Language Models Can Align Themselves without Finetuning
Yuhui Li, Fangyun Wei, Jinjing Zhao et al.
Learning to Act from Actionless Videos through Dense Correspondences
Po-Chen Ko, Jiayuan Mao, Yilun Du et al.
Flow Matching on General Geometries
Ricky T. Q. Chen, Yaron Lipman
Chain of Hindsight aligns Language Models with Feedback
Hao Liu, Carmelo Sferrazza, Pieter Abbeel
Emu: Generative Pretraining in Multimodality
Quan Sun, Qiying Yu, Yufeng Cui et al.
Scaling Diffusion Language Models via Adaptation from Autoregressive Models
Shansan Gong, Shivam Agarwal, Yizhe Zhang et al.
MetaTool Benchmark for Large Language Models: Deciding Whether to Use Tools and Which to Use
Yue Huang, Jiawen Shi, Yuan Li et al.
Beyond Reverse KL: Generalizing Direct Preference Optimization with Diverse Divergence Constraints
Chaoqi Wang, Yibo Jiang, Chenghao Yang et al.
QA-LoRA: Quantization-Aware Low-Rank Adaptation of Large Language Models
Yuhui Xu, Lingxi Xie, Xiaotao Gu et al.
Simplifying, Stabilizing and Scaling Continuous-time Consistency Models
Cheng Lu, Yang Song
PF-LRM: Pose-Free Large Reconstruction Model for Joint Pose and Shape Prediction
Peng Wang, Hao Tan, Sai Bi et al.
Generative Judge for Evaluating Alignment
Junlong Li, Shichao Sun, Weizhe Yuan et al.
Let 2D Diffusion Model Know 3D-Consistency for Robust Text-to-3D Generation
Junyoung Seo, Wooseok Jang, Min-Seop Kwak et al.
Pathformer: Multi-scale Transformers with Adaptive Pathways for Time Series Forecasting
Peng Chen, Yingying ZHANG, Yunyao Cheng et al.
Can Sensitive Information Be Deleted From LLMs? Objectives for Defending Against Extraction Attacks
Vaidehi Ramesh Patil, Peter Hase, Mohit Bansal
Hypothesis Search: Inductive Reasoning with Language Models
Ruocheng Wang, Eric Zelikman, Gabriel Poesia et al.
AlphaEdit: Null-Space Constrained Knowledge Editing for Language Models
Junfeng Fang, Houcheng Jiang, Kun Wang et al.
Interpreting CLIP's Image Representation via Text-Based Decomposition
Yossi Gandelsman, Alexei Efros, Jacob Steinhardt
SweetDreamer: Aligning Geometric Priors in 2D diffusion for Consistent Text-to-3D
Weiyu LI, Rui Chen, Xuelin Chen et al.
AFlow: Automating Agentic Workflow Generation
Jiayi Zhang, Jinyu Xiang, Zhaoyang Yu et al.
TraceVLA: Visual Trace Prompting Enhances Spatial-Temporal Awareness for Generalist Robotic Policies
Ruijie Zheng, Yongyuan Liang, Shuaiyi Huang et al.
FreeNoise: Tuning-Free Longer Video Diffusion via Noise Rescheduling
Haonan Qiu, Menghan Xia, Yong Zhang et al.
Time Travel in LLMs: Tracing Data Contamination in Large Language Models
Shahriar Golchin, Mihai Surdeanu
WildBench: Benchmarking LLMs with Challenging Tasks from Real Users in the Wild
Bill Yuchen Lin, Yuntian Deng, Khyathi Chandu et al.
Chain-of-Knowledge: Grounding Large Language Models via Dynamic Knowledge Adapting over Heterogeneous Sources
Xingxuan Li, Ruochen Zhao, Yew Ken Chia et al.
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Tinghao Xie, Xiangyu Qi, Yi Zeng et al.
Retrieval Head Mechanistically Explains Long-Context Factuality
Wenhao Wu, Yizhong Wang, Guangxuan Xiao et al.
One Step of Gradient Descent is Provably the Optimal In-Context Learner with One Layer of Linear Self-Attention
Arvind Mahankali, Tatsunori Hashimoto, Tengyu Ma
FLASK: Fine-grained Language Model Evaluation based on Alignment Skill Sets
Seonghyeon Ye, Doyoung Kim, Sungdong Kim et al.
Guiding Instruction-based Image Editing via Multimodal Large Language Models
Tsu-Jui Fu, Wenze Hu, Xianzhi Du et al.
Omni-MATH: A Universal Olympiad Level Mathematic Benchmark for Large Language Models
Bofei Gao, Feifan Song, Zhe Yang et al.
DNA-GPT: Divergent N-Gram Analysis for Training-Free Detection of GPT-Generated Text
Xianjun Yang, Wei Cheng, Yue Wu et al.
World Model on Million-Length Video And Language With Blockwise RingAttention
Hao Liu, Wilson Yan, Matei Zaharia et al.
FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing
Yuren Cong, Mengmeng Xu, Christian Simon et al.
SE(3)-Stochastic Flow Matching for Protein Backbone Generation
Joey Bose, Tara Akhound-Sadegh, Guillaume Huguet et al.
MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans?
YiFan Zhang, Huanyu Zhang, Haochen Tian et al.
Generalization in diffusion models arises from geometry-adaptive harmonic representations
Zahra Kadkhodaie, Florentin Guth, Eero Simoncelli et al.
Video Language Planning
Yilun Du, Sherry Yang, Pete Florence et al.
Diffusion Policy Policy Optimization
Allen Ren, Justin Lidard, Lars Ankile et al.
Scaling Large Language Model-based Multi-Agent Collaboration
Chen Qian, Zihao Xie, YiFei Wang et al.
Inference Scaling Laws: An Empirical Analysis of Compute-Optimal Inference for LLM Problem-Solving
Yangzhen Wu, Zhiqing Sun, Shanda Li et al.
HQ-Edit: A High-Quality Dataset for Instruction-based Image Editing
MUDE HUI, Siwei Yang, Bingchen Zhao et al.
GoLLIE: Annotation Guidelines improve Zero-Shot Information-Extraction
Oscar Sainz, Iker García-Ferrero, Rodrigo Agerri et al.
MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning
Zayne Sprague, Xi Ye, Kaj Bostrom et al.
Small-scale proxies for large-scale Transformer training instabilities
Mitchell Wortsman, Peter Liu, Lechao Xiao et al.
Multimodal Web Navigation with Instruction-Finetuned Foundation Models
Hiroki Furuta, Kuang-Huei Lee, Ofir Nachum et al.
Physics of Language Models: Part 3.2, Knowledge Manipulation
Zeyuan Allen-Zhu, Yuanzhi Li
Pushing Mixture of Experts to the Limit: Extremely Parameter Efficient MoE for Instruction Tuning
Ted Zadouri, Ahmet Üstün, Arash Ahmadian et al.
AgentHarm: A Benchmark for Measuring Harmfulness of LLM Agents
Maksym Andriushchenko, Alexandra Souly, Mateusz Dziemian et al.
Linearity of Relation Decoding in Transformer Language Models
Evan Hernandez, Arnab Sen Sharma, Tal Haklay et al.
MLE-bench: Evaluating Machine Learning Agents on Machine Learning Engineering
Jun Shern Chan, Neil Chowdhury, Oliver Jaffe et al.
DeepSeek-Prover-V1.5: Harnessing Proof Assistant Feedback for Reinforcement Learning and Monte-Carlo Tree Search
Huajian Xin, Z.Z. Ren, Junxiao Song et al.
Does Writing with Language Models Reduce Content Diversity?
Vishakh Padmakumar, He He
Spider 2.0: Evaluating Language Models on Real-World Enterprise Text-to-SQL Workflows
Fangyu Lei, Jixuan Chen, Yuxiao Ye et al.
Improving LoRA in Privacy-preserving Federated Learning
Youbang Sun, Zitao Li, Yaliang Li et al.
OpenMathInstruct-2: Accelerating AI for Math with Massive Open-Source Instruction Data
Shubham Toshniwal, Wei Du, Ivan Moshkov et al.
Deep Compression Autoencoder for Efficient High-Resolution Diffusion Models
Junyu Chen, Han Cai, Junsong Chen et al.
Oryx MLLM: On-Demand Spatial-Temporal Understanding at Arbitrary Resolution
Zuyan Liu, Yuhao Dong, Ziwei Liu et al.
Zipformer: A faster and better encoder for automatic speech recognition
Zengwei Yao, Liyong Guo, Xiaoyu Yang et al.
Vision-Language Models are Zero-Shot Reward Models for Reinforcement Learning
Juan Rocamonde, Victoriano Montesinos, Elvis Nava et al.
Large Language Models as Analogical Reasoners
Michihiro Yasunaga, Xinyun Chen, Yujia Li et al.
DiffusionSat: A Generative Foundation Model for Satellite Imagery
Samar Khanna, Patrick Liu, Linqi Zhou et al.
Mixture of LoRA Experts
xun wu, Shaohan Huang, Furu Wei
LLaMA-Omni: Seamless Speech Interaction with Large Language Models
Qingkai Fang, Shoutao Guo, Yan Zhou et al.
Denoising Diffusion Bridge Models
Linqi Zhou, Aaron Lou, Samar Khanna et al.
IRIS: LLM-Assisted Static Analysis for Detecting Security Vulnerabilities
Ziyang Li, Saikat Dutta, Mayur Naik
Harnessing Explanations: LLM-to-LM Interpreter for Enhanced Text-Attributed Graph Representation Learning
Xiaoxin He, Xavier Bresson, Thomas Laurent et al.
Automated Design of Agentic Systems
Shengran Hu, Cong Lu, Jeff Clune
In-context Autoencoder for Context Compression in a Large Language Model
Tao Ge, Hu Jing, Lei Wang et al.
WavTokenizer: an Efficient Acoustic Discrete Codec Tokenizer for Audio Language Modeling
Shengpeng Ji, Ziyue Jiang, Wen Wang et al.
An LLM can Fool Itself: A Prompt-Based Adversarial Attack
Xilie Xu, Keyi Kong, Ning Liu et al.
Adjoint Matching: Fine-tuning Flow and Diffusion Generative Models with Memoryless Stochastic Optimal Control
Carles Domingo i Enrich, Michal Drozdzal, Brian Karrer et al.
Matcher: Segment Anything with One Shot Using All-Purpose Feature Matching
Yang Liu, Muzhi Zhu, Hengtao Li et al.
MogaNet: Multi-order Gated Aggregation Network
Siyuan Li, Zedong Wang, Zicheng Liu et al.
OpenWebMath: An Open Dataset of High-Quality Mathematical Web Text
Keiran Paster, Marco Dos Santos, Zhangir Azerbayev et al.
Learning Performance-Improving Code Edits
Alexander Shypula, Aman Madaan, Yimeng Zeng et al.
Spike-driven Transformer V2: Meta Spiking Neural Network Architecture Inspiring the Design of Next-generation Neuromorphic Chips
Man Yao, Jiakui Hu, Tianxiang Hu et al.
DistillSpec: Improving Speculative Decoding via Knowledge Distillation
Yongchao Zhou, Kaifeng Lyu, Ankit Singh Rawat et al.
Davidsonian Scene Graph: Improving Reliability in Fine-grained Evaluation for Text-to-Image Generation
Jaemin Cho, Yushi Hu, Jason Baldridge et al.
Manifold Preserving Guided Diffusion
Yutong He, Naoki Murata, Chieh-Hsin Lai et al.
Large Language Model Cascades with Mixture of Thought Representations for Cost-Efficient Reasoning
Murong Yue, Jie Zhao, Min Zhang et al.
LLMs Know More Than They Show: On the Intrinsic Representation of LLM Hallucinations
Hadas Orgad, Michael Toker, Zorik Gekhman et al.
Mutual Reasoning Makes Smaller LLMs Stronger Problem-Solver
Zhenting Qi, Mingyuan MA, Jiahang Xu et al.
LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory
Di Wu, Hongwei Wang, Wenhao Yu et al.
VisRAG: Vision-based Retrieval-augmented Generation on Multi-modality Documents
Shi Yu, Chaoyue Tang, Bokai Xu et al.
Lotus: Diffusion-based Visual Foundation Model for High-quality Dense Prediction
Jing He, Haodong Li, Wei Yin et al.
Adapting Large Language Models via Reading Comprehension
Daixuan Cheng, Shaohan Huang, Furu Wei
What's In My Big Data?
Yanai Elazar, Akshita Bhagia, Ian Magnusson et al.
Large Language Models to Enhance Bayesian Optimization
Tennison Liu, Nicolás Astorga, Nabeel Seedat et al.
AnyText: Multilingual Visual Text Generation and Editing
Yuxiang Tuo, Wangmeng Xiang, Jun-Yan He et al.
Interpreting Emergent Planning in Model-Free Reinforcement Learning
Thomas Bush, Stephen Chung, Usman Anwar et al.
Scaling up Masked Diffusion Models on Text
Shen Nie, Fengqi Zhu, Chao Du et al.
ToolACE: Winning the Points of LLM Function Calling
Weiwen Liu, Xu Huang, Xingshan Zeng et al.
Mol-Instructions: A Large-Scale Biomolecular Instruction Dataset for Large Language Models
Yin Fang, Xiaozhuan Liang, Ningyu Zhang et al.
Zoology: Measuring and Improving Recall in Efficient Language Models
Simran Arora, Sabri Eyuboglu, Aman Timalsina et al.
Data Scaling Laws in Imitation Learning for Robotic Manipulation
Fanqi Lin, Yingdong Hu, Pingyue Sheng et al.
Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling
Liliang Ren, Yang Liu, Yadong Lu et al.
CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL
Mohammadreza Pourreza, Hailong Li, Ruoxi Sun et al.
Teaching Arithmetic to Small Transformers
Nayoung Lee, Kartik Sreenivasan, Jason Lee et al.
WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning
Zehan Qi, Xiao Liu, Iat Long Iong et al.
GenSim: Generating Robotic Simulation Tasks via Large Language Models
Lirui Wang, Yiyang Ling, Zhecheng Yuan et al.
Decomposed Diffusion Sampler for Accelerating Large-Scale Inverse Problems
Hyungjin Chung, Suhyeon Lee, Jong Chul YE
Text-to-3D with Classifier Score Distillation
Xin Yu, Yuan-Chen Guo, Yangguang Li et al.
Scaling Laws of RoPE-based Extrapolation
Xiaoran Liu, Hang Yan, Chenxin An et al.
MuirBench: A Comprehensive Benchmark for Robust Multi-image Understanding
Fei Wang, XINGYU FU, James Y. Huang et al.
Navigating Text-To-Image Customization: From LyCORIS Fine-Tuning to Model Evaluation
Shih-Ying Yeh, Yu-Guan Hsieh, Zhidong Gao et al.
SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
Rui-Jie Zhu, Qihang Zhao, Jason Eshraghian et al.
BRIGHT: A Realistic and Challenging Benchmark for Reasoning-Intensive Retrieval
Hongjin SU, Howard Yen, Mengzhou Xia et al.
No Pose, No Problem: Surprisingly Simple 3D Gaussian Splats from Sparse Unposed Images
Botao Ye, Sifei Liu, Haofei Xu et al.
RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches
Jiayuan Gu, Sean Kirmani, Paul Wohlhart et al.
LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models
Gunho Park, baeseong park, Minsub Kim et al.
Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens
Lijie Fan, Tianhong Li, Siyang Qin et al.
The All-Seeing Project: Towards Panoptic Visual Recognition and Understanding of the Open World
Weiyun Wang, Min Shi, Qingyun Li et al.
VD3D: Taming Large Video Diffusion Transformers for 3D Camera Control
Sherwin Bahmani, Ivan Skorokhodov, Aliaksandr Siarohin et al.
Advancing Pose-Guided Image Synthesis with Progressive Conditional Diffusion Models
Fei Shen, Hu Ye, Jun Zhang et al.
Data Mixing Laws: Optimizing Data Mixtures by Predicting Language Modeling Performance
Jiasheng Ye, Peiju Liu, Tianxiang Sun et al.
The Truth is in There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
Pratyusha Sharma, Jordan Ash, Dipendra Kumar Misra
LLaVA-Mini: Efficient Image and Video Large Multimodal Models with One Vision Token
Shaolei Zhang, Qingkai Fang, Yang et al.
VLM2Vec: Training Vision-Language Models for Massive Multimodal Embedding Tasks
Ziyan Jiang, Rui Meng, Xinyi Yang et al.
Cameras as Rays: Pose Estimation via Ray Diffusion
Jason Zhang, Amy Lin, Moneish Kumar et al.
Masked Diffusion Models are Secretly Time-Agnostic Masked Models and Exploit Inaccurate Categorical Sampling
Kaiwen Zheng, Yongxin Chen, Hanzi Mao et al.
Retrieval meets Long Context Large Language Models
Peng Xu, Wei Ping, Xianchao Wu et al.
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents
Hanrong Zhang, Jingyuan Huang, Kai Mei et al.
Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders
Min Shi, Fuxiao Liu, Shihao Wang et al.
Physics of Language Models: Part 3.3, Knowledge Capacity Scaling Laws
Zeyuan Allen-Zhu, Yuanzhi Li
AutoDAN-Turbo: A Lifelong Agent for Strategy Self-Exploration to Jailbreak LLMs
Xiaogeng Liu, Peiran Li, G. Edward Suh et al.
LLMCarbon: Modeling the End-to-End Carbon Footprint of Large Language Models
Ahmad Faiz, Sotaro Kaneda, Ruhan Wang et al.
Towards Lossless Dataset Distillation via Difficulty-Aligned Trajectory Matching
Ziyao Guo, Kai Wang, George Cazenavette et al.
Universal Jailbreak Backdoors from Poisoned Human Feedback
Javier Rando, Florian Tramer
Understanding Catastrophic Forgetting in Language Models via Implicit Inference
Suhas Kotha, Jacob Springer, Aditi Raghunathan
NEFTune: Noisy Embeddings Improve Instruction Finetuning
Neel Jain, Ping-yeh Chiang, Yuxin Wen et al.
Tamper-Resistant Safeguards for Open-Weight LLMs
Rishub Tamirisa, Bhrugu Bharathi, Long Phan et al.
HelpSteer2-Preference: Complementing Ratings with Preferences
Zhilin Wang, Alexander Bukharin, Olivier Delalleau et al.
FasterViT: Fast Vision Transformers with Hierarchical Attention
Ali Hatamizadeh, Greg Heinrich, Hongxu Yin et al.
A Sanity Check for AI-generated Image Detection
Shilin Yan, Ouxiang Li, Jiayin Cai et al.
LEGO-Prover: Neural Theorem Proving with Growing Libraries
Haiming Wang, Huajian Xin, Chuanyang Zheng et al.
SmartPlay : A Benchmark for LLMs as Intelligent Agents
Yue Wu, Xuan Tang, Tom Mitchell et al.
Text2Reward: Reward Shaping with Language Models for Reinforcement Learning
Tianbao Xie, Siheng Zhao, Chen Henry Wu et al.
EIA: ENVIRONMENTAL INJECTION ATTACK ON GENERALIST WEB AGENTS FOR PRIVACY LEAKAGE
Zeyi Liao, Lingbo Mo, Chejian Xu et al.
ScaleCrafter: Tuning-free Higher-Resolution Visual Generation with Diffusion Models
Yingqing He, Shaoshu Yang, Haoxin Chen et al.
Vision-RWKV: Efficient and Scalable Visual Perception with RWKV-Like Architectures
Yuchen Duan, Weiyun Wang, Zhe Chen et al.
Strategic Preys Make Acute Predators: Enhancing Camouflaged Object Detectors by Generating Camouflaged Objects
Chunming He, Kai Li, Yachao Zhang et al.
Synapse: Trajectory-as-Exemplar Prompting with Memory for Computer Control
Longtao Zheng, Rundong Wang, Xinrun Wang et al.
Autoregressive Video Generation without Vector Quantization
Haoge Deng, Ting Pan, Haiwen Diao et al.
Phenomenal Yet Puzzling: Testing Inductive Reasoning Capabilities of Language Models with Hypothesis Refinement
Linlu Qiu, Liwei Jiang, Ximing Lu et al.