Most Cited ICML 2024 "visibility layers" Papers
2,635 papers found • Page 1 of 14
Conference
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Patrick Esser, Sumith Kulal, Andreas Blattmann et al.
Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model
Lianghui Zhu, Bencheng Liao, Qian Zhang et al.
Improving Factuality and Reasoning in Language Models through Multiagent Debate
Yilun Du, Shuang Li, Antonio Torralba et al.
Transformers are SSMs: Generalized Models and Efficient Algorithms Through Structured State Space Duality
Tri Dao, Albert Gu
MM-Vet: Evaluating Large Multimodal Models for Integrated Capabilities
Weihao Yu, Zhengyuan Yang, Linjie Li et al.
Chatbot Arena: An Open Platform for Evaluating LLMs by Human Preference
Wei-Lin Chiang, Lianmin Zheng, Ying Sheng et al.
Model Alignment as Prospect Theoretic Optimization
Kawin Ethayarajh, Winnie Xu, Niklas Muennighoff et al.
HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal
Mantas Mazeika, Long Phan, Xuwang Yin et al.
NExT-GPT: Any-to-Any Multimodal LLM
Shengqiong Wu, Hao Fei, Leigang Qu et al.
DoRA: Weight-Decomposed Low-Rank Adaptation
Shih-Yang Liu, Chien-Yi Wang, Hongxu Yin et al.
MusicRL: Aligning Music Generation to Human Preferences
Geoffrey Cideron, Sertan Girgin, Mauro Verzetti et al.
Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads
Tianle Cai, Yuhong Li, Zhengyang Geng et al.
Language Models are Super Mario: Absorbing Abilities from Homologous Models as a Free Lunch
Le Yu, Bowen Yu, Haiyang Yu et al.
RLAIF vs. RLHF: Scaling Reinforcement Learning from Human Feedback with AI Feedback
Harrison Lee, Samrat Phatale, Hassan Mansoor et al.
Self-Rewarding Language Models
Weizhe Yuan, Richard Yuanzhe Pang, Kyunghyun Cho et al.
A decoder-only foundation model for time-series forecasting
Abhimanyu Das, Weihao Kong, Rajat Sen et al.
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Zixiang Chen, Yihe Deng, Huizhuo Yuan et al.
Unified Training of Universal Time Series Forecasting Transformers
Gerald Woo, Chenghao Liu, Akshat Kumar et al.
GPT-4V(ision) is a Generalist Web Agent, if Grounded
Boyuan Zheng, Boyu Gou, Jihyung Kil et al.
VideoPoet: A Large Language Model for Zero-Shot Video Generation
Dan Kondratyuk, Lijun Yu, Xiuye Gu et al.
Contrastive Preference Optimization: Pushing the Boundaries of LLM Performance in Machine Translation
Haoran Xu, Amr Sharaf, Yunmo Chen et al.
Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision
Collin Burns, Pavel Izmailov, Jan Kirchner et al.
LESS: Selecting Influential Data for Targeted Instruction Tuning
Mengzhou Xia, Sadhika Malladi, Suchin Gururangan et al.
Genie: Generative Interactive Environments
Jake Bruce, Michael Dennis, Ashley Edwards et al.
Q-Align: Teaching LMMs for Visual Scoring via Discrete Text-Defined Levels
Haoning Wu, Zicheng Zhang, Weixia Zhang et al.
How Language Model Hallucinations Can Snowball
Muru Zhang, Ofir Press, William Merrill et al.
GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Jiawei Zhao, Zhenyu Zhang, Beidi Chen et al.
KIVI: A Tuning-Free Asymmetric 2bit Quantization for KV Cache
Zirui Liu, Jiayi Yuan, Hongye Jin et al.
Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution
Chrisantha Fernando, Dylan Banarse, Henryk Michalewski et al.
The Linear Representation Hypothesis and the Geometry of Large Language Models
Kiho Park, Yo Joong Choe, Victor Veitch
MOMENT: A Family of Open Time-series Foundation Models
Mononito Goswami, Konrad Szafer, Arjun Choudhry et al.
Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution
Aaron Lou, Chenlin Meng, Stefano Ermon
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang, Yangyi Chen, Lifan Yuan et al.
LoRA+: Efficient Low Rank Adaptation of Large Models
Soufiane Hayou, Nikhil Ghosh, Bin Yu
EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty
Yuhui Li, Fangyun Wei, Chao Zhang et al.
The WMDP Benchmark: Measuring and Reducing Malicious Use with Unlearning
Nathaniel Li, Alexander Pan, Anjali Gopal et al.
Gated Linear Attention Transformers with Hardware-Efficient Training
Songlin Yang, Bailin Wang, Yikang Shen et al.
TravelPlanner: A Benchmark for Real-World Planning with Language Agents
Jian Xie, Kai Zhang, Jiangjie Chen et al.
Iterative Preference Learning from Human Feedback: Bridging Theory and Practice for RLHF under KL-constraint
Wei Xiong, Hanze Dong, Chenlu Ye et al.
NaturalSpeech 3: Zero-Shot Speech Synthesis with Factorized Codec and Diffusion Models
Zeqian Ju, Yuancheng Wang, Kai Shen et al.
An Embodied Generalist Agent in 3D World
Jiangyong Huang, Silong Yong, Xiaojian Ma et al.
AlphaZero-Like Tree-Search can Guide Large Language Model Decoding and Training
Ziyu Wan, Xidong Feng, Muning Wen et al.
MLLM-as-a-Judge: Assessing Multimodal LLM-as-a-Judge with Vision-Language Benchmark
Dongping Chen, Ruoxi Chen, Shilin Zhang et al.
LongRoPE: Extending LLM Context Window Beyond 2 Million Tokens
Yiran Ding, Li Lyna Zhang, Chengruidong Zhang et al.
SqueezeLLM: Dense-and-Sparse Quantization
Sehoon Kim, Coleman Hooper, Amir Gholaminejad et al.
Prismatic VLMs: Investigating the Design Space of Visually-Conditioned Language Models
Siddharth Karamcheti, Suraj Nair, Ashwin Balakrishna et al.
Break the Sequential Dependency of LLM Inference Using Lookahead Decoding
Yichao Fu, Peter Bailis, Ion Stoica et al.
Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study
Shusheng Xu, Wei Fu, Jiaxuan Gao et al.
QUEST: Query-Aware Sparsity for Efficient Long-Context LLM Inference
Jiaming Tang, Yilong Zhao, Kan Zhu et al.
Physics of Language Models: Part 3.1, Knowledge Storage and Extraction
Zeyuan Allen-Zhu, Yuanzhi Li
QuIP$\#$: Even Better LLM Quantization with Hadamard Incoherence and Lattice Codebooks
Albert Tseng, Jerry Chee, Qingyao Sun et al.
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
Andrew Campbell, Jason Yim, Regina Barzilay et al.
3D-VLA: A 3D Vision-Language-Action Generative World Model
Haoyu Zhen, Xiaowen Qiu, Peihao Chen et al.
Better & Faster Large Language Models via Multi-token Prediction
Fabian Gloeckle, Badr Youbi Idrissi, Baptiste Roziere et al.
Spotting LLMs With Binoculars: Zero-Shot Detection of Machine-Generated Text
Abhimanyu Hans, Avi Schwarzschild, Valeriia Cherepanova et al.
In-context Vectors: Making In Context Learning More Effective and Controllable Through Latent Space Steering
Sheng Liu, Haotian Ye, Lei Xing et al.
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution
Alex Gu, Baptiste Roziere, Hugh Leather et al.
GaussianPro: 3D Gaussian Splatting with Progressive Propagation
Kai Cheng, Xiaoxiao Long, Kaizhi Yang et al.
ULTRAFEEDBACK: Boosting Language Models with Scaled AI Feedback
Ganqu Cui, Lifan Yuan, Ning Ding et al.
Debating with More Persuasive LLMs Leads to More Truthful Answers
Akbir Khan, John Hughes, Dan Valentine et al.
Magicoder: Empowering Code Generation with OSS-Instruct
Yuxiang Wei, Zhe Wang, Jiawei Liu et al.
Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs
Ling Yang, Zhaochen Yu, Chenlin Meng et al.
Fast Timing-Conditioned Latent Audio Diffusion
Zach Evans, CJ Carr, Josiah Taylor et al.
Evolution of Heuristics: Towards Efficient Automatic Algorithm Design Using Large Language Model
Fei Liu, Tong Xialiang, Mingxuan Yuan et al.
AlphaFold Meets Flow Matching for Generating Protein Ensembles
Bowen Jing, Bonnie Berger, Tommi Jaakkola
MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases
Zechun Liu, Changsheng Zhao, Forrest Iandola et al.
Nash Learning from Human Feedback
REMI MUNOS, Michal Valko, Daniele Calandriello et al.
RoboGen: Towards Unleashing Infinite Data for Automated Robot Learning via Generative Simulation
Yufei Wang, Zhou Xian, Feng Chen et al.
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Soroush Nasiriany, Fei Xia, Wenhao Yu et al.
Data Engineering for Scaling Language Models to 128K Context
Yao Fu, Rameswar Panda, Xinyao Niu et al.
Assessing the Brittleness of Safety Alignment via Pruning and Low-Rank Modifications
Boyi Wei, Kaixuan Huang, Yangsibo Huang et al.
Monitoring AI-Modified Content at Scale: A Case Study on the Impact of ChatGPT on AI Conference Peer Reviews
Weixin Liang, Zachary Izzo, Yaohui Zhang et al.
Transolver: A Fast Transformer Solver for PDEs on General Geometries
Haixu Wu, Huakun Luo, Haowen Wang et al.
SciBench: Evaluating College-Level Scientific Problem-Solving Abilities of Large Language Models
Xiaoxuan Wang, ziniu hu, Pan Lu et al.
Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data
Fahim Tajwar, Anikait Singh, Archit Sharma et al.
Fundamental Limitations of Alignment in Large Language Models
Yotam Wolf, Noam Wies, Oshri Avnery et al.
tinyBenchmarks: evaluating LLMs with fewer examples
Felipe Maia Polo, Lucas Weber, Leshem Choshen et al.
Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models
Asma Ghandeharioun, Avi Caciularu, Adam Pearce et al.
Caduceus: Bi-Directional Equivariant Long-Range DNA Sequence Modeling
Yair Schiff, Chia Hsiang Kao, Aaron Gokaslan et al.
Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities
Zhifeng Kong, ARUSHI GOEL, Rohan Badlani et al.
MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation
Qian Huang, Jian Vora, Percy Liang et al.
A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity
Andrew Lee, Xiaoyan Bai, Itamar Pres et al.
MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Kaining Ying, Fanqing Meng, Jin Wang et al.
Repeat After Me: Transformers are Better than State Space Models at Copying
Samy Jelassi, David Brandfonbrener, Sham Kakade et al.
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Fuzhao Xue, Zian Zheng, Yao Fu et al.
Extreme Compression of Large Language Models via Additive Quantization
Vage Egiazarian, Andrei Panferov, Denis Kuznedelev et al.
COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability
Xingang Guo, Fangxu Yu, Huan Zhang et al.
Score identity Distillation: Exponentially Fast Distillation of Pretrained Diffusion Models for One-Step Generation
Mingyuan Zhou, Huangjie Zheng, Zhendong Wang et al.
Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity
Lu Yin, You Wu, Zhenyu Zhang et al.
Generalized Preference Optimization: A Unified Approach to Offline Alignment
Yunhao Tang, Zhaohan Guo, Zeyu Zheng et al.
Actions Speak Louder than Words: Trillion-Parameter Sequential Transducers for Generative Recommendations
Jiaqi Zhai, Yunxing Liao, Xing Liu et al.
Timer: Generative Pre-trained Transformers Are Large Time Series Models
Yong Liu, Haoran Zhang, Chenyu Li et al.
LLaGA: Large Language and Graph Assistant
Runjin Chen, Tong Zhao, Ajay Jaiswal et al.
ReMax: A Simple, Effective, and Efficient Reinforcement Learning Method for Aligning Large Language Models
Ziniu Li, Tian Xu, Yushun Zhang et al.
Video-of-Thought: Step-by-Step Video Reasoning from Perception to Cognition
Hao Fei, Shengqiong Wu, Wei Ji et al.
MathScale: Scaling Instruction Tuning for Mathematical Reasoning
Zhengyang Tang, Xingxing Zhang, Benyou Wang et al.
Stealing part of a production language model
Nicholas Carlini, Daniel Paleka, Krishnamurthy Dvijotham et al.
BiLLM: Pushing the Limit of Post-Training Quantization for LLMs
Wei Huang, Yangdong Liu, Haotong Qin et al.
Image Hijacks: Adversarial Images can Control Generative Models at Runtime
Luke Bailey, Euan Ong, Stuart Russell et al.
HALC: Object Hallucination Reduction via Adaptive Focal-Contrast Decoding
Zhaorun Chen, Zhuokai Zhao, HONGYIN LUO et al.
SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models
Dongyang Liu, Renrui Zhang, Longtian Qiu et al.
BetterV: Controlled Verilog Generation with Discriminative Guidance
Zehua Pei, Huiling Zhen, Mingxuan Yuan et al.
WorkArena: How Capable are Web Agents at Solving Common Knowledge Work Tasks?
Alexandre Drouin, Maxime Gasse, Massimo Caccia et al.
Simple linear attention language models balance the recall-throughput tradeoff
Simran Arora, Sabri Eyuboglu, Michael Zhang et al.
A Minimaximalist Approach to Reinforcement Learning from Human Feedback
Gokul Swamy, Christoph Dann, Rahul Kidambi et al.
The Pitfalls of Next-Token Prediction
Gregor Bachmann, Vaishnavh Nagarajan
Language Agents with Reinforcement Learning for Strategic Play in the Werewolf Game
Zelai Xu, Chao Yu, Fei Fang et al.
SparseTSF: Modeling Long-term Time Series Forecasting with *1k* Parameters
Shengsheng Lin, Weiwei Lin, Wentai Wu et al.
ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL
Yifei Zhou, Andrea Zanette, Jiayi Pan et al.
Chain of Code: Reasoning with a Language Model-Augmented Code Emulator
Chengshu Li, Jacky Liang, Andy Zeng et al.
Prompting4Debugging: Red-Teaming Text-to-Image Diffusion Models by Finding Problematic Prompts
Zhi-Yi Chin, Chieh Ming Jiang, Ching-Chun Huang et al.
Behavior Generation with Latent Actions
Seungjae Lee, Yibin Wang, Haritheja Etukuru et al.
The Illusion of State in State-Space Models
William Merrill, Jackson Petty, Ashish Sabharwal
QuRating: Selecting High-Quality Data for Training Language Models
Alexander Wettig, Aatmik Gupta, Saumya Malik et al.
TSLANet: Rethinking Transformers for Time Series Representation Learning
Emadeldeen Eldele, Mohamed Ragab, Zhenghua Chen et al.
Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment
Rui Yang, Xiaoman Pan, Feng Luo et al.
An LLM Compiler for Parallel Function Calling
Sehoon Kim, Suhong Moon, Ryan Tabrizi et al.
Safety Fine-Tuning at (Almost) No Cost: A Baseline for Vision Large Language Models
Yongshuo Zong, Ondrej Bohdal, Tingyang Yu et al.
Beyond Chinchilla-Optimal: Accounting for Inference in Language Model Scaling Laws
Nikhil Sardana, Jacob Portes, Alexandre (Sasha) Doubov et al.
WebLINX: Real-World Website Navigation with Multi-Turn Dialogue
Xing Han Lù, Zdeněk Kasner, Siva Reddy
Scaling Laws for Fine-Grained Mixture of Experts
Jan Ludziejewski, Jakub Krajewski, Kamil Adamczewski et al.
Token-level Direct Preference Optimization
Yongcheng Zeng, Guoqing Liu, Weiyu Ma et al.
RL-VLM-F: Reinforcement Learning from Vision Language Foundation Model Feedback
Yufei Wang, Zhanyi Sun, Jesse Zhang et al.
Controlled Decoding from Language Models
Sidharth Mudgal, Jong Lee, Harish Ganapathy et al.
Challenges in Training PINNs: A Loss Landscape Perspective
Pratik Rathore, Weimu Lei, Zachary Frangella et al.
Prodigy: An Expeditiously Adaptive Parameter-Free Learner
Konstantin Mishchenko, Aaron Defazio
MagicPose: Realistic Human Poses and Facial Expressions Retargeting with Identity-aware Diffusion
Di Chang, Yichun Shi, Quankai Gao et al.
HumanTOMATO: Text-aligned Whole-body Motion Generation
Shunlin Lu, Ling-Hao Chen, Ailing Zeng et al.
ODIN: Disentangled Reward Mitigates Hacking in RLHF
Lichang Chen, Chen Zhu, Jiuhai Chen et al.
AI Control: Improving Safety Despite Intentional Subversion
Ryan Greenblatt, Buck Shlegeris, Kshitij Sachan et al.
A Tale of Tails: Model Collapse as a Change of Scaling Laws
Elvis Dohmatob, Yunzhen Feng, Pu Yang et al.
RoboDreamer: Learning Compositional World Models for Robot Imagination
Siyuan Zhou, Yilun Du, Jiaben Chen et al.
Can Mamba Learn How To Learn? A Comparative Study on In-Context Learning Tasks
Jong Ho Park, Jaden Park, Zheyang Xiong et al.
Revisiting Zeroth-Order Optimization for Memory-Efficient LLM Fine-Tuning: A Benchmark
Yihua Zhang, Pingzhi Li, Junyuan Hong et al.
Stop Regressing: Training Value Functions via Classification for Scalable Deep RL
Jesse Farebrother, Jordi Orbay, Quan Vuong et al.
On Prompt-Driven Safeguarding for Large Language Models
Chujie Zheng, Fan Yin, Hao Zhou et al.
In-context Convergence of Transformers
Yu Huang, Yuan Cheng, Yingbin LIANG
Momentor: Advancing Video Large Language Model with Fine-Grained Temporal Reasoning
Long Qian, Juncheng Li, Yu Wu et al.
Provably Robust DPO: Aligning Language Models with Noisy Feedback
Sayak Ray Chowdhury, Anush Kini, Nagarajan Natarajan
Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast
Xiangming Gu, Xiaosen Zheng, Tianyu Pang et al.
GLoRe: When, Where, and How to Improve LLM Reasoning via Global and Local Refinements
Alexander Havrilla, Sharath Chandra Raparthy, Christoforos Nalmpantis et al.
How Transformers Learn Causal Structure with Gradient Descent
Eshaan Nichani, Alex Damian, Jason Lee
Decomposing Uncertainty for Large Language Models through Input Clarification Ensembling
Bairu Hou, Yujian Liu, Kaizhi Qian et al.
Proactive Detection of Voice Cloning with Localized Watermarking
Robin San Roman, Pierre Fernandez, Hady Elsahar et al.
Algorithm of Thoughts: Enhancing Exploration of Ideas in Large Language Models
Bilgehan Sel, Ahmad Al-Tawaha, Vanshaj Khattar et al.
Iterated Denoising Energy Matching for Sampling from Boltzmann Densities
Tara Akhound-Sadegh, Jarrid Rector-Brooks, Joey Bose et al.
InfiAgent-DABench: Evaluating Agents on Data Analysis Tasks
Xueyu Hu, Ziyu Zhao, Shuang Wei et al.
MindEye2: Shared-Subject Models Enable fMRI-To-Image With 1 Hour of Data
Paul Scotti, Mihir Tripathy, Cesar Kadir Torrico Villanueva et al.
Flora: Low-Rank Adapters Are Secretly Gradient Compressors
Yongchang Hao, Yanshuai Cao, Lili Mou
GALA3D: Towards Text-to-3D Complex Scene Generation via Layout-guided Generative Gaussian Splatting
Xiaoyu Zhou, Xingjian Ran, Yajiao Xiong et al.
Diffusion Language Models Are Versatile Protein Learners
Xinyou Wang, Zaixiang Zheng, Fei YE et al.
Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference
Piotr Nawrot, Adrian Łańcucki, Marcin Chochowski et al.
DsDm: Model-Aware Dataset Selection with Datamodels
Logan Engstrom
Position: Levels of AGI for Operationalizing Progress on the Path to AGI
Meredith Morris, Jascha Sohl-Dickstein, Noah Fiedel et al.
Large Language Models are Geographically Biased
Rohin Manvi, Samar Khanna, Marshall Burke et al.
Online Speculative Decoding
Xiaoxuan Liu, Lanxiang Hu, Peter Bailis et al.
AttnLRP: Attention-Aware Layer-Wise Relevance Propagation for Transformers
Reduan Achtibat, Sayed Mohammad Vakilzadeh Hatefi, Maximilian Dreyer et al.
DPOT: Auto-Regressive Denoising Operator Transformer for Large-Scale PDE Pre-Training
Zhongkai Hao, Chang Su, LIU SONGMING et al.
Localizing Task Information for Improved Model Merging and Compression
Ke Wang, Nikolaos Dimitriadis, Guillermo Ortiz-Jimenez et al.
SparQ Attention: Bandwidth-Efficient LLM Inference
Luka Ribar, Ivan Chelombiev, Luke Hudlass-Galley et al.
Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models
Christian Schlarmann, Naman Singh, Francesco Croce et al.
MaxMin-RLHF: Alignment with Diverse Human Preferences
Souradip Chakraborty, Jiahao Qiu, Hui Yuan et al.
Human Alignment of Large Language Models through Online Preference Optimisation
Daniele Calandriello, Zhaohan Guo, REMI MUNOS et al.
MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions
Kai Zhang, Yi Luan, Hexiang Hu et al.
Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning
Hao Zhao, Maksym Andriushchenko, Francesco Croce et al.
FlowMM: Generating Materials with Riemannian Flow Matching
Benjamin Kurt Miller, Ricky T. Q. Chen, Anuroop Sriram et al.
Representation Surgery for Multi-Task Model Merging
Enneng Yang, Li Shen, Zhenyi Wang et al.
IM-3D: Iterative Multiview Diffusion and Reconstruction for High-Quality 3D Generation
Luke Melas-Kyriazi, Iro Laina, Christian Rupprecht et al.
AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls
YU DU, Fangyun Wei, Hongyang Zhang
Language Models with Conformal Factuality Guarantees
Christopher Mohri, Tatsunori Hashimoto
A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts
Kuang-Huei Lee, Xinyun Chen, Hiroki Furuta et al.
Robust Classification via a Single Diffusion Model
Huanran Chen, Yinpeng Dong, Zhengyi Wang et al.
Scalable High-Resolution Pixel-Space Image Synthesis with Hourglass Diffusion Transformers
Katherine Crowson, Stefan Baumann, Alex Birch et al.
Merging Multi-Task Models via Weight-Ensembling Mixture of Experts
Anke Tang, Li Shen, Yong Luo et al.
Evaluating Quantized Large Language Models
Shiyao Li, Xuefei Ning, Luning Wang et al.
SceneCraft: An LLM Agent for Synthesizing 3D Scenes as Blender Code
ziniu hu, Ahmet Iscen, Aashi Jain et al.
In-Context Language Learning: Architectures and Algorithms
Ekin Akyürek, Bailin Wang, Yoon Kim et al.
A Closer Look at the Limitations of Instruction Tuning
Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.
Boximator: Generating Rich and Controllable Motions for Video Synthesis
Jiawei Wang, Yuchen Zhang, Jiaxin Zou et al.
Guiding LLMs The Right Way: Fast, Non-Invasive Constrained Generation
Luca Beurer-Kellner, Marc Fischer, Martin Vechev
Video-LaVIT: Unified Video-Language Pre-training with Decoupled Visual-Motional Tokenization
Yang Jin, Zhicheng Sun, Kun Xu et al.
DS-Agent: Automated Data Science by Empowering Large Language Models with Case-Based Reasoning
Siyuan Guo, Cheng Deng, Ying Wen et al.
Low-Cost High-Power Membership Inference Attacks
Sajjad Zarifzadeh, Philippe Liu, Reza Shokri
Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference
Harry Dong, Xinyu Yang, Zhenyu Zhang et al.
Position: Graph Foundation Models Are Already Here
Haitao Mao, Zhikai Chen, Wenzhuo Tang et al.
Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations
Yanda Chen, Ruiqi Zhong, Narutatsu Ri et al.
Wukong: Towards a Scaling Law for Large-Scale Recommendation
Buyun Zhang, Liang Luo, Yuxin Chen et al.
NExT-Chat: An LMM for Chat, Detection and Segmentation
Ao Zhang, Yuan Yao, Wei Ji et al.
FiT: Flexible Vision Transformer for Diffusion Model
Zeyu Lu, ZiDong Wang, Di Huang et al.
A Dynamical Model of Neural Scaling Laws
Blake Bordelon, Alexander Atanasov, Cengiz Pehlevan
Rolling Diffusion Models
David Ruhe, Jonathan Heek, Tim Salimans et al.
video-SALMONN: Speech-Enhanced Audio-Visual Large Language Models
Guangzhi Sun, Wenyi Yu, Changli Tang et al.
How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis
Federico Bianchi, Patrick John Chia, Mert Yuksekgonul et al.
Watermark Stealing in Large Language Models
Nikola Jovanović, Robin Staab, Martin Vechev
D-Flow: Differentiating through Flows for Controlled Generation
Heli Ben-Hamu, Omri Puny, Itai Gat et al.
A Touch, Vision, and Language Dataset for Multimodal Alignment
Letian Fu, Gaurav Datta, Huang Huang et al.
DITTO: Diffusion Inference-Time T-Optimization for Music Generation
Zachary Novack, Julian McAuley, Taylor Berg-Kirkpatrick et al.
Accurate LoRA-Finetuning Quantization of LLMs via Information Retention
Haotong Qin, Xudong Ma, Xingyu Zheng et al.
Guidance with Spherical Gaussian Constraint for Conditional Diffusion
Lingxiao Yang, Shutong Ding, Yifan Cai et al.
VideoPrism: A Foundational Visual Encoder for Video Understanding
Long Zhao, Nitesh Bharadwaj Gundavarapu, Liangzhe Yuan et al.