Most Cited ICLR "blackbox model verification" Papers
6,124 papers found • Page 3 of 31
Conference
Energy-Based Diffusion Language Models for Text Generation
Minkai Xu, Tomas Geffner, Karsten Kreis et al.
AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ
Jonas Belouadi, Anne Lauscher, Steffen Eger
From Zero to Turbulence: Generative Modeling for 3D Flow Simulation
Marten Lienen, David Lüdke, Jan Hansen-Palmus et al.
Local Search GFlowNets
Minsu Kim, Yun Taeyoung, Emmanuel Bengio et al.
An Image is Worth More Than 16x16 Patches: Exploring Transformers on Individual Pixels
Duy-Kien Nguyen, Mahmoud Assran, Unnat Jain et al.
MMIU: Multimodal Multi-image Understanding for Evaluating Large Vision-Language Models
Fanqing Meng, Jin Wang, Chuanhao Li et al.
BioDiscoveryAgent: An AI Agent for Designing Genetic Perturbation Experiments
Yusuf Roohani, Andrew Lee, Qian Huang et al.
GPAvatar: Generalizable and Precise Head Avatar from Image(s)
Xuangeng Chu, Yu Li, Ailing Zeng et al.
Physics of Language Models: Part 2.2, How to Learn From Mistakes on Grade-School Math Problems
Tian Ye, Zicheng Xu, Yuanzhi Li et al.
Frozen Transformers in Language Models Are Effective Visual Encoder Layers
Ziqi Pang, Ziyang Xie, Yunze Man et al.
Causal Order: The Key to Leveraging Imperfect Experts in Causal Inference
Aniket Vashishtha, Abbavaram Gowtham Reddy, Abhinav Kumar et al.
Rectified Diffusion: Straightness Is Not Your Need in Rectified Flow
Fu-Yun Wang, Ling Yang, Zhaoyang Huang et al.
Soft Contrastive Learning for Time Series
Seunghan Lee, Taeyoung Park, Kibok Lee
Ctrl-Adapter: An Efficient and Versatile Framework for Adapting Diverse Controls to Any Diffusion Model
Han Lin, Jaemin Cho, Abhay Zala et al.
RNNs are not Transformers (Yet): The Key Bottleneck on In-Context Retrieval
Kaiyue Wen, Xingyu Dang, Kaifeng Lyu
Simplifying Transformer Blocks
Bobby He, Thomas Hofmann
How Two-Layer Neural Networks Learn, One (Giant) Step at a Time
Yatin Dandi, Florent Krzakala, Bruno Loureiro et al.
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Yury Gorishniy, Akim Kotelnikov, Artem Babenko
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Zhengbo Wang, Jian Liang, Ran He et al.
Promptriever: Instruction-Trained Retrievers Can Be Prompted Like Language Models
Orion Weller, Ben Van Durme, Dawn Lawrie et al.
Learning How Hard to Think: Input-Adaptive Allocation of LM Computation
Mehul Damani, Idan Shenfeld, Andi Peng et al.
Learn Your Reference Model for Real Good Alignment
Alexey Gorbatovski, Boris Shaposhnikov, Alexey Malakhov et al.
Inference-Aware Fine-Tuning for Best-of-N Sampling in Large Language Models
Yinlam Chow, Guy Tennenholtz, Izzeddin Gur et al.
Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems
Zhenting Qi, Hanlin Zhang, Eric P Xing et al.
What does the Knowledge Neuron Thesis Have to do with Knowledge?
Jingcheng Niu, Andrew Liu, Zining Zhu et al.
ALLaM: Large Language Models for Arabic and English
M Saiful Bari, Yazeed Alnumay, Norah Alzahrani et al.
Eliminating Position Bias of Language Models: A Mechanistic Approach
Ziqi Wang, Hanlin Zhang, Xiner Li et al.
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo, Luke Ong, Philip Torr et al.
Jumanji: a Diverse Suite of Scalable Reinforcement Learning Environments in JAX
Clément Bonnet, Daniel Luo, Donal Byrne et al.
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Zhuoqun Li, Xuanang Chen, Haiyang Yu et al.
Smooth ECE: Principled Reliability Diagrams via Kernel Smoothing
Jaroslaw Blasiok, Preetum Nakkiran
GAIA: Zero-shot Talking Avatar Generation
Tianyu He, Junliang Guo, Runyi Yu et al.
BESA: Pruning Large Language Models with Blockwise Parameter-Efficient Sparsity Allocation
Peng Xu, Wenqi Shao, Mengzhao Chen et al.
Str2Str: A Score-based Framework for Zero-shot Protein Conformation Sampling
Jiarui Lu, Bozitao Zhong, Zuobai Zhang et al.
ODEFormer: Symbolic Regression of Dynamical Systems with Transformers
Stéphane d'Ascoli, Sören Becker, Philippe Schwaller et al.
JoMA: Demystifying Multilayer Transformers via Joint Dynamics of MLP and Attention
Yuandong Tian, Yiping Wang, Zhenyu Zhang et al.
Shot2Story: A New Benchmark for Comprehensive Understanding of Multi-shot Videos
Mingfei Han, Linjie Yang, Xiaojun Chang et al.
Visual Agents as Fast and Slow Thinkers
Guangyan Sun, Mingyu Jin, Zhenting Wang et al.
PARTNR: A Benchmark for Planning and Reasoning in Embodied Multi-agent Tasks
Matthew Chang, Gunjan Chhablani, Alexander Clegg et al.
Vision Language Models are In-Context Value Learners
Yecheng Jason Ma, Joey Hejna, Chuyuan Fu et al.
Group Preference Optimization: Few-Shot Alignment of Large Language Models
Siyan Zhao, John Dang, Aditya Grover
FaithEval: Can Your Language Model Stay Faithful to Context, Even If "The Moon is Made of Marshmallows"
Yifei Ming, Senthil Purushwalkam, Shrey Pandit et al.
Real-Fake: Effective Training Data Synthesis Through Distribution Matching
Jianhao Yuan, Jie Zhang, Shuyang Sun et al.
Can LLMs Separate Instructions From Data? And What Do We Even Mean By That?
Egor Zverev, Sahar Abdelnabi, Soroush Tabesh et al.
Generator Matching: Generative modeling with arbitrary Markov processes
Peter Holderrieth, Marton Havasi, Jason Yim et al.
Depthwise Hyperparameter Transfer in Residual Networks: Dynamics and Scaling Limit
Blake Bordelon, Lorenzo Noci, Mufan Li et al.
TEOChat: A Large Vision-Language Assistant for Temporal Earth Observation Data
Jeremy Irvin, Emily Liu, Joyce Chen et al.
MOOSE-Chem: Large Language Models for Rediscovering Unseen Chemistry Scientific Hypotheses
Zonglin Yang, Wanhao Liu, Ben Gao et al.
Fiddler: CPU-GPU Orchestration for Fast Inference of Mixture-of-Experts Models
Keisuke Kamahori, Tian Tang, Yile Gu et al.
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
Zhiqian Lan, Yuxuan Jiang, Yao Mu et al.
Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance Accompaniment
Siyao Li, Tianpei Gu, Zhitao Yang et al.
OMNI-EPIC: Open-endedness via Models of human Notions of Interestingness with Environments Programmed in Code
Maxence Faldor, Jenny Zhang, Antoine Cully et al.
Preble: Efficient Distributed Prompt Scheduling for LLM Serving
Vikranth Srivatsa, Zijian He, Reyna Abhyankar et al.
Gaining Wisdom from Setbacks: Aligning Large Language Models via Mistake Analysis
Kai Chen, Chunwei Wang, Kuo Yang et al.
Learning Diverse Attacks on Large Language Models for Robust Red-Teaming and Safety Tuning
Seanie Lee, Minsu Kim, Lynn Cherif et al.
RRM: Robust Reward Model Training Mitigates Reward Hacking
Tianqi Liu, Wei Xiong, Jie Ren et al.
On the Optimization and Generalization of Multi-head Attention
Christos Thrampoulidis, Rouzbeh Ghaderi, Hossein Taheri et al.
Articulate-Anything: Automatic Modeling of Articulated Objects via a Vision-Language Foundation Model
Long Le, Jason Xie, William Liang et al.
Depth Any Video with Scalable Synthetic Data
Honghui Yang, Di Huang, Wei Yin et al.
Data Shapley in One Training Run
Jiachen (Tianhao) Wang, Prateek Mittal, Dawn Song et al.
ACE: All-round Creator and Editor Following Instructions via Diffusion Transformer
Zhen Han, Zeyinzi Jiang, Yulin Pan et al.
LLM-Assisted Code Cleaning For Training Accurate Code Generators
Naman Jain, Tianjun Zhang, Wei-Lin Chiang et al.
Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis
Jinbin Bai, Tian Ye, Wei Chow et al.
How efficient is LLM-generated code? A rigorous & high-standard benchmark
Ruizhong Qiu, Weiliang Zeng, James Ezick et al.
Selective Aggregation for Low-Rank Adaptation in Federated Learning
Pengxin Guo, Shuang Zeng, Yanran Wang et al.
Improved Probabilistic Image-Text Representations
Sanghyuk Chun
Catastrophic Failure of LLM Unlearning via Quantization
Zhiwei Zhang, Fali Wang, Xiaomin Li et al.
RMB: Comprehensively benchmarking reward models in LLM alignment
Enyu Zhou, Guodong Zheng, Binghai Wang et al.
AffineQuant: Affine Transformation Quantization for Large Language Models
Yuexiao Ma, Huixia Li, Xiawu Zheng et al.
Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms
Zhangheng LI, Keen You, Haotian Zhang et al.
Fine-Tuning Discrete Diffusion Models via Reward Optimization with Applications to DNA and Protein Design
Chenyu Wang, Masatoshi Uehara, Yichun He et al.
Two-stage LLM Fine-tuning with Less Specialization and More Generalization
Yihan Wang, Si Si, Daliang Li et al.
Towards Diverse Behaviors: A Benchmark for Imitation Learning with Human Demonstrations
Xiaogang Jia, Denis Blessing, Xinkai Jiang et al.
Robust LLM safeguarding via refusal feature adversarial training
Lei Yu, Virginie Do, Karen Hambardzumyan et al.
Frame-Voyager: Learning to Query Frames for Video Large Language Models
Sicheng Yu, CHENGKAI JIN, Huanyu Wang et al.
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
Audrey Huang, Wenhao Zhan, Tengyang Xie et al.
Towards Realistic Data Generation for Real-World Super-Resolution
Long Peng, Wenbo Li, Renjing Pei et al.
To Code or Not To Code? Exploring Impact of Code in Pre-training
Viraat Aryabumi, Yixuan Su, Raymond Ma et al.
Trust or Escalate: LLM Judges with Provable Guarantees for Human Agreement
Jaehun Jung, Faeze Brahman, Yejin Choi
Accelerating Auto-regressive Text-to-Image Generation with Training-free Speculative Jacobi Decoding
Yao Teng, Han Shi, Xian Liu et al.
ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Yi Ding, Bolian Li, Ruqi Zhang
Curriculum reinforcement learning for quantum architecture search under hardware errors
Yash J. Patel, Akash Kundu, Mateusz Ostaszewski et al.
Real2Code: Reconstruct Articulated Objects via Code Generation
Mandi Zhao, Yijia Weng, Dominik Bauer et al.
Point-SAM: Promptable 3D Segmentation Model for Point Clouds
Yuchen Zhou, Jiayuan Gu, Tung Chiang et al.
On the expressiveness and spectral bias of KANs
Yixuan Wang, Jonathan Siegel, Ziming Liu et al.
T-MARS: Improving Visual Representations by Circumventing Text Feature Learning
Pratyush Maini, Sachin Goyal, Zachary Lipton et al.
Benchmarking and Improving Generator-Validator Consistency of Language Models
XIANG LI, Vaishnavi Shrivastava, Siyan Li et al.
Asynchronous RLHF: Faster and More Efficient Off-Policy RL for Language Models
Michael Noukhovitch, Shengyi Huang, Sophie Xhonneux et al.
MAPE-PPI: Towards Effective and Efficient Protein-Protein Interaction Prediction via Microenvironment-Aware Protein Embedding
Lirong Wu, Yijun Tian, Yufei Huang et al.
3DTrajMaster: Mastering 3D Trajectory for Multi-Entity Motion in Video Generation
Xiao Fu, Xian Liu, Xintao WANG et al.
Agents' Room: Narrative Generation through Multi-step Collaboration
Fantine Huot, Reinald Kim Amplayo, Jennimaria Palomaki et al.
CG-Bench: Clue-grounded Question Answering Benchmark for Long Video Understanding
Guo Chen, Yicheng Liu, Yifei Huang et al.
DePT: Decomposed Prompt Tuning for Parameter-Efficient Fine-tuning
Zhengxiang Shi, Aldo Lipani
MIA-Bench: Towards Better Instruction Following Evaluation of Multimodal LLMs
Yusu Qian, Hanrong Ye, Jean-Philippe Fauconnier et al.
Few-Shot Detection of Machine-Generated Text using Style Representations
Rafael Rivera Soto, Kailin Koch, Aleem Khan et al.
On Targeted Manipulation and Deception when Optimizing LLMs for User Feedback
Marcus Williams, Micah Carroll, Adhyyan Narang et al.
TestGenEval: A Real World Unit Test Generation and Test Completion Benchmark
Kush Jain, Gabriel Synnaeve, Baptiste Roziere
Trajectory attention for fine-grained video motion control
Zeqi Xiao, Wenqi Ouyang, Yifan Zhou et al.
Does CLIP’s generalization performance mainly stem from high train-test similarity?
Prasanna Mayilvahanan, Thaddäus Wiedemer, Evgenia Rusak et al.
Self-Evolving Multi-Agent Collaboration Networks for Software Development
Yue Hu, Yuzhu Cai, Yaxin Du et al.
SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
Enze Xie, Junsong Chen, Junyu Chen et al.
Theory on Mixture-of-Experts in Continual Learning
Hongbo Li, Sen Lin, Lingjie Duan et al.
On the Role of Attention Heads in Large Language Model Safety
Zhenhong Zhou, Haiyang Yu, Xinghua Zhang et al.
Human-inspired Episodic Memory for Infinite Context LLMs
Zafeirios Fountas, Martin A Benfeghoul, Adnan Oomerjee et al.
Do Vision-Language Models Represent Space and How? Evaluating Spatial Frame of Reference under Ambiguities
Zheyuan Zhang, Fengyuan Hu, Jayjun Lee et al.
Scaling Speech-Text Pre-training with Synthetic Interleaved Data
Aohan Zeng, Zhengxiao Du, Mingdao Liu et al.
WOODS: Benchmarks for Out-of-Distribution Generalization in Time Series
Irina Rish, Kartik Ahuja, Mohammad Javad Darvishi Bayazi et al.
Streaming Video Question-Answering with In-context Video KV-Cache Retrieval
Shangzhe Di, Zhelun Yu, Guanghao Zhang et al.
A-Bench: Are LMMs Masters at Evaluating AI-generated Images?
Zicheng Zhang, Haoning Wu, Chunyi Li et al.
Looking Inward: Language Models Can Learn About Themselves by Introspection
Felix Jedidja Binder, James Chua, Tomek Korbak et al.
Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents
Kexun Zhang, Weiran Yao, Zuxin Liu et al.
Theory, Analysis, and Best Practices for Sigmoid Self-Attention
Jason Ramapuram, Federico Danieli, Eeshan Gunesh Dhekane et al.
Dual RL: Unification and New Methods for Reinforcement and Imitation Learning
Harshit Sikchi, Qinqing Zheng, Amy Zhang et al.
SWIFT: On-the-Fly Self-Speculative Decoding for LLM Inference Acceleration
Heming Xia, Yongqi Li, Jun Zhang et al.
Test-time Alignment of Diffusion Models without Reward Over-optimization
Sunwoo Kim, Minkyu Kim, Dongmin Park
Competition Dynamics Shape Algorithmic Phases of In-Context Learning
Core Francisco Park, Ekdeep Singh Lubana, Hidenori Tanaka
The Semantic Hub Hypothesis: Language Models Share Semantic Representations Across Languages and Modalities
Zhaofeng Wu, Xinyan Yu, Dani Yogatama et al.
EG4D: Explicit Generation of 4D Object without Score Distillation
Qi Sun, Zhiyang Guo, Ziyu Wan et al.
Attention with Markov: A Curious Case of Single-layer Transformers
Ashok Makkuva, Marco Bondaschi, Adway Girish et al.
STanHop: Sparse Tandem Hopfield Model for Memory-Enhanced Time Series Prediction
Yu-Hsuan Wu, Jerry Hu, Weijian Li et al.
Provable Offline Preference-Based Reinforcement Learning
Wenhao Zhan, Masatoshi Uehara, Nathan Kallus et al.
MR-GSM8K: A Meta-Reasoning Benchmark for Large Language Model Evaluation
Zhongshen Zeng, Pengguang Chen, Shu Liu et al.
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting
Yu Liu, Baoxiong Jia, Ruijie Lu et al.
Diffusion Feedback Helps CLIP See Better
Wenxuan Wang, Quan Sun, Fan Zhang et al.
PolyGCL: GRAPH CONTRASTIVE LEARNING via Learnable Spectral Polynomial Filters
Jingyu Chen, Runlin Lei, Zhewei Wei
DiffSplat: Repurposing Image Diffusion Models for Scalable Gaussian Splat Generation
Chenguo Lin, Panwang Pan, Bangbang Yang et al.
PINNACLE: PINN Adaptive ColLocation and Experimental points selection
Gregory Kang Ruey Lau, Apivich Hemachandra, See-Kiong Ng et al.
Quality-Diversity through AI Feedback
Herbie Bradley, Andrew Dai, Hannah Teufel et al.
Dynamic-LLaVA: Efficient Multimodal Large Language Models via Dynamic Vision-language Context Sparsification
Wenxuan Huang, Zijie Zhai, Yunhang Shen et al.
Watermark Anything With Localized Messages
Tom Sander, Pierre Fernandez, Alain Oliviero Durmus et al.
Strong Model Collapse
Elvis Dohmatob, Yunzhen Feng, Arjun Subramonian et al.
PaPaGei: Open Foundation Models for Optical Physiological Signals
Arvind Pillai, Dimitris Spathis, Fahim Kawsar et al.
AirPhyNet: Harnessing Physics-Guided Neural Networks for Air Quality Prediction
Kethmi Hirushini Hettige, Jiahao Ji, Shili Xiang et al.
SG-I2V: Self-Guided Trajectory Control in Image-to-Video Generation
Koichi Namekata, Sherwin Bahmani, Ziyi Wu et al.
Uni-Sign: Toward Unified Sign Language Understanding at Scale
Zecheng Li, Wengang Zhou, Weichao Zhao et al.
When Do Prompting and Prefix-Tuning Work? A Theory of Capabilities and Limitations
Aleksandar Petrov, Philip Torr, Adel Bibi
Stable Neural Stochastic Differential Equations in Analyzing Irregular Time Series Data
YongKyung Oh, Dongyoung Lim, Sungil Kim
TESTAM: A Time-Enhanced Spatio-Temporal Attention Model with Mixture of Experts
Hyunwook Lee, Sungahn Ko
Combining Induction and Transduction for Abstract Reasoning
Wen-Ding Li, Keya Hu, Carter Larsen et al.
Multi-modal Agent Tuning: Building a VLM-Driven Agent for Efficient Tool Usage
Zhi Gao, Bofei Zhang, Pengxiang Li et al.
Making RL with Preference-based Feedback Efficient via Randomization
Runzhe Wu, Wen Sun
No Training, No Problem: Rethinking Classifier-Free Guidance for Diffusion Models
Seyedmorteza Sadat, Manuel Kansy, Otmar Hilliges et al.
Variational Best-of-N Alignment
Afra Amini, Tim Vieira, Elliott Ash et al.
Modeling Future Conversation Turns to Teach LLMs to Ask Clarifying Questions
Michael Zhang, W. Bradley Knox, Eunsol Choi
On Scaling Up 3D Gaussian Splatting Training
Hexu Zhao, Haoyang Weng, Daohan Lu et al.
Looped Transformers for Length Generalization
Ying Fan, Yilun Du, Kannan Ramchandran et al.
Sparse Autoencoders Do Not Find Canonical Units of Analysis
Patrick Leask, Bart Bussmann, Michael Pearce et al.
SymmCD: Symmetry-Preserving Crystal Generation with Diffusion Models
Daniel Levy, Siba Smarak Panigrahi, Sékou-Oumar Kaba et al.
Training-Free Activation Sparsity in Large Language Models
James Liu, Pragaash Ponnusamy, Tianle Cai et al.
VLAS: Vision-Language-Action Model with Speech Instructions for Customized Robot Manipulation
Wei Zhao, Pengxiang Ding, Zhang Min et al.
How Does Critical Batch Size Scale in Pre-training?
Hanlin Zhang, Depen Morwani, Nikhil Vyas et al.
Large Language Models Assume People are More Rational than We Really are
Ryan Liu, Jiayi Geng, Joshua Peterson et al.
Synthetic continued pretraining
Zitong Yang, Neil Band, Shuangping Li et al.
A Unified and General Framework for Continual Learning
Zhenyi Wang, Yan Li, Li Shen et al.
Deconstructing What Makes a Good Optimizer for Autoregressive Language Models
Rosie Zhao, Depen Morwani, David Brandfonbrener et al.
DiscoveryBench: Towards Data-Driven Discovery with Large Language Models
Bodhisattwa Prasad Majumder, Harshit Surana, Dhruv Agarwal et al.
Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning
Mohamed Elsayed, A. Rupam Mahmood
Large Language Models are Efficient Learners of Noise-Robust Speech Recognition
Yuchen Hu, CHEN CHEN, Chao-Han Huck Yang et al.
ConceptPrune: Concept Editing in Diffusion Models via Skilled Neuron Pruning
Ruchika Chavhan, Da Li, Timothy Hospedales
MBR and QE Finetuning: Training-time Distillation of the Best and Most Expensive Decoding Methods
Mara Finkelstein, Markus Freitag
OpenTab: Advancing Large Language Models as Open-domain Table Reasoners
Kezhi Kong, Jiani Zhang, Zhengyuan Shen et al.
PolaFormer: Polarity-aware Linear Attention for Vision Transformers
Weikang Meng, Yadan Luo, Xin Li et al.
TOMATO: Assessing Visual Temporal Reasoning Capabilities in Multimodal Foundation Models
Ziyao Shangguan, Chuhan Li, Yuxuan Ding et al.
Multi-resolution HuBERT: Multi-resolution Speech Self-Supervised Learning with Masked Unit Prediction
Jiatong Shi, Hirofumi Inaguma, Xutai Ma et al.
FreeVS: Generative View Synthesis on Free Driving Trajectory
Qitai Wang, Lue Fan, Yuqi Wang et al.
FairMT-Bench: Benchmarking Fairness for Multi-turn Dialogue in Conversational LLMs
Zhiting Fan, Ruizhe Chen, Tianxiang Hu et al.
DartControl: A Diffusion-Based Autoregressive Motion Model for Real-Time Text-Driven Motion Control
Kaifeng Zhao, Gen Li, Siyu Tang
ExACT: Teaching AI Agents to Explore with Reflective-MCTS and Exploratory Learning
Xiao Yu, Baolin Peng, Vineeth Vajipey et al.
Towards Energy Efficient Spiking Neural Networks: An Unstructured Pruning Framework
Xinyu Shi, Jianhao Ding, Zecheng Hao et al.
Reconstructive Visual Instruction Tuning
Haochen Wang, Anlin Zheng, Yucheng Zhao et al.
AgentStudio: A Toolkit for Building General Virtual Agents
Longtao Zheng, Zhiyuan Huang, Zhenghai Xue et al.
How to Fine-Tune Vision Models with SGD
Ananya Kumar, Ruoqi Shen, Sebastien Bubeck et al.
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
Ziniu Li, Congliang Chen, Tian Xu et al.
Sequential Controlled Langevin Diffusions
Junhua Chen, Lorenz Richter, Julius Berner et al.
Multi-granularity Correspondence Learning from Long-term Noisy Videos
Yijie Lin, Jie Zhang, Zhenyu Huang et al.
Flow Matching with General Discrete Paths: A Kinetic-Optimal Perspective
Neta Shaul, Itai Gat, Marton Havasi et al.
PAD: Personalized Alignment of LLMs at Decoding-time
Ruizhe Chen, Xiaotian Zhang, Meng Luo et al.
EmbodiedSAM: Online Segment Any 3D Thing in Real Time
Xiuwei Xu, Huangxing Chen, Linqing Zhao et al.
Adversarial Perturbations Cannot Reliably Protect Artists From Generative AI
Robert Hönig, Javier Rando, Nicholas Carlini et al.
Do Generated Data Always Help Contrastive Learning?
Yifei Wang, Jizhe Zhang, Yisen Wang
Adaptive Data Optimization: Dynamic Sample Selection with Scaling Laws
Yiding Jiang, Allan Zhou, Zhili Feng et al.
SafeDreamer: Safe Reinforcement Learning with World Models
Weidong Huang, Jiaming Ji, Chunhe Xia et al.
$R^2$-Guard: Robust Reasoning Enabled LLM Guardrail via Knowledge-Enhanced Logical Reasoning
Mintong Kang, Bo Li
Preference Optimization for Reasoning with Pseudo Feedback
Fangkai Jiao, Geyang Guo, Xingxing Zhang et al.
ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities
Ezra Karger, Houtan Bastani, Chen Yueh-Han et al.
HiSplat: Hierarchical 3D Gaussian Splatting for Generalizable Sparse-View Reconstruction
Shengji Tang, Weicai Ye, Peng Ye et al.
LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation
Fangxun Shu, Yue Liao, Lei Zhang et al.
Computational Limits of Low-Rank Adaptation (LoRA) Fine-Tuning for Transformer Models
Jerry Yao-Chieh Hu, Maojiang Su, En-Jui Kuo et al.
AGILE3D: Attention Guided Interactive Multi-object 3D Segmentation
Yuanwen Yue, Sabarinath Mahadevan, Jonas Schult et al.
Compositional Entailment Learning for Hyperbolic Vision-Language Models
Avik Pal, Max van Spengler, Guido D'Amely di Melendugno et al.
LongGenBench: Benchmarking Long-Form Generation in Long Context LLMs
Yuhao Wu, Ming Shan Hee, Zhiqiang Hu et al.
A Closer Look at Machine Unlearning for Large Language Models
Xiaojian Yuan, Tianyu Pang, Chao Du et al.
Proactive Agent: Shifting LLM Agents from Reactive Responses to Active Assistance
Yaxi Lu, Shenzhi Yang, Cheng Qian et al.
Add-it: Training-Free Object Insertion in Images With Pretrained Diffusion Models
Yoad Tewel, Rinon Gal, Dvir Samuel et al.
Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering
Ziyu Zhao, tao shen, Didi Zhu et al.
Think while You Generate: Discrete Diffusion with Planned Denoising
Sulin Liu, Juno Nam, Andrew Campbell et al.
Text4Seg: Reimagining Image Segmentation as Text Generation
Mengcheng Lan, Chaofeng Chen, Yue Zhou et al.
Dynamic Diffusion Transformer
Wangbo Zhao, Yizeng Han, Jiasheng Tang et al.
MMWorld: Towards Multi-discipline Multi-faceted World Model Evaluation in Videos
Xuehai He, Weixi Feng, Kaizhi Zheng et al.
Efficient Evolutionary Search Over Chemical Space with Large Language Models
Haorui Wang, Marta Skreta, Cher-Tian Ser et al.