Most Cited ICLR "target-aware projection" Papers
6,124 papers found • Page 4 of 31
Conference
DisenBooth: Identity-Preserving Disentangled Tuning for Subject-Driven Text-to-Image Generation
Hong Chen, Yipeng Zhang, Simin Wu et al.
Plug-and-Play Policy Planner for Large Language Model Powered Dialogue Agents
Yang Deng, Wenxuan Zhang, Wai Lam et al.
BALROG: Benchmarking Agentic LLM and VLM Reasoning On Games
Davide Paglieri, Bartłomiej Cupiał, Samuel Coward et al.
PhysBench: Benchmarking and Enhancing Vision-Language Models for Physical World Understanding
Wei Chow, Jiageng Mao, Boyi Li et al.
Mega-TTS 2: Boosting Prompting Mechanisms for Zero-Shot Speech Synthesis
Ziyue Jiang, Jinglin Liu, Yi Ren et al.
Polynormer: Polynomial-Expressive Graph Transformer in Linear Time
Chenhui Deng, Zichao Yue, Zhiru Zhang
Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs
Yuxin Zhang, Lirui Zhao, Mingbao Lin et al.
The Geometry of Categorical and Hierarchical Concepts in Large Language Models
Kiho Park, Yo Joong Choe, Yibo Jiang et al.
FreDF: Learning to Forecast in the Frequency Domain
Hao Wang, Lichen Pan, Yuan Shen et al.
Overthinking the Truth: Understanding how Language Models Process False Demonstrations
Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt
Relay Diffusion: Unifying diffusion process across resolutions for image synthesis
Jiayan Teng, Wendi Zheng, Ming Ding et al.
MedTrinity-25M: A Large-scale Multimodal Dataset with Multigranular Annotations for Medicine
Yunfei Xie, Ce Zhou, Lang Gao et al.
Planning in Natural Language Improves LLM Search for Code Generation
Evan Wang, Federico Cassano, Catherine Wu et al.
PromptTTS 2: Describing and Generating Voices with Text Prompt
Yichong Leng, ZHifang Guo, Kai Shen et al.
EfficientDM: Efficient Quantization-Aware Fine-Tuning of Low-Bit Diffusion Models
YEFEI HE, Jing Liu, Weijia Wu et al.
In-Context Learning through the Bayesian Prism
Madhur Panwar, Kabir Ahuja, Navin Goyal
Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models
Zachary Ankner, Cody Blakeney, Kartik Sreenivasan et al.
Inducing High Energy-Latency of Large Vision-Language Models with Verbose Images
Kuofeng Gao, Yang Bai, Jindong Gu et al.
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Melissa Hall, Michal Drozdzal, Oscar Mañas et al.
Frequency-Aware Transformer for Learned Image Compression
Han Li, Shaohui Li, Wenrui Dai et al.
Simple Hierarchical Planning with Diffusion
Chang Chen, Fei Deng, Kenji Kawaguchi et al.
Scalable Diffusion for Materials Generation
Sherry Yang, Kwanghwan Cho, Amil Merchant et al.
TorchRL: A data-driven decision-making library for PyTorch
Albert Bou, Matteo Bettini, Sebastian Dittert et al.
Transformers as Decision Makers: Provable In-Context Reinforcement Learning via Supervised Pretraining
Licong Lin, Yu Bai, Song Mei
SWE-Search: Enhancing Software Agents with Monte Carlo Tree Search and Iterative Refinement
Antonis Antoniades, Albert Örwall, Kexun Zhang et al.
DriveTransformer: Unified Transformer for Scalable End-to-End Autonomous Driving
Xiaosong Jia, Junqi You, Zhiyuan Zhang et al.
GameGen-X: Interactive Open-world Game Video Generation
Haoxuan Che, Xuanhua He, Quande Liu et al.
VisualAgentBench: Towards Large Multimodal Models as Visual Foundation Agents
Xiao Liu, Tianjie Zhang, Yu Gu et al.
InstructScene: Instruction-Driven 3D Indoor Scene Synthesis with Semantic Graph Prior
Chenguo Lin, Yadong MU
On the Learnability of Watermarks for Language Models
Chenchen Gu, XIANG LI, Percy Liang et al.
Weak to Strong Generalization for Large Language Models with Multi-capabilities
Yucheng Zhou, Jianbing Shen, Yu Cheng
Circumventing Concept Erasure Methods For Text-To-Image Generative Models
Minh Pham, Kelly Marshall, Niv Cohen et al.
Composed Image Retrieval with Text Feedback via Multi-grained Uncertainty Regularization
Yiyang Chen, Zhedong Zheng, Wei Ji et al.
Arithmetic Without Algorithms: Language Models Solve Math with a Bag of Heuristics
Yaniv Nikankin, Anja Reusch, Aaron Mueller et al.
MagicPIG: LSH Sampling for Efficient LLM Generation
Zhuoming Chen, Ranajoy Sadhukhan, Zihao Ye et al.
Accelerating Diffusion Transformers with Token-wise Feature Caching
Chang Zou, Xuyang Liu, Ting Liu et al.
ChartMimic: Evaluating LMM's Cross-Modal Reasoning Capability via Chart-to-Code Generation
Cheng Yang, Chufan Shi, Yaxin Liu et al.
How do Language Models Bind Entities in Context?
Jiahai Feng, Jacob Steinhardt
CycleResearcher: Improving Automated Research via Automated Review
Yixuan Weng, Minjun Zhu, Guangsheng Bao et al.
AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents
Ke Yang, Yao Liu, Sapana Chaudhary et al.
SolidGen: An Autoregressive Model for Direct B-rep Synthesis
Karl Willis, Joseph Lambourne, Nigel Morris et al.
Deep Confident Steps to New Pockets: Strategies for Docking Generalization
Gabriele Corso, Arthur Deng, Nicholas Polizzi et al.
Guiding Masked Representation Learning to Capture Spatio-Temporal Relationship of Electrocardiogram
Yeongyeon Na, Minje Park, Yunwon Tae et al.
Does Refusal Training in LLMs Generalize to the Past Tense?
Maksym Andriushchenko, Nicolas Flammarion
Learn-by-interact: A Data-Centric Framework For Self-Adaptive Agents in Realistic Environments
Hongjin SU, Ruoxi Sun, Jinsung Yoon et al.
Long Context Compression with Activation Beacon
Peitian Zhang, Zheng Liu, Shitao Xiao et al.
CatVTON: Concatenation Is All You Need for Virtual Try-On with Diffusion Models
Zheng Chong, Xiao Dong, Haoxiang Li et al.
Grokking as the transition from lazy to rich training dynamics
Tanishq Kumar, Blake Bordelon, Samuel Gershman et al.
Multi-Source Diffusion Models for Simultaneous Music Generation and Separation
Giorgio Mariani, Irene Tallini, Emilian Postolache et al.
Recursive Generalization Transformer for Image Super-Resolution
Zheng Chen, Yulun Zhang, Jinjin Gu et al.
ReDeEP: Detecting Hallucination in Retrieval-Augmented Generation via Mechanistic Interpretability
Zhongxiang Sun, Xiaoxue Zang, Kai Zheng et al.
Scaling for Training Time and Post-hoc Out-of-distribution Detection Enhancement
Kai Xu, Rongyu Chen, Gianni Franchi et al.
QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models
Jing Liu, Ruihao Gong, Xiuying Wei et al.
Scaling Laws for Precision
Tanishq Kumar, Zachary Ankner, Benjamin Spector et al.
ViDiT-Q: Efficient and Accurate Quantization of Diffusion Transformers for Image and Video Generation
Tianchen Zhao, Tongcheng Fang, Haofeng Huang et al.
ImageFolder: Autoregressive Image Generation with Folded Tokens
Xiang Li, Kai Qiu, Hao Chen et al.
Self-Introspective Decoding: Alleviating Hallucinations for Large Vision-Language Models
Fushuo Huo, Wenchao Xu, Zhong Zhang et al.
CATCH: Channel-Aware Multivariate Time Series Anomaly Detection via Frequency Patching
Xingjian Wu, Xiangfei Qiu, Zhengyu Li et al.
Tensor Programs VI: Feature Learning in Infinite Depth Neural Networks
Greg Yang, Dingli Yu, Chen Zhu et al.
Attention Satisfies: A Constraint-Satisfaction Lens on Factual Errors of Language Models
Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones et al.
Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence
Weize Chen, Ziming You, Ran Li et al.
Deep Temporal Graph Clustering
Meng Liu, Yue Liu, KE LIANG et al.
Process Reward Model with Q-value Rankings
Wendi Li, Yixuan Li
MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning
Haotian Zhang, Mingfei Gao, Zhe Gan et al.
Successor Heads: Recurring, Interpretable Attention Heads In The Wild
Rhys Gould, Euan Ong, George Ogden et al.
Looped Transformers are Better at Learning Learning Algorithms
Liu Yang, Kangwook Lee, Robert Nowak et al.
Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition
Feng Lu, Lijun Zhang, Xiangyuan Lan et al.
AI Sandbagging: Language Models can Strategically Underperform on Evaluations
Teun van der Weij, Felix Hofstätter, Oliver Jaffe et al.
TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning
Xiangyu Zeng, Kunchang Li, Chenting Wang et al.
Learning Dynamics of LLM Finetuning
YI REN, Danica Sutherland
Interpreting and Editing Vision-Language Representations to Mitigate Hallucinations
Nick Jiang, Anish Kachinthaya, Suzanne Petryk et al.
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models
Andy K Zhang, Neil Perry, Riya Dulepet et al.
Matryoshka Diffusion Models
Jiatao Gu, Shuangfei Zhai, Yizhe Zhang et al.
Space Group Constrained Crystal Generation
Rui Jiao, Wenbing Huang, Yu Liu et al.
C-TPT: Calibrated Test-Time Prompt Tuning for Vision-Language Models via Text Feature Dispersion
Hee Suk Yoon, Eunseop Yoon, Joshua Tian Jin Tee et al.
Image and Video Tokenization with Binary Spherical Quantization
Yue Zhao, Yuanjun Xiong, Philipp Krähenbühl
Alleviating Exposure Bias in Diffusion Models through Sampling with Shifted Time Steps
Mingxiao Li, Tingyu Qu, Ruicong Yao et al.
Evaluating the Zero-shot Robustness of Instruction-tuned Language Models
Jiuding Sun, Chantal Shaib, Byron Wallace
An Emulator for Fine-tuning Large Language Models using Small Language Models
Eric Mitchell, Rafael Rafailov, Archit Sharma et al.
Smaller, Weaker, Yet Better: Training LLM Reasoners via Compute-Optimal Sampling
Hritik Bansal, Arian Hosseini, Rishabh Agarwal et al.
GIM: Learning Generalizable Image Matcher From Internet Videos
Xuelun Shen, zhipeng cai, Wei Yin et al.
On the Foundations of Shortcut Learning
Katherine Hermann, Hossein Mobahi, Thomas FEL et al.
Web Agents with World Models: Learning and Leveraging Environment Dynamics in Web Navigation
Hyungjoo Chae, Namyoung Kim, Kai Ong et al.
DSBench: How Far Are Data Science Agents from Becoming Data Science Experts?
Liqiang Jing, Zhehui Huang, Xiaoyang Wang et al.
MovieDreamer: Hierarchical Generation for Coherent Long Visual Sequences
Canyu Zhao, Mingyu Liu, Wen Wang et al.
Don't Trust: Verify -- Grounding LLM Quantitative Reasoning with Autoformalization
Jin Zhou, Charles Staats, Wenda Li et al.
TabR: Tabular Deep Learning Meets Nearest Neighbors
Yury Gorishniy, Ivan Rubachev, Nikolay Kartashev et al.
GraphCare: Enhancing Healthcare Predictions with Personalized Knowledge Graphs
Pengcheng Jiang, Cao Xiao, Adam Cross et al.
Variational Bayesian Last Layers
James Harrison, John Willes, Jasper Snoek
Robust agents learn causal world models
Jonathan Richens, Tom Everitt
Towards Principled Evaluations of Sparse Autoencoders for Interpretability and Control
Aleksandar Makelov, Georg Lange, Neel Nanda
Cut the Crap: An Economical Communication Pipeline for LLM-based Multi-Agent Systems
Guibin Zhang, Yanwei Yue, Zhixun Li et al.
Animate-X: Universal Character Image Animation with Enhanced Motion Representation
Shuai Tan, Biao Gong, Xiang Wang et al.
Making Pre-trained Language Models Great on Tabular Prediction
Jiahuan Yan, Bo Zheng, Hongxia Xu et al.
CLaM-TTS: Improving Neural Codec Language Model for Zero-Shot Text-to-Speech
Jaehyeon Kim, Keon Lee, Seungjun Chung et al.
ImagenHub: Standardizing the evaluation of conditional image generation models
Max Ku, Tianle Li, Kai Zhang et al.
MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding
Ranajoy Sadhukhan, Jian Chen, Zhuoming Chen et al.
Dynamic-SUPERB Phase-2: A Collaboratively Expanding Benchmark for Measuring the Capabilities of Spoken Language Models with 180 Tasks
Chien-yu Huang, Wei-Chih Chen, Shu-wen Yang et al.
Masked Audio Generation using a Single Non-Autoregressive Transformer
Alon Ziv, Itai Gat, Gael Le Lan et al.
Tell Your Model Where to Attend: Post-hoc Attention Steering for LLMs
Qingru Zhang, Chandan Singh, Liyuan Liu et al.
Sentence-level Prompts Benefit Composed Image Retrieval
Yang Bai, Xinxing Xu, Yong Liu et al.
Monte Carlo guided Denoising Diffusion models for Bayesian linear inverse problems.
Gabriel Cardoso, Yazid Janati el idrissi, Sylvain Le Corff et al.
Diffusion Generative Flow Samplers: Improving learning signals through partial trajectory optimization
Dinghuai Zhang, Ricky T. Q. Chen, Chenghao Liu et al.
Consistency Models as a Rich and Efficient Policy Class for Reinforcement Learning
Zihan Ding, Chi Jin
ViDA: Homeostatic Visual Domain Adapter for Continual Test Time Adaptation
Jiaming Liu, Senqiao Yang, Peidong Jia et al.
Toward effective protection against diffusion-based mimicry through score distillation
Haotian Xue, Chumeng Liang, Xiaoyu Wu et al.
Multiagent Finetuning: Self Improvement with Diverse Reasoning Chains
Vighnesh Subramaniam, Yilun Du, Joshua B Tenenbaum et al.
Matryoshka Multimodal Models
Mu Cai, Jianwei Yang, Jianfeng Gao et al.
Generative Pre-training for Speech with Flow Matching
Alexander Liu, Matthew Le, Apoorv Vyas et al.
PINNsFormer: A Transformer-Based Framework For Physics-Informed Neural Networks
Zhiyuan Zhao, Xueying Ding, B. Aditya Prakash
Scaling Transformers for Low-Bitrate High-Quality Speech Coding
Julian Parker, Anton Smirnov, Jordi Pons et al.
Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF
Shicong Cen, Jincheng Mei, Katayoon Goshvadi et al.
FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Xunhao Lai, Jianqiao Lu, Yao Luo et al.
RazorAttention: Efficient KV Cache Compression Through Retrieval Heads
Hanlin Tang, Yang Lin, Jing Lin et al.
PnP-Flow: Plug-and-Play Image Restoration with Flow Matching
Ségolène Martin, Anne Gagneux, Paul Hagemann et al.
DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer
Junyuan Hong, Jiachen (Tianhao) Wang, Chenhui Zhang et al.
Negative Label Guided OOD Detection with Pretrained Vision-Language Models
Xue JIANG, Feng Liu, Zhen Fang et al.
SAFREE: Training-Free and Adaptive Guard for Safe Text-to-Image And Video Generation
Jaehong Yoon, Shoubin Yu, Vaidehi Ramesh Patil et al.
SKILL-MIX: a Flexible and Expandable Family of Evaluations for AI Models
Dingli Yu, Simran Kaur, Arushi Gupta et al.
AgentSquare: Automatic LLM Agent Search in Modular Design Space
Yu Shang, Yu Li, Keyu Zhao et al.
SafeDiffuser: Safe Planning with Diffusion Probabilistic Models
Wei Xiao, Johnson (Tsun-Hsuan) Wang, Chuang Gan et al.
BEND: Benchmarking DNA Language Models on Biologically Meaningful Tasks
Frederikke Marin, Felix Teufel, Marc Horlacher et al.
CodeChain: Towards Modular Code Generation Through Chain of Self-revisions with Representative Sub-modules
Hung Le, Hailin Chen, Amrita Saha et al.
Reti-Diff: Illumination Degradation Image Restoration with Retinex-based Latent Diffusion Model
Chunming He, Chengyu Fang, Yulun Zhang et al.
Compressing LLMs: The Truth is Rarely Pure and Never Simple
AJAY JAISWAL, Zhe Gan, Xianzhi Du et al.
Proteina: Scaling Flow-based Protein Structure Generative Models
Tomas Geffner, Kieran Didi, Zuobai Zhang et al.
ScienceAgentBench: Toward Rigorous Assessment of Language Agents for Data-Driven Scientific Discovery
Ziru Chen, Shijie Chen, Yuting Ning et al.
Building Math Agents with Multi-Turn Iterative Preference Learning
Wei Xiong, Chengshuai Shi, Jiaming Shen et al.
See What You Are Told: Visual Attention Sink in Large Multimodal Models
Seil Kang, Jinyeong Kim, Junhyeok Kim et al.
LEAP: Liberate Sparse-View 3D Modeling from Camera Poses
Hanwen Jiang, Zhenyu Jiang, Yue Zhao et al.
LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving
Tianyu Li, Peijin Jia, Bangjun Wang et al.
Geographic Location Encoding with Spherical Harmonics and Sinusoidal Representation Networks
Marc Rußwurm, Konstantin Klemmer, Esther Rolf et al.
The Blessing of Randomness: SDE Beats ODE in General Diffusion-based Image Editing
Shen Nie, Hanzhong Guo, Cheng Lu et al.
Language Model Inversion
John X. Morris, Wenting Zhao, Justin Chiu et al.
Booster: Tackling Harmful Fine-tuning for Large Language Models via Attenuating Harmful Perturbation
Tiansheng Huang, Sihao Hu, Fatih Ilhan et al.
Differentially Private Synthetic Data via Foundation Model APIs 1: Images
Zinan Lin, Sivakanth Gopi, Janardhan Kulkarni et al.
Repetition Improves Language Model Embeddings
Jacob Springer, Suhas Kotha, Daniel Fried et al.
Lemur: Integrating Large Language Models in Automated Program Verification
Haoze Wu, Clark Barrett, Nina Narodytska
Not All Heads Matter: A Head-Level KV Cache Compression Method with Integrated Retrieval and Reasoning
Yu Fu, Zefan Cai, Abedelkadir Asi et al.
Self-Improvement in Language Models: The Sharpening Mechanism
Audrey Huang, Adam Block, Dylan Foster et al.
Ground-A-Video: Zero-shot Grounded Video Editing using Text-to-image Diffusion Models
Hyeonho Jeong, Jong Chul YE
COLLIE: Systematic Construction of Constrained Text Generation Tasks
Shunyu Yao, Howard Chen, Austin Hanjie et al.
Massive Editing for Large Language Models via Meta Learning
Chenmien Tan, Ge Zhang, Jie Fu
Exploring Target Representations for Masked Autoencoders
xingbin liu, Jinghao Zhou, Tao Kong et al.
On Diffusion Modeling for Anomaly Detection
Victor Livernoche, Vineet Jain, Yashar Hezaveh et al.
Tell me about yourself: LLMs are aware of their learned behaviors
Jan Betley, Xuchan Bao, Martín Soto et al.
NeuroLM: A Universal Multi-task Foundation Model for Bridging the Gap between Language and EEG Signals
Wei-Bang Jiang, Yansen Wang, Bao-liang Lu et al.
Magnushammer: A Transformer-Based Approach to Premise Selection
Maciej Mikuła, Szymon Tworkowski, Szymon Antoniak et al.
How to Evaluate Reward Models for RLHF
Evan Frick, Tianle Li, Connor Chen et al.
Multi-View Causal Representation Learning with Partial Observability
Dingling Yao, Danru Xu, Sébastien Lachapelle et al.
Towards Semantic Equivalence of Tokenization in Multimodal LLM
Shengqiong Wu, Hao Fei, Xiangtai Li et al.
FasterCache: Training-Free Video Diffusion Model Acceleration with High Quality
Zhengyao Lyu, Chenyang Si, Junhao Song et al.
LLaRA: Supercharging Robot Learning Data for Vision-Language Policy
Xiang Li, Cristina Mata, Jongwoo Park et al.
OWL: A Large Language Model for IT Operations
Hongcheng Guo, Jian Yang, Jiaheng Liu et al.
Simple is Effective: The Roles of Graphs and Large Language Models in Knowledge-Graph-Based Retrieval-Augmented Generation
Mufei Li, Siqi Miao, Pan Li
What Matters When Repurposing Diffusion Models for General Dense Perception Tasks?
Guangkai Xu, yongtao ge, Mingyu Liu et al.
EdgeRunner: Auto-regressive Auto-encoder for Artistic Mesh Generation
Jiaxiang Tang, Max Li, Zekun Hao et al.
MuSc: Zero-Shot Industrial Anomaly Classification and Segmentation with Mutual Scoring of the Unlabeled Images
Xurui Li, Ziming Huang, Feng Xue et al.
Hymba: A Hybrid-head Architecture for Small Language Models
Xin Dong, Yonggan Fu, Shizhe Diao et al.
LLM-SR: Scientific Equation Discovery via Programming with Large Language Models
Parshin Shojaee, Kazem Meidani, Shashank Gupta et al.
Controlling Space and Time with Diffusion Models
Daniel Watson, Saurabh Saxena, Lala Li et al.
Physics-Informed Diffusion Models
Jan-Hendrik Bastek, WaiChing Sun, Dennis Kochmann
An Unforgeable Publicly Verifiable Watermark for Large Language Models
Aiwei Liu, Leyi Pan, Xuming Hu et al.
LLMOPT: Learning to Define and Solve General Optimization Problems from Scratch
caigao jiang, Xiang Shu, Hong Qian et al.
Dichotomy of Early and Late Phase Implicit Biases Can Provably Induce Grokking
Kaifeng Lyu, Jikai Jin, Zhiyuan Li et al.
Safe Offline Reinforcement Learning with Feasibility-Guided Diffusion Model
Yinan Zheng, Jianxiong Li, Dongjie Yu et al.
Raidar: geneRative AI Detection viA Rewriting
Chengzhi Mao, Carl Vondrick, Hao Wang et al.
Eliminating Oversaturation and Artifacts of High Guidance Scales in Diffusion Models
Seyedmorteza Sadat, Otmar Hilliges, Romann Weber
CLoSD: Closing the Loop between Simulation and Diffusion for multi-task character control
Guy Tevet, Sigal Raab, Setareh Cohan et al.
Benchmarking Multimodal Retrieval Augmented Generation with Dynamic VQA Dataset and Self-adaptive Planning Agent
Yangning Li, Yinghui Li, Xinyu Wang et al.
Prototypical Information Bottlenecking and Disentangling for Multimodal Cancer Survival Prediction
Yilan Zhang, Yingxue XU, Jianqi Chen et al.
TabM: Advancing tabular deep learning with parameter-efficient ensembling
Yury Gorishniy, Akim Kotelnikov, Artem Babenko
Progressive3D: Progressively Local Editing for Text-to-3D Content Creation with Complex Semantic Prompts
Xinhua Cheng, Tianyu Yang, Jianan Wang et al.
Controlled Text Generation via Language Model Arithmetic
Jasper Dekoninck, Marc Fischer, Luca Beurer-Kellner et al.
Simplifying Deep Temporal Difference Learning
Matteo Gallici, Mattie Fellows, Benjamin Ellis et al.
From Molecules to Materials: Pre-training Large Generalizable Models for Atomic Property Prediction
Nima Shoghi, Adeesh Kolluru, John Kitchin et al.
OpenNeRF: Open Set 3D Neural Scene Segmentation with Pixel-Wise Features and Rendered Novel Views
Francis Engelmann, Fabian Manhardt, Michael Niemeyer et al.
RepoGraph: Enhancing AI Software Engineering with Repository-level Code Graph
Siru Ouyang, Wenhao Yu, Kaixin Ma et al.
On Penalty Methods for Nonconvex Bilevel Optimization and First-Order Stochastic Approximation
Jeongyeol Kwon, Dohyun Kwon, Stephen Wright et al.
In-Context Learning Learns Label Relationships but Is Not Conventional Learning
Jannik Kossen, Yarin Gal, Tom Rainforth
LLM Blueprint: Enabling Text-to-Image Generation with Complex and Detailed Prompts
Hanan Gani, Shariq Bhat, Muzammal Naseer et al.
Towards Interpreting Visual Information Processing in Vision-Language Models
Clement Neo, Luke Ong, Philip Torr et al.
Seer: Language Instructed Video Prediction with Latent Diffusion Models
Xianfan Gu, Chuan Wen, Weirui Ye et al.
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Xiangyu Wang, Donglin Yang, ziqin wang et al.
UniAdapter: Unified Parameter-Efficient Transfer Learning for Cross-modal Modeling
Haoyu Lu, Yuqi Huo, Guoxing Yang et al.
Test-Time Training on Nearest Neighbors for Large Language Models
Moritz Hardt, Yu Sun
LLM Unlearning via Loss Adjustment with Only Forget Data
Yaxuan Wang, Jiaheng Wei, Yuhao Liu et al.
SynCamMaster: Synchronizing Multi-Camera Video Generation from Diverse Viewpoints
Jianhong Bai, Menghan Xia, Xintao WANG et al.
Beyond Weisfeiler-Lehman: A Quantitative Framework for GNN Expressiveness
Bohang Zhang, Jingchu Gai, Yiheng Du et al.
Tuning LayerNorm in Attention: Towards Efficient Multi-Modal LLM Finetuning
Bingchen Zhao, Haoqin Tu, Chen Wei et al.
SALMON: Self-Alignment with Instructable Reward Models
Zhiqing Sun, Yikang Shen, Hongxin Zhang et al.
Model merging with SVD to tie the Knots
George Stoica, Pratik Ramesh, Boglarka Ecsedi et al.
KOR-Bench: Benchmarking Language Models on Knowledge-Orthogonal Reasoning Tasks
Kaijing Ma, Xeron Du, Yunran Wang et al.
Energy-Based Diffusion Language Models for Text Generation
Minkai Xu, Tomas Geffner, Karsten Kreis et al.
MVSFormer++: Revealing the Devil in Transformer's Details for Multi-View Stereo
chenjie cao, xinlin ren, Yanwei Fu
SEAL: Safety-enhanced Aligned LLM Fine-tuning via Bilevel Data Selection
Han Shen, Pin-Yu Chen, Payel Das et al.
LoRA-Pro: Are Low-Rank Adapters Properly Optimized?
Zhengbo Wang, Jian Liang, Ran He et al.
MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation
Chenxi Wang, Xiang Chen, Ningyu Zhang et al.
Correcting the Mythos of KL-Regularization: Direct Alignment without Overoptimization via Chi-Squared Preference Optimization
Audrey Huang, Wenhao Zhan, Tengyang Xie et al.
AgentTrek: Agent Trajectory Synthesis via Guiding Replay with Web Tutorials
Yiheng Xu, Dunjie Lu, Zhennan Shen et al.