2025 "large language models" Papers
267 papers found • Page 5 of 6
Revising and Falsifying Sparse Autoencoder Feature Explanations
George Ma, Samuel Pfrommer, Somayeh Sojoudi
RHYTHM: Reasoning with Hierarchical Temporal Tokenization for Human Mobility
Haoyu He, Haozheng Luo, Yan Chen et al.
Risk-aware Direct Preference Optimization under Nested Risk Measure
Lijun Zhang, Lin Li, Yajie Qi et al.
RoboTron-Nav: A Unified Framework for Embodied Navigation Integrating Perception, Planning, and Prediction
Yufeng Zhong, Chengjian Feng, Feng yan et al.
Robust Hallucination Detection in LLMs via Adaptive Token Selection
Mengjia Niu, Hamed Haddadi, Guansong Pang
ROUTE: Robust Multitask Tuning and Collaboration for Text-to-SQL
Yang Qin, Chao Chen, Zhihang Fu et al.
RSAVQ: Riemannian Sensitivity-Aware Vector Quantization for Large Language Models
Zukang Xu, Xing Hu, Qiang Wu et al.
rStar-Coder: Scaling Competitive Code Reasoning with a Large-Scale Verified Dataset
Yifei Liu, Li Lyna Zhang, Yi Zhu et al.
Scaling and context steer LLMs along the same computational path as the human brain
Joséphine Raugel, Jérémy Rapin, Stéphane d'Ascoli et al.
Self-Evolving Pseudo-Rehearsal for Catastrophic Forgetting with Task Similarity in LLMs
Jun Wang, Liang Ding, Shuai Wang et al.
Self Iterative Label Refinement via Robust Unlabeled Learning
Hikaru Asano, Tadashi Kozuno, Yukino Baba
Self-Updatable Large Language Models by Integrating Context into Model Parameters
Yu Wang, Xinshuang Liu, Xiusi Chen et al.
Self-Verification Provably Prevents Model Collapse in Recursive Synthetic Training
Shi Fu, Yingjie Wang, Yuzhu Chen et al.
SeRL: Self-play Reinforcement Learning for Large Language Models with Limited Data
Wenkai Fang, Shunyu Liu, Yang Zhou et al.
ShiQ: Bringing back Bellman to LLMs
Pierre Clavier, Nathan Grinsztajn, Raphaël Avalos et al.
SimPER: A Minimalist Approach to Preference Alignment without Hyperparameters
Teng Xiao, Yige Yuan, Zhengyu Chen et al.
SIMS: Simulating Stylized Human-Scene Interactions with Retrieval-Augmented Script Generation
Wenjia Wang, Liang Pan, Zhiyang Dou et al.
Simulating Society Requires Simulating Thought
Chance Jiajie Li, Jiayi Wu, Zhenze MO et al.
S'MoRE: Structural Mixture of Residual Experts for Parameter-Efficient LLM Fine-tuning
Hanqing Zeng, Yinglong Xia, Zhuokai Zhao et al.
SORRY-Bench: Systematically Evaluating Large Language Model Safety Refusal
Tinghao Xie, Xiangyu Qi, Yi Zeng et al.
Sparse MeZO: Less Parameters for Better Performance in Zeroth-Order LLM Fine-Tuning
Yong Liu, Zirui Zhu, Chaoyu Gong et al.
SPARTUN3D: Situated Spatial Understanding of 3D World in Large Language Model
Yue Zhang, Zhiyang Xu, Ying Shen et al.
Speculative RAG: Enhancing Retrieval Augmented Generation through Drafting
Zilong (Ryan) Wang, Zifeng Wang, Long Le et al.
SpikeLLM: Scaling up Spiking Neural Network to Large Language Models via Saliency-based Spiking
Xingrun Xing, Boyan Gao, Zheng Liu et al.
SSTAG: Structure-Aware Self-Supervised Learning Method for Text-Attributed Graphs
Ruyue Liu, Rong Yin, Xiangzhen Bo et al.
SteerConf: Steering LLMs for Confidence Elicitation
Ziang Zhou, Tianyuan Jin, Jieming Shi et al.
Stop DDoS Attacking the Research Community with AI-Generated Survey Papers
Jianghao Lin, Rong Shan, Jiachen Zhu et al.
Streaming Attention Approximation via Discrepancy Theory
Ekaterina Kochetkova, Kshiteej Jitesh Sheth, Insu Han et al.
StructRAG: Boosting Knowledge Intensive Reasoning of LLMs via Inference-time Hybrid Information Structurization
Zhuoqun Li, Xuanang Chen, Haiyang Yu et al.
SWE-bench Goes Live!
Linghao Zhang, Shilin He, Chaoyun Zhang et al.
SynLogic: Synthesizing Verifiable Reasoning Data at Scale for Learning Logical Reasoning and Beyond
Junteng Liu, Yuanxiang Fan, Jiang Zhuo et al.
System Prompt Optimization with Meta-Learning
Yumin Choi, Jinheon Baek, Sung Ju Hwang
TCM-Ladder: A Benchmark for Multimodal Question Answering on Traditional Chinese Medicine
Jiacheng Xie, Yang Yu, Ziyang Zhang et al.
The Common Pile v0.1: An 8TB Dataset of Public Domain and Openly Licensed Text
Nikhil Kandpal, Brian Lester, Colin Raffel et al.
ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning
Shulin Huang, Linyi Yang, Yan Song et al.
Think Thrice Before You Act: Progressive Thought Refinement in Large Language Models
Chengyu Du, Jinyi Han, Yizhou Ying et al.
Timely Clinical Diagnosis through Active Test Selection
Silas Ruhrberg Estévez, Nicolás Astorga, Mihaela van der Schaar
To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning
Zayne Sprague, Fangcong Yin, Juan Rodriguez et al.
TorchTitan: One-stop PyTorch native solution for production ready LLM pretraining
Wanchao Liang, Tianyu Liu, Less Wright et al.
Towards Effective Evaluations and Comparisons for LLM Unlearning Methods
Qizhou Wang, Bo Han, Puning Yang et al.
Towards Understanding Safety Alignment: A Mechanistic Perspective from Safety Neurons
Jianhui Chen, Xiaozhi Wang, Zijun Yao et al.
Training-Free Activation Sparsity in Large Language Models
James Liu, Pragaash Ponnusamy, Tianle Cai et al.
Training-Free Bayesianization for Low-Rank Adapters of Large Language Models
Haizhou Shi, Yibin Wang, Ligong Han et al.
TrajAgent: An LLM-Agent Framework for Trajectory Modeling via Large-and-Small Model Collaboration
Yuwei Du, Jie Feng, Jie Zhao et al.
Trajectory-LLM: A Language-based Data Generator for Trajectory Prediction in Autonomous Driving
Kairui Yang, Zihao Guo, Gengjie Lin et al.
Tree of Preferences for Diversified Recommendation
Hanyang Yuan, Ning Tang, Tongya Zheng et al.
Triplets Better Than Pairs: Towards Stable and Effective Self-Play Fine-Tuning for LLMs
Yibo Wang, Hai-Long Sun, Guangda Huzhang et al.
Truth over Tricks: Measuring and Mitigating Shortcut Learning in Misinformation Detection
Herun Wan, Jiaying Wu, Minnan Luo et al.
TSENOR: Highly-Efficient Algorithm for Finding Transposable N:M Sparse Masks
Xiang Meng, Mehdi Makni, Rahul Mazumder
TTRL: Test-Time Reinforcement Learning
Yuxin Zuo, Kaiyan Zhang, Li Sheng et al.