2025 Poster "supervised fine-tuning" Papers
17 papers found
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
Hanlei Zhang, zhuohang li, Hua Xu et al.
InstructRAG: Instructing Retrieval-Augmented Generation via Self-Synthesized Rationales
Zhepei Wei, Wei-Lin Chen, Yu Meng
Logical Consistency of Large Language Models in Fact-Checking
Bishwamittra Ghosh, Sarah Hasan, Naheed Anjum Arafat et al.
Longer Context, Deeper Thinking: Uncovering the Role of Long-Context Ability in Reasoning
Wang Yang, Zirui Liu, Hongye Jin et al.
Multi-Token Prediction Needs Registers
Anastasios Gerontopoulos, Spyridon Gidaris, Nikos Komodakis
Preserving Diversity in Supervised Fine-Tuning of Large Language Models
Ziniu Li, Congliang Chen, Tian Xu et al.
Reinforcement Learning with Backtracking Feedback
Bilgehan Sel, Vaishakh Keshava, Phillip Wallis et al.
Repetition Improves Language Model Embeddings
Jacob Springer, Suhas Kotha, Daniel Fried et al.
RoboRefer: Towards Spatial Referring with Reasoning in Vision-Language Models for Robotics
Enshen Zhou, Jingkun An, Cheng Chi et al.
Robot-R1: Reinforcement Learning for Enhanced Embodied Reasoning in Robotics
Dongyoung Kim, Huiwon Jang, Sumin Park et al.
Steering Information Utility in Key-Value Memory for Language Model Post-Training
Chunyuan Deng, Ruidi Chang, Hanjie Chen
The Promise of RL for Autoregressive Image Editing
Saba Ahmadi, Rabiul Awal, Ankur Sikarwar et al.
Topology of Reasoning: Understanding Large Reasoning Models through Reasoning Graph Properties
Gouki Minegishi, Hiroki Furuta, Takeshi Kojima et al.
Unveiling the Secret Recipe: A Guide For Supervised Fine-Tuning Small LLMs
Aldo Pareja, Nikhil Shivakumar Nayak, Hao Wang et al.
VPO: Aligning Text-to-Video Generation Models with Prompt Optimization
Jiale Cheng, Ruiliang Lyu, Xiaotao Gu et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
Words or Vision: Do Vision-Language Models Have Blind Faith in Text?
Ailin Deng, Tri Cao, Zhirui Chen et al.