Poster "instruction tuning" Papers

64 papers found • Page 1 of 2

3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination

Jianing "Jed" Yang, Xuweiyi Chen, Nikhil Madaan et al.

CVPR 2025arXiv:2406.05132
30
citations

3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer

Jiajun Deng, Tianyu He, Li Jiang et al.

CVPR 2025arXiv:2501.01163
40
citations

3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling

Qizhi Pei, Rui Yan, Kaiyuan Gao et al.

ICLR 2025arXiv:2406.05797
6
citations

A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models

Mengyang Sun, Yihao Wang, Tao Feng et al.

ICML 2025arXiv:2502.15828
6
citations

Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark

Hanlei Zhang, zhuohang li, Hua Xu et al.

NEURIPS 2025arXiv:2504.16427
2
citations

Chain of Execution Supervision Promotes General Reasoning in Large Language Models

Nuo Chen, Zehua Li, Keqin Bao et al.

NEURIPS 2025arXiv:2510.23629

ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities

Peng Xu, Wei Ping, Xianchao Wu et al.

ICLR 2025arXiv:2407.14482
37
citations

ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models

Ke Niu, Haiyang Yu, Mengyang Zhao et al.

ICCV 2025arXiv:2502.19958
8
citations

Decoupling Angles and Strength in Low-rank Adaptation

Massimo Bini, Leander Girrbach, Zeynep Akata

ICLR 2025arXiv:2503.18225
8
citations

DELIFT: Data Efficient Language model Instruction Fine-Tuning

Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa et al.

ICLR 2025arXiv:2411.04425
9
citations

Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation

Yuying Ge, Yizhuo Li, Yixiao Ge et al.

CVPR 2025arXiv:2412.04432
9
citations

Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want

Weifeng Lin, Xinyu Wei, Ruichuan An et al.

ICLR 2025arXiv:2403.20271
87
citations

Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models

Rui Ye, Jingyi Chai, Xiangrui Liu et al.

ICLR 2025arXiv:2406.10630
18
citations

EmoEdit: Evoking Emotions through Image Manipulation

Jingyuan Yang, Jiawei Feng, Weibin Luo et al.

CVPR 2025arXiv:2405.12661
20
citations

Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning

Mingyang Chen, sunhaoze, Tianpeng Li et al.

ICLR 2025arXiv:2410.12952
26
citations

Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning

Fanrui Zhang, Dian Li, Qiang Zhang et al.

NEURIPS 2025arXiv:2505.16836
4
citations

Federated Continual Instruction Tuning

Haiyang Guo, Fanhu Zeng, Fei Zhu et al.

ICCV 2025arXiv:2503.12897
7
citations

Fine-tuning with Reserved Majority for Noise Reduction

Shuyang Jiang, Yusheng Liao, Ya Zhang et al.

ICLR 2025
2
citations

FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models

Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques et al.

NEURIPS 2025arXiv:2506.02961
4
citations

Generative Representational Instruction Tuning

Niklas Muennighoff, Hongjin SU, Liang Wang et al.

ICLR 2025arXiv:2402.09906
222
citations

HMVLM:Human Motion-Vision-Language Model via MoE LoRA

Lei Hu, Yongjing Ye, Shihong Xia

NEURIPS 2025

HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding

Chenxin Tao, Shiqian Su, Xizhou Zhu et al.

CVPR 2025arXiv:2412.16158
5
citations

Human Motion Instruction Tuning

Lei Li, Sen Jia, Jianhao Wang et al.

CVPR 2025arXiv:2411.16805
14
citations

Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning

Simran Kaur, Simon Park, Anirudh Goyal et al.

ICLR 2025arXiv:2408.14774
19
citations

Learning Dynamics of LLM Finetuning

YI REN, Danica Sutherland

ICLR 2025arXiv:2407.10490
67
citations

Mimir: Improving Video Diffusion Models for Precise Text Understanding

Shuai Tan, Biao Gong, Yutong Feng et al.

CVPR 2025arXiv:2412.03085
16
citations

MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning

Xinyan Chen, Renrui Zhang, Dongzhi JIANG et al.

NEURIPS 2025arXiv:2506.05331
24
citations

Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment

Pritam Sarkar, Sayna Ebrahimi, Ali Etemad et al.

ICLR 2025arXiv:2405.18654
22
citations

Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning

Xiaochuan Li, Zichun Yu, Chenyan Xiong

ICLR 2025arXiv:2410.14208
5
citations

OmniBench: Towards The Future of Universal Omni-Language Models

Yizhi Li, Ge Zhang, Yinghao Ma et al.

NEURIPS 2025arXiv:2409.15272
53
citations

Online Video Understanding: OVBench and VideoChat-Online

Zhenpeng Huang, Xinhao Li, Jiaqi Li et al.

CVPR 2025arXiv:2501.00584
12
citations

OrderChain: Towards General Instruct-Tuning for Stimulating the Ordinal Understanding Ability of MLLM

Jinhong Wang, Shuo Tong, Jintai CHEN et al.

ICCV 2025arXiv:2504.04801

PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions

Weifeng Lin, Xinyu Wei, Renrui Zhang et al.

ICLR 2025arXiv:2409.15278
26
citations

ProSec: Fortifying Code LLMs with Proactive Security Alignment

Xiangzhe Xu, Zian Su, Jinyao Guo et al.

ICML 2025arXiv:2411.12882
13
citations

RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation

Mingfei Han, Liang Ma, Kamila Zhumakhanova et al.

CVPR 2025arXiv:2412.08591
12
citations

Scale Efficient Training for Large Datasets

Qing Zhou, Junyu Gao, Qi Wang

CVPR 2025arXiv:2503.13385
3
citations

Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation

Linda He, Jue Wang, Maurice Weber et al.

ICLR 2025arXiv:2504.12637
2
citations

Synthetic Visual Genome

Jae Sung Park, Zixian Ma, Linjie Li et al.

CVPR 2025arXiv:2506.07643
2
citations

Teaching VLMs to Localize Specific Objects from In-context Examples

Sivan Doveh, Nimrod Shabtay, Eli Schwartz et al.

ICCV 2025arXiv:2411.13317
3
citations

TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision

Ayush Gupta, Anirban Roy, Rama Chellappa et al.

ICCV 2025arXiv:2506.09445

Training Language Models on Synthetic Edit Sequences Improves Code Synthesis

Ulyana Piterbarg, Lerrel Pinto, Rob Fergus

ICLR 2025arXiv:2410.02749
7
citations

T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning

Yanjun Fu, Faisal Hamman, Sanghamitra Dutta

NEURIPS 2025arXiv:2506.01317
7
citations

Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation

Sungmin Cha, Kyunghyun Cho

NEURIPS 2025arXiv:2505.13111
4
citations

WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct

Haipeng Luo, Qingfeng Sun, Can Xu et al.

ICLR 2025arXiv:2308.09583
655
citations

A Closer Look at the Limitations of Instruction Tuning

Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.

ICML 2024arXiv:2402.05119
83
citations

Dolphins: Multimodal Language Model for Driving

Yingzi Ma, Yulong Cao, Jiachen Sun et al.

ECCV 2024arXiv:2312.00438
128
citations

eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data

Peng, Xinyi Ling, Ziru Chen et al.

ICML 2024arXiv:2402.08831
46
citations

Evaluating Model Bias Requires Characterizing its Mistakes

Isabela Albuquerque, Jessica Schrouff, David Warde-Farley et al.

ICML 2024arXiv:2407.10633
3
citations

Executable Code Actions Elicit Better LLM Agents

Xingyao Wang, Yangyi Chen, Lifan Yuan et al.

ICML 2024arXiv:2402.01030
344
citations

Facial Affective Behavior Analysis with Instruction Tuning

Yifan Li, Anh Dao, Wentao Bao et al.

ECCV 2024arXiv:2404.05052
24
citations
PreviousNext