Poster "instruction tuning" Papers
64 papers found • Page 1 of 2
Conference
3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination
Jianing "Jed" Yang, Xuweiyi Chen, Nikhil Madaan et al.
3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer
Jiajun Deng, Tianyu He, Li Jiang et al.
3D-MolT5: Leveraging Discrete Structural Information for Molecule-Text Modeling
Qizhi Pei, Rui Yan, Kaiyuan Gao et al.
A Stronger Mixture of Low-Rank Experts for Fine-Tuning Foundation Models
Mengyang Sun, Yihao Wang, Tao Feng et al.
Can Large Language Models Help Multimodal Language Analysis? MMLA: A Comprehensive Benchmark
Hanlei Zhang, zhuohang li, Hua Xu et al.
Chain of Execution Supervision Promotes General Reasoning in Large Language Models
Nuo Chen, Zehua Li, Keqin Bao et al.
ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities
Peng Xu, Wei Ping, Xianchao Wu et al.
ChatReID: Open-ended Interactive Person Retrieval via Hierarchical Progressive Tuning for Vision Language Models
Ke Niu, Haiyang Yu, Mengyang Zhao et al.
Decoupling Angles and Strength in Low-rank Adaptation
Massimo Bini, Leander Girrbach, Zeynep Akata
DELIFT: Data Efficient Language model Instruction Fine-Tuning
Ishika Agarwal, Krishnateja Killamsetty, Lucian Popa et al.
Divot: Diffusion Powers Video Tokenizer for Comprehension and Generation
Yuying Ge, Yizhuo Li, Yixiao Ge et al.
Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want
Weifeng Lin, Xinyu Wei, Ruichuan An et al.
Emerging Safety Attack and Defense in Federated Instruction Tuning of Large Language Models
Rui Ye, Jingyi Chai, Xiangrui Liu et al.
EmoEdit: Evoking Emotions through Image Manipulation
Jingyuan Yang, Jiawei Feng, Weibin Luo et al.
Facilitating Multi-turn Function Calling for LLMs via Compositional Instruction Tuning
Mingyang Chen, sunhaoze, Tianpeng Li et al.
Fact-R1: Towards Explainable Video Misinformation Detection with Deep Reasoning
Fanrui Zhang, Dian Li, Qiang Zhang et al.
Federated Continual Instruction Tuning
Haiyang Guo, Fanhu Zeng, Fei Zhu et al.
Fine-tuning with Reserved Majority for Noise Reduction
Shuyang Jiang, Yusheng Liao, Ya Zhang et al.
FlowerTune: A Cross-Domain Benchmark for Federated Fine-Tuning of Large Language Models
Yan Gao, Massimo R. Scamarcia, Javier Fernandez-Marques et al.
Generative Representational Instruction Tuning
Niklas Muennighoff, Hongjin SU, Liang Wang et al.
HMVLM:Human Motion-Vision-Language Model via MoE LoRA
Lei Hu, Yongjing Ye, Shihong Xia
HoVLE: Unleashing the Power of Monolithic Vision-Language Models with Holistic Vision-Language Embedding
Chenxin Tao, Shiqian Su, Xizhou Zhu et al.
Human Motion Instruction Tuning
Lei Li, Sen Jia, Jianhao Wang et al.
Instruct-SkillMix: A Powerful Pipeline for LLM Instruction Tuning
Simran Kaur, Simon Park, Anirudh Goyal et al.
Learning Dynamics of LLM Finetuning
YI REN, Danica Sutherland
Mimir: Improving Video Diffusion Models for Precise Text Understanding
Shuai Tan, Biao Gong, Yutong Feng et al.
MINT-CoT: Enabling Interleaved Visual Tokens in Mathematical Chain-of-Thought Reasoning
Xinyan Chen, Renrui Zhang, Dongzhi JIANG et al.
Mitigating Object Hallucination in MLLMs via Data-augmented Phrase-level Alignment
Pritam Sarkar, Sayna Ebrahimi, Ali Etemad et al.
Montessori-Instruct: Generate Influential Training Data Tailored for Student Learning
Xiaochuan Li, Zichun Yu, Chenyan Xiong
OmniBench: Towards The Future of Universal Omni-Language Models
Yizhi Li, Ge Zhang, Yinghao Ma et al.
Online Video Understanding: OVBench and VideoChat-Online
Zhenpeng Huang, Xinhao Li, Jiaqi Li et al.
OrderChain: Towards General Instruct-Tuning for Stimulating the Ordinal Understanding Ability of MLLM
Jinhong Wang, Shuo Tong, Jintai CHEN et al.
PixWizard: Versatile Image-to-Image Visual Assistant with Open-Language Instructions
Weifeng Lin, Xinyu Wei, Renrui Zhang et al.
ProSec: Fortifying Code LLMs with Proactive Security Alignment
Xiangzhe Xu, Zian Su, Jinyao Guo et al.
RoomTour3D: Geometry-Aware Video-Instruction Tuning for Embodied Navigation
Mingfei Han, Liang Ma, Kamila Zhumakhanova et al.
Scale Efficient Training for Large Datasets
Qing Zhou, Junyu Gao, Qi Wang
Scaling Instruction-tuned LLMs to Million-token Contexts via Hierarchical Synthetic Data Generation
Linda He, Jue Wang, Maurice Weber et al.
Synthetic Visual Genome
Jae Sung Park, Zixian Ma, Linjie Li et al.
Teaching VLMs to Localize Specific Objects from In-context Examples
Sivan Doveh, Nimrod Shabtay, Eli Schwartz et al.
TOGA: Temporally Grounded Open-Ended Video QA with Weak Supervision
Ayush Gupta, Anirban Roy, Rama Chellappa et al.
Training Language Models on Synthetic Edit Sequences Improves Code Synthesis
Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
T-SHIRT: Token-Selective Hierarchical Data Selection for Instruction Tuning
Yanjun Fu, Faisal Hamman, Sanghamitra Dutta
Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation
Sungmin Cha, Kyunghyun Cho
WizardMath: Empowering Mathematical Reasoning for Large Language Models via Reinforced Evol-Instruct
Haipeng Luo, Qingfeng Sun, Can Xu et al.
A Closer Look at the Limitations of Instruction Tuning
Sreyan Ghosh, Chandra Kiran Evuru, Sonal Kumar et al.
Dolphins: Multimodal Language Model for Driving
Yingzi Ma, Yulong Cao, Jiachen Sun et al.
eCeLLM: Generalizing Large Language Models for E-commerce from Large-scale, High-quality Instruction Data
Peng, Xinyi Ling, Ziru Chen et al.
Evaluating Model Bias Requires Characterizing its Mistakes
Isabela Albuquerque, Jessica Schrouff, David Warde-Farley et al.
Executable Code Actions Elicit Better LLM Agents
Xingyao Wang, Yangyi Chen, Lifan Yuan et al.
Facial Affective Behavior Analysis with Instruction Tuning
Yifan Li, Anh Dao, Wentao Bao et al.