CVPR 2025 "large language models" Papers
10 papers found
A Simple yet Effective Layout Token in Large Language Models for Document Understanding
Zhaoqing Zhu, Chuwei Luo, Zirui Shao et al.
CVPR 2025posterarXiv:2503.18434
7
citations
CXPMRG-Bench: Pre-training and Benchmarking for X-ray Medical Report Generation on CheXpert Plus Dataset
Xiao Wang, Fuling Wang, Yuehang Li et al.
CVPR 2025posterarXiv:2410.00379
16
citations
DivPrune: Diversity-based Visual Token Pruning for Large Multimodal Models
Saeed Ranjbar Alvar, Gursimran Singh, Mohammad Akbari et al.
CVPR 2025posterarXiv:2503.02175
48
citations
DSV-LFS: Unifying LLM-Driven Semantic Cues with Visual Features for Robust Few-Shot Segmentation
Amin Karimi, Charalambos Poullis
CVPR 2025posterarXiv:2503.04006
4
citations
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy
You Li, Fan Ma, Yi Yang
CVPR 2025posterarXiv:2411.16752
9
citations
InteractAnything: Zero-shot Human Object Interaction Synthesis via LLM Feedback and Object Affordance Parsing
Jinlu Zhang, Yixin Chen, Zan Wang et al.
CVPR 2025highlightarXiv:2505.24315
7
citations
IterIS: Iterative Inference-Solving Alignment for LoRA Merging
Hongxu chen, Zhen Wang, Runshi Li et al.
CVPR 2025posterarXiv:2411.15231
5
citations
Learning from Neighbors: Category Extrapolation for Long-Tail Learning
Shizhen Zhao, Xin Wen, Jiahui Liu et al.
CVPR 2025posterarXiv:2410.15980
5
citations
Rethinking Vision-Language Model in Face Forensics: Multi-Modal Interpretable Forged Face Detector
Xiao Guo, Xiufeng Song, Yue Zhang et al.
CVPR 2025posterarXiv:2503.20188
24
citations
Video Summarization with Large Language Models
Min Jung Lee, Dayoung Gong, Minsu Cho
CVPR 2025posterarXiv:2504.11199
8
citations