Pandeng Li

9

Papers

76

Total Citations

Papers (9)

Towards Balanced Alignment: Modal-Enhanced Semantic Modeling for Video Moment Retrieval

Hybrid-Level Instruction Injection for Video Token Compression in Multi-modal Large Language Models

UFO: A Unified Approach to Fine-grained Visual Perception via Open-ended Language Interface

NeurIPS 2025arXiv

AlignZeg: Mitigating Objective Misalignment for Zero-shot Semantic Segmentation

CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

CLIP-Adapted Region-to-Text Learning for Generative Open-Vocabulary Semantic Segmentation

Progressive Spatio-Temporal Prototype Matching for Text-Video Retrieval

Dual-Stream Knowledge-Preserving Hashing for Unsupervised Video Retrieval

MomentDiff: Generative Video Moment Retrieval from Random to Real