Xiaofang Wang

4

Papers

7

Total Citations

Papers (4)

Building a Mind Palace: Structuring Environment-Grounded Semantic Graphs for Effective Long Video Analysis with LLMs

Accelerating Multimodal Large Language Models by Searching Optimal Vision Token Reduction

Apollo: An Exploration of Video Understanding in Large Multimodal Models

ControlRoom3D: Room Generation using Semantic Proxy Rooms