Weijia Li

14

Papers

139

Total Citations

Papers (14)

LEGION: Learning to Ground and Explain for Synthetic Image Detection

Cross-view image geo-localization with Panorama-BEV Co-Retrieval Network

UrBench: A Comprehensive Benchmark for Evaluating Large Multimodal Models in Multi-View Urban Scenarios

SG-BEV: Satellite-Guided BEV Fusion for Cross-View Semantic Segmentation

Where am I? Cross-View Geo-localization with Natural Language Descriptions

Efficient Multi-modal Large Language Models via Progressive Consistency Distillation

NeurIPS 2025arXiv

Can Large Multimodal Models Understand Agricultural Scenes? Benchmarking with AgroMind

NeurIPS 2025arXiv

BLINK-Twice: You see, but do you observe? A Reasoning Benchmark on Visual Perception

NeurIPS 2025arXiv

Leveraging BEV Paradigm for Ground-to-Aerial Image Synthesis

VHM: Versatile and Honest Vision Language Model for Remote Sensing Image Analysis

Scene4U: Hierarchical Layered 3D Scene Reconstruction from Single Panoramic Image for Your Immerse Exploration

Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion Model

AutoOS: Make Your OS More Powerful by Exploiting Large Language Models

3D Building Reconstruction from Monocular Remote Sensing Images with Multi-level Supervisions