Le Xue

5

Papers

309

Total Citations

Papers (5)

ULIP-2: Towards Scalable Multimodal Pre-training for 3D Understanding

Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization

LLAVIDAL: A Large LAnguage VIsion Model for Daily Activities of Living

X-InstructBLIP: A Framework for Aligning Image, 3D, Audio, Video to LLMs and its Emergent Cross-modal Reasoning

SegAnyPET: Universal Promptable Segmentation from Positron Emission Tomography Images