by Ze Huang Papers
3 papers found
4D-VLA: Spatiotemporal Vision-Language-Action Pretraining with Cross-Scene Calibration
Jiahui Zhang, Yurui Chen, Yueming Xu et al.
NeurIPS 2025oral
From Flatland to Space: Teaching Vision-Language Models to Perceive and Reason in 3D
Jiahui Zhang, Yurui Chen, Yueming Xu et al.
NeurIPS 2025poster
WoVoGen: World Volume-aware Diffusion for Controllable Multi-camera Driving Scene Generation
Jiachen Lu, Ze Huang, Zeyu Yang et al.
ECCV 2024poster