Tai Wang

16
Papers
278
Total Citations

Papers (16)

LLaVA-3D: A Simple yet Effective Pathway to Empowering LMMs with 3D Capabilities

ICCV 2025
127
citations

Unified Human-Scene Interaction via Prompted Chain-of-Contacts

ICLR 2024
100
citations

GenNBV: Generalizable Next-Best-View Policy for Active 3D Reconstruction

CVPR 2024
34
citations

OST-Bench: Evaluating the Capabilities of MLLMs in Online Spatio-temporal Scene Understanding

NeurIPS 2025
6
citations

GENMANIP: LLM-driven Simulation for Generalizable Instruction-Following Manipulation

CVPR 2025
5
citations

Rethinking the Embodied Gap in Vision-and-Language Navigation: A Holistic Study of Physical and Visual Disparities

ICCV 2025
4
citations

VFlowOpt: A Token Pruning Framework for LMMs with Visual Information Flow-Guided Optimization

ICCV 2025
2
citations

GeoMIM: Towards Better 3D Knowledge Transfer via Masked Image Modeling for Multi-view 3D Understanding

ICCV 2023arXiv
0
citations

RoboGround: Robotic Manipulation with Grounded Vision-Language Priors

CVPR 2025
0
citations

Monocular 3D Object Detection with Depth from Motion

ECCV 2022
0
citations

GLEAM: Learning Generalizable Exploration Policy for Active Mapping in Complex 3D Indoor Scene

ICCV 2025
0
citations

EmbodiedScan: A Holistic Multi-Modal 3D Perception Suite Towards Embodied AI

CVPR 2024
0
citations

Cylindrical and Asymmetrical 3D Convolution Networks for LiDAR Segmentation

CVPR 2021arXiv
0
citations

MV-JAR: Masked Voxel Jigsaw and Reconstruction for LiDAR-Based Self-Supervised Pre-Training

CVPR 2023
0
citations

MonoDETR: Depth-guided Transformer for Monocular 3D Object Detection

ICCV 2023arXiv
0
citations

Scene as Occupancy

ICCV 2023arXiv
0
citations