2025 "scene understanding" Papers

67 papers found • Page 2 of 2

Promptable 3-D Object Localization with Latent Diffusion Models

Cheng-Yao Hong, Li-Heng Wang, Tyng-Luh Liu

NEURIPS 2025poster

PSI: A Benchmark for Human Interpretation and Response in Traffic Interactions

TAOTAO JING, Tina Chen, Renran Tian et al.

NEURIPS 2025poster

Relation3D : Enhancing Relation Modeling for Point Cloud Instance Segmentation

Edward LOO, Jiacheng Deng

CVPR 2025posterarXiv:2506.17891
4
citations

RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving

Zhijian Huang, Chengjian Feng, Baihui Xiao et al.

ICCV 2025posterarXiv:2412.07689
11
citations

Robust 3D Object Detection using Probabilistic Point Clouds from Single-Photon LiDARs

Bhavya Goyal, Felipe Gutierrez-Barragan, Wei Lin et al.

ICCV 2025posterarXiv:2508.00169
2
citations

SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining

Yue Li, Qi Ma, Runyi Yang et al.

ICCV 2025posterarXiv:2503.18052
21
citations

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Yunxiang Fu, Meng Lou, Yizhou Yu

CVPR 2025posterarXiv:2412.11890
22
citations

Spiking Vision Transformer with Saccadic Attention

Shuai Wang, Malu Zhang, Dehao Zhang et al.

ICLR 2025oralarXiv:2502.12677
15
citations

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Yong Liu, Song-Li Wu, Sule Bai et al.

ICCV 2025posterarXiv:2506.16058
2
citations

Supercharging Floorplan Localization with Semantic Rays

Yuval Grader, Hadar Averbuch-Elor

ICCV 2025posterarXiv:2507.09291
2
citations

TACO: Taming Diffusion for in-the-wild Video Amodal Completion

Ruijie Lu, Yixin Chen, Yu Liu et al.

ICCV 2025posterarXiv:2503.12049
10
citations

Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers

An Lun Liu, Yu-Wei Chao, Yi-Ting Chen

ICCV 2025posterarXiv:2507.11287

The 3D-PC: a benchmark for visual perspective taking in humans and machines

Drew Linsley, Peisen Zhou, Alekh Ashok et al.

ICLR 2025posterarXiv:2406.04138
10
citations

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NEURIPS 2025posterarXiv:2505.17771
4
citations

Towards Efficient Foundation Model for Zero-shot Amodal Segmentation

Zhaochen Liu, Limeng Qiao, Xiangxiang Chu et al.

CVPR 2025poster
3
citations

Unveiling the Invisible: Reasoning Complex Occlusions Amodally with AURA

Zhixuan Li, Hyunse Yoon, Sanghoon Lee et al.

ICCV 2025posterarXiv:2503.10225
3
citations

Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting

Anand Bhattad, Konpat Preechakul, Alexei Efros

NEURIPS 2025posterarXiv:2503.21770
8
citations