Poster "scene understanding" Papers

46 papers found

2D Gaussian Splatting-based Sparse-view Transparent Object Depth Reconstruction via Physics Simulation for Scene Update

Jeongyun Kim, Seunghoon Jeong, Giseop Kim et al.

ICCV 2025posterarXiv:2507.11069
1
citations

3D-LLaVA: Towards Generalist 3D LMMs with Omni Superpoint Transformer

Jiajun Deng, Tianyu He, Li Jiang et al.

CVPR 2025posterarXiv:2501.01163
39
citations

ADAPT: Attentive Self-Distillation and Dual-Decoder Prediction Fusion for Continual Panoptic Segmentation

Ze Yang, Shichao Dong, Ruibo Li et al.

ICLR 2025poster

A Dataset for Semantic Segmentation in the Presence of Unknowns

Zakaria Laskar, Tomas Vojir, Matej Grcic et al.

CVPR 2025posterarXiv:2503.22309

Auto-Vocabulary Semantic Segmentation

Osman Ülger, Maksymilian Kulicki, Yuki Asano et al.

ICCV 2025posterarXiv:2312.04539
4
citations

Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

Lei-lei Li, Jianwu Fang, Junbin Xiao et al.

ICCV 2025posterarXiv:2506.23263

DenseGrounding: Improving Dense Language-Vision Semantics for Ego-centric 3D Visual Grounding

Henry Zheng, Hao Shi, Qihang Peng et al.

ICLR 2025posterarXiv:2505.04965
8
citations

Distilling Multi-modal Large Language Models for Autonomous Driving

Deepti Hegde, Rajeev Yasarla, Hong Cai et al.

CVPR 2025posterarXiv:2501.09757
27
citations

Exploring the Effectiveness of Object-Centric Representations in Visual Question Answering: Comparative Insights with Foundation Models

Amir Mohammad Karimi Mamaghan, Samuele Papa, Karl H. Johansson et al.

ICLR 2025posterarXiv:2407.15589
12
citations

From Objects to Anywhere: A Holistic Benchmark for Multi-level Visual Grounding in 3D Scenes

Tianxu Wang, Zhuofan Zhang, Ziyu Zhu et al.

NeurIPS 2025posterarXiv:2506.04897

HouseLayout3D: A Benchmark and Training-free Baseline for 3D Layout Estimation in the Wild

Valentin Bieri, Marie-Julie Rakotosaona, Keisuke Tateno et al.

NeurIPS 2025posterarXiv:2512.02450

HUMOTO: A 4D Dataset of Mocap Human Object Interactions

Jiaxin Lu, Chun-Hao Huang, Uttaran Bhattacharya et al.

ICCV 2025posterarXiv:2504.10414
6
citations

Knowledge Transfer from Interaction Learning

Yilin Gao, Kangyi Chen, Zhongxing Peng et al.

ICCV 2025posterarXiv:2509.18733

Learning 3D Scene Analogies with Neural Contextual Scene Maps

Junho Kim, Gwangtak Bae, Eun Sun Lee et al.

ICCV 2025posterarXiv:2503.15897
1
citations

MirrorVerse: Pushing Diffusion Models to Realistically Reflect the World

Ankit Dhiman, Manan Shah, R. Venkatesh Babu

CVPR 2025posterarXiv:2504.15397
1
citations

MLLM-For3D: Adapting Multimodal Large Language Model for 3D Reasoning Segmentation

Jiaxin Huang, Runnan Chen, Ziwen Li et al.

NeurIPS 2025posterarXiv:2503.18135
8
citations

MLLMs Need 3D-Aware Representation Supervision for Scene Understanding

Xiaohu Huang, Jingjing Wu, Qunyi Xie et al.

NeurIPS 2025posterarXiv:2506.01946
17
citations

mmWalk: Towards Multi-modal Multi-view Walking Assistance

Kedi Ying, Ruiping Liu, Chongyan Chen et al.

NeurIPS 2025posterarXiv:2510.11520

PolarFree: Polarization-based Reflection-Free Imaging

Mingde Yao, Menglu Wang, King Man Tam et al.

CVPR 2025posterarXiv:2503.18055
4
citations

Prioritizing Perception-Guided Self-Supervision: A New Paradigm for Causal Modeling in End-to-End Autonomous Driving

Yi Huang, Zhan Qu, Lihui Jiang et al.

NeurIPS 2025posterarXiv:2511.08214
1
citations

Promptable 3-D Object Localization with Latent Diffusion Models

Cheng-Yao Hong, Li-Heng Wang, Tyng-Luh Liu

NeurIPS 2025poster

SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining

Yue Li, Qi Ma, Runyi Yang et al.

ICCV 2025posterarXiv:2503.18052
20
citations

SegMAN: Omni-scale Context Modeling with State Space Models and Local Attention for Semantic Segmentation

Yunxiang Fu, Meng Lou, Yizhou Yu

CVPR 2025posterarXiv:2412.11890
22
citations

Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation

Yong Liu, Song-Li Wu, Sule Bai et al.

ICCV 2025posterarXiv:2506.16058
2
citations

Supercharging Floorplan Localization with Semantic Rays

Yuval Grader, Hadar Averbuch-Elor

ICCV 2025posterarXiv:2507.09291
2
citations

Task-Oriented Human Grasp Synthesis via Context- and Task-Aware Diffusers

An Lun Liu, Yu-Wei Chao, Yi-Ting Chen

ICCV 2025posterarXiv:2507.11287

TopoPoint: Enhance Topology Reasoning via Endpoint Detection in Autonomous Driving

Yanping Fu, Xinyuan Liu, Tianyu Li et al.

NeurIPS 2025posterarXiv:2505.17771
4
citations

Visual Jenga: Discovering Object Dependencies via Counterfactual Inpainting

Anand Bhattad, Konpat Preechakul, Alexei Efros

NeurIPS 2025posterarXiv:2503.21770
8
citations

3D Small Object Detection with Dynamic Spatial Pruning

Xiuwei Xu, Zhihao Sun, Ziwei Wang et al.

ECCV 2024posterarXiv:2305.03716
9
citations

CarFormer: Self-Driving with Learned Object-Centric Representations

Shadi Hamdan, Fatma Guney

ECCV 2024posterarXiv:2407.15843
11
citations

CountFormer: Multi-View Crowd Counting Transformer

Hong Mo, Xiong Zhang, Jianchao Tan et al.

ECCV 2024posterarXiv:2407.02047
9
citations

EgoPet: Egomotion and Interaction Data from an Animal's Perspective

Amir Bar, Arya Bakhtiar, Danny L Tran et al.

ECCV 2024posterarXiv:2404.09991
10
citations

FastCAD: Real-Time CAD Retrieval and Alignment from Scans and Videos

Florian Langer, Jihong Ju, Georgi Dikov et al.

ECCV 2024posterarXiv:2403.15161
5
citations

General Geometry-aware Weakly Supervised 3D Object Detection

Guowen Zhang, Junsong Fan, Liyi Chen et al.

ECCV 2024posterarXiv:2407.13748
5
citations

GS-LRM: Large Reconstruction Model for 3D Gaussian Splatting

Kai Zhang, Sai Bi, Hao Tan et al.

ECCV 2024posterarXiv:2404.19702
246
citations

MC-PanDA: Mask Confidence for Panoptic Domain Adaptation

Ivan Martinovic, Josip Šarić, Siniša Šegvić

ECCV 2024posterarXiv:2407.14110
2
citations

MTMamba: Enhancing Multi-Task Dense Scene Understanding by Mamba-Based Decoders

Baijiong Lin, Weisen Jiang, Pengguang Chen et al.

ECCV 2024posterarXiv:2407.02228
26
citations

NeRF-MAE: Masked AutoEncoders for Self-Supervised 3D Representation Learning for Neural Radiance Fields

Muhammad Zubair Irshad, Sergey Zakharov, Vitor Guizilini et al.

ECCV 2024posterarXiv:2404.01300
22
citations

Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields

Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.

ECCV 2024posterarXiv:2403.11131

Open Panoramic Segmentation

Junwei Zheng, Ruiping Liu, Yufan Chen et al.

ECCV 2024posterarXiv:2407.02685
15
citations

OpenPSG: Open-set Panoptic Scene Graph Generation via Large Multimodal Models

Zijian Zhou, Zheng Zhu, Holger Caesar et al.

ECCV 2024posterarXiv:2407.11213
13
citations

Random Walk on Pixel Manifolds for Anomaly Segmentation of Complex Driving Scenes

Zelong Zeng, Kaname Tomite

ECCV 2024posterarXiv:2404.17961
1
citations

SAM-E: Leveraging Visual Foundation Model with Sequence Imitation for Embodied Manipulation

Junjie Zhang, Chenjia Bai, Haoran He et al.

ICML 2024poster

SAM-guided Graph Cut for 3D Instance Segmentation

Haoyu Guo, He Zhu, Sida Peng et al.

ECCV 2024posterarXiv:2312.08372
32
citations

Scene-Graph ViT: End-to-End Open-Vocabulary Visual Relationship Detection

Tim Salzmann, Markus Ryll, Alex Bewley et al.

ECCV 2024posterarXiv:2403.14270
8
citations

Training-Free Model Merging for Multi-target Domain Adaptation

Wenyi Li, Huan-ang Gao, Mingju Gao et al.

ECCV 2024posterarXiv:2407.13771
11
citations