Autonomous Driving
Self-driving vehicles and related perception
Top Papers
Is Ego Status All You Need for Open-Loop End-to-End Autonomous Driving?
Zhiqi Li, Zhiding Yu, Shiyi Lan et al.
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Yiheng Xu, Zekun Wang, Junli Wang et al.
Generative End-to-End Autonomous Driving
Wenzhao Zheng, Ruiqi Song, Xianda Guo et al.
Street Gaussians: Modeling Dynamic Urban Scenes with Gaussian Splatting
Yunzhi Yan, Haotong Lin, Chenxu Zhou et al.
VLP: Vision Language Planning for Autonomous Driving
Chenbin Pan, Burhan Yaman, Tommaso Nesti et al.
NeuRAD: Neural Rendering for Autonomous Driving
Adam Tonderski, Carl Lindström, Georg Hess et al.
Dolphins: Multimodal Language Model for Driving
Yingzi Ma, Yulong Cao, Jiachen Sun et al.
Generalized Predictive Model for Autonomous Driving
Jiazhi Yang, Shenyuan Gao, Yihang Qiu et al.
TUMTraf V2X Cooperative Perception Dataset
Walter Zimmer, Gerhard Arya Wardana, Suren Sritharan et al.
WebDancer: Towards Autonomous Information Seeking Agency
Jialong Wu, Baixuan Li, Runnan Fang et al.
Diffusion-Based Planning for Autonomous Driving with Flexible Guidance
Yinan Zheng, Ruiming Liang, Kexin ZHENG et al.
ORION: A Holistic End-to-End Autonomous Driving Framework by Vision-Language Instructed Action Generation
Haoyu Fu, Diankun Zhang, Zongchuang Zhao et al.
LaneSegNet: Map Learning with Lane Segment Perception for Autonomous Driving
Tianyu Li, Peijin Jia, Bangjun Wang et al.
DriveArena: A Closed-loop Generative Simulation Platform for Autonomous Driving
Xuemeng Yang, Licheng Wen, Tiantian Wei et al.
Towards Realistic UAV Vision-Language Navigation: Platform, Benchmark, and Methodology
Xiangyu Wang, Donglin Yang, ziqin wang et al.
AvatarGPT: All-in-One Framework for Motion Understanding Planning Generation and Beyond
Zixiang Zhou, Yu Wan, Baoyuan Wang
Digital Life Project: Autonomous 3D Characters with Social Intelligence
Zhongang Cai, Jianping Jiang, Zhongfei Qing et al.
SimLingo: Vision-Only Closed-Loop Autonomous Driving with Language-Action Alignment
Katrin Renz, Long Chen, Elahe Arani et al.
SEPT: Towards Efficient Scene Representation Learning for Motion Prediction
Zhiqian Lan, Yuxuan Jiang, Yao Mu et al.
Collaborative Semantic Occupancy Prediction with Hybrid Feature Fusion in Connected Automated Vehicles
Rui Song, Chenwei Liang, Hu Cao et al.
DrivingGPT: Unifying Driving World Modeling and Planning with Multi-modal Autoregressive Transformers
Yuntao Chen, Yuqi Wang, Zhaoxiang Zhang
MagicDrive-V2: High-Resolution Long Video Generation for Autonomous Driving with Adaptive Control
Ruiyuan Gao, Kai Chen, Bo Xiao et al.
End-to-End Autonomous Driving Through V2X Cooperation
Haibao Yu, Wenxian Yang, Jiaru Zhong et al.
RAD: Training an End-to-End Driving Policy via Large-Scale 3DGS-based Reinforcement Learning
Hao Gao, Shaoyu Chen, Bo Jiang et al.
Think2Drive: Efficient Reinforcement Learning by Thinking with Latent World Model for Autonomous Driving (in CARLA-v2)
Qifeng Li, Xiaosong Jia, Shaobo Wang et al.
NeuroNCAP: Photorealistic Closed-loop Safety Testing for Autonomous Driving
William Ljungbergh, Adam Tonderski, Joakim Johnander et al.
Don't Shake the Wheel: Momentum-Aware Planning in End-to-End Autonomous Driving
Ziying Song, Caiyan Jia, Lin Liu et al.
SplatAD: Real-Time Lidar and Camera Rendering with 3D Gaussian Splatting for Autonomous Driving
Georg Hess, Carl Lindström, Maryam Fatemi et al.
Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving
JunDa Cheng, Wei Yin, Kaixuan Wang et al.
SlowTrack: Increasing the Latency of Camera-Based Perception in Autonomous Driving Using Adversarial Examples
Chen Ma, Ningfei Wang, Qi Alfred Chen et al.
UnO: Unsupervised Occupancy Fields for Perception and Forecasting
Ben Agro, Quinlan Sykora, Sergio Casas et al.
DrivingForward: Feed-forward 3D Gaussian Splatting for Driving Scene Reconstruction from Flexible Surround-view Input
Qijian Tian, Xin Tan, Yuan Xie et al.
Robust Autonomy Emerges from Self-Play
Marco Cusumano-Towner, David Hafner, Alexander Hertzberg et al.
SeFlow: A Self-Supervised Scene Flow Method in Autonomous Driving
Qingwen Zhang, Yi Yang, Peizheng Li et al.
FreeVS: Generative View Synthesis on Free Driving Trajectory
Qitai Wang, Lue Fan, Yuqi Wang et al.
Dispider: Enabling Video LLMs with Active Real-Time Interaction via Disentangled Perception, Decision, and Reaction
Rui Qian, Shuangrui Ding, Xiaoyi Dong et al.
Visual Agentic AI for Spatial Reasoning with a Dynamic API
Damiano Marsili, Rohun Agrawal, Yisong Yue et al.
DeSiRe-GS: 4D Street Gaussians for Static-Dynamic Decomposition and Surface Reconstruction for Urban Driving Scenes
Chensheng Peng, Chengwei Zhang, Yixiao Wang et al.
STAMP: Scalable Task- And Model-agnostic Collaborative Perception
Xiangbo Gao, Runsheng Xu, Jiachen Li et al.
PPAD: Iterative Interactions of Prediction and Planning for End-to-end Autonomous Driving
Zhili Chen, Maosheng Ye, Shuangjie Xu et al.
Distilling Multi-modal Large Language Models for Autonomous Driving
Deepti Hegde, Rajeev Yasarla, Hong Cai et al.
Safe-Sim: Safety-Critical Closed-Loop Traffic Simulation with Diffusion-Controllable Adversaries
WEI-JER Chang, Francesco Pittaluga, Masayoshi TOMIZUKA et al.
CityWalker: Learning Embodied Urban Navigation from Web-Scale Videos
Xinhao Liu, Jintong Li, Yicheng Jiang et al.
CityNav: A Large-Scale Dataset for Real-World Aerial Navigation
Jungdae Lee, Taiki Miyanishi, Shuhei Kurita et al.
Decoupled Spatio-Temporal Consistency Learning for Self-Supervised Tracking
Yaozong Zheng, Bineng Zhong, Qihua Liang et al.
LISO: Lidar-only Self-Supervised 3D Object Detection
Stefan Baur, Frank Moosmann, Andreas Geiger
MUSES: The Multi-Sensor Semantic Perception Dataset for Driving under Uncertainty
Tim Broedermann, David Brüggemann, Christos Sakaridis et al.
Epona: Autoregressive Diffusion World Model for Autonomous Driving
Kaiwen Zhang, Zhenyu Tang, Xiaotao Hu et al.
PACER+: On-Demand Pedestrian Animation Controller in Driving Scenarios
Jingbo Wang, Zhengyi Luo, Ye Yuan et al.
Reliability in Semantic Segmentation: Can We Use Synthetic Data?
Thibaut Loiseau, Tuan Hung Vu, Mickael Chen et al.
Bridging Past and Future: End-to-End Autonomous Driving with Historical Prediction and Planning
Bozhou Zhang, Nan Song, Xin Jin et al.
AnyTouch: Learning Unified Static-Dynamic Representation across Multiple Visuo-tactile Sensors
Ruoxuan Feng, Jiangyu Hu, Wenke Xia et al.
PreSight: Enhancing Autonomous Vehicle Perception with City-Scale NeRF Priors
Tianyuan Yuan, Mao Yucheng, Jiawei Yang et al.
V2XPnP: Vehicle-to-Everything Spatio-Temporal Fusion for Multi-Agent Perception and Prediction
Zewei Zhou, Hao Xiang, Zhaoliang Zheng et al.
Navigation Instruction Generation with BEV Perception and Large Language Models
Sheng Fan, Rui Liu, Wenguan Wang et al.
HENet: Hybrid Encoding for End-to-end Multi-task 3D Perception from Multi-view Cameras
Zhongyu Xia, ZhiWei Lin, Xinhao Wang et al.
InsMapper: Exploring Inner-instance Information for Vectorized HD Mapping
Zhenhua Xu, Kwan-Yee K. Wong, Hengshuang ZHAO
UniM2AE: Multi-modal Masked Autoencoders with Unified 3D Representation for 3D Perception in Autonomous Driving
Jian Zou, Tianyu Huang, Guanglei Yang et al.
3DRealCar: An In-the-wild RGB-D Car Dataset with 360-degree Views
Xiaobiao Du, Yida Wang, Haiyang Sun et al.
Adaptive VIO: Deep Visual-Inertial Odometry with Online Continual Learning
Youqi Pan, Wugen Zhou, Yingdian Cao et al.
Multiagent Multitraversal Multimodal Self-Driving: Open MARS Dataset
Yiming Li, Zhiheng Li, Nuo Chen et al.
Weakly Supervised Semantic Segmentation for Driving Scenes
Dongseob Kim, Seungho Lee, Junsuk Choe et al.
Can LVLMs Obtain a Driver’s License? A Benchmark Towards Reliable AGI for Autonomous Driving
Yuhang Lu, Yichen Yao, Jiadong Tu et al.
DriveGPT4-V2: Harnessing Large Language Model Capabilities for Enhanced Closed-Loop Autonomous Driving
Zhenhua Xu, Yan Bai, Yujia Zhang et al.
Evidential Active Recognition: Intelligent and Prudent Open-World Embodied Perception
Lei Fan, Mingfu Liang, Yunxuan Li et al.
Day-Night Cross-domain Vehicle Re-identification
Hongchao Li, Jingong Chen, AIHUA ZHENG et al.
SOLVE: Synergy of Language-Vision and End-to-End Networks for Autonomous Driving
Xuesong Chen, Linjiang Huang, Tao Ma et al.
JarvisIR: Elevating Autonomous Driving Perception with Intelligent Image Restoration
yunlong lin, Zixu Lin, Haoyu Chen et al.
SAFE: Multitask Failure Detection for Vision-Language-Action Models
Qiao Gu, Yuanliang Ju, Shengxiang Sun et al.
Real-Time Simulated Avatar from Head-Mounted Sensors
Zhengyi Luo, Jinkun Cao, Rawal Khirodkar et al.
Simulating Human-like Daily Activities with Desire-driven Autonomy
Yiding Wang, Yuxuan Chen, Fangwei Zhong et al.
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving
Kaituo Feng, Changsheng Li, Dongchun Ren et al.
3426 Regulating Intermediate 3D Features for Vision-Centric Autonomous Driving
Junkai Xu, Liang Peng, Haoran Cheng et al.
NEST: A Neuromodulated Small-world Hypergraph Trajectory Prediction Model for Autonomous Driving
Chengyue Wang, Haicheng Liao, Bonan Wang et al.
CALICO: Self-Supervised Camera-LiDAR Contrastive Pre-training for BEV Perception
Jiachen Sun, Haizhong Zheng, Qingzhao Zhang et al.
Perception-as-Control: Fine-grained Controllable Image Animation with 3D-aware Motion Representation
Yingjie Chen, Yifang Men, Yuan Yao et al.
DV-3DLane: End-to-end Multi-modal 3D Lane Detection with Dual-view Representation
Yueru Luo, Shuguang Cui, Zhen Li
UniOcc: A Unified Benchmark for Occupancy Forecasting and Prediction in Autonomous Driving
Yuping Wang, Xiangyu Huang, Xiaokang Sun et al.
NuPlanQA: A Large-Scale Dataset and Benchmark for Multi-View Driving Scene Understanding in Multi-Modal Large Language Models
Sung-Yeon Park, Can Cui, Yunsheng Ma et al.
Fine-Grained Evaluation of Large Vision-Language Models in Autonomous Driving
Yue Li, Meng Tian, Zhenyu Lin et al.
Enhancing Multi-Robot Semantic Navigation Through Multimodal Chain-of-Thought Score Collaboration
Zhixuan Shen, Haonan Luo, Kexun Chen et al.
OLiDM: Object-aware LiDAR Diffusion Models for Autonomous Driving
Tianyi Yan, Junbo Yin, Xianpeng Lang et al.
Generative Planning with 3D-Vision Language Pre-training for End-to-End Autonomous Driving
Tengpeng Li, Hanli Wang, Xianfei Li et al.
RoboTron-Drive: All-in-One Large Multimodal Model for Autonomous Driving
Zhijian Huang, Chengjian Feng, Baihui Xiao et al.
UniMLVG: Unified Framework for Multi-view Long Video Generation with Comprehensive Control Capabilities for Autonomous Driving
Rui Chen, Zehuan Wu, Yichen Liu et al.
SAMFusion: Sensor-Adaptive Multimodal Fusion for 3D Object Detection in Adverse Weather
Edoardo Palladin, Roland Dietze, Praveen Narayanan et al.
Unifying Appearance Codes and Bilateral Grids for Driving Scene Gaussian Splatting
Nan Wang, Lixing Xiao, Yuantao Chen et al.
HiP-AD: Hierarchical and Multi-Granularity Planning with Deformable Attention for Autonomous Driving in a Single Decoder
Yingqi Tang, Zhuoran Xu, Zhaotie Meng et al.
S4-Driver: Scalable Self-Supervised Driving Multimodal Large Language Model with Spatio-Temporal Visual Representation
Yichen Xie, Runsheng Xu, Tong He et al.
CarFormer: Self-Driving with Learned Object-Centric Representations
Shadi Hamdan, Fatma Guney
VisionPAD: A Vision-Centric Pre-training Paradigm for Autonomous Driving
Haiming Zhang, Wending Zhou, Shenzhen The Chinese University of Hongkong et al.
SIRA: Scalable Inter-frame Relation and Association for Radar Perception
Ryoma Yataka, Pu Wang, Petros Boufounos et al.
RepVF: A Unified Vector Fields Representation for Multi-task 3D Perception
Shen Jianbing, Chunliang Li, Wencheng Han et al.
DriveEditor: A Unified 3D Information-Guided Framework for Controllable Object Editing in Driving Scenes
Yiyuan Liang, Zhiying Yan, Liqun Chen et al.
UAVScenes: A Multi-Modal Dataset for UAVs
Sijie Wang, Siqi Li, Yawei Zhang et al.
RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Xiaosu Zhu, Hualian Sheng, Sijia Cai et al.
SOLAMI: Social Vision-Language-Action Modeling for Immersive Interaction with 3D Autonomous Characters
Jianping Jiang, Weiye Xiao, Zhengyu Lin et al.
Vanishing-Point-Guided Video Semantic Segmentation of Driving Scenes
Diandian Guo, Deng-Ping Fan, Tongyu Lu et al.
Event-Aided Time-To-Collision Estimation for Autonomous Driving
Jinghang Li, Bangyan Liao, Xiuyuan LU et al.
FSD-BEV: Foreground Self-Distillation for Multi-view 3D Object Detection
Zheng Jiang, Jinqing Zhang, Yanan Zhang et al.