2025 "multimodal dataset" Papers

12 papers found

CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling

Matthew Fortier, Mats L. Richter, Oliver Sonnentag et al.

ICLR 2025posterarXiv:2406.04940
2
citations

CrypticBio: A Large Multimodal Dataset for Visually Confusing Species

Georgiana Manolache, Gerard Schouten, Joaquin Vanschoren

NEURIPS 2025oral

Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation

Moru Liu, Hao Dong, Jessica Kelly et al.

NEURIPS 2025posterarXiv:2505.16985
3
citations

MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios

Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.

AAAI 2025paperarXiv:2409.16084
11
citations

MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing

Shreelekha Revankar, Utkarsh Mall, Cheng Perng Phoo et al.

NEURIPS 2025oralarXiv:2507.16228

Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions

Liang Xu, Chengqun Yang, Zili Lin et al.

ICCV 2025posterarXiv:2508.04681
1
citations

RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments

Haisheng Su, Feixiang Song, CONG MA et al.

CVPR 2025posterarXiv:2408.15503
5
citations

ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models

Hongbo Liu, Jingwen He, Yi Jin et al.

NEURIPS 2025posterarXiv:2506.21356
7
citations

STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection

Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.

CVPR 2025highlightarXiv:2504.02823
2
citations

TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident

Yixuan Zhou, Long Bai, Sijia Cai et al.

ICLR 2025oral
3
citations

Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation

Kaining Ying, Henghui Ding, Guangquan Jie et al.

ICCV 2025posterarXiv:2507.22886
5
citations

Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models

Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.

NEURIPS 2025spotlightarXiv:2507.13383
3
citations