2025 "multimodal dataset" Papers
12 papers found
CarbonSense: A Multimodal Dataset and Baseline for Carbon Flux Modelling
Matthew Fortier, Mats L. Richter, Oliver Sonnentag et al.
ICLR 2025posterarXiv:2406.04940
2
citations
CrypticBio: A Large Multimodal Dataset for Visually Confusing Species
Georgiana Manolache, Gerard Schouten, Joaquin Vanschoren
NEURIPS 2025oral
Extremely Simple Multimodal Outlier Synthesis for Out-of-Distribution Detection and Segmentation
Moru Liu, Hao Dong, Jessica Kelly et al.
NEURIPS 2025posterarXiv:2505.16985
3
citations
MM-CamObj: A Comprehensive Multimodal Dataset for Camouflaged Object Scenarios
Jiacheng Ruan, Wenzhen Yuan, Zehao Lin et al.
AAAI 2025paperarXiv:2409.16084
11
citations
MONITRS: Multimodal Observations of Natural Incidents Through Remote Sensing
Shreelekha Revankar, Utkarsh Mall, Cheng Perng Phoo et al.
NEURIPS 2025oralarXiv:2507.16228
Perceiving and Acting in First-Person: A Dataset and Benchmark for Egocentric Human-Object-Human Interactions
Liang Xu, Chengqun Yang, Zili Lin et al.
ICCV 2025posterarXiv:2508.04681
1
citations
RoboSense: Large-scale Dataset and Benchmark for Egocentric Robot Perception and Navigation in Crowded and Unstructured Environments
Haisheng Su, Feixiang Song, CONG MA et al.
CVPR 2025posterarXiv:2408.15503
5
citations
ShotBench: Expert-Level Cinematic Understanding in Vision-Language Models
Hongbo Liu, Jingwen He, Yi Jin et al.
NEURIPS 2025posterarXiv:2506.21356
7
citations
STING-BEE: Towards Vision-Language Model for Real-World X-ray Baggage Security Inspection
Divya Velayudhan, Abdelfatah Ahmed, Mohamad Alansari et al.
CVPR 2025highlightarXiv:2504.02823
2
citations
TAU-106K: A New Dataset for Comprehensive Understanding of Traffic Accident
Yixuan Zhou, Long Bai, Sijia Cai et al.
ICLR 2025oral
3
citations
Towards Omnimodal Expressions and Reasoning in Referring Audio-Visual Segmentation
Kaining Ying, Henghui Ding, Guangquan Jie et al.
ICCV 2025posterarXiv:2507.22886
5
citations
Whose View of Safety? A Deep DIVE Dataset for Pluralistic Alignment of Text-to-Image Models
Charvi Rastogi, Tian Huey Teh, Pushkar Mishra et al.
NEURIPS 2025spotlightarXiv:2507.13383
3
citations