"large-scale dataset curation" Papers
3 papers found
CPSea: Large-scale cyclic peptide-protein complex dataset for machine learning in cyclic peptide design
Ziyi Yang, Hanyuan Xie, Yinjun Jia et al.
NeurIPS 2025poster
OmniCorpus: A Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text
Qingyun Li, Zhe Chen, Weiyun Wang et al.
ICLR 2025posterarXiv:2406.08418
48
citations
Data Roaming and Quality Assessment for Composed Image Retrieval
Matan Levy, Rami Ben-Ari, Nir Darshan et al.
AAAI 2024paperarXiv:2303.09429