2025 "zero-shot learning" Papers

53 papers found • Page 1 of 2

Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models

Yankai Jiang, Peng Zhang, Donglin Yang et al.

CVPR 2025posterarXiv:2505.02753

AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements

Adriana-Eufrosina Bora, Pierre-Luc St-Charles, Mirko Bronzi et al.

ICLR 2025posterarXiv:2502.07022
2
citations

AmorLIP: Efficient Language-Image Pretraining via Amortization

Haotian Sun, Yitong Li, Yuchen Zhuang et al.

NeurIPS 2025posterarXiv:2505.18983
2
citations

A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis

Dongheng Lin, Mengxue Qu, Kunyang Han et al.

NeurIPS 2025oralarXiv:2511.00962

Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning

Hairui Ren, Fan Tang, He Zhao et al.

CVPR 2025posterarXiv:2504.11930

Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation

Jingmin Zhu, Anqi Zhu, Hossein Rahmani et al.

NeurIPS 2025posterarXiv:2512.11458

Can LLMs Understand Time Series Anomalies?

Zihao Zhou, Rose Yu

ICLR 2025posterarXiv:2410.05440
32
citations

CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale

ZeMing Gong, Austin Wang, Xiaoliang Huo et al.

ICLR 2025posterarXiv:2405.17537
18
citations

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.

CVPR 2025posterarXiv:2412.12077
22
citations

CrypticBio: A Large Multimodal Dataset for Visually Confusing Species

Georgiana Manolache, Gerard Schouten, Joaquin Vanschoren

NeurIPS 2025oral

Dense Video Object Captioning from Disjoint Supervision

Xingyi Zhou, Anurag Arnab, Chen Sun et al.

ICLR 2025oralarXiv:2306.11729
7
citations

Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling

Qirui Wu, Denys Iliash, Daniel Ritchie et al.

ICCV 2025highlightarXiv:2411.19492
2
citations

DocVLM: Make Your VLM an Efficient Reader

Mor Shpigel Nacson, Aviad Aberdam, Roy Ganz et al.

CVPR 2025posterarXiv:2412.08746
10
citations

FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning

Woosung Koh, Wonbeen Oh, Siyeol Kim et al.

ICLR 2025poster

FlySearch: Exploring how vision-language models explore

Adam Pardyl, Dominik Matuszek, Mateusz Przebieracz et al.

NeurIPS 2025posterarXiv:2506.02896
3
citations

GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion

Karlo Koledic, Luka Petrovic, Ivan Marković et al.

ICCV 2025posterarXiv:2412.06080
1
citations

HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis

Yuto Nishimura, Takumi Hirose, Masanari Ohi et al.

ICLR 2025posterarXiv:2410.04380
5
citations

Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy

You Li, Fan Ma, Yi Yang

CVPR 2025posterarXiv:2411.16752
9
citations

InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection

Jinguo Luo, Weihong Ren, Quanlong Zheng et al.

NeurIPS 2025spotlight

Knowledge Transfer from Interaction Learning

Yilin Gao, Kangyi Chen, Zhongxing Peng et al.

ICCV 2025posterarXiv:2509.18733

Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator

Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.

CVPR 2025posterarXiv:2411.15466
36
citations

LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization

Alessio Spagnoletti, Jean Prost, Andres Almansa et al.

ICCV 2025posterarXiv:2503.12615
9
citations

LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling

Li Huaqiu, Yong Wang, Tongwen Huang et al.

ICCV 2025posterarXiv:2507.00790
3
citations

Locality-Aware Zero-Shot Human-Object Interaction Detection

Sanghyun Kim, Deunsol Jung, Minsu Cho

CVPR 2025posterarXiv:2505.19503

MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer

Yuancheng Wang, Haoyue Zhan, Liwei Liu et al.

ICLR 2025posterarXiv:2409.00750
156
citations

MetaOOD: Automatic Selection of OOD Detection Models

Yuehan Qin, Yichi Zhang, Yi Nian et al.

ICLR 2025posterarXiv:2410.03074
16
citations

MIRA: Medical Time Series Foundation Model for Real-World Health Data

Hao Li, Bowen Deng, Chang Xu et al.

NeurIPS 2025oralarXiv:2506.07584
4
citations

MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion

Yikun Ma, Yiqing Li, Jiawei Wu et al.

ICCV 2025posterarXiv:2503.17695
1
citations

Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap

Christopher Liao, Christian So, Theodoros Tsiligkaridis et al.

ICLR 2025posterarXiv:2402.04416
1
citations

Multitask Learning with Stochastic Interpolants

Hugo Negrel, Florentin Coeurdoux, Michael Albergo et al.

NeurIPS 2025spotlightarXiv:2508.04605

Noisy Test-Time Adaptation in Vision-Language Models

Chentao Cao, Zhun Zhong, (Andrew) Zhanke Zhou et al.

ICLR 2025posterarXiv:2502.14604
4
citations

PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling

Junchao Gong, Siwei Tu, Weidong Yang et al.

ICLR 2025oralarXiv:2410.05805
7
citations

RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion

Bardienus Duisterhof, Jan Oberst, Bowen Wen et al.

NeurIPS 2025posterarXiv:2506.05285
4
citations

Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval

Yuanmin Tang, Jue Zhang, Xiaoting Qin et al.

CVPR 2025highlightarXiv:2412.11077
15
citations

Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos

Kaihua Chen, Tarasha Khurana, Deva Ramanan

NeurIPS 2025posterarXiv:2507.12646
2
citations

RESAnything: Attribute Prompting for Arbitrary Referring Segmentation

Ruiqi Wang, Hao Zhang

NeurIPS 2025posterarXiv:2505.02867
2
citations

scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling

Joel Dapello, Marcel Nassar, Ridvan Eksi et al.

NeurIPS 2025poster

SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding

Rong Li, Shijie Li, Lingdong Kong et al.

CVPR 2025posterarXiv:2412.04383
40
citations

Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models

Lexiang Xiong, Liu Chengyu, Jingwen Ye et al.

NeurIPS 2025posterarXiv:2510.22851

Should VLMs be Pre-trained with Image Data?

Sedrick Keh, Jean Mercat, Samir Yitzhak Gadre et al.

ICLR 2025posterarXiv:2503.07603

SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding

Zhao Jin, Rong-Cheng Tu, Jingyi Liao et al.

NeurIPS 2025posterarXiv:2506.21924
2
citations

Teaching Human Behavior Improves Content Understanding Abilities Of VLMs

SOMESH SINGH, Harini S I, Yaman Singla et al.

ICLR 2025poster
2
citations

The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs

HONG LI, Nanxi Li, Yuanjie Chen et al.

ICLR 2025posterarXiv:2410.01417
3
citations

TikZero: Zero-Shot Text-Guided Graphics Program Synthesis

Jonas Belouadi, Eddy Ilg, Margret Keuper et al.

ICCV 2025highlightarXiv:2503.11509
2
citations

Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models

Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.

ICCV 2025posterarXiv:2508.10407

TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster

Kanghui Ning, Zijie Pan, Yu Liu et al.

NeurIPS 2025posterarXiv:2503.07649
11
citations

Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding

Zaiquan Yang, Yuhao LIU, Gerhard Hancke et al.

NeurIPS 2025oralarXiv:2509.15178
2
citations

Vision Transformers with Self-Distilled Registers

Zipeng Yan, Yinjie Chen, Chong Zhou et al.

NeurIPS 2025spotlightarXiv:2505.21501
4
citations

Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning

Huajie Jiang, Zhengxian Li, Xiaohan Yu et al.

CVPR 2025posterarXiv:2503.23030
1
citations

X-Dyna: Expressive Dynamic Human Image Animation

Di Chang, Hongyi Xu, You Xie et al.

CVPR 2025highlightarXiv:2501.10021
14
citations
← PreviousNext →