2025 "zero-shot learning" Papers
53 papers found • Page 1 of 2
Advancing Generalizable Tumor Segmentation with Anomaly-Aware Open-Vocabulary Attention Maps and Frozen Foundation Diffusion Models
Yankai Jiang, Peng Zhang, Donglin Yang et al.
AIMS.au: A Dataset for the Analysis of Modern Slavery Countermeasures in Corporate Statements
Adriana-Eufrosina Bora, Pierre-Luc St-Charles, Mirko Bronzi et al.
AmorLIP: Efficient Language-Image Pretraining via Amortization
Haotian Sun, Yitong Li, Yuchen Zhuang et al.
A Unified Reasoning Framework for Holistic Zero-Shot Video Anomaly Analysis
Dongheng Lin, Mengxue Qu, Kunyang Han et al.
Beyond Words: Augmenting Discriminative Richness via Diffusions in Unsupervised Prompt Learning
Hairui Ren, Fan Tang, He Zhao et al.
Boosting Skeleton-based Zero-Shot Action Recognition with Training-Free Test-Time Adaptation
Jingmin Zhu, Anqi Zhu, Hossein Rahmani et al.
Can LLMs Understand Time Series Anomalies?
Zihao Zhou, Rose Yu
CLIBD: Bridging Vision and Genomics for Biodiversity Monitoring at Scale
ZeMing Gong, Austin Wang, Xiaoliang Huo et al.
CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology
Yuxuan Sun, Yixuan Si, Chenglu Zhu et al.
CrypticBio: A Large Multimodal Dataset for Visually Confusing Species
Georgiana Manolache, Gerard Schouten, Joaquin Vanschoren
Dense Video Object Captioning from Disjoint Supervision
Xingyi Zhou, Anurag Arnab, Chen Sun et al.
Diorama: Unleashing Zero-shot Single-view 3D Indoor Scene Modeling
Qirui Wu, Denys Iliash, Daniel Ritchie et al.
DocVLM: Make Your VLM an Efficient Reader
Mor Shpigel Nacson, Aviad Aberdam, Roy Ganz et al.
FlickerFusion: Intra-trajectory Domain Generalizing Multi-agent Reinforcement Learning
Woosung Koh, Wonbeen Oh, Siyeol Kim et al.
FlySearch: Exploring how vision-language models explore
Adam Pardyl, Dominik Matuszek, Mateusz Przebieracz et al.
GVDepth: Zero-Shot Monocular Depth Estimation for Ground Vehicles based on Probabilistic Cue Fusion
Karlo Koledic, Luka Petrovic, Ivan Marković et al.
HALL-E: Hierarchical Neural Codec Language Model for Minute-Long Zero-Shot Text-to-Speech Synthesis
Yuto Nishimura, Takumi Hirose, Masanari Ohi et al.
Imagine and Seek: Improving Composed Image Retrieval with an Imagined Proxy
You Li, Fan Ma, Yi Yang
InstructHOI: Context-Aware Instruction for Multi-Modal Reasoning in Human-Object Interaction Detection
Jinguo Luo, Weihong Ren, Quanlong Zheng et al.
Knowledge Transfer from Interaction Learning
Yilin Gao, Kangyi Chen, Zhongxing Peng et al.
Large-Scale Text-to-Image Model with Inpainting is a Zero-Shot Subject-Driven Image Generator
Chaehun Shin, Jooyoung Choi, Heeseung Kim et al.
LATINO-PRO: LAtent consisTency INverse sOlver with PRompt Optimization
Alessio Spagnoletti, Jean Prost, Andres Almansa et al.
LD-RPS: Zero-Shot Unified Image Restoration via Latent Diffusion Recurrent Posterior Sampling
Li Huaqiu, Yong Wang, Tongwen Huang et al.
Locality-Aware Zero-Shot Human-Object Interaction Detection
Sanghyun Kim, Deunsol Jung, Minsu Cho
MaskGCT: Zero-Shot Text-to-Speech with Masked Generative Codec Transformer
Yuancheng Wang, Haoyue Zhan, Liwei Liu et al.
MetaOOD: Automatic Selection of OOD Detection Models
Yuehan Qin, Yichi Zhang, Yi Nian et al.
MIRA: Medical Time Series Foundation Model for Real-World Health Data
Hao Li, Bowen Deng, Chang Xu et al.
MotionDiff: Training-free Zero-shot Interactive Motion Editing via Flow-assisted Multi-view Diffusion
Yikun Ma, Yiqing Li, Jiawei Wu et al.
Multimodal Unsupervised Domain Generalization by Retrieving Across the Modality Gap
Christopher Liao, Christian So, Theodoros Tsiligkaridis et al.
Multitask Learning with Stochastic Interpolants
Hugo Negrel, Florentin Coeurdoux, Michael Albergo et al.
Noisy Test-Time Adaptation in Vision-Language Models
Chentao Cao, Zhun Zhong, (Andrew) Zhanke Zhou et al.
PostCast: Generalizable Postprocessing for Precipitation Nowcasting via Unsupervised Blurriness Modeling
Junchao Gong, Siwei Tu, Weidong Yang et al.
RaySt3R: Predicting Novel Depth Maps for Zero-Shot Object Completion
Bardienus Duisterhof, Jan Oberst, Bowen Wen et al.
Reason-before-Retrieve: One-Stage Reflective Chain-of-Thoughts for Training-Free Zero-Shot Composed Image Retrieval
Yuanmin Tang, Jue Zhang, Xiaoting Qin et al.
Reconstruct, Inpaint, Test-Time Finetune: Dynamic Novel-view Synthesis from Monocular Videos
Kaihua Chen, Tarasha Khurana, Deva Ramanan
RESAnything: Attribute Prompting for Arbitrary Referring Segmentation
Ruiqi Wang, Hao Zhang
scGeneScope: A Treatment-Matched Single Cell Imaging and Transcriptomics Dataset and Benchmark for Treatment Response Modeling
Joel Dapello, Marcel Nassar, Ridvan Eksi et al.
SeeGround: See and Ground for Zero-Shot Open-Vocabulary 3D Visual Grounding
Rong Li, Shijie Li, Lingdong Kong et al.
Semantic Surgery: Zero-Shot Concept Erasure in Diffusion Models
Lexiang Xiong, Liu Chengyu, Jingwen Ye et al.
Should VLMs be Pre-trained with Image Data?
Sedrick Keh, Jean Mercat, Samir Yitzhak Gadre et al.
SPAZER: Spatial-Semantic Progressive Reasoning Agent for Zero-shot 3D Visual Grounding
Zhao Jin, Rong-Cheng Tu, Jingyi Liao et al.
Teaching Human Behavior Improves Content Understanding Abilities Of VLMs
SOMESH SINGH, Harini S I, Yaman Singla et al.
The Labyrinth of Links: Navigating the Associative Maze of Multi-modal LLMs
HONG LI, Nanxi Li, Yuanjie Chen et al.
TikZero: Zero-Shot Text-Guided Graphics Program Synthesis
Jonas Belouadi, Eddy Ilg, Margret Keuper et al.
Translation of Text Embedding via Delta Vector to Suppress Strongly Entangled Content in Text-to-Image Diffusion Models
Eunseo Koh, SeungHoo Hong, Tae-Young Kim et al.
TS-RAG: Retrieval-Augmented Generation based Time Series Foundation Models are Stronger Zero-Shot Forecaster
Kanghui Ning, Zijie Pan, Yu Liu et al.
Unleashing the Potential of Multimodal LLMs for Zero-Shot Spatio-Temporal Video Grounding
Zaiquan Yang, Yuhao LIU, Gerhard Hancke et al.
Vision Transformers with Self-Distilled Registers
Zipeng Yan, Yinjie Chen, Chong Zhou et al.
Visual and Semantic Prompt Collaboration for Generalized Zero-Shot Learning
Huajie Jiang, Zhengxian Li, Xiaohan Yu et al.
X-Dyna: Expressive Dynamic Human Image Animation
Di Chang, Hongyi Xu, You Xie et al.