"zero-shot learning" Papers

84 papers found • Page 2 of 2

CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models

Dan Shi, Chaobin You, Jian-Tao Huang et al.

AAAI 2024paperarXiv:2312.12853
2
citations

Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification

Long-Fei Li, Peng Zhao, Zhi-Hua Zhou

AAAI 2024paperarXiv:2407.08787
4
citations

Data-Free Generalized Zero-Shot Learning

Bowen Tang, Jing Zhang, Yan Long et al.

AAAI 2024paperarXiv:2401.15657

DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection

Zhi Zhou, Ming Yang, Jiang-Xin Shi et al.

ICML 2024poster

ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis

Jungil Kong, Junmo Lee, Jeongmin Kim et al.

ICML 2024poster

FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance

Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.

ECCV 2024posterarXiv:2407.05578
7
citations

GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes

Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.

ECCV 2024posterarXiv:2305.16037
52
citations

GroundVLP: Harnessing Zero-Shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection

Haozhan Shen, Tiancheng Zhao, Mingwei Zhu et al.

AAAI 2024paperarXiv:2312.15043

HiFi-123: Towards High-fidelity One Image to 3D Content Generation

Wangbo Yu, Li Yuan, Yanpei Cao et al.

ECCV 2024posterarXiv:2310.06744
34
citations

Image Captioning with Multi-Context Synthetic Data

AAAI 2024paperarXiv:2305.18072

Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance

Xinyu Peng, Ziyang Zheng, Wenrui Dai et al.

ICML 2024poster

InstructDoc: A Dataset for Zero

Shot Generalization of Visual Document Understanding with Instructions - Ryota Tanaka, Taichi Iki, Kyosuke Nishida et al.

AAAI 2024paperarXiv:2401.13313

Interactive Visual Task Learning for Robots

AAAI 2024paperarXiv:2312.13219

LangCell: Language-Cell Pre-training for Cell Identity Understanding

Suyuan Zhao, Jiahuan Zhang, Yushuai Wu et al.

ICML 2024poster

Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields

Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.

ECCV 2024posterarXiv:2403.11131

OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model

Runyi Li, Xuhan SHENG, Weiqi Li et al.

ECCV 2024posterarXiv:2404.10312
11
citations

PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation

Runze Liu, Yali Du, Fengshuo Bai et al.

ICML 2024poster

PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs

Soroush Nasiriany, Fei Xia, Wenhao Yu et al.

ICML 2024poster

Pix2Gif: Motion-Guided Diffusion for GIF Generation

Hitesh Kandala, Jianfeng Gao, Jianwei Yang

ECCV 2024posterarXiv:2403.04634
5
citations

Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer

Yaoting Wang, Liu Weisong, Guangyao Li et al.

AAAI 2024paperarXiv:2309.07929
38
citations

Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos

Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.

ECCV 2024posterarXiv:2409.20557
10
citations

Revisiting the Role of Language Priors in Vision-Language Models

Zhiqiu Lin, Xinyue Chen, Deepak Pathak et al.

ICML 2024poster

SelfVC: Voice Conversion With Iterative Refinement using Self Transformations

Paarth Neekhara, Shehzeen Hussain, Rafael Valle et al.

ICML 2024poster

StyleSinger: Style Transfer for Out

of-Domain Singing Voice Synthesis

AAAI 2024paperarXiv:2312.10741

Task Contamination: Language Models May Not Be Few-Shot Anymore

Changmao Li, Jeffrey Flanigan

AAAI 2024paperarXiv:2312.16337
130
citations

Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers

Johann Schmidt, Sebastian Stober

ICML 2024poster

Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention

Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.

ICML 2024poster

Training-free Video Temporal Grounding using Large-scale Pre-trained Models

Minghang Zheng, Xinhao Cai, Qingchao Chen et al.

ECCV 2024posterarXiv:2408.16219
20
citations

VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

yunxin li, Baotian Hu, Haoyuan Shi et al.

ICML 2024poster

Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models

Jinhao Li, Haopeng Li, Sarah Erfani et al.

ICML 2024poster

Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval

Weihang Su, Qingyao Ai, Xiangsheng Li et al.

AAAI 2024paperarXiv:2312.10661
22
citations

Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives

Weibo Gao, Qi Liu, Hao Wang et al.

AAAI 2024paperarXiv:2312.13434
29
citations

Zero-Shot Image Feature Consensus with Deep Functional Maps

Xinle Cheng, Congyue Deng, Adam Harley et al.

ECCV 2024posterarXiv:2403.12038
8
citations

Zero-shot Text-guided Infinite Image Synthesis with LLM guidance

Soyeong Kwon, TAEGYEONG LEE, Taehwan Kim

ECCV 2024posterarXiv:2407.12642
3
citations