"zero-shot learning" Papers
84 papers found • Page 2 of 2
CORECODE: A Common Sense Annotated Dialogue Dataset with Benchmark Tasks for Chinese Large Language Models
Dan Shi, Chaobin You, Jian-Tao Huang et al.
Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification
Long-Fei Li, Peng Zhao, Zhi-Hua Zhou
Data-Free Generalized Zero-Shot Learning
Bowen Tang, Jing Zhang, Yan Long et al.
DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Zhi Zhou, Ming Yang, Jiang-Xin Shi et al.
ELF: Encoding Speaker-Specific Latent Speech Feature for Speech Synthesis
Jungil Kong, Junmo Lee, Jeongmin Kim et al.
FALIP: Visual Prompt as Foveal Attention Boosts CLIP Zero-Shot Performance
Jiedong Zhuang, Jiaqi Hu, Lianrui Mu et al.
GenerateCT: Text-Conditional Generation of 3D Chest CT Volumes
Ibrahim Ethem Hamamci, Sezgin Er, Anjany Sekuboyina et al.
GroundVLP: Harnessing Zero-Shot Visual Grounding from Vision-Language Pre-training and Open-Vocabulary Object Detection
Haozhan Shen, Tiancheng Zhao, Mingwei Zhu et al.
HiFi-123: Towards High-fidelity One Image to 3D Content Generation
Wangbo Yu, Li Yuan, Yanpei Cao et al.
Image Captioning with Multi-Context Synthetic Data
Improving Diffusion Models for Inverse Problems Using Optimal Posterior Covariance
Xinyu Peng, Ziyang Zheng, Wenrui Dai et al.
InstructDoc: A Dataset for Zero
Shot Generalization of Visual Document Understanding with Instructions - Ryota Tanaka, Taichi Iki, Kyosuke Nishida et al.
Interactive Visual Task Learning for Robots
LangCell: Language-Cell Pre-training for Cell Identity Understanding
Suyuan Zhao, Jiahuan Zhang, Yushuai Wu et al.
Omni-Recon: Harnessing Image-based Rendering for General-Purpose Neural Radiance Fields
Yonggan Fu, Huaizhi Qu, Zhifan Ye et al.
OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model
Runyi Li, Xuhan SHENG, Weiqi Li et al.
PEARL: Zero-shot Cross-task Preference Alignment and Robust Reward Learning for Robotic Manipulation
Runze Liu, Yali Du, Fengshuo Bai et al.
PIVOT: Iterative Visual Prompting Elicits Actionable Knowledge for VLMs
Soroush Nasiriany, Fei Xia, Wenhao Yu et al.
Pix2Gif: Motion-Guided Diffusion for GIF Generation
Hitesh Kandala, Jianfeng Gao, Jianwei Yang
Prompting Segmentation with Sound Is Generalizable Audio-Visual Source Localizer
Yaoting Wang, Liu Weisong, Guangyao Li et al.
Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos
Mohaiminul Islam, Tushar Nagarajan, Huiyu Wang et al.
Revisiting the Role of Language Priors in Vision-Language Models
Zhiqiu Lin, Xinyue Chen, Deepak Pathak et al.
SelfVC: Voice Conversion With Iterative Refinement using Self Transformations
Paarth Neekhara, Shehzeen Hussain, Rafael Valle et al.
StyleSinger: Style Transfer for Out
of-Domain Singing Voice Synthesis
Task Contamination: Language Models May Not Be Few-Shot Anymore
Changmao Li, Jeffrey Flanigan
Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers
Johann Schmidt, Sebastian Stober
Towards Causal Foundation Model: on Duality between Optimal Balancing and Attention
Jiaqi Zhang, Joel Jennings, Agrin Hilmkil et al.
Training-free Video Temporal Grounding using Large-scale Pre-trained Models
Minghang Zheng, Xinhao Cai, Qingchao Chen et al.
VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context
yunxin li, Baotian Hu, Haoyuan Shi et al.
Visual-Text Cross Alignment: Refining the Similarity Score in Vision-Language Models
Jinhao Li, Haopeng Li, Sarah Erfani et al.
Wikiformer: Pre-training with Structured Information of Wikipedia for Ad-Hoc Retrieval
Weihang Su, Qingyao Ai, Xiangsheng Li et al.
Zero-1-to-3: Domain-Level Zero-Shot Cognitive Diagnosis via One Batch of Early-Bird Students towards Three Diagnostic Objectives
Weibo Gao, Qi Liu, Hao Wang et al.
Zero-Shot Image Feature Consensus with Deep Functional Maps
Xinle Cheng, Congyue Deng, Adam Harley et al.
Zero-shot Text-guided Infinite Image Synthesis with LLM guidance
Soyeong Kwon, TAEGYEONG LEE, Taehwan Kim