TFCustom: Customized Image Generation with Time-Aware Frequency Feature Guidance

5citations
5
Citations
8
Authors
2
Data Points

Abstract

Subject-driven image personalization has seen notable advancements, especially with the ReferenceNet paradigm, which excels in integrating reference image features for creative and commercial applications. However, current ReferenceNet implementations mainly function as latent-level feature extractors, limiting their potential. This restricts the delivery of suitable features to the denoising backbone across timesteps, resulting in suboptimal image consistency. In this paper, we revisit reference feature extraction and propose TFCustom, a framework that focuses on reference image features at different temporal and frequency levels. We introduce synchronized ReferenceNet to extract reference features while optimizing noise injection and denoising. We also propose a time-aware frequency refinement module that uses high- and low-frequency filters with time embeddings to adaptively select reference feature injection. Additionally, we introduce a reward-based loss to improve the similarity between reference objects and generated images. Experimental results show that TFCustom outperforms existing methods in single-object and multi-object reference generation, with significant improvements in textual details.

Citation History

Jan 25, 2026
5
Jan 31, 2026
5