Text-Free Learning of a Natural Language Interface for Pretrained Face Generators

On 9 Sep, 2022 By admin 0 Comments

September, 2022

Abstract

We propose Fast text2StyleGAN, a natural language interface that adapts pre-trained GANs for text-guided human face synthesis. Leveraging the recent advances in Contrastive Language-Image Pre-training (CLIP), no text data is required during training. Fast text2StyleGAN is formulated as a conditional variational autoencoder (CVAE) that provides extra control and diversity to the generated images at test time. Our model does not require re-training or fine-tuning of the GANs or CLIP when encountering new text prompts. In contrast to prior work, we do not rely on optimization at test time, making our method orders of magnitude faster than prior work. Empirically, on FFHQ dataset, our method offers faster and more accurate generation of images from natural language descriptions with varying levels of detail compared to prior work.

Attachment:

Text-Free Learning of a Natural Language Interface for Pretrained Face Generators.pdf

Resource Type:

Academic Paper

Tags:

Machine Learning

Artificial Intelligence

Language Models

Natural Language Processing

NLP

Contrastive Language-Image Pre-training

Image and Video Processing

Computer Vision and Pattern Recognition

You are here

Text-Free Learning of a Natural Language Interface for Pretrained Face Generators