Paint by Word

We investigate the problem of zero-shot semantic image painting. Instead of painting modifications into an image using only concrete colors or a finite set of semantic concepts, we ask how to create semantic paint based on open full-text descriptions: our goal is to be able to point to a location in a synthesized image and apply an arbitrary new concept such as “rustic” or “opulent” or “happy dog.” To do this, our method combines a state-of-the art generative model of realistic images with a state-of-the-art text-image semantic similarity network. We find that, to make large changes, it is important to use non-gradient methods to explore latent space, and it is important to relax the computations of the GAN to target changes to a specific region. We conduct user studies to compare our methods to several baselines.

https://arxiv.org/abs/2103.10951

在本文中我们研究零样本的语义图像生成问题。不同于往一张图片上绘制离散的色彩或者有限的语义内容,我们提出了如何基于完全文字描述进行语义绘图的问题:我们的目标是通过文字描述给出一个区域就可以在此区域上绘制任意的内容,例如朴素的,奢华的或者特定的图案。为了实现这个任务,我们的方法结合了的现有的SOTA图像生成模型以及文字-图像语义相似度估计网络。我们发现,为了有所改善,放松GAN对于特定域的计算变得十份重要。我们让我们的方法与几个baseline进行了比较。

发表评论

邮箱地址不会被公开。 必填项已用*标注