Navigating the GAN Parameter Space for Semantic Image Editing

An image

Generative Adversarial Networks (GANs) are currently an indispensable tool for visual editing, being a standard component of image-to-image translation and image restoration pipelines. Furthermore, GANs are especially useful for controllable generation since their latent spaces contain a wide range of interpretable directions, well suited for semantic editing operations. By gradually changing latent codes along these directions, one can produce impressive visual effects, unattainable without GANs. 
In this paper, we significantly expand the range of visual effects achievable with the state-of-the-art models, like StyleGAN2. In contrast to existing works, which mostly operate by latent codes, we discover interpretable directions in the space of the generator parameters. By several simple methods, we explore this space and demonstrate that it also contains a plethora of interpretable directions, which are an excellent source of non-trivial semantic manipulations. The discovered manipulations cannot be achieved by transforming the latent codes and can be used to edit both synthetic and real images. We release our code and models and hope they will serve as a handy tool for further efforts on GAN-based image editing.

https://arxiv.org/abs/2011.13786

GAN在视觉编辑领域已经成为不可或缺的工具,在图像到图像翻译以及图像复原领域已经称为一个标准组件。另外,GAN在可控制生成任务中尤其有价值,因为他的隐空间包含了广泛的可解释方向,这些可解释方向可以被语义编辑任务有效地利用。通过逐步地改变隐空间的编码的这些方向,我们可以制造惊人的视觉效果,这是一般的GAN无法做到的。在本文中,我们极大地拓展了现有模型的应用领域,如StyleGAN2。不像绝大多数的模型,我们探究在生成器的参数中探究可解释的方向。我们利用几个简单的方法就可以探究这个空间并且展示了这个空间里有足够的可解释方向,这些可解释方向是可以被语义编辑利用的资源。我们发现,语义编辑不能通过隐空间编码的方法达成,它可被用于合成以及真实图像。

发表评论

邮箱地址不会被公开。 必填项已用*标注