SWAGAN: A Style-based Wavelet-driven Generative Model

In recent years, considerable progress has been made in the visual quality of Generative Adversarial Networks (GANs). Even so, these networks still suffer from degradation in quality for high-frequency content, stemming from a spectrally biased architecture, and similarly unfavorable loss functions. To address this issue, we present a novel general-purpose Style and WAvelet based GAN (SWAGAN) that implements progressive generation in the frequency domain. SWAGAN incorporates wavelets throughout its generator and discriminator architectures, enforcing a frequency-aware latent representation at every step of the way. This approach yields enhancements in the visual quality of the generated images, and considerably increases computational performance. We demonstrate the advantage of our method by integrating it into the SyleGAN2 framework, and verifying that content generation in the wavelet domain leads to higher quality images with more realistic high-frequency content. Furthermore, we verify that our model’s latent space retains the qualities that allow StyleGAN to serve as a basis for a multitude of editing tasks, and show that our frequency-aware approach also induces improved downstream visual quality.

https://arxiv.org/abs/2102.06108

最近,通过GANs生成的图像质量有了显著提高。然而,对于一些高频内容,GANs生成效果还有待提高,这些问题是由网络对于特定频谱的偏差以及不合适的损失函数造成的。为了解决上述问题,我们提出了一种通用的基于风格和小波的GAN:SWAGAN用于频域的生成任务。SWAGAN在它的生成器和判别器中输入小波,强迫网络学习到频率相关的隐空间表示。这种架构可以提高生成图像的质量,并且节约算力。我们展示了把我们的方法域StyleGAN2结合在一起的优点,并且验证了在小波域上的内容生成可以生成更高质量的图片且保留真实的高频特征。另外,我们还验证了我们模型的隐空间保留了足够的特征可供StyleGAN进行后续的图像编辑任务,这充分证明了我们的频率感知的方法可以有效提升下游任务。

发表评论

邮箱地址不会被公开。 必填项已用*标注