Semantic Image Synthesis via Efficient Class-Adaptive Normalization

Spatially-adaptive normalization (SPADE) is remarkably successful recently in conditional semantic image synthesis, which modulates the normalized activation with spatially-varying transformations learned from semantic layouts, to prevent the semantic information from being washed away. Despite its impressive performance, a more thorough understanding of the advantages inside the box is still highly demanded to help reduce the significant computation and parameter overhead introduced by this novel structure. In this paper, from a return-on-investment point of view, we conduct an in-depth analysis of the effectiveness of this spatially-adaptive normalization and observe that its modulation parameters benefit more from semantic-awareness rather than spatial-adaptiveness, especially for high-resolution input masks. Inspired by this observation, we propose class-adaptive normalization (CLADE), a lightweight but equally-effective variant that is only adaptive to semantic class. In order to further improve spatial-adaptiveness, we introduce intra-class positional map encoding calculated from semantic layouts to modulate the normalization parameters of CLADE and propose a truly spatially-adaptive variant of CLADE, namely CLADE-ICPE. %Benefiting from this design, CLADE greatly reduces the computation cost while being able to preserve the semantic information in the generation. Through extensive experiments on multiple challenging datasets, we demonstrate that the proposed CLADE can be generalized to different SPADE-based methods while achieving comparable generation quality compared to SPADE, but it is much more efficient with fewer extra parameters and lower computational cost.

SPADE在条件语义图像合成任务中取得了瞩目的成绩,它对通过分割标签学习到的经过空间变换的标准化激活进行建模,从而避免了语义信息在生成过程中的损失。除了关注它优越的性能,对于模型深入的探究将有助于提升模型的计算效率。在本文中,我们从投资-回报理论的角度对spatially-adaptive normalization模型效率进行了深入的研究,我们发现模型的性能在语义层级的收益比空间适应性层级更高,这样的收益差距在高分辨输入条件时更加明显。根据这个发现,我们提出了CLADE,一个轻量级但同等效率的仅仅受语义标签影响的模型。为了进一步改进空间适应性,我们提出了通过语义分割图计算的类内位置图以对CLADE的标准化参数进行建模。在不同数据上的实验表明,CLADE能够在保持于SPADE相似性能的前提下以更少的参数量达到更高效的运算效率。


邮箱地址不会被公开。 必填项已用*标注