标签归档:Pose Synthesis

HumanGAN: A Generative Model of Humans Images

Generative adversarial networks achieve great performance in photorealistic image synthesis in various domains, including human images. However, they usually employ latent vectors that encode the sampled outputs globally. This does not allow convenient control of semantically-relevant individual parts of the image, and is not able to draw samples that only differ in partial aspects, such as clothing style. We address these limitations and present a generative model for images of dressed humans offering control over pose, local body part appearance and garment style. This is the first method to solve various aspects of human image generation such as global appearance sampling, pose transfer, parts and garment transfer, and parts sampling jointly in a unified framework. As our model encodes part-based latent appearance vectors in a normalized pose-independent space and warps them to different poses, it preserves body and clothing appearance under varying posture. Experiments show that our flexible and general generative method outperforms task-specific baselines for pose-conditioned image generation, pose transfer and part sampling in terms of realism and output resolution.

https://arxiv.org/abs/2103.06902

生成对抗网络将图像生成拓展许多应用中并取得了良好的反响。但是,它们往往使用隐矢量对采样输出进行编码,这使得对于独立部分的编辑工作变得很不方便,也无法控制部分单独变量例如服饰的风格。我们通过提出一个新的生成模型来解决这个问题,我们提出的模型可以控制姿态,局部身体部位以及服装风格。这是第一个从多方面解决人体图像生成的方法,它由全局外观采样,姿态转移,部位和服饰转移,以及部位联合采样几个部分组成。当我们的编码器编码完成隐外观向量到一个标准化的姿态无关的空间之后我们将它映射到不同的姿态,这不会影响身体和服饰的外观。实验表明我们的模型在条件图像生成,姿态转移以及部分采样等任务中获得了优异的性能。

Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis

We tackle human image synthesis, including human motion imitation, appearance transfer, and novel view synthesis, within a unified framework. It means that the model, once being trained, can be used to handle all these tasks. The existing task-specific methods mainly use 2D keypoints to estimate the human body structure. However, they only express the position information with no abilities to characterize the personalized shape of the person and model the limb rotations. In this paper, we propose to use a 3D body mesh recovery module to disentangle the pose and shape. It can not only model the joint location and rotation but also characterize the personalized body shape. To preserve the source information, such as texture, style, color, and face identity, we propose an Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB) that propagates the source information in both image and feature spaces to the synthesized reference. Specifically, the source features are extracted by a denoising convolutional auto-encoder for characterizing the source identity well. Furthermore, our proposed method can support a more flexible warping from multiple sources. To further improve the generalization ability of the unseen source images, a one/few-shot adversarial learning is applied. In detail, it firstly trains a model in an extensive training set. Then, it finetunes the model by one/few-shot unseen image(s) in a self-supervised way to generate high-resolution (512 x 512 and 1024 x 1024) results. Also, we build a new dataset, namely iPER dataset, for the evaluation of human motion imitation, appearance transfer, and novel view synthesis. Extensive experiments demonstrate the effectiveness of our methods in terms of preserving face identity, shape consistency, and clothes details.

https://arxiv.org/abs/2011.09055

我们在本文中关注在统一架构中完成人体姿态合成任务:包括动作模仿,外观转移,新视角合成。这表明模型在一次训练之后就可以完成上述任务。现有的任务驱动的模型主要实用2D关键点估计人体姿态。但是,这样的方法仅仅关注位置信息而忽略了肢体形态的个人差异以及对于肢体旋转的建模。在本文中,我们利用3D人体网格护肤模型去解决姿态和形态的问题。另外我们还提出了Attentional Liquid Warping GAN with Attentional Liquid Warping Block (AttLWB)用于在图像和特征空间之间传递源信息。