标签归档:Style Transfer

High-Resolution Photorealistic Image Translation in Real-Time: A Laplacian Pyramid Translation Network

Existing image-to-image translation (I2IT) methods are either constrained to low-resolution images or long inference time due to their heavy computational burden on the convolution of high-resolution feature maps. In this paper, we focus on speeding-up the high-resolution photorealistic I2IT tasks based on closed-form Laplacian pyramid decomposition and reconstruction. Specifically, we reveal that the attribute transformations, such as illumination and color manipulation, relate more to the low-frequency component, while the content details can be adaptively refined on high-frequency components. We consequently propose a Laplacian Pyramid Translation Network (LPTN) to simultaneously perform these two tasks, where we design a lightweight network for translating the low-frequency component with reduced resolution and a progressive masking strategy to efficiently refine the high-frequency ones. Our model avoids most of the heavy computation consumed by processing high-resolution feature maps and faithfully preserves the image details. Extensive experimental results on various tasks demonstrate that the proposed method can translate 4K images in real-time using one normal GPU while achieving comparable transformation performance against existing methods. 


现有的I2IT的方法被低分辨率图像和冗长的推理时间困扰。在本文中,我们通过闭合拉普拉斯金字塔进行分解和重建以完成高分辨图像的I2IT任务。我们发现光照和色彩变化更多的与图像的低频部分相关,而图像的内容与其高频部分相关。我们在这里提出一种拉普拉斯金字塔变换网络(LPTN), 这个轻量化的网络可以用低分辨率的形式转换低频特征并用一种渐进式的掩膜方式转换高频特征。我们的模型避免的大部分的复杂计算同时保持了尽量多的图像细节。在实验中,我们的模型可以实现实时4k分辨率的图像风格迁移。

K-Hairstyle: A Large-scale Korean hairstyle dataset for virtual hair editing and hairstyle classification

The hair and beauty industry is one of the fastest growing industries. This led to the development of various applications, such as virtual hair dyeing or hairstyle translations, to satisfy the need of the customers. Although there are several public hair datasets available for these applications, they consist of limited number of images with low resolution, which restrict their performance on high-quality hair editing. Therefore, we introduce a novel large-scale Korean hairstyle dataset, K-hairstyle, 256,679 with high-resolution images. In addition, K-hairstyle contains various hair attributes annotated by Korean expert hair stylists and hair segmentation masks. We validate the effectiveness of our dataset by leveraging several applications, such as hairstyle translation, and hair classification and hair retrieval. Furthermore, we will release K-hairstyle soon.


美发和美容产业是最近发展得最快的行业之一。它们的发展带动了许多类似于虚拟染发或者发型迁移等应用的发展。尽管现在已经有几个公开的发型数据集,但是都存在数据量小或者低分辨率等等问题,这限制了发型编辑技术的发展。所以我们介绍一个大规模的韩国发型数据集K-hairstyle. 它拥有256,679张高分辨率的图像。另外,数据集还包含多种由韩国发型师标注的发型属性标签以及分割掩膜。我们在诸如发型迁移,发型分类以及发型检索应用中测试和验证了我们的数据集。

Improving Object Detection in Art Images Using Only Style Transfer

Despite recent advances in object detection using deep learning neural networks, these neural networks still struggle to identify objects in art images such as paintings and drawings. This challenge is known as the cross depiction problem and it stems in part from the tendency of neural networks to prioritize identification of an object’s texture over its shape. In this paper we propose and evaluate a process for training neural networks to localize objects – specifically people – in art images. We generate a large dataset for training and validation by modifying the images in the COCO dataset using AdaIn style transfer. This dataset is used to fine-tune a Faster R-CNN object detection network, which is then tested on the existing People-Art testing dataset. The result is a significant improvement on the state of the art and a new way forward for creating datasets to train neural networks to process art images.


虽然最近深度学习在目标检测领域有了长足发展,但是这些网络在艺术作品如画作等数据上的表现不佳。这个问题主要是因为神经网络倾向于通过目标的纹理而非形状进行推断。在本文中我们提出并且验证一种训练检测器的流程,这个流程训练的是对于艺术作品中的人物。我们使用AdaIn风格迁移将COCO数据集构建成一个庞大的数据集,然后在People-Art testing数据集上进行测试。结果显示我们的方法有效地提高了现有检测器在艺术作品上的检测表现。

Perceptual Losses for Real-Time Style Transfer and Super-Resolution

Perceptual Losses for Real-Time Style Transfer and Super-Resolution |  SpringerLink

We consider image transformation problems, where an input image is transformed into an output image. Recent methods for such problems typically train feed-forward convolutional neural networks using a per-pixel loss between the output and ground-truth images. Parallel work has shown that high-quality images can be generated by defining and optimizing perceptual loss functions based on high-level features extracted from pretrained networks. We combine the benefits of both approaches, and propose the use of perceptual loss functions for training feed-forward networks for image transformation tasks. We show results on image style transfer, where a feed-forward network is trained to solve the optimization problem proposed by Gatys et al in real-time. Compared to the optimization-based method, our network gives similar qualitative results but is three orders of magnitude faster. We also experiment with single-image super-resolution, where replacing a per-pixel loss with a perceptual loss gives visually pleasing results.