ArrowGAN : Learning to Generate Videos by Learning Arrow of Time

Training GANs on videos is even more sophisticated than on images because videos have a distinguished dimension: time. While recent methods designed a dedicated architecture considering time, generated videos are still far from indistinguishable from real videos. In this paper, we introduce ArrowGAN framework, where the discriminators learns to classify arrow of time as an auxiliary task and the generators tries to synthesize forward-running videos. We argue that the auxiliary task should be carefully chosen regarding the target domain. In addition, we explore categorical ArrowGAN with recent techniques in conditional image generation upon ArrowGAN framework, achieving the state-of-the-art performance on categorical video generation. Our extensive experiments validate the effectiveness of arrow of time as a self-supervisory task, and demonstrate that all our components of categorical ArrowGAN lead to the improvement regarding video inception score and Frechet video distance on three datasets: Weizmann, UCFsports, and UCF-101.

使用视频训练GANs相较于用图像训练是复杂的,因为视频多出来一个时间轴。虽然最近的专用方法都考虑到了时间,但是生成的视频还远远不是完美的。在本文中,我们介绍一个称为ArrowGAN的视频生成模型,判别器将判别时间箭作为一个辅助任务,而生成器用于生成正向时间的视频。我们认为辅助任务的选择应该根据不同的目标域进行选择。另外,我们研究了类别ArrowGAN作为一个条件图像生成GAN在视频生成任务上的表现。我们在追加实验室验证了时间箭作为在自监督任务上的有效性,并且证明了所有ArrowGAN的部分均对性能有益并在Weizmann, UCFsports, 和UCF-101数据集上进行了验证。


邮箱地址不会被公开。 必填项已用*标注