DRIT++: Diverse Image-to-Image Translation via

Disentangled Representations



Learning diverse image-to-image translation from unpaired data

Abstract
Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: 1) lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative evaluations, we measure realism with user study and Frèchet inception distance, and measure diversity with the perceptual distance metric, Jensen-Shannon divergence, and number of statistically-different bins.

Paper

Citation

Hsin-Ying Lee*, Hung-Yu Tseng*, Jia-Bin Huang, Maneesh Kumar Singh, and Ming-Hsuan Yang, "Diverse Image-to-Image Translation via Disentangled Representations", in European Conference on Computer Vision, 2018.


Qi Mao*, Hsin-Ying Lee*, Hung-Yu Tseng*, Siwei Ma, and Ming-Hsuan Yang, "Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis", in IEEE Conference on Computer Vision and Pattern Recognition, 2019


Hsin-Ying Lee*, Hung-Yu Tseng*, Qi Mao*, Jia-Bin Huang, Yu-Ding Lu, Maneesh Kumar Singh, and Ming-Hsuan Yang, "DRIT++: Diverse Image-to-Image Translation via Disentangled Representations", in arXiv preprint, 2019.

* indicates equal contributions



Bibtex
@inproceedings{DRIT,
            author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Huang, Jia-Bin and Singh, Maneesh Kumar and Yang, Ming-Hsuan},
            booktitle = {European Conference on Computer Vision},
            title = {Diverse Image-to-Image Translation via Disentangled Representations},
            year = {2018}}
@inproceedings{MSGAN,
            author = {Mao, Qi and Lee, Hsin-Ying and Tseng, Hung-Yu and Ma, Siwei and Yang, Ming-Hsuan},
            booktitle = {IEEE Conference on Computer Vision and Pattern Recognition},
            title = {Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis},
            year = {2019}}
@article{DRIT_plus,
            author = {Lee, Hsin-Ying and Tseng, Hung-Yu and Mao, Qi and Huang, Jia-Bin and Lu, Yu-Ding and Singh, Maneesh Kumar and Yang, Ming-Hsuan},
            title = {DRIT++: Diverse Image-to-Image Translation viaDisentangled Representations},
            journal = {arXiv preprint arXiv:1905.01270},
            year = {2019}}
Models
Disentangled Representation Image-to-Image Translation (DRIT)

Multi-Domain DRIT
Results
Cat → Dog
Dog → Cat
Summer → Winter
Winter → Summer

* Click to see results from different methods
References