DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

Hsin-Ying Lee; Hung-Yu Tseng; Jia-Bin Huang; Maneesh Singh; Ming-Hsuan Yang; Qi Mao; Yu-Ding Lu

arxiv: 1905.01270 · v2 · pith:ZKLJZCBTnew · submitted 2019-05-02 · 💻 cs.CV

DRIT++: Diverse Image-to-Image Translation via Disentangled Representations

Hsin-Ying Lee , Hung-Yu Tseng , Qi Mao , Jia-Bin Huang , Yu-Ding Lu , Maneesh Singh , Ming-Hsuan Yang This is my paper

classification 💻 cs.CV

keywords diverseoutputstrainingattributedisentangledimagesspacecontent

0 comments

read the original abstract

Image-to-image translation aims to learn the mapping between two visual domains. There are two main challenges for this task: 1) lack of aligned training pairs and 2) multiple possible outputs from a single input image. In this work, we present an approach based on disentangled representation for generating diverse outputs without paired training images. To synthesize diverse outputs, we propose to embed images onto two spaces: a domain-invariant content space capturing shared information across domains and a domain-specific attribute space. Our model takes the encoded content features extracted from a given input and attribute vectors sampled from the attribute space to synthesize diverse outputs at test time. To handle unpaired training data, we introduce a cross-cycle consistency loss based on disentangled representations. Qualitative results show that our model can generate diverse and realistic images on a wide range of tasks without paired training data. For quantitative evaluations, we measure realism with user study and Fr\'{e}chet inception distance, and measure diversity with the perceptual distance metric, Jensen-Shannon divergence, and number of statistically-different bins.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Latent space projections and atlases: A cautionary tale in deep neuroimaging using autoencoders
stat.AP 2025-09 unverdicted novelty 5.0

A simple convolutional autoencoder on ADNI brain scans learns latent spaces linked to Alzheimer's progression; the new LRCP framework plus SHAP analysis identifies which atlas regions carry the clinically relevant inf...