On Training Deep Generative Models with Latent Variables

作者: Djordje Miladinovic

DOI:

关键词:

摘要: Humans understand the world through concepts. They form high-level abstractions to represent sensory information in a simple way. Conceptual thinking is one of the central aspects of human intelligence as it allows knowledge reuse, simplifies the understanding of cause-effect relationships, and empowers creativity. We argue that further progress in our quest for artificial intelligence critically depends on the development of machine learning algorithms that can infer concepts from data and fantasize new data based on those concepts. Deep generative models with latent variables (DGLs) provide a unified framework for both (i) representation learning and (ii) data synthesis. Despite remarkable recent progress in this area, many practical challenges prevent DGLs from attaining their full potential. The goal of this thesis is to highlight those challenges and propose novel algorithmic solutions to address them. The first part of this thesis studies DGLs in the context of sequential data such as text sequences. DGLs for sequences are typically trained via maximum likelihood estimation (MLE), but this renders models with uninformative latent variables. To regularize degenerate MLE training, we propose an importance-weighted dropout scheme, implemented using an adversarial approach. In contrast to standard dropout, our method obtains a better trade-off between representation learning and sequence modeling. In the second part, we discuss DGLs for images in the form of variational autoencoders (VAEs). VAEs are generally regarded as inferior to other generative models concerning both density estimation and image generation quality. By …

参考文章(0)