PAC-MAN Re-created with AI by NVIDIA

Read More

#gans, #nvidia

Non-Adversarial Video Synthesis with Learned Priors

Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our approach generates superior quality videos compared to the existing stateof-the-art methods. Read More

Code

#gans, #recurrent-neural-networks

Reliable Fidelity and Diversity Metrics for Generative Models

Devising indicative evaluation metrics for the image generation task remains an open problem. The most widely used metric for measuring the similarity between real and generated images has been the Frechet Inception Distance (FID) score. Because it does not differentiate the fidelity and diversity aspects of the generated images, recent papers have introduced variants of precision and recall metrics to diagnose those properties separately. In this paper, we show that even the latest version of the precision and recall metrics are not reliable yet. For example, they fail to detect the match between two identical distributions, they are not robust against outliers, and the evaluation hyperparameters are selected arbitrarily. We propose density and coverage metrics that solve the above issues. We analytically and experimentally show that density and coverage provide more interpretable and reliable signals for practitioners than the existing metrics. Read More

#gans

A Beginner’s Guide to Generative Adversarial Networks (GANs)

Generative adversarial networks (GANs) are deep neural net architectures comprised of two nets, pitting one against the other (thus the “adversarial”).

GANs were introduced in a paper by Ian Goodfellow and other researchers at the University of Montreal, including Yoshua Bengio, in 2014. Referring to GANs, Facebook’s AI research director Yann LeCun called adversarial training “the most interesting idea in the last 10 years in ML.”

GANs’ potential is huge, because they can learn to mimic any distribution of data. That is, GANs can be taught to create worlds eerily similar to our own in any domain: images, music, speech, prose. They are robot artists in a sense, and their output is impressive – poignant even. Read More

#gans

Going Beyond GAN? New DeepMind VAE Model Generates High Fidelity Human

Generative adversarial networks (GANs) have become AI researchers’ “go-to” technique for generating photo-realistic synthetic images. Now, DeepMind researchers say that there may be a better option.

In a new paper, the Google-owned research company introduces its VQ-VAE 2 model for large scale image generation. The model is said to yield results competitive with state-of-the-art generative model BigGAN in synthesizing high-resolution images while delivering broader diversity and overcoming some native shortcomings of GANs. Read More

#deep-learning, #gans, #image-recognition

5 New Generative Adversarial Network (GAN) Architectures For Image Synthesis

AI image synthesis has made impressive progress since Generative Adversarial Networks (GANs) were introduced in 2014. GANs were originally only capable of generating small, blurry, black-and-white pictures, but now we can generate high-resolution, realistic and colorful pictures that you can hardly distinguish from real photographs.

Here we have summarized for you 5 recently introduced GAN architectures that are used for image synthesis. Read More

#gans

Efficient Video Generation on Complex Datasets

Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale. We attempt to carry this success to the field of video modeling by showing that large Generative Adversarial Networks trained on the complex Kinetics-600 dataset are able to produce video samples of substantially higher complexity than previous work. Our proposed model, Dual Video Discriminator GAN (DVD-GAN), scales to longer and higher resolution videos by leveraging a computationally efficient decomposition of its discriminator. We evaluate on the related tasks of video synthesis and video prediction, and achieve new state of the art Fréchet Inception Distance on prediction for Kinetics-600,as well as state of the art Inception Score for synthesis on the UCF-101 dataset,alongside establishing a strong baseline for synthesis on Kinetics-600. Read More

#gans, #image-recognition

AI Can Now Create Artificial People – What Does That Mean For Humans?

When DataGrid, Inc. announced it successfully developed an AI system capable of generating high-quality photorealistic Japanese faces, it was impressive. But now the company has gone even further. Its artificial intelligence (AI) system can now create not only faces and hair from a variety of ethnicities, but bodies that can move and wear any outfit. While these images are fictitious, they are incredibly photorealistic. Read More

#fake, #gans

IBM’s AI automatically generates creative captions for images

Writing photo captions is a monotonous — but necessary — chore begrudgingly undertaken by editors everywhere. Fortunately for them, though, AI might soon be able to handle the bulk of the work. In a paper (“Adversarial Semantic Alignment for Improved Image Captions”) appearing at the 2019 Conference in Computer Vision and Pattern Recognition (CVPR) in Long Beach, California this week, a team of scientists at IBM Research describes a model capable of autonomously crafting diverse, creative, and convincingly humanlike captions. Read More

#gans, #image-recognition, #nlp

Adversarial Semantic Alignment for Improved Image Captions

In this paper we study image captioning as a conditional GAN training, proposing both a context-aware LSTM captioner and co-attentive discriminator, which enforces semantic alignment between images and captions. We empirically focus on the viability of two training methods: Self-critical Sequence Training (SCST) and Gumbel Straight-Through (ST) and demonstrate that SCST shows more stable gradient behavior and improved results over Gumbel ST, even without accessing discriminator gradients directly. We also address the problem of automatic evaluation for captioning models and introduce a new semantic score, and show its correlation to human judgement. As an evaluation paradigm, we argue that an important criterion for a captioner is the ability to generalize to compositions of objects that do not usually cooccur together. To this end, we introduce a small captioned Out of Context (OOC) test set. The OOC set, combined with our semantic score, are the proposed new diagnosis tools for the captioning community. When evaluated on OOC and MS-COCO benchmarks, we show that SCST-based training has a strong performance in both semantic score and human evaluation, promising to be a valuable new approach for efficient discrete GAN training. Read More

#gans, #image-recognition, #nlp