The Modern Mathematics of Deep Learning

We describe the new field of mathematical analysis of deep learning. This field emerged around a list
of research questions that were not answered within the classical framework of learning theory. These
questions concern: the outstanding generalization power of overparametrized neural networks, the role of
depth in deep architectures, the apparent absence of the curse of dimensionality, the surprisingly successful
optimization performance despite the non-convexity of the problem, understanding what features are
learned, why deep architectures perform exceptionally well in physical problems, and which fine aspects
of an architecture affect the behavior of a learning task in which way. We present an overview of modern
approaches that yield partial answers to these questions. For selected approaches, we describe the main
ideas in more detail. Read More

#deep-learning

GPT-3 Scared You? Meet Wu Dao 2.0: A Monster of 1.75 Trillion Parameters

We’re living exciting times in AI. OpenAI shocked the world a year ago with GPT-3. Two weeks ago Google presented LaMDA and MUM, two AIs that will revolutionize chatbots and the search engine, respectively. And just a few days ago, on the 1st of June, the Beijing Academy of Artificial Intelligence (BAAI) conference presented Wu Dao 2.0.

Wu Dao 2.0 is now the largest neural network ever created and probably the most powerful. Its potential and limits are yet to be fully disclosed, but the expectations are high and rightly so.

In this article, I’ll review the available information about Wu Dao 2.0: What it is, what it can do, and what are the promises of its creators for the future. Enjoy! Read More

#nlp, #china-ai

TextStyleBrush: Transfer of text aesthetics from a single example

We present a novel approach for disentangling the content of a text image from all aspects of its appearance. The appearance representation we derive can then be applied to new content, for one-shot transfer of the source style to new content. We learn this disentanglement in a self-supervised manner. Our method processes entire word boxes, without requiring segmentation of text from background, per-character processing, or making assumptions on string lengths. We show results in different text domains which were previously handled by specialized methods, e.g., scene text, handwritten text. To these ends, we make a number of technical contributions: (1) We disentangle the style and content of a textual image into a non-parametric, fixed-dimensional vector. (2) We propose a novel approach inspired by StyleGAN but conditioned over the example style at different resolution and content. (3) We present novel self-supervised training criteria which preserve both source style and target content using a pre-trained font classifier and text recognizer. Finally, (4) we also introduce Imgur5K, a new challenging dataset for handwritten word images. We offer numerous qualitative photo-realistic results of our method. We further show that our method surpasses previous work in quantitative tests on scene text and handwriting datasets, as well as in a user study. Read More

#image-recognition, #gans