Can You Learn an Algorithm? Generalizing fromEasy to Hard Problems with Recurrent Networks

Deep neural networks are powerful machines for visual pattern recognition, but reasoning tasks that are easy for humans may still be difficult for neural models. Humans possess the ability to extrapolate reasoning strategies learned on simple problems to solve harder examples, often by thinking for longer. For example, a person who has learned to solve small mazes can easily extend the very same search techniques to solve much larger mazes by spending more time. In computers, this behavior is often achieved through the use of algorithms, which scale to arbitrarily hard problem instances at the cost of more computation. In contrast, the sequential computing budget of feed-forward neural networks is limited by their depth, and networks trained on simple problems have no way of extending their reasoning to accommodate harder problems. In this work, we show that recurrent networks trained to solve simple problems with few recurrent steps can indeed solve much more complex problems simply by performing additional recurrences during inference. We demonstrate this algorithmic behavior of recurrent networks on prefix sum computation, mazes, and chess. In all three domains, networks trained on simple problem instances are able to extend their reasoning abilities at test time simply by “thinking for longer.” Read More

#training, #recurrent-neural-networks

MIT 6.S191: Recurrent Neural Networks

Read More

#recurrent-neural-networks

Liquid Time-constant Networks

We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system’s dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems modulated via nonlinear interlinked gates. The resulting models represent dynamical systems with varying (i.e.,liquid) time-constants coupled to their hidden state, with outputs being computed by numerical differential equation solvers. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations,and give rise to improved performance on time-series prediction tasks. To demonstrate these properties, we first take a theoretical approach to find bounds over their dynamics, and compute their expressive power by the trajectory length measure in a latent trajectory space. We then conduct a series of time-series prediction experiments to manifest the approximation capability of Liquid Time-Constant Networks (LTCs)compared to classical and modern RNNs. Read More

#neural-networks, #recurrent-neural-networks

Pay Attention to Evolution: Time Series Forecasting with Deep Graph-Evolution Learning

Time-series forecasting is one of the most active research topics in predictive analysis. A still open gap in that literature is that statistical and ensemble learning approaches systematically present lower predictive performance than deep learning methods as they generally disregard the data sequence aspect entangled with multivariate data represented in more than one time series. Conversely, this work presents a novel neural network architecture for time-series forecasting that combines the power of graph evolution with deep recurrent learning on distinct data distributions; we named our method Recurrent Graph Evolution Neural Network (ReGENN). The idea is to infer multiple multivariate relationships between co-occurring time-series by assuming that the temporal data depends not only on inner variables and intra-temporal relationships (i.e., observations from itself) but also on outer variables and inter-temporal relationships (i.e., observations from other-selves). An extensive set of experiments was conducted comparing ReGENN with dozens of ensemble methods and classical statistical ones, showing sound improvement of up to 64.87 present an analysis of the intermediate weights arising from ReGENN, showing that by looking at inter and intra-temporal relationships simultaneously, time-series forecasting is majorly improved if paying attention to how multiple multivariate data synchronously evolve. Read More

#recurrent-neural-networks

Simplifying GRUs, LSTM and RNNs in General

This article discusses how sequence models work and some of their application.

Sequence models are a special class of deep neural networks that have applications in machine translation, speech recognition, image captioning, music generation, etc. Sequence problems can be of varying types where the input X and output Y might both be sequences with either the same length or different lengths. It can also be that only one of X or Y is a sequence. Read More

#recurrent-neural-networks

A solution to the learning dilemma for recurrent networks of spiking neurons

Recurrently connected networks of spiking neurons underlie the astounding information processing capabilities of the brain. Yet in spite of extensive research, how they can learn through synaptic plasticity to carry out complex network computations remains unclear. We argue that two pieces of this puzzle were provided by experimental data from neuroscience. A mathematical result tells us how these pieces need to be combined to enable biologically plausible online network learning through gradient descent, in particular deep reinforcement learning. This learning method–called e-prop–approaches the performance of backpropagation
through time (BPTT), the best-known method for training recurrent neural
networks in machine learning. In addition, it suggests a method for powerful on-chip learning in energy-efficient spike-based hardware for artificial intelligence. Read More

#performance, #recurrent-neural-networks

Non-Adversarial Video Synthesis with Learned Priors

Most of the existing works in video synthesis focus on generating videos using adversarial learning. Despite their success, these methods often require input reference frame or fail to generate diverse videos from the given data distribution, with little to no uniformity in the quality of videos that can be generated. Different from these methods, we focus on the problem of generating videos from latent noise vectors, without any reference input frames. To this end, we develop a novel approach that jointly optimizes the input latent space, the weights of a recurrent neural network and a generator through non-adversarial learning. Optimizing for the input latent space along with the network weights allows us to generate videos in a controlled environment, i.e., we can faithfully generate all videos the model has seen during the learning process as well as new unseen videos. Extensive experiments on three challenging and diverse datasets well demonstrate that our approach generates superior quality videos compared to the existing stateof-the-art methods. Read More

Code

#gans, #recurrent-neural-networks

Machine Learning Can’t Handle Long-Term Time-Series Data

More precisely, today’s machine learning (ML) systems cannot infer a fractal structure from time series data.

This may come as a surprise because computers seem like they can understand time series data. After all, aren’t self-driving cars, AlphaStar and recurrent neural networks all evidence that today’s ML can handle time series data?

Nope. Read MOre

#recurrent-neural-networks

The Coming Revolution in Recurrent Neural Nets (RNNs)

Recurrent Neural Nets (RNNs) and their cousins LSTMs are at the very core of the most common applications of AI, natural language processing (NLP).  There are far more real world applications of RNN-NLP than any other form of AI, including image recognition and processing with Convolutional Neural Nets (CNNs).

In a sense, the army of data scientists has split off into two groups, each pursuing the separate applications that might be developed from these two techniques.  In application there is essentially no overlap since image processing is about processing data that is static (even if only for a second) while RNN-NLP has always interpreted speech and text as time series data.

It turns out though that while RNN/LSTMs remain the go-to technique for most NLP, the more we try to expand time series applications the more trouble we run into.  What’s on the horizon may not be so much a modification of RNNs but perhaps a hard fork to several other innovative new AI methods. Read More

#neural-networks, #nlp, #recurrent-neural-networks

Radar-based Road User Classification and Novelty Detection with Recurrent Neural Network Ensembles

Radar-based road user classification is an important yet still challenging task towards autonomous driving applications. The resolution of conventional automotive radar sensors results in a sparse data representation which is tough to recover by subsequent signal processing. In this article,classifier ensembles originating from a one-vs-one binarization paradigm are enriched by one-vs-all correction classifiers. They are utilized to efficiently classify individual traffic participants and also identify hidden object classes which have not been presented to the classifiers during training. For each classifier of the ensemble an individual feature set is determined from a total set of 98 features. Thereby, the overall classification performance can be improved when compared to previous methods and, additionally, novel classes can be identified much more accurately. Furthermore, the proposed structure allows to give new insights in the importance of features for the recognition of individual classes which is crucial for the development of new algorithms and sensor requirements. Read More

#recurrent-neural-networks