Breakthrough Research In Reinforcement Learning From 2019

Reinforcement learning (RL) continues to be less valuable for business applications than supervised learning, and even unsupervised learning. It is successfully applied only in areas where huge amounts of simulated data can be generated, like robotics and games.

However, many experts recognize RL as a promising path towards Artificial General Intelligence (AGI), or true intelligence. Thus, research teams from top institutions and tech leaders are seeking ways to make RL algorithms more sample-efficient and stable.

We’ve selected and summarized 10 research papers that we think are representative of the latest research trends in reinforcement learning. Read More

#reinforcement-learning

On the Measure of Intelligence

To make deliberate progress towards more intelligent and more human-like artificial systems, we need to be following an appropriate feedback signal: we need to be able to define and evaluate intelligence in a way that enables comparisons between two systems,as well as comparisons with humans. Over the past hundred years, there has been an abundance of attempts to define and measure intelligence, across both the fields of psychology and AI. We summarize and critically assess these definitions and evaluation approaches,while making apparent the two historical conceptions of intelligence that have implicitly guided them. We note that in practice, the contemporary AI community still gravitates to-wards benchmarking intelligence by comparing the skill exhibited by AIs and humans at specific tasks, such as board games and video games. We argue that solely measuring skill at any given task falls short of measuring intelligence, because skill is heavily modulated by prior knowledge and experience: unlimited priors or unlimited training data allow experimenters to “buy” arbitrary levels of skills for a system, in a way that masks the system’s own generalization power. We then articulate a new formal definition of intelligence based on Algorithmic Information Theory, describing intelligence as skill-acquisition efficiency and highlighting the concepts of scope,generalization difficulty,priors, and experience, as critical pieces to be accounted for in characterizing intelligent systems. Using this definition, we propose a set of guidelines for what a general AI benchmark should look like.Finally, we present a new benchmark closely following these guidelines, the Abstraction and Reasoning Corpus (ARC), built upon an explicit set of priors designed to be as close as possible to innate human priors. We argue that ARC can be used to measure a human-like form of general fluid intelligence and that it enables fair general intelligence comparisons between AI systems and humans. Read More

#artificial-intelligence, #human, #reinforcement-learning

An Introduction to Deep Reinforcement Learning

Deep reinforcement learning is the combination of reinforcement learning (RL) and deep learning. This field of research has been able to solve a wide range of complex decision-making tasks that were previously out of reach for a machine.Thus, deep RL opens up many new applications in domains such as healthcare, robotics, smart grids, finance, and many more. This manuscript provides an introduction to deep reinforcement learning models, algorithms and techniques.Particular focus is on the aspects related to generalization and how deep RL can be used for practical applications. We assume the reader is familiar with basic machine learning concepts. Read More

#deep-learning, #reinforcement-learning

Solving Rubik’s Cube with a Robot Hand

We demonstrate that models trained only in simulation can be used to solve a manipulation problem of unprecedented complexity on a real robot. This is made possible by two key components: a novel algorithm, which we call automatic domain randomization (ADR) and a robot platform built for machine learning. ADR automatically generates a distribution over randomized environments of ever-increasing difficulty. Control policies and vision state estimators trained with ADR exhibit vastly improved sim2real transfer. For control policies, memory-augmented models trained on an ADR-generated distribution of environments show clear signs of emergent meta-learning at test time. The combination of ADR with our custom robot platform allows us to solve a Rubik’s cube with a humanoid robot hand, which involves both control and state estimation problems. Videos summarizing our results are available: https://openai.com/blog/solving-rubiks-cube/ Read More

#reinforcement-learning, #robotics

DeepMind Has Quietly Open Sourced Three New Impressive Reinforcement Learning Frameworks

Deep reinforcement learning(DRL) has been at the center of some of the biggest breakthroughs of artificial intelligence(AI) in the last few years. However, despite all its progress, DRL methods remain incredibly difficult to apply in mainstream solutions given the lack of tooling and libraries. Consequently, DRL remains mostly a research activity that hasn’t seen a lot of adoption into real world machine learning solutions. Addressing that problem requires better tools and frameworks. Among the current generation of artificial intelligence(AI) leaders, DeepMind stands alone as the company that has done the most to advance DRL research and development. Recently, the Alphabet subsidiary has been releasing a series of new open source technologies that can help to streamline the adoption of DRL methods. Read More

#reinforcement-learning

OpenSpiel: A Framework for Reinforcement Learning in Games

OpenSpiel is a collection of environments and algorithms for research in general reinforcement learning and search/planning in games. OpenSpiel supports n-player (single- and multi- agent) zero-sum, cooperative and general-sum, one-shot and sequential, strictly turn-taking and simultaneous-move, perfect and imperfect information games, as well as traditional multiagent environments such as (partially- and fully-observable) grid worlds and social dilemmas. OpenSpiel also includes tools to analyze learning dynamics and other common evaluation metrics. This document serves both as an overview of the code base and an introduction to the terminology, core concepts, and algorithms across the fields of reinforcement learning,computational game theory, and search. Read More

#reinforcement-learning

How teaching AI to be curious helps machines learn for themselves

When playing a video game, what motivates you to carry on?

This question is perhaps too broad to yield a single answer, but if you had to sum up why you accept that next quest, jump into a new level, or cave and playjust one more turn, the simplest explanation might be “curiosity” — just to see what happens next. And as it turns out, curiosity is a very effective motivator when teaching AI to play video games, too.IN A GAME WITHOUT REWARDS, TEACHING AI IS DIFFICULT

Research published this week by artificial intelligence lab OpenAI explains how an AI agent with a sense of curiosity outperformed its predecessors playing the classic 1984 Atari game Montezuma’s Revenge. Read More

#reinforcement-learning

Deepmind’s losses and the future of Artificial Intelligence

ALPHABET’S DEEPMIND LOST $572 million last year. What does it mean?

DeepMind, likely the world’s largest research-focused artificial intelligence operation, is losing a lot of money fast, more than $1 billion in the past three years. DeepMind also has more than $1 billion in debt due in the next 12 months.

Does this mean that AI is falling apart? Read More

#artificial-intelligence, #reinforcement-learning

Inside DeepMind's epic mission to solve science's trickiest problem

DeepMind is best known for its breakthroughs in machine learning and deep learning that have resulted in highly publicised events in which neural networks combined with algorithms have mastered computer games, beaten chess grandmasters and caused Lee Sedol, the world champion of Go – widely agreed to be the most complex game man has created – to declare: “From the beginning of the game, there was not a moment in time when I thought that I was winning.”

For Demis Hassabis,Shane Legg, and Mustafa Suleyman, the proof points offered by gameplay will define the next ten years: namely, to use data and machine learning to solve some of the hardest problems in science. Read More

#deep-learning, #reinforcement-learning, #strategy

Hierarchical Imitation and Reinforcement Learning

We study how to effectively leverage expert feedback to learn sequential decision-making policies. We focus on problems with sparse rewards and long time horizons, which typically pose significant challenges in reinforcement learning. We propose an algorithmic framework, called hierarchical guidance, that leverages the hierarchical structure of the underlying problem to integrate different modes of expert interaction. Our framework can incorporate different combinations of imitation learning (IL) and reinforcement learning (RL) at different levels, leading to dramatic reductions in both expert effort and cost of exploration. Using long-horizon benchmarks, including Montezuma’s Revenge, we demonstrate that our approach can learn significantly faster than hierarchical RL, and be significantly more label-efficient than standard IL. We also theoretically analyze labeling cost for certain instantiations of our framework. Read More

#human, #observational-learning, #reinforcement-learning