Learning sensorimotor control policies from high dimensional images crucially relies on the quality of the underlying visual representations. Prior works show that structured latent space such as visual keypoints often outperforms unstructured representations for robotic control. However, most of these representations, whether structured or unstructured are learned in a 2D space even though the control tasks are usually performed in a 3D environment. In this work, we propose a framework to learn such a 3D geometric structure directly from images in an end-to-end unsupervised manner. The input images are embedded into latent 3D keypoints via a differentiable encoder which is trained to optimize both a multi-view consistency loss and downstream task objective. These discovered D keypoints tend to meaningfully capture robot joints as well as object movements in a consistent manner across both time and 3D space. The proposed approach outperforms prior state-of-art methods across a variety of reinforcement learning benchmarks. Read More
Daily Archives: November 11, 2021
Fortune Brainstorm A.I. 2021: Big breakthroughs
Dr. Andrew Ng, Founder and CEO, Landing A.I. and DeepLearning.AI Interviewer: Brian O’Keefe, FORTUNE at Brainstorm A.I., a recent conference bringing together the top executives from the world’s biggest tech companies, thought leaders, and innovators to explore key issues shaping the A.I. revolution. Read More