Google’s Supermodel: DeepMind Perceiver is a step on the road to an AI machine that could process anything and everything

The Perceiver is kind-of a way-station on the way to what Google AI lead Jeff Dean has described as one model that could handle any task, and “learn” faster, with less data.

Arguably one of the premiere events that has brought AI to popular attention in recent years was the invention of the Transformer by Ashish Vaswani and colleagues at Google in 2017. The Transformer led to lots of language programs such as Google’s BERT and OpenAI’s GPT-3 that have been able to produce surprisingly human-seeming sentences, giving the impression machines can write like a person. 

Now, scientists at DeepMind in the U.K., which is owned by Google, want to take the benefits of the Transformer beyond text, to let it revolutionize other material including images, sounds and video, and spatial data of the kind a car records with LiDAR. 

The Perceiver, unveiled this week by DeepMind in a paper posted on arXiv, adapts the Transformer with some tweaks to let it consume all those types of input, and to perform on the various tasks, such as image recognition, for which separate kinds of neural networks are usually developed. Read More

#big7, #nlp, #image-recognition