Artificial intelligence (AI) is a multidisciplinary field of science and engineering whose goal is to create intelligent machines.
We believe that AI will be a force multiplier on technological progress in our increasingly digital, data-driven world. This is because everything around us today, ranging from culture to consumer products, is a product of intelligence.
The State of AI Report is now in its third year. New to the 2020 edition are several invited content contributions from a range of well-known and up-and-coming companies and research groups. Consider this Report as a compilation of the most interesting things we’ve seen with a goal of triggering an informed conversation about the state of AI and its implication for the future. Read More
Monthly Archives: October 2020
AI Training Method Exceeds GPT-3 Performance with 99.9% Fewer Parameters
A team of scientists at LMU Munich have developed Pattern-Exploiting Training (PET), a deep-learning training technique for natural language processing (NLP) models. Using PET, the team trained a Transformer NLP model with 223M parameters that out-performed the 175B-parameter GPT-3 by over 3 percentage points on the SuperGLUE benchmark. Read More
TernaryBERT: Quantization Meets Distillation
A BERTology contribution by Huawei
The ongoing trend of building ever larger models like BERT and GPT-3 has been accompanied by a complementary effort to reduce their size at little or no cost in accuracy. Effective models are built either via distillation (Pre-trained Distillation, DistilBERT, MobileBERT, TinyBERT), quantization (Q-BERT, Q8BERT) or parameter pruning.
On September 27, Huawei introduced TernaryBERT, a model that leverages both distillation and quantization to achieve accuracy comparable to the original BERT model with ~15x decrease in size. What is truly remarkable about TernaryBERT is that its weights are ternarized, i.e. have one of three values: -1, 0, or 1 (and can hence be stored in only two bits). Read More
Algorithms are not enough
The next breakthrough in AI requires a rethinking of our hardware
Today’s AI has a problem: it is expensive. Training Resnet-152, a modern computer vision model, is estimated to cost around 10 Billion floating point operations, which is dwarfed by modern language models. Training GPT-3, the recent natural language model from OpenAI, is estimated to cost 300 Billion Trillion floating point operations, which costs at least $5M on commercial GPUs. Compare this to the human brain, which can recognize faces, answer questions, and drive cars with as little as a banana and a cup of coffee. Read More
Google Teases Large Scale Reinforcement Learning Infrastructure
“The new infrastructure reduces the training time from eight hours down to merely one hour compared to a strong baseline.”
The current state-of-the-art reinforcement learning techniques require many iterations over many samples from the environment to learn a target task. For instance, the game Dota 2 learns from batches of 2 million frames every 2 seconds. The infrastructure that handles RL at this scale should be not only good at collecting a large number of samples, but also be able to quickly iterate over these extensive amounts of samples during training. Read More
Super Learner versus Deep Neural Network
Deep Learning has taken a prominent place for tasks involving predictive modelling and pattern recognition. Deep Learning with its auto feature extraction and feed-forward methods gives the confidence to extract low-level features in order to identify high-level identities in big data applications. However, deep neural networks have drawbacks, which include many hyperparameters tuning together, slow convergence in smaller datasets and issues explaining why a particular decision was been made. While traditional machine learning algorithms can address these drawbacks, they are not typically capable of achieving the performance levels registered by deep neural networks. To improve performance, ensemble methods are used to combine multiple base learners. Read More
Backpropagation made easy
Backpropagation is so basic in machine learning yet seems so daunting. But actually, it is easier than it seems.
t doesn’t take a math genius to learn Machine Learning (ML). Basically, all you need is college first-year level calculus, linear algebra, and probability theory, and you are good to go. But behind the seemingly-benign first impression of ML, there are a lot of mathematical theories related to ML. For many people, the first real obstacle in learning ML is back-propagation (BP). It is the method we use to deduce the gradient of parameters in a neural network (NN). It is a necessary step in the Gradient Descent algorithm to train a model. Read More
Researchers spot origins of stereotyping in AI language technologies
A team of researchers has identified a set of cultural stereotypes that are introduced into artificial intelligence models for language early in their development—a finding that adds to our understanding of the factors that influence results yielded by search engines and other AI-driven tools. Read More
9 Soft Skills Every Employee Will Need In The Age Of Artificial Intelligence (AI)
Technical skills and data literacy are obviously important in this age of AI, big data, and automation. But that doesn’t mean we should ignore the human side of work – skills in areas that robots can’t do so well. I believe these softer skills will become even more critical for success as the nature of work evolves, and as machines take on more of the easily automated aspects of work. In other words, the work of humans is going to become altogether more, well, human. Read More