In recent years, the evolution of large language models has skyrocketed. BERT became one of the most popular and efficient models allowing to solve a wide range of NLP tasks with high accuracy. After BERT, a set of other models appeared later on the scene demonstrating outstanding results as well.
The obvious trend that became easy to observe is the fact that with time large language models (LLMs) tend to become more complex by exponentially augmenting the number of parameters and data they are trained on. Research in deep learning showed that such techniques usually lead to better results. Unfortunately, the machine learning world has already dealt with several problems regarding LLMs, and scalability has become the main obstacle in effective training, storing and using them.
As a consequence, new LLMs have been recently developed to tackle scalability issues. In this article, we will discuss ALBERT which was invented in 2020 with an objective of significant reduction of BERT parameters. — Read More