Over the last two years, the Natural Language Processing community has witnessed an acceleration in progress on a wide range of different tasks and applications. 🚀 This progress was enabled by a shift of paradigm in the way we classically build an NLP system: for a long time, we used pre-trained word embeddings such as word2vec or GloVe to initialize the first layer of a neural network, followed by a task-specific architecture that is trained in a supervised way using a single dataset.
Recently, several works demonstrated that we can learn hierarchical contextualized representations on web-scale datasets 📖 leveraging unsupervised (or self-supervised) signals such as language modeling and transfer this pre-training to downstream tasks (Transfer Learning).Excitingly, this shift led to significant advances on a wide range of downstream applications ranging from Question Answering, to Natural Language Inference through Syntactic Parsing…
“Which papers can I read to catch up with the latest trends in modern NLP?”
Read More