Domain Adaptation of A Large Language Model

Large language models (LLMs) like BERT are usually pre-trained on general domain corpora like Wikipedia and BookCorpus. If we apply them to more specialized domains like medical, there is often a drop in performance compared to models adapted for those domains.

In this article, we will explore how to adapt a pre-trained LLM like Deberta base to medical domain using the HuggingFace Transformers library. Specifically, we will cover an effective technique called intermediate pre-training where we do further pre-training of the LLM on data from our target domain. This adapts the model to the new domain, and improves its performance.

This is a simple yet effective technique to tune LLMs to your domain and gain significant improvements in downstream task performance. — Read More

#devops

Exploring GPTs: ChatGPT in a trench coat?

The biggest announcement from last week’s OpenAI DevDay (and there were a LOT of announcements) was GPTs. Users of ChatGPT Plus can now create their own, custom GPT chat bots that other Plus subscribers can then talk to.

My initial impression of GPTs was that they’re not much more than ChatGPT in a trench coat—a fancy wrapper for standard GPT-4 with some pre-baked prompts.

Now that I’ve spent more time with them I’m beginning to see glimpses of something more than that. The combination of features they provide can add up to some very interesting results. — Read More

#chatbots

500 chatbots read the news and discussed it on social media. Guess how that went.

On a simulated day in July of a 2020 that didn’t happen, 500 chatbots read the news — real news, our news, from the real July 1, 2020. ABC News reported that Alabama students were throwing “COVID parties.” On CNN, President Donald Trump called Black Lives Matter a “symbol of hate.” The New York Times had a story about the baseball season being canceled because of the pandemic.

Then the 500 robots logged into something very much (but not totally) like Twitter, and discussed what they had read. Meanwhile, in our world, the not-simulated world, a bunch of scientists were watching. — Read More

#chatbots

Google DeepMind wants to define what counts as artificial general intelligence

AGI, or artificial general intelligence, is one of the hottest topics in tech today. It’s also one of the most controversial. A big part of the problem is that few people agree on what the term even means. Now a team of Google DeepMind researchers has put out a paper that cuts through the cross talk with not just one new definition for AGI but a whole taxonomy of them.

In broad terms, AGI typically means artificial intelligence that matches (or outmatches) humans on a range of tasks. But specifics about what counts as human-like, what tasks, and how many all tend to get waved away: AGI is AI, but better.

To come up with the new definition, the Google DeepMind team started with prominent existing definitions of AGI and drew out what they believe to be their essential common features. 

The team also outlines five ascending levels of AGI: emerging (which in their view includes cutting-edge chatbots like ChatGPT and Bard), competent, expert, virtuoso, and superhuman (performing a wide range of tasks better than all humans, including tasks humans cannot do at all, such as decoding other people’s thoughts, predicting future events, and talking to animals). They note that no level beyond emerging AGI has been achieved. — Read More

Read the Paper

#human