Constitutional AI: RLHF On Steroids

AIs like GPT-4 go through several different1 types of training. First, they train on giant text corpuses in order to work at all. Later, they go through a process called “reinforcement learning through human feedback” (RLHF) which trains them to be “nice”. RLHF is why they (usually) won’t make up fake answers to your questions, tell you how to make a bomb, or rank all human races from best to worst.

RLHF is hard. The usual method is to make human crowdworkers rate thousands of AI responses as good or bad, then train the AI towards the good answers and away from the bad answers. But having thousands of crowdworkers rate thousands of answers is expensive and time-consuming. And it puts the AI’s ethics in the hands of random crowdworkers. Companies train these crowdworkers in what responses they want, but they’re limited by the crowdworkers’ ability to follow their rules.

In their new preprint Constitutional AI: Harmlessness From AI Feedback, a team at Anthropic (a big AI company) announces a surprising update to this process: what if the AI gives feedback to itself? — Read More

#nlp

Meta open-sources multisensory AI model that combines six types of data

Meta has announced a new open-source AI model that links together multiple streams of data, including text, audio, visual data, temperature, and movement readings.

The model is only a research project at this point, with no immediate consumer or practical applications, but it points to a future of generative AI systems that can create immersive, multisensory experiences and shows that Meta continues to share AI research at a time when rivals like OpenAI and Google have become increasingly secretive.

The core concept of the research is linking together multiple types of data into a single multidimensional index (or “embedding space,” to use AI parlance). This idea may seem a little abstract, but it’s this same concept that underpins the recent boom in generative AI. Read More

#big7

IBM takes another shot at Watson as A.I. boom picks up steam

It’s been a long time since IBM has actively touted Watson. Originally created to beat humans at the “Jeopardy!” game show, Watson marked IBM’s early splash in artificial intelligence, but it never amounted to a profitable offering.

About 15 months ago, IBM sold its Watson Health unit for an undisclosed amount to private equity firm Francisco Partners.

Now, Watson has given way to WatsonX, and IBM is trying to ride the latest boom in AI. IBM is billing it as a development studio for companies to “train, tune and deploy” machine-learning models.  Read More

#devops