How Meta trains large language models at scale

As we continue to focus our AI research and development on solving increasingly complex problems, one of the most significant and challenging shifts we’ve experienced is the sheer scale of computation required to train large language models (LLMs).

Traditionally, our AI model training has involved a training massive number of models that required a comparatively smaller number of GPUs. This was the case for our recommendation models (e.g., our feed and ranking models) that would ingest vast amounts of information to make accurate recommendations that power most of our products.

With the advent of generative AI (GenAI), we’ve seen a shift towards fewer jobs, but incredibly large ones. Supporting GenAI at scale has meant rethinking how our software, hardware, and network infrastructure come together. — Read More

#big7

What happened when 20 comedians got AI to write their routines

AI is good at lots of things: spotting patterns in data, creating fantastical images, and condensing thousands of words into just a few paragraphs. But can it be a useful tool for writing comedy?

New research suggests that it can, but only to a very limited extent. It’s an intriguing finding that hints at the ways AI can—and cannot—assist with creative endeavors more generally.  — Read More

#humor

ChatGPT has caused a massive drop in demand for online digital freelancers

Many employees, especially those working in creative fields, are understandably worried by the prospect of AI stealing their jobs – and new research has found it may not be an unfounded fear. 

A report from the Imperial College Business School, Harvard Business School, and the German Institute for Economic Research, found the demand for digital freelancers in writing and coding declined by 21% since the launch of ChatGPT in November 2022. — Read More

Read the Paper

#strategy

Teams of LLM Agents can Exploit Zero-Day Vulnerabilities

LLM agents have become increasingly sophisticated, especially in the realm of cybersecurity. Researchers have shown that LLM agents can exploit real-world vulnerabilities when given a description of the vulnerability and toy capture-the-flag problems. However, these agents still perform poorly on real-world vulnerabilities that are unknown to the agent ahead of time (zero-day vulnerabilities).

In this work, we show that teams of LLM agents can exploit real-world, zero-day vulnerabilities. Prior agents struggle with exploring many different vulnerabilities and long-range planning when used alone. To resolve this, we introduce HPTSA, a system of agents with a planning agent that can launch subagents. The planning agent explores the system and determines which subagents to call, resolving long-term planning issues when trying different vulnerabilities. We construct a benchmark of 15 real-world vulnerabilities and show that our team of agents improve over prior work by up to 4.5x. — Read More

#cyber