Securing the future of AI agents

AI agents are transforming our relationship with technology. By autonomously executing complex tasks — from cyber defence to scientific discovery and product development — these systems are unlocking a new era of productivity. In the U.S alone, AI agents could create $2.9 trillion in economic value by 2030.

As these agents become more capable, they also require more sophisticated safeguards. That’s why we developed our AI Control Roadmap: a framework for building and managing the advanced AI we deploy within Google. This “defense-in-depth” approach, which could serve as a model for the wider industry, goes beyond traditional model alignment, adding a crucial layer of system-level security that provides assurance even if alignment is imperfect. — Read More

#trust

Reinforcement learning towards broadly and persistently beneficial models

As AI systems become more capable and autonomous in high-stakes settings like health, science, education, and coding, they will need to remain helpful, honest, transparent, and safe in situations they have not seen before. This requires generalizing to new contexts, new pressures, longer and more complex interactions, and across domains that differ from those seen during training.

We find that reinforcement learning on realistic scenarios targeting beneficial traits can produce broad improvements across dozens of benchmarks measuring aligned and beneficial behavior. These alignment gains generalize beyond the domains used for training and persist under adversarial pressure. — Read More

#strategy

Anthropic Thinks “FOOM” Is Near

Famous AI Doomer Eliezer Yudkowsky first wrote about “Recursive Self Improvement” (RSI from here on) back in December 2008 on LessWrong. For those who don’t know what this means, it is the hypothetical tipping point where Skynet becomes self-aware and starts self-improving at a geometric rate AI systems are able to meaningfully contribute or even take over their own training and enhancement. In short, one generation of AIs can give rise to their successors.

Dario Amodei, and Anthropic more broadly, have bought this narrative, hook, line, and sinker. — Read More

#singularity

A New Era of Midjourney

Today we’re gonna announce something a little weird and a little crazy, but also spectacular and filled with hope.

… We’re building a bold new kind of machine to reimagine the foundations of healthcare and our relationships to our bodies.

… It starts by stepping into a shallow pool of golden light. You then begin to descend into the water. Your body passes through a ring of underwater sensors, each acting like a dolphin, using its echolocation. The sensors send ultrasonic sound waves through your body from every angle. With enough waves, and enough angles, we form an image of what’s happening inside your body. — Read More

#human