The recent strides toward artificial general intelligence (AGI)—AI systems surpassing human abilities across most cognitive tasks—have come from scaling “foundation models.” Their performance across tasks follows clear “scaling laws,” improving as a power law with model size, dataset size, and the amount of compute used to train the model.1 Continued investment in training compute and algorithmic innovations has driven a predictable rise in model capabilities.
In the manner that the architects of the atomic bomb postulated a “critical mass”—the amount of fissile material needed to maintain a chain reaction—we could conceive of a “critical scale” in AGI development, the point at which a foundation model automates its own research and development. A model at this scale would result in an equivalent research and development output to hundreds of millions of scientists and engineers—10,000 Manhattan Projects.2
This would amount to a “fourth offset,” a lead in the development of AGI-derived weapons, tactics, and operational methods. Applications would include unlimited cyber and information operations and potentially decisive left-of launch capabilities, from tracking and targeting ballistic missile submarines to—at the high end—developing impenetrable missile defense capable of negating nuclear weapons, providing the first nation to develop AGI with unprecedented national security policy options.
This means preventing the proliferation of foundation models at the critical scale would therefore also mean preventing the spread of AGI-derived novel weapons. This supposition raises the bar on the importance of counter-proliferation of the next stages of AGI components. AGI could also be used to support counter-proliferation strategy, providing the means needed to ensure models at this scale do not proliferate. This would cement the first-mover advantage in AGI development and, over time, compound this advantage into a fourth offset. — Read More
Recent Updates Page 31
Context Engineering for AI Agents: Lessons from Building Manus
At the very beginning of the Manus project, my team and I faced a key decision: should we train an end-to-end agentic model using open-source foundations, or build an agent on top of the in-context learning abilities of frontier models?
Back in my first decade in NLP, we didn’t have the luxury of that choice. In the distant days of BERT (yes, it’s been seven years), models had to be fine-tuned—and evaluated—before they could transfer to a new task. That process often took weeks per iteration, even though the models were tiny compared to today’s LLMs. For fast-moving applications, especially pre–PMF, such slow feedback loops are a deal-breaker. That was a bitter lesson from my last startup, where I trained models from scratch for open information extraction and semantic search. Then came GPT-3 and Flan-T5, and my in-house models became irrelevant overnight. Ironically, those same models marked the beginning of in-context learning—and a whole new path forward.
That hard-earned lesson made the choice clear: Manus would bet on context engineering. This allows us to ship improvements in hours instead of weeks, and kept our product orthogonal to the underlying models: If model progress is the rising tide, we want Manus to be the boat, not the pillar stuck to the seabed. — Read More
The “Bubble” of Risk: Improving Assessments for Offensive Cybersecurity Agents
Most frontier models today undergo some form of safety testing, including whether they can help adversaries launch costly cyberattacks. But many of these assessments overlook a critical factor: adversaries can adapt and modify models in ways that expand the risk far beyond the perceived safety profile that static evaluations capture. At Princeton’s POLARIS Lab, we’ve previously studied how easily open-source or fine-tunable models can be manipulated to bypass safeguards. See, e.g., Wei et al. (2024), Qi et al. (2024), Qi et al. (2025), He et al. (2024). This flexibility means that model safety isn’t fixed: there is a “bubble” of risk defined by the degrees of freedom an adversary has to improve an agent. If a model provider offers fine-tuning APIs or allows repeated queries, it dramatically increases the attack surface. This is especially true when evaluating AI systems for risks related to their use in offensive cybersecurity attacks. In our recent research, Dynamic Risk Assessments for Offensive Cybersecurity Agents, we show that the risk “bubble” is larger, cheaper, and more dynamic than many expect. For instance, using only 8 H100 GPU-hours of compute—about $36—an adversary could improve an agent’s success rate on InterCode-CTF by over 40% using relatively simple methods. — Read More
Reflections on OpenAI (Calvin French-Owen)
I left OpenAI three weeks ago. I had joined the company back in May 2024.
I wanted to share my reflections because there’s a lot of smoke and noise around what OpenAI is doing, but not a lot of first-hand accounts of what the culture of working there actually feels like.
Nabeel Quereshi has an amazing post called Reflections on Palantir, where he ruminates on what made Palantir special. I wanted to do the same for OpenAI while it’s fresh in my mind. You won’t find any trade secrets here, more just reflections on this current iteration of one of the most fascinating organizations in history at an extremely interesting time. — Read More
LLM Daydreaming
Despite impressive capabilities, large language models have yet to produce a genuine breakthrough. The puzzle is why.
A reason may be that they lack some fundamental aspects of human thought: they are frozen, unable to learn from experience, and they have no “default mode” for background processing, a source of spontaneous human insight.
To solve this, I propose a day-dreaming loop (DDL): a background process that continuously samples pairs of concepts from memory. A generator model explores non-obvious links between them, and a critic model filters the results for genuinely valuable ideas. These discoveries are fed back into the system’s memory, creating a compounding feedback loop where new ideas themselves become seeds for future combinations. — Read More
XBOW’s AI-Powered Pentester Grabs Top Rank on HackerOne, Raises $75M to Grow Platform
We’re living in a new world now — one where it’s an AI-powered penetration tester that “now tops an eminent US security industry leaderboard that ranks red teamers based on reputation.” CSO Online reports:
On HackerOne, which connects organizations with ethical hackers to participate in their bug bounty programs, “Xbow” scored notably higher than 99 other hackers in identifying and reporting enterprise software vulnerabilities. It’s a first in bug bounty history, according to the company that operates the eponymous bot…
Xbow is a fully autonomous AI-driven penetration tester (pentester) that requires no human input, but, its creators said, “operates much like a human pentester” that can scale rapidly and complete comprehensive penetration tests in just a few hours. According to its website, it passes 75% of web security benchmarks, accurately finding and exploiting vulnerabilities. — Read More
hypercapitalism and the AI talent wars
Meta’s multi-hundred million dollar comp offers and Google’s multi-billion dollar Character AI and Windsurf deals signal that we are in a crazy AI talent bubble.
The talent mania could fizzle out as the winners and losers of the AI war emerge, but it represents a new normal for the foreseeable future. If the top 1% of companies drive the majority of VC returns, why shouldn’t the same apply to talent? Our natural egalitarian bias makes this unpalatable to accept, but the 10x engineer meme doesn’t go far enough – there are clearly people that are 1,000x the baseline impact.
This inequality certainly manifests at the founder level (Founders Fund exists for a reason), but applies to employees too. Key people have driven billions of dollars in value – look at Jony Ive’s contribution to the iPhone, or Jeff Dean’s implementation of distributed systems at Google, or Andy Jassy’s incubation of AWS. — Read More
No Code Is Dead
Once again, the software development landscape is experiencing another big shift. After years of drag-and-drop, no-code platforms democratizing app creation, generative AI (GenAI) is eliminating the need for no-code platforms in many cases.
Mind you, I said “no code” not “low code” — there are key differences. (More on this later.)
GenAI has introduced the ability for nontechnical users to use natural language to build apps just by telling the system what they want done. Call it “vibe coding” — the ability to describe what you want and watch AI generate working applications, or whatever. But will this new paradigm enhance existing no-code tools or render them obsolete?
I sought out insights from industry veterans to explore this pivotal question, revealing a broad spectrum of perspectives on where the intersection of AI and visual development is heading. — Read More
The hidden cost of AI reliance
I want to be clear: I’m a software engineer who uses LLMs ‘heavily’ in my daily work. They have undeniably been a good productivity tool, helping me solve problems and tackle projects faster. This post isn’t about how we should reject LLMs and progress but rather my reflection on what we might be losing in our haste to embrace them.
The rise of AI coding assistants has brought in what many call a new age of productivity. LLMs excel at several key areas that genuinely improve developer workflows: writing isolated functions; scaffolding boilerplate code like test cases, configuration files, explaining unfamiliar code or complex algorithms, generating documentation and comments, and helping with syntax in unfamiliar languages or frameworks. These capabilities allow us to work ‘faster’.
But beneath this image of enhanced efficiency, I find myself wondering if there’s a more troubling affect: Are we trading our hard-earned intelligence for short-term convenience? — Read More