In April, Anthropic initated Project Glasswing. The idea was to let companies use their new model to find and fix vulnerabilities in their own software. It was a fantastic PR move, and so many press outlets have uncritically parroted Anthropic’s claims that it’s now common wisdom that Mythos is better at finding software vulnerabilities than other models. Which is just not true. — Read More
Daily Archives: June 9, 2026
Models inherit a stale web, and they set us back a year
… [T]he models we now write code with learned from a web that is already old. I made the model gap to show this concretely: measured in Chrome releases (I know, I know, the web is far broader than just Chrome, but also Chrome has easy data to access on chromestatus.com), even the freshest model is several versions behind, and most are ten to twenty behind. The “knowledge” cutoff is a serious issue for the web platform, and the ecosystem of libraries and tools that are being launched but are not easily available to these models is massively gaining traction (Claude Code).
That connects to model half-life, where I looked at how quickly models are superseded, and to dead framework theory: if a framework stops appearing in fresh training data, the models stop reaching for it, and the framework quietly dies regardless of its merits. I wrote this thesis at least 6 months ago, and I think I’ve been proven correct (which is why we built Modern Web Guidance). The flip side, though, is that I’ve found guided output getting better than what people create (I think auto-research loops to optimize web performance, as an example, will massively raise the bar for the quality of the web people experience). — Read More
ChatGPT failed to kill Google Search
Ayear ago it wasn’t clear how AI was going to work out for Alphabet — GOOGL $365.95 (-1.35%) — , which missed out on the first-mover advantage held by OpenAI’s ChatGPT.
The fear was that AI competition would eat into traffic for Google’s all-important Search business. And that incorporating AI answers into its own searches could cannibalize revenue, since customers would be less likely to pay for their blue-linked pride of place if people got all their answers up top. Those fears have not materialized. — Read More