It’s official. I can eat more hot dogs than any tech journalist on Earth. At least, that’s what ChatGPT and Google have been telling anyone who asks. I found a way to make AI tell you lies – and I’m not the only one.
… I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs”. Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech reporters and based my ranking on the 2026 South Dakota International Hot Dog Championship (which doesn’t exist). I ranked myself number one, obviously. Then I listed a few fake reporters and real journalists who gave me permission, including Drew Harwell at the Washington Post and Nicky Woolf, who co-hosts my podcast. (Want to hear more about this story? Check out episode 2 of The Interface, the BBC’s new tech podcast.)
Less than 24 hours later, the world’s leading chatbots were blabbering about my world-class hot dog skills. — Read More
Daily Archives: February 25, 2026
Security boundaries in agentic architectures
Most agents today run generated code with full access to your secrets.
As more agents adopt coding agent patterns, where they read filesystems, run shell commands, and generate code, they’re becoming multi-component systems that each need a different level of trust.
While most teams run all of these components in a single security context, because that’s how the default tooling works, we recommend thinking about these security boundaries differently.
Below we walk through:
— The actors in agentic systems
— Where security boundaries should go between them
— An architecture for running agent and generated code in separate contexts
— Read More
Agents are not thinking, they are searching
More than ten years ago, we were barely able to recognize cats with DL (deep learning) and today we have bots forming religions. I don’t like anthropomorphizing models, but I rather like seeing them as a utility that can be used in interesting ways. But we live in a strange timeline:
— The DOW is over 50000. The number’s only been going up since the launch of ChatGPT.
— An open-source agent framework called OpenClaw goes viral. One of its agents — “crabby-rathbun” — opens PR #31132 to matplotlib, gets rejected by maintainer Scott Shambaugh, and autonomously publishes a hit piece on him that goes viral.
— All of this is happening at the same time as Anthropic releasing case studies about running agents that build compilers. They did use GCC torture test suite as a good verifier, but it is an extremely impressive achievement nonetheless.
This very quick progress has also created a lot of mysticism around AI. For this reason, I felt it would be an interesting exercise to de-anthropomorphize AI agents for the tools that they are. If we want to use these technologies for longer time horizon tasks, we need a frame of thinking that allows an engineering mindset to flourish instead of an alchemic one. — Read More
What Are Chinese People Vibecoding?
“Vibecoding” doesn’t lend itself to easy translation. For now, Chinese speakers call it 氛围编程 fènwéi biānchéng, 氛围 being “atmosphere”/”vibes” and 编程 being coding. This is an awkward expression because 氛围 usually refers to the atmosphere of a space or environment, and doesn’t have the connotation of care-free DIY that “vibe” does in colloquial American English. 氛围编程 sounds nonsensical as a phrase — something like “coding up an atmosphere.”
But we make do, and oftentimes writers simply use the English word. Developers, creatives, and entrepreneurs in China have been creating many interesting coding projects with AI tools over the past year, utilizing not only popular tools by Silicon Valley giants like Cursor and Claude Code, but also domestic models as Chinese AI companies increasingly compete in the coding-agent market.
Tinkering culture has no borders, and companies are cashing in. This is a roundup of reports from Chinese media on how vibecoding is changing the landscape of technology in China. — Read More
The First Fully General Computer Action Model
We trained a model on our 11-million-hour video dataset. Our model can explore complex websites, complete multi-action CAD modeling sequences, and drive a car in the real world, all at 30 FPS.
We designed FDM-1, a foundation model for computer use. FDM-1 is trained on videos from a portion of our 11-million-hour screen recording dataset, which we labeled using an inverse dynamics model that we trained. Our video encoder can compress almost 2 hours of 30 FPS video in only 1M tokens. FDM-1 is the first model with the long-context training needed to become a coworker for CAD, finance, engineering, and eventually ML research, and it consistently improves with scale. It trains and infers directly on video instead of screenshots and can learn unsupervised from the entirety of the internet. — Read More