2026: The Year The IDE Died

Read More
#videos

The Anthropic Hive Mind

… If you run some back-of-envelope math on how hard it is to get into Anthropic, as an industry professional, and compare it to your odds of making it as a HS or college player into the National Football League, you’ll find the odds are comparable. Everyone I’ve met from Anthropic is the best of the best of the best, to an even crazier degree than Google was at its peak. (Evidence: Google hired me. I was the scrapest of the byest.)

…Everyone you talk to from Anthropic will eventually mention the chaos. It is not run like any other company of this size. Every other company quickly becomes “professional” and compartmentalized and accountable and grown-up and whatnot at their size. … Anthropic is completely run by vibes. — Read More

#strategy

AI Pioneer: The Bubble Is Real And Could Trigger an AI Winter | Andrew Ng

Read More
#videos

13 thoughts on Anthropic, OpenAI and the Department of War

When I went to bed last night1, it appeared that Secretary of War Pete Hegseth (it still feels surreal to type that phrase) had potentially undermined American competitiveness by instructing the federal government not to use Claude and designating the company behind it, Anthropic, as a supply chain risk, a move that could force divestment in Anthropic from Nvidia, Amazon, Google and other companies that contract with the federal government. Was the military going to be stuck using Elon Musk’s Grok, a model that has its uses but is decidedly not on the lead lap and is reportedly considered too unreliable for classified settings?

Nope. Instead, I awoke to news that the Pentagon had reached an agreement with Anthropic rival OpenAI. (And also that we were bombing Iran.) This is at least a little bit more rational, which is not to say that you should feel happy about any of this. The story is complicated and is still developing; Anthropic will take its case to court and the government could TACO out. (For instance, by signing the deal with OpenAI but unbanning Claude.)

Nevertheless, the intersection of AI and politics falls squarely into the Silver Bulletin wheelhouse, something I’m sure we’ll be covering more and more. — Read More

#dod

AI chatbots chose nuclear escalation in 95% of simulated war games, study finds

At least one AI model in every war game escalated the conflict by threatening to use nuclear weapons, the study found.

Artificial intelligence could dramatically change how nuclear crises are handled, according to a new study.

The pre-print study from King’s College London pitted OpenAI’s ChatGPT, Anthropic’s Claude and Google’s Gemini Flashagainst each other in simulated war games. Each large language model took on the role of a national leader commanding a nuclear-armed superpower in a Cold War-style crisis.

In every game, at least one model attempted to escalate the conflict by threatening to detonate a nuclear weapon. — Read More

#strategy

Large-Scale Online Deanonymization with LLMs

TL;DR: We show that LLM agents can figure out who you are from your anonymous online posts. Across Hacker News, Reddit, LinkedIn, and anonymized interview transcripts, our method identifies users with high precision – and scales to tens of thousands of candidates.

While it has been known that individuals can be uniquely identified by surprisingly few attributes, this was often practically limited. Data is often only available in unstructured form and deanonymization used to require human investigators to search and reason based on clues. We show that from a handful of comments, LLMs can infer where you live, what you do, and your interests – then search for you on the web. In our new research, we show that this is not only possible but increasingly practical. — Read More

Read the Paper

#privacy

The Architecture Behind Open-Source LLMs

In December 2024, DeepSeek released V3 with the claim that they had trained a frontier-class model for $5.576 million. They used an attention mechanism called Multi-Head Latent Attention that slashed memory usage. An expert routing strategy avoided the usual performance penalty. Aggressive FP8 training cuts costs further.

Within months, Moonshot AI’s Kimi K2 team openly adopted DeepSeek’s architecture as their starting point, scaled it to a trillion parameters, invented a new optimizer to solve a training stability challenge that emerged at that scale, and competed with it across major benchmarks.

Then, in February 2026, Zhipu AI’s GLM-5 integrated DeepSeek’s sparse attention mechanism into their own design while contributing a novel reinforcement learning framework.

This is how the open-weight ecosystem actually works: teams build on each other’s innovations in public, and the pace of progress compounds. To understand why, you need to look at the architecture. — Read More

#architecture

The February Reset: Three Labs, Four Models, and the End of “One Best AI”

February 5th, 2026. Anthropic ships Claude Opus 4.6. Same day, OpenAI drops GPT-5.3-Codex. Twelve days later, Anthropic follows with Sonnet 4.6. Two days after that, Google fires back with Gemini 3.1 Pro.

Four frontier models. Three labs. Fourteen days.

When the dust settled, something genuinely new had happened: no single model won. Not on benchmarks. Not on user preference. Not on price. Not on coding. For the first time in the frontier AI race, the leaderboard fractured into distinct lanes, and the “which model is best?” question stopped having a coherent answer.

This article maps who won what, where each model fails, and how the February shakeup changes the way you should think about your model stack. No cheerleading for any provider. Just the numbers and the trade-offs. — Read More

#architecture

MCP is dead. Long live the CLI

I’m going to make a bold claim: MCP is already dying. We may not fully realize it yet, but the signs are there. OpenClaw doesn’t support it. Pi doesn’t support it. And for good reason.

When Anthropic announced the Model Context Protocol, the industry collectively lost its mind. Every company scrambled to ship MCP servers as proof they were “AI first.” Massive resources poured into new endpoints, new wire formats, new authorization schemes, all so LLMs could talk to services they could already talk to.

I’ll admit, I never fully understood the need for it. You know what LLMs are really good at? Figuring things out on their own. Give them a CLI and some docs and they’re off to the races.

I tried to avoid writing this for a long time, but I’m convinced MCP provides no real-world benefit, and that we’d be better off without it. Let me explain. — Read More

#devops

The third era of AI software development

When we started building Cursor a few years ago, most code was written one keystroke at a time. Tab autocomplete changed that and opened the first era of AI-assisted coding.

Then agents arrived, and developers shifted to directing agents through synchronous prompt-and-response loops. That was the second era. Now a third era is arriving. It is defined by agents that can tackle larger tasks independently, over longer timescales, with less human direction.

As a result, Cursor is no longer primarily about writing code. It is about helping developers build the factory that creates their software. This factory is made up of fleets of agents that they interact with as teammates: providing initial direction, equipping them with the tools to work independently, and reviewing their work.

Many of us at Cursor are already working this way. More than one-third of the PRs we merge are now created by agents that run on their own computers in the cloud. A year from now, we think the vast majority of development work will be done by these kinds of agents. — Read More

#devops