Software engineers aren’t being replaced; they’re moving from typing code to orchestrating agents, proving that infrastructure matters more than model size.
Boris Cherny, creator of Anthropic’s Claude Code, says he hasn’t written a line of code by hand in months. He shipped 22 pull requests one day, 27 the next, all AI-generated. Company-wide, Anthropic reports that 70 to 90% of its code is now written by AI. CEO Dario Amodei has predicted that AI could handle “most, maybe all” of what software engineers do within months.
And yet Anthropic typically has dozens of software engineering openings, one reportedly carrying $570K in total compensation. As one observer noted, the company is simultaneously predicting the end of the profession and paying top dollar to hire into it. — Read More
Tag Archives: Strategy
Import AI 455: AI systems are about to start building themselves.
AI systems are about to start building themselves. What does that mean?
I’m writing this post because when I look at all the publicly available information I reluctantly come to the view that there’s a likely chance (60%+) that no-human-involved AI R&D – an AI system powerful enough that it could plausibly autonomously build its own successor – happens by the end of 2028.
This is a big deal.
I don’t know how to wrap my head around it. — Read More
A Final Answer: Is AI Really a Bubble?
With Anthropic having hit a $44 billion run rate, up from “just” $9 billion three months ago, and on a trend line to a $100 billion run rate by the end of the year, they are putting their business on par with some of the most cash-generating business models of all time. OpenAI’s growth with Codex is just as impressive.
And while I have my counterarguments, one way or another, AI has found some sort of product-market fit, and people have finally put to rest the idea that AI is a bubble.
Well, wrong.
The economic picture in AI is much more complicated than it meets the eye; it’s bubbly in ways people in San Francisco, too smart for their own good, fail to identify. — Read More
AI Outperforms Doctors in Emergency Room Tasks, New Harvard Study Shows
An advanced AI agent has outperformed human physicians on a series of demanding tests that assess the ability to correctly diagnose patient illnesses in clinical settings, a Harvard-led study found. OpenAI’s “o1 preview,” the company’s first model capable of step-by-step reasoning, proved that it could conduct real world triage in emergency rooms, recommend appropriate diagnostic tests, and perform case management tasks at a level that matched or surpassed the ability of even well-trained human doctors.
The study, led by Harvard researchers with collaborators at Stanford and published today in Science, suggests an urgent need for controlled trials of the technology, the authors say, to determine how it can be most effectively deployed. — Read More
The Last Software Engineer
For more than a decade, I have taught software engineers how to implement testing, React, Remix, MCP, and more
I built courses around practice. I would simulate a real work environment: a product manager gives you a task, you read the docs, you work in the codebase, you build the feature, and then you compare your solution with mine.
That was valuable because implementation was valuable.
It still is. But it is becoming less scarce.
AI coding agents are slowly eating away at the tasks software engineers have done for decades. — Read More
What If We Prompted AI for Outcomes Instead of Outputs?
I’ve been to a lot of meetups about AI in the last year. Across all of those there’s been a common refrain that gets repeated by the experts and the newly empowered noobs alike. “If you don’t know how to get what you need out of your AI tool, just ask it.” It’s one of the most powerful aspects of the AI revolution. You can’t ask a hammer how to build a cabinet. You can ask Claude how to build the web app you’ve imagined for the last 20 years.
In all of these cases though, the prompt is always focused on creating a specific thing, an output. However, there’s a question worth sitting with — one we’ve started discussing internally lately. What would it look like to prompt AI for an outcome instead of an output? — Read More
Apple UX Principle: How Simplicity Drives Apple’s 5–10% Conversion Rates
The Apple UX Principle is often misunderstood as a design style defined by minimalism and clean interfaces. In reality, what Apple Inc. has built is far more strategic. It is a system designed to influence how people think, feel, and ultimately decide.
This case study explores how Apple applies five core UX principles, usability, communication, functionality, aesthetics, and emotional connection, to create product experiences that consistently outperform industry benchmarks. More specifically, it examines how these principles contribute to Apple’s estimated 5–10% conversion rates, significantly higher than the typical ecommerce average of 2–3%.
The goal is not to replicate Apple’s design, but to understand the mechanisms behind its performance. — Read More
AI evals are becoming the new compute bottleneck
AI evaluation has crossed a cost threshold that changes who can do it. The Holistic Agent Leaderboard (HAL) recently spent about $40,000 to run 21,730 agent rollouts across 9 models and 9 benchmarks. A single GAIA run on a frontier model can cost $2,829 before caching. Exgentic‘s $22,000 sweep across agent configurations found a 33× cost spread on identical tasks, isolating scaffold choice as a first-order cost driver, and UK-AISI recently scaled agentic steps into the millions to study inference-time compute. In scientific ML, The Well costs about 960 H100-hours to evaluate one new architecture and 3,840 H100-hours for a full four-baseline sweep. While compression techniques have been proposed for static benchmarks, new agent benchmarks are noisy, scaffold-sensitive, and only partly compressible. Training-in-the-loop benchmarks are expensive by construction, and when you try to add reliability to these evals, repeated runs further multiply the cost. — Read More
McKinsey and Google Cloud launch the McKinsey Google Transformation Group to scale enterprise impact for the AI era
McKinsey and Google Cloud today announced the McKinsey Google Transformation Group, expanding the two organizations’ long-standing partnership to accelerate enterprise outcomes by enabling AI transformations across domains and industries.
The new group combines McKinsey’s strategy and industry expertise, transformation experience, and technology delivery capabilities with Google Cloud’s AI stack—including compute accelerators, multimodal Gemini models, and Gemini Enterprise—to help clients turn AI ambition into sustained business value. The organizations will deliver this value through joint teams, cofunded value assessments, and outcome-based models, creating a more seamless, end-to-end experience while reducing up-front investment and aligning to measurable results. — Read More
The World Can’t Keep Up With AI Labs
Late last year a new AI psychosis kicked off. This time it was coding agents.
People started saying this is a new era in programming, blah blah blah.
A few months later, we’ve got more than just claims. We’ve got numbers. And they say something unusual is happening in the market.
Coding agents are the first AI product people are paying for at volume and regularly. Because it directly speeds up their work. It’s too early to claim businesses are replacing whole processes with agents across the board. But compute demand has started growing faster than anyone can build it out.
Here’s why this moment is different, why nobody’s ready, and what I took from it personally. — Read More