The Economics of Generative AI: Two Years Later

I am excited to update my original analysis from 2024: The Economics of Generative AI. This is the analysis I come back to more than anything else I’ve written because it’s a reminder of the “physics” of the AI industry.

Two years ago, I found that the Gen AI value chain was inverted: the compute layer captured ~83% of all revenue and ~87% of all gross profit. The application layer, despite being closest to end customers, earned almost nothing. I predicted this would flip over time, following the pattern of every prior platform shift.

Two years since, the AI ecosystem has grown roughly 5x, from ~$90B to ~$435B in annualized revenue. But what’s remarkable is how little the shape of economics has changed.

The bottom line upfront: Semi is a one-player game. Apps is a two-player game. Infra is the only competitive layer. The most profitable strategy in AI is still selling the shovels. 🙂 — Read More

#investing

Meta-Harness: End-to-End Optimization of Model Harnesses

The performance of large language model (LLM) systems depends not only on model weights, but also on their harness: the code that determines what information to store, retrieve, and present to the model. Yet harnesses are still designed largely by hand, and existing text optimizers are poorly matched to this setting because they compress feedback too aggressively. We introduce Meta-Harness, an outer-loop system that searches over harness code for LLM applications. It uses an agentic proposer that accesses the source code, scores, and execution traces of all prior candidates through a filesystem. On online text classification, Meta-Harness improves over a state-of-the-art context management system by 7.7 points while using 4x fewer context tokens. On retrieval-augmented math reasoning, a single discovered harness improves accuracy on 200 IMO-level problems by 4.7 points on average across five held-out models. On agentic coding, discovered harnesses surpass the best hand-engineered baselines on TerminalBench-2. Together, these results show that richer access to prior experience can enable automated harness engineering. — Read More

#devops

Backpropagation is simpler than you think (once you see this)

Backpropagation is one of those terms that gets thrown around so much in AI that people assume everyone already understands it.

But most explanations stop at “the network adjusts its weights using gradients” and leave you nodding along without actually knowing what is being computed or why.

In this blog, I’m going to fix that.

We’ll start from scratch and work all the way to a complete, clean idea of every gradient you need. —  Read More

#training

Quantum computers need vastly fewer resources than thought to break vital encryption

Building a utility-scale quantum computer that can crack one of the most vital cryptosystems—elliptic curves—doesn’t require nearly the resources anticipated just a year or two ago, two independently written whitepapers have concluded. In one, researchers demonstrated the use of neutral atoms as reconfigurable qubits that have free access to each other. They went on to show this approach could allow a quantum computer to break 256-bit elliptic-curve cryptography (ECC) in 10 days while using 100 times less overhead than previously estimated. In a second paper, Google researchers demonstrated how to break ECC-securing blockchains for bitcoin and other cryptocurrencies in less than nine minutes while achieving a 20-fold resource reduction.

Taken together, the papers are the latest sign that cryptographically relevant quantum computing (CRQC) at utility-scale is making meaningful progress. — Read More

#quantum

Architectural Governance at AI Speed

GenAI has slashed the effort required to produce code, and rapid prototyping is increasingly common. As a result, the software development lifecycle is now constrained by an organization’s ability to bring ideas into alignment and maintain cohesion across the system.

Historically, organizations have relied on manual processes and human oversight to achieve architectural cohesion. Startups rely on key individuals to catch misalignment between architectural intent and implementation. Enterprise-level organizations attempt to maintain cohesion through change boards and proliferating ADRs and documentation. In both contexts, identifying misalignment is slow because it requires synchronous dependence on a central authority. In the startup case, development teams are stuck waiting for busy experts. In the enterprise case, they have to wait on review boards and sift through documented guidance with the hope that what they find has not become obsolete. GenAI exacerbates this by accelerating the production of work that’s subject to review. Where previously only developers were producing code over days or weeks, executives and product managers can now vibe-code functional prototypes in minutes or hours. As a result, development teams are left with an impossible choice: be beholden to the pace of manual oversight at the cost of velocity, or push forward without knowing whether they are aligned.

Over time, these small pushes compound into architectural fragmentation, which the organization responds to with more process and stricter guidelines, which further increase the difficulty of releasing software in alignment. This is a vicious cycle that slows delivery and blunts innovation. — Read More

#architecture