A lot of people say AI will make us all “managers” or “editors”…but I think this is a dangerously incomplete view!
Personally, I’m trying to code like a surgeon.
A surgeon isn’t a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.
My current goal with AI coding tools is to spend 100% of my time doing stuff that matters. — Read More
Tag Archives: DevOps
Reasoning Is Not Model Improvement
When OpenAI released o1 in 2024 and called it a “reasoning model,” the industry celebrated a breakthrough. Finally, AI that could think step-by-step, solve complex problems, handle graduate-level mathematics.
But look closer at what’s actually happening under the hood. When you ask o1 to multiply two large numbers, it doesn’t calculate. It generates Python code, executes it in a sandbox, and returns the result. Unlike GPT-3, which at least attempted arithmetic internally (and often failed), o1 explicitly delegates computation to external tools.
This pattern extends everywhere. The autonomy in agentic AI? Chained tool calls like web searches, API invocations, database queries. The breakthrough isn’t in the model’s intelligence. It’s in the orchestration layer coordinating external systems. Everything from reasoning to agentic AI is just a sophisticated application of code generation. These are not model improvements. They’re engineering workarounds for models that stopped improving.
This matters because the entire AI industry (from unicorn valuations to trillion-dollar GDP projections) depends on continued model improvement. What we’re getting instead is increasingly elaborate plumbing for fundamentally stagnant foundations. — Read More
Claude Code is unreasonably good at building MVPs
The most valuable code I’ve written in the past six months is code I fully intend to throw away. This isn’t some zen programming philosophy or agile methodology talking point. It’s the practical reality of using Claude Code to build prototypes and MVPs.
Here’s what’s fundamentally changed: the time from “what if we built X” to “here’s a working version of X” has collapsed from weeks or months down to hours or days. That compression doesn’t just make development faster. It changes what kinds of ideas are worth exploring in the first place. — Read More
Google’s URL Context Grounding: Another Nail in RAG’s Coffin?
Google’s hot streak in AI-related releases continues unabated. Just a few days ago, it released a new tool for Gemini called URL context grounding.
URL context grounding can be used stand-alone or combined with Google search grounding to conduct deep dives into internet content.
In a nutshell, it’s a way to programmatically have Gemini read, understand and answer questions about content and data contained in individual web URLs (including those pointing to PDFs) without the need to perform what we know as traditional RAG processing. — Read More
The Claude Code SDK and the Birth of HaaS (Harness as a Service)
As tasks require more autonomous behavior from agents, the core primitive for working with AI is shifting from the LLM API (chat style endpoints) to the Harness API (customizable runtimes). I call this Harness as a Service (HaaS). Quickly build, customize, and share agents via a rich ecosystem of agent harnesses. Today we’ll cover how to customize harnesses to build usable agents quickly + the future of agent development in a world of open harnesses. — Read More
Building a Resilient Event Publisher with Dual Failure Capture
When we set out to rebuild Klaviyo’s event infrastructure, our goal wasn’t just to handle more scale, it was to make the system rock solid. In Part 1 of this series, we shared how we migrated from RabbitMQ to a Kafka-based architecture to process 170,000 events per second at peak without losing data. In Part 2, we dived into how we made event consumers resilient.
This post, Part 3, is all about the Event Publisher, the entry point into our event pipeline. The publisher has an important job: It needs to accept events from hundreds of thousands of concurrent clients, serialize them, keep up with unpredictable traffic spikes, and most importantly, ensure that no event is ever lost. If the publisher isn’t resilient, the rest of the pipeline can’t rely on a steady and complete flow of events. — Read More
Scaling Engineering Teams: Lessons from Google, Facebook, and Netflix
After spending over a decade in engineering leadership roles at some of the world’s most chaotic innovation factories—Google, Facebook, and Netflix—I’ve learned one universal truth: scaling engineering teams is like raising teenagers. They grow fast, develop personalities of their own, and if you don’t set boundaries, suddenly they’re setting the house on fire at 3am.
The difference between teams that thrive at scale and those that collapse into Slack-thread anarchy typically comes down to three key factors:
— Structured goal-setting
— A ruthless focus on code quality
— Intentional culture building
Let me share some lessons I learned from scaling teams at Google, Facebook, and Netflix. — Read More
Agile is Out, Architecture is Back
Software development has always been defined by its extremes. In the early days, we planned everything. Specs were sacred. Architecture diagrams came before a single line of code. And every change felt like steering a cargo ship — slow, bureaucratic, and heavily documented.
Then came Agile, and the pendulum swung hard in the other direction. We embraced speed, iteration, and imperfection. “Working software over comprehensive documentation” became the battle cry of a new generation. Shipping fast was more important than getting it right the first time. And to be fair, that shift unlocked enormous productivity. It changed the culture of software for good.
Now, we’re entering a new era — one driven by AI tools that can generate code from a sentence. Tools like GitHub Copilot and Claude Code are reshaping what it means to be a developer. It’s not just about writing code anymore — it’s about designing the environment in which code gets written.
And that pendulum? It’s swinging back again. — Read More
AI Focus: Interception
This is a very quick post. I had an idea as I was walking the dog this evening, and I wanted to build a functioning demo and write about it within a couple of hours.
While the post and idea started this evening, the genesis of the idea has been brewing for a while and goes back over a year to August 2024, when I wrote about being sucked into a virtual internet. WebSim has been on my mind for a while, because I loved the idea of being able to simulate my own version of the web using the browser directly and not via another web page. And a couple of weeks ago, I managed to work out how to get Puppeteer to intercept requests and respond with content generated via an LLM. — Read More
The Last Programmers
We’re witnessing the final generation of people who translate ideas into code by hand.XXXXI quit my job at Amazon in May to join a startup called Icon.
… I felt like I was reaching the ceiling of what I could learn about AI and building good products within Amazon’s constraints. That’s why I joined Icon. At Icon, we move at a completely different speed. We ship features in days that would have taken Amazon months to approve.
… The interesting part is watching how my teammates work. One of them hasn’t looked at actual code in weeks. Instead, he writes design documents in plain English and trusts AI to handle the implementation. When something needs fixing, he edits the document, not the code.
It made me realize something profound: we’re living through the end of an era where humans translate ideas into code by hand. Within a few years, that skill will be as relevant as knowing how to shoe a horse. — Read More