The New Calculus of AI-based Coding

Over the past three months, a team of experienced, like-minded engineers and I have been building something really cool within Amazon Bedrock. While I’m pretty excited about what we are building, there is another unique thing about our team  – most of our code is written by AI agents such as Amazon Q or Kiro. Before you roll your eyes: no, we’re not vibe coding. I don’t believe that’s the right way to build robust software.

Instead, we use an approach where a human and AI agent collaborate to produce the code changes. For our team, every commit has an engineer’s name attached to it, and that engineer ultimately needs to review and stand behind the code. We use steering rules to setup constraints for how the AI agent should operate within our codebase, and writing in Rust has been a great benefit. Rust compiler is famous for focusing on correctness and safety, catching many problems at compile time and providing helpful error messages that help the agent iterate. As a juxtaposition to vibe coding, I prefer the term “agentic coding.” Much less exciting but in our industry boring is usually good. — Read More

#devops

Introducing vibe coding in Google AI Studio

We’ve been building a better foundation for AI Studio, and this week we introduced a totally new AI powered vibe coding experience in Google AI Studio. This redesigned experience is meant to take you from prompt to working AI app in minutes without you having to juggle with API keys, or figuring out how to tie models together. — Read More

#devops

Code like a surgeon

A lot of people say AI will make us all “managers” or “editors”…but I think this is a dangerously incomplete view!

Personally, I’m trying to code like a surgeon.

A surgeon isn’t a manager, they do the actual work! But their skills and time are highly leveraged with a support team that handles prep, secondary tasks, admin. The surgeon focuses on the important stuff they are uniquely good at.

My current goal with AI coding tools is to spend 100% of my time doing stuff that matters.  — Read More

#devops

Reasoning Is Not Model Improvement

When OpenAI released o1 in 2024 and called it a “reasoning model,” the industry celebrated a breakthrough. Finally, AI that could think step-by-step, solve complex problems, handle graduate-level mathematics.

But look closer at what’s actually happening under the hood. When you ask o1 to multiply two large numbers, it doesn’t calculate. It generates Python code, executes it in a sandbox, and returns the result. Unlike GPT-3, which at least attempted arithmetic internally (and often failed), o1 explicitly delegates computation to external tools.

This pattern extends everywhere. The autonomy in agentic AI? Chained tool calls like web searches, API invocations, database queries. The breakthrough isn’t in the model’s intelligence. It’s in the orchestration layer coordinating external systems. Everything from reasoning to agentic AI is just a sophisticated application of code generation. These are not model improvements. They’re engineering workarounds for models that stopped improving.

This matters because the entire AI industry (from unicorn valuations to trillion-dollar GDP projections) depends on continued model improvement. What we’re getting instead is increasingly elaborate plumbing for fundamentally stagnant foundations. — Read More

#devops

Claude Code is unreasonably good at building MVPs

The most valuable code I’ve written in the past six months is code I fully intend to throw away. This isn’t some zen programming philosophy or agile methodology talking point. It’s the practical reality of using Claude Code to build prototypes and MVPs.

Here’s what’s fundamentally changed: the time from “what if we built X” to “here’s a working version of X” has collapsed from weeks or months down to hours or days. That compression doesn’t just make development faster. It changes what kinds of ideas are worth exploring in the first place. — Read More

#devops

Google’s URL Context Grounding: Another Nail in RAG’s Coffin?

Google’s hot streak in AI-related releases continues unabated. Just a few days ago, it released a new tool for Gemini called URL context grounding. 

URL context grounding can be used stand-alone or combined with Google search grounding to conduct deep dives into internet content.

In a nutshell, it’s a way to programmatically have Gemini read, understand and answer questions about content and data contained in individual web URLs (including those pointing to PDFs) without the need to perform what we know as traditional RAG processing. — Read More

#devops

The Claude Code SDK and the Birth of HaaS (Harness as a Service)

As tasks require more autonomous behavior from agents, the core primitive for working with AI is shifting from the LLM API (chat style endpoints) to the Harness API (customizable runtimes). I call this Harness as a Service (HaaS). Quickly build, customize, and share agents via a rich ecosystem of agent harnesses. Today we’ll cover how to customize harnesses to build usable agents quickly + the future of agent development in a world of open harnesses. — Read More

#devops

Building a Resilient Event Publisher with Dual Failure Capture

When we set out to rebuild Klaviyo’s event infrastructure, our goal wasn’t just to handle more scale, it was to make the system rock solid. In Part 1 of this series, we shared how we migrated from RabbitMQ to a Kafka-based architecture to process 170,000 events per second at peak without losing data. In Part 2, we dived into how we made event consumers resilient.

This post, Part 3, is all about the Event Publisher, the entry point into our event pipeline. The publisher has an important job: It needs to accept events from hundreds of thousands of concurrent clients, serialize them, keep up with unpredictable traffic spikes, and most importantly, ensure that no event is ever lost. If the publisher isn’t resilient, the rest of the pipeline can’t rely on a steady and complete flow of events. — Read More

#devops

Scaling Engineering Teams: Lessons from Google, Facebook, and Netflix

After spending over a decade in engineering leadership roles at some of the world’s most chaotic innovation factories—Google, Facebook, and Netflix—I’ve learned one universal truth: scaling engineering teams is like raising teenagers. They grow fast, develop personalities of their own, and if you don’t set boundaries, suddenly they’re setting the house on fire at 3am.

The difference between teams that thrive at scale and those that collapse into Slack-thread anarchy typically comes down to three key factors:

— Structured goal-setting
— A ruthless focus on code quality
— Intentional culture building

Let me share some lessons I learned from scaling teams at Google, Facebook, and Netflix. — Read More

#devops

Agile is Out, Architecture is Back

Software development has always been defined by its extremes. In the early days, we planned everything. Specs were sacred. Architecture diagrams came before a single line of code. And every change felt like steering a cargo ship — slow, bureaucratic, and heavily documented.

Then came Agile, and the pendulum swung hard in the other direction. We embraced speed, iteration, and imperfection. “Working software over comprehensive documentation” became the battle cry of a new generation. Shipping fast was more important than getting it right the first time. And to be fair, that shift unlocked enormous productivity. It changed the culture of software for good.

Now, we’re entering a new era — one driven by AI tools that can generate code from a sentence. Tools like GitHub Copilot and Claude Code are reshaping what it means to be a developer. It’s not just about writing code anymore — it’s about designing the environment in which code gets written.

And that pendulum? It’s swinging back again. — Read More

#devops