Building Claude from Scratch: 62 Components Behind Anthropic’s Thinking Engine

In practice, when building agentic systems, AI models are rarely the bottleneck anymore. The harness around them isAnthropic spent two years building that harness for Claude, the orchestration code that picks the right tools and grades its work before declaring success. Claude itself is built around 62 carefully composed components spanning machine learning patterns like compute optimal allocation, deliberative alignment, bi temporal memory, alongside agentic patterns like the OODA loop, plan and execute, architect editor splits, and many others.

Those 62 components that define Claude’s thinking approach are distributed across 4 main principles: Cognition, Orchestration, Reliability, and Grounding and Trust. — Read More

#devops

Ben’s Builds #3 – an email app

What did I build this week?

An email app…

I use Gmail. I’ve used Superhuman for years. I like it a lot. It is fast, keyboard-first, clean, and is good software. But like many saas products, it keeps adding features that I don’t need and more importantly, I don’t need to be paying for email.

I wanted a split inbox and rules to organize my emails.

Kicking off with Codex: — Read More

#devops

The Neural Shortcut to Language

Speech is often viewed as a massive leap in brain complexity, but new research suggests that evolving complex vocalizations might be much simpler than we thought. By comparing the brains of ordinary lab mice with Alston’s singing mice, a Central American species famous for its rapid-fire vocal duets, researchers discovered that the difference isn’t a bigger brain or new regions.

Instead, evolution simply tripled the number of neurons connecting the mouth-movement center to just two key areas. This “minimalist” neural adaptation may mirror the same evolutionary trick that eventually gave humans the gift of language. — Read More

#human

The April every AI plan broke

April was a strange month for anyone who’s been tracking AI pricing. I keep a running file of the meaningful packaging and pricing moves from the major labs. By the third week of April my notes for the month had outgrown the page and started spilling into a separate document. Five major announcements, three of the four biggest providers, all in three weeks, all pointing in roughly the same direction.

… Five panicked moves in three weeks, from three of the four biggest commercial AI providers in the world, with one common thread:

The original design of their subscription plans is being challenged by evolving product capabilities and usage patterns.Read More

#strategy

The Roadmap to Mastering Tool Calling in AI Agents

Most AI agent failures do not trace back to bad reasoning. The model understands the task, then calls the wrong tool, passes malformed arguments, gets back an unhandled error, and produces a wrong answer anyway. The reasoning layer gets the attention; the tool layer is where production incidents actually happen.

Tool calling — also called function calling — is what bridges a language model’s reasoning to real-world action. Without it, agents are capped by training data: no live queries, no external systems, no side effects. With it, an agent can search the web, call APIs, run code, retrieve documents, and trigger transactions in any system that exposes an interface.

Getting this right means understanding the full stack, not just the happy path. — Read More

#devops

The Architecture Of Local-First Web Development

Last October, I was sitting in a hotel room in Lisbon, the night before I was supposed to demo a project management tool my team had spent four months building. The hotel Wi-Fi was doing that thing where it connects but nothing actually loads. And I watched our app, this thing I was genuinely proud of, render a blank screen with a spinner. Then a timeout error. Then nothing.

I pulled out my phone, tethered to cellular, and got a shaky connection. The app loaded, but every click was a two-second wait. Create a task? Spinner. Move a task between columns? Spinner. I sat there thinking: we built a front end in React, a back end in Node, a Postgres database, a Redis cache, a GraphQL API with six resolvers just for the task board. All that infrastructure, and the damn thing can’t show me my own data without a round-trip to a server 3,000 miles away.

That was the night I started seriously looking at local-first architecture. Not because I read a blog post or saw a tweet. Because I was embarrassed. — Read More

#architecture

Notes from inside China’s AI labs

The Chinese companies building language models are set up as the perfect fast-followers for the technology, building on long-standing cultural traditions in education and work, along with subtly different approaches to building technology companies. When you look at the outputs, the latest, biggest models enabling agentic workflows, and the ingredients, excellent scientists, large-scale data, and accelerated computing, the Chinese and American labs look largely similar. The lasting differences emerge in how these are organized and conditioned.

long thought that a reason that the Chinese labs are so good at catching up and keeping up with the frontier is that they’re culturally aligned for this task, but without talking to people directly I felt like it wasn’t my place to attribute substantial influence to this hunch. Speaking with many wonderful, humble, and open scientists at the leading Chinese labs has crystallized a lot of my beliefs. — Read More

#china-ai

N-Day Research with AI: Using Ollama and n8n

I have been working on N-day research for the past year, focusing specifically on Microsoft components. During this time, I developed several tools to support and streamline my research.

… Since there is a growing trend toward AI-driven analysis, I wanted to evaluate whether an AI model could analyze patched and vulnerable functions and independently identify the underlying vulnerability. This approach could be especially useful for initial triage and enabling faster analysis.

So, I decided to experiment with the tools I already have and extend my workflow further. I started by deploying a local LLM and building from there. — Read More

#cyber

The AWS MCP Server is now generally available

I have been building with AI agents and MCP tools for a while now, and one question kept coming up: how do you give an agent real, authenticated access to AWS without handing it the keys to the kingdom? Today, there is an answer.

I’m happy to announce the general availability of the AWS MCP Server, a managed remote Model Context Protocol (MCP) server that gives AI agents and coding assistants secure, authenticated access to all AWS services through a small, fixed set of tools. — Read More

#devops

What’s new in IAM: Security, governance, and runtime defense

The AI era demands a fundamental shift in security, and that includes identity and access management (IAM). Traditional controls simply aren’t built for autonomous AI agents that interact with sensitive data at machine speed, a reality we address with our new IAM advancements for the agentic enterprise era.

Engineered as built-in Google Cloud capabilities to secure the rapidly-expanding world of AI agents, at Google Cloud Next we introduced a new security and governance paradigm for managing agent identity and access. This comprehensive framework focuses on foundational Agent Identity and an Agent Gateway with Identity-Aware Proxy, while integrating robust agent access management, agent guardrails, and runtime defense to enable a secure cloud environment for your organization. — Read More

#devops