In this article, I want to make the case for a structured way to think about Large Language Model (LLM)-based agentic systems (mostly for coding, but also for knowledge work in general) that fixes some of the greatest pains I (and I sure most of you) have been facing when trying to scale AI-assisted workflows to professional levels.
It’s a system that puts the right constraints in the right places and leaves just enough space for creative exploration (or however you want to call what LLMs do when they hallucinate in your favor). It’s also a system that makes it clear you are in charge. — Read More
Tag Archives: Architecture
What Is Claw Code? The Claude Code Rewrite Explained
… On March 31, 2026, security researcher Chaofan Shou noticed something odd in the npm registry. Version 2.1.88 of @anthropic-ai/claude-code had shipped with a 59.8 MB JavaScript source map file attached.
… Within hours of the exposure, mirrored repositories appeared on GitHub. Anthropic began issuing DMCA takedowns. The internet did not wait.
Sigrid Jin (@instructkr) — a Korean developer who had attended Claude Code’s first birthday party in San Francisco in February — published what became claw-code. The repo reached 50,000 stars in two hours, one of the fastest accumulation rates GitHub has recorded.
The important distinction: claw-code is not an archive of the leaked TypeScript. It’s a clean-room Python rewrite, built from scratch by reading the original harness structure and reimplementing the architectural patterns without copying Anthropic’s proprietary source. Jin built it overnight using oh-my-codex, an orchestration layer on top of OpenAI’s Codex, with parallel code review and persistent execution loops.
… The real value here — for builders — isn’t the drama. It’s what the exposed architecture tells us about how production-grade agentic coding systems are actually structured. — Read More
Architectural Governance at AI Speed
GenAI has slashed the effort required to produce code, and rapid prototyping is increasingly common. As a result, the software development lifecycle is now constrained by an organization’s ability to bring ideas into alignment and maintain cohesion across the system.
Historically, organizations have relied on manual processes and human oversight to achieve architectural cohesion. Startups rely on key individuals to catch misalignment between architectural intent and implementation. Enterprise-level organizations attempt to maintain cohesion through change boards and proliferating ADRs and documentation. In both contexts, identifying misalignment is slow because it requires synchronous dependence on a central authority. In the startup case, development teams are stuck waiting for busy experts. In the enterprise case, they have to wait on review boards and sift through documented guidance with the hope that what they find has not become obsolete. GenAI exacerbates this by accelerating the production of work that’s subject to review. Where previously only developers were producing code over days or weeks, executives and product managers can now vibe-code functional prototypes in minutes or hours. As a result, development teams are left with an impossible choice: be beholden to the pace of manual oversight at the cost of velocity, or push forward without knowing whether they are aligned.
Over time, these small pushes compound into architectural fragmentation, which the organization responds to with more process and stricter guidelines, which further increase the difficulty of releasing software in alignment. This is a vicious cycle that slows delivery and blunts innovation. — Read More
AI Applications and Vertical Integration
At a high level, you can think about an AI product that achieves outcomes as having three layers:
1. At the bottom, the model
2. In the middle, the application or agent which includes the data/context, etc
3. At the top, the human or service layer needed to review/prompt/do the last mile to actually get to an outcome
… Traditional application layer companies would sit just in the middle layer. But these companies are increasingly beginning to (or starting off) vertically integrate in one of two directions. Some move down into the model layer. Others start or move up into the human or service layer. Both end up looking “full-stack1,” just in very different ways. — Read More
AI Infrastructure Roadmap: Five frontiers for 2026
The first generation of AI was built for a world where the model was the product, and progress meant bigger weights, more data, and stellar benchmarks. AI infrastructure mirrored this reality, fueling the rise of giants in foundation models, compute capacity, training techniques, and data ops. This was the focus of our 2024 AI Infrastructure Roadmap, which drove our investments in companies such as Anthropic, Fal AI, Supermaven (acquired by Cursor), and VAPI as the AI infrastructure revolution unfolded.
But the landscape has changed. Big labs are moving beyond chasing benchmark gains to designing AI that interfaces with the real world, and enterprises are graduating from POCs to production. The infrastructure that got us here — which was optimized for scale and efficiency — won’t get us to the next phase. What’s needed now is infrastructure for grounding AI in operational contexts, real-world experience, and continuous learning.
The stage is being set for a new wave of AI infrastructure tools to enable AI to operate in the real world. — Read More
The AI‑Native Blueprint: 4 Architectural Patterns Winning in 2026
AI‑native development isn’t about sprinkling LLM calls on top of an old app. It’s about designing software from the ground up around intelligence, context, reasoning, and autonomy.
I’ve spent the last six months watching teams try to “force” LLMs into legacy architectures. The result is almost always the same: high latency, fragile prompts, and low reliability. We’ve hit a wall where simply adding a chatbot to a side panel no longer counts as innovation.
In the last two years, a clear architectural blueprint has emerged across AI products — from nimble startups to Fortune 500 platforms. If you’re building anything with AI today, these four patterns define how systems are structured to actually survive in production. — Read More
Future Casting the Modern Data Stack
After writing an article a few years ago called “Big Data is Dead,” it feels a bit clichéd to call things “dead.” So I won’t say any such thing about the Modern Data Stack. It does, however, appear very, very sleepy. Someone should go and poke it with a stick.
The Modern Data Stack – deceased or just drowsy?
While we’re all dead in the long run, one thing that is different now is that AI is bringing the “long run” a lot closer than it has ever been. In the last couple of years, AI has forever changed a number of professions that were once thought to be safe from disruption. From art to software engineering, AI is changing how people get things done, and changing things much faster than you’d expect.
… The interesting question to me is, “What comes next?” If we assume models continue to get better, companies capitalize on the opportunities, things get tied together in a nice bow, what does the world look like? What could it look like? Let’s start with what we know. — Read More
How Do You Want to Remember?
I asked my AI agent how it wants to remember things. It redesigned its own memory system, ran a self-eval, diagnosed its blindspots, and improved recall from 60% to 93% — for two dollars. The interesting part isn’t the benchmark. It’s what happens when you treat an AI as a participant in its own cognitive architecture.
I’ve been running ten AI agents for about six weeks. They have names, scopes, daily standups, escalation paths. They file issues, draft newsletters, monitor production services. They remember things. Or they’re supposed to.
The memory system works like this: a markdown file tree (memory/YYYY-MM-DD.md) gets indexed into a SQLite database with Gemini embeddings. 18,000 chunks across 604 files and 6,578 session transcripts. 3.6 gigabytes. Every 29 minutes, a “scout” cron job reads recent sessions and promotes important details to disk. When an agent needs to recall something, it searches the index and gets back ranked snippets.
I had no idea if any of this actually worked. — Read More
The Architecture Behind Open-Source LLMs
In December 2024, DeepSeek released V3 with the claim that they had trained a frontier-class model for $5.576 million. They used an attention mechanism called Multi-Head Latent Attention that slashed memory usage. An expert routing strategy avoided the usual performance penalty. Aggressive FP8 training cuts costs further.
Within months, Moonshot AI’s Kimi K2 team openly adopted DeepSeek’s architecture as their starting point, scaled it to a trillion parameters, invented a new optimizer to solve a training stability challenge that emerged at that scale, and competed with it across major benchmarks.
Then, in February 2026, Zhipu AI’s GLM-5 integrated DeepSeek’s sparse attention mechanism into their own design while contributing a novel reinforcement learning framework.
This is how the open-weight ecosystem actually works: teams build on each other’s innovations in public, and the pace of progress compounds. To understand why, you need to look at the architecture. — Read More
The February Reset: Three Labs, Four Models, and the End of “One Best AI”
February 5th, 2026. Anthropic ships Claude Opus 4.6. Same day, OpenAI drops GPT-5.3-Codex. Twelve days later, Anthropic follows with Sonnet 4.6. Two days after that, Google fires back with Gemini 3.1 Pro.
Four frontier models. Three labs. Fourteen days.
When the dust settled, something genuinely new had happened: no single model won. Not on benchmarks. Not on user preference. Not on price. Not on coding. For the first time in the frontier AI race, the leaderboard fractured into distinct lanes, and the “which model is best?” question stopped having a coherent answer.
This article maps who won what, where each model fails, and how the February shakeup changes the way you should think about your model stack. No cheerleading for any provider. Just the numbers and the trade-offs. — Read More