There is currently a social media and industry zeitgeist dialed-in on CLIs…just as there was a moment for MCP but just a few short months ago
While it is true that there are token savings to be had by using a CLI, many folks have not considered how agents using custom CLIs run into the same context problem as MCP, except now without structure and many other sacrifices
In much of the discourse, there is a lack of distinction between local MCP over stdio versus server MCP over HTTP; the latter is a very different use case
… The oversight made by many is that individual usage of coding agents looks very different from organizational adoption of coding agents where there is an emphasis on visibility, telemetry, security, quality, and being able to operationalize and maintain agent-coded systems by a team of varying degrees of skill and experience.
For enterprise and org-level use cases, MCP is the present and future and teams need to be able to cut through the hype of the moment. — Read More
Tag Archives: DevOps
He Wrote 200 Lines of Code and Walked Away (What happened Next will blow your Mind)
Let me tell you a story that’s going to mess with your head a little bit.
A developer named Liyuanhao sat down and wrote 200 lines of code in Rust.
That’s it. Just a tiny, bare-bones script.
But what happened after he hit run is the kind of thing you have to read twice just to make sure you aren’t imagining things.
He named the project yoyo — a self-evolving coding agent. And then, and this is the part that genuinely gets me, he stepped away entirely. He took his hands off the keyboard.
He gave it one single instruction: evolve until you rival Claude Code. Then, he just sat back and watched. — Read More
How A Regular Person Can Utilize AI Agents
Let’s do this again, redux! I’ll explain how to use AI agents for easy language learning, to create an easier version of my morning briefing, and finally, a far easier version of my briefing transcription -> summary -> action pipeline. In the process, my goal is to help readers remix the general principles for their own (mostly safe) agents.
My last piece about AI agents was my most popular and widely shared article to date. Usually, one writes a “Part 1” that’s easier and a “Part 2” that’s more complex. This is the exact opposite.
… So, in this revisit, I have these goals:
— Explain the general principles of creating agents (more slowly)
— Use methods that are more accessible to non-technical users.
— Give a framework for remixing these methods for readers’ own ideas/agents.
Ironically, this piece took longer than my last one. Instead of just sharing my workflows, this piece is designed to let you use these agents with step-by-step instructions, from scratch, and have them adapted to you (not me). — Read More
Andrej Karpathy’s new open source ‘autoresearch’ lets you run hundreds of AI experiments a night — with revolutionary implications
Over the weekend, Andrej Karpathy—the influential former Tesla AI lead and co-founder and former member of OpenAI who coined the term “vibe coding”— posted on X about his new open source project, autoresearch.
It wasn’t a finished model or a massive corporate product: it was by his own admission a simple, 630-line script made available on Github under a permissive, enterprise-friendly MIT License. But the ambition was massive: automating the scientific method with AI agents while us humans sleep. — Read More
Perplexity turns your Mac mini into a 24/7 AI agent
Two weeks after launching Perplexity Computer, a cloud-based AI agent that can orchestrate 20 frontier models to execute multi-step workflows autonomously, the company used its inaugural Ask 2026 developer conference in San Francisco on Wednesday to dramatically widen the platform’s reach
The centrepiece of announcement is Personal Computer: software that runs continuously on a user-supplied Mac mini, merging local files, apps, and sessions with Perplexity’s cloud-based Computer system. — Read More
The 8 Levels of Agentic Engineering
AI’s coding ability is outpacing our ability to wield it effectively. That’s why all the SWE-bench score maxxing isn’t syncing with the productivity metrics engineering leadership actually cares about. When Anthropic’s team ships a product like Cowork in 10 days and another team can’t move past a broken POC using the same models, the difference is that one team has closed the gap between capability and practice and the other hasn’t.
That gap doesn’t close overnight. It closes in levels. 8 of them. Most of you reading this are likely past the first few, and you should be eager to reach the next one because each subsequent level is a huge leap in output, and every improvement in model capability amplifies those gains further.
Level 1: Tab Complete
Level 2: Agent IDE
Level 3: Context Engineering
Level 4: Compounding Engineering
Level 5: MCP and Skills
Level 6: Harness Engineering
Level 7: Background Agents
Level 8: Autonomous Agent Teams
— Read More
The Capability Maturity Model for AI in Design
Matt Davey, who is Chief Experience Officer at 1Password, created a useful capability maturity model for AI in design. His original model has 5 levels (Limited, Reactive, Developing, Embedded, and Leading), each of which differs along 6 characteristics (Leadership on AI, Strategy & Budgeting, AI Culture & Talent, AI Learning & Enablement, AI Agents & Automation, and AI Product Design). Thus, the model covers both the use of AI within the design process and the use of AI in the resulting product. I recommend you read the full thing, but here is a summary of Davey’s 5 capability maturity levels for AI in design.
As discussed below, I added Maturity Level 6, Symbiotic, for a more complete capability maturity ladder.
For a summary of this article, watch my short overview explainer video (YouTube, 6 min.). — Read More
You Need to Rewrite Your CLI for AI Agents
I built a CLI for Google Workspace — agents first. Not “built a CLI, then noticed agents were using it.” From Day One, the design assumptions were shaped by the fact that AI agents would be the primary consumers of every command, every flag, and every byte of output.
CLIs are increasingly the lowest-friction interface for AI agents to reach external systems. Agents don’t need GUIs. They need deterministic, machine-readable output, self-describing schemas they can introspect at runtime, and safety rails against their own hallucinations. — Read More
Not Prompts, Blueprints
I hate to micromanage & I’ve been micromanaging AI.
A few months ago, I’d use Claude for a familiar workflow : capturing notes from a meeting, drafting a follow-up email, updating the CRM, writing the investment memo. Micromanagement at 10x speed. The agent would finish a step, then wait. I’d scan the output, type the next instruction, wait again. Prompt, response, prompt, response. I was the bottleneck in my own system.
A year ago, this was necessary. The models couldn’t hold a complex task in their heads. Now they can.
But this leverage requires planning. Now I sketch the workflow before I touch the machine. — Read More
MCP is dead. Long live the CLI
I’m going to make a bold claim: MCP is already dying. We may not fully realize it yet, but the signs are there. OpenClaw doesn’t support it. Pi doesn’t support it. And for good reason.
When Anthropic announced the Model Context Protocol, the industry collectively lost its mind. Every company scrambled to ship MCP servers as proof they were “AI first.” Massive resources poured into new endpoints, new wire formats, new authorization schemes, all so LLMs could talk to services they could already talk to.
I’ll admit, I never fully understood the need for it. You know what LLMs are really good at? Figuring things out on their own. Give them a CLI and some docs and they’re off to the races.
I tried to avoid writing this for a long time, but I’m convinced MCP provides no real-world benefit, and that we’d be better off without it. Let me explain. — Read More