AI’s coding ability is outpacing our ability to wield it effectively. That’s why all the SWE-bench score maxxing isn’t syncing with the productivity metrics engineering leadership actually cares about. When Anthropic’s team ships a product like Cowork in 10 days and another team can’t move past a broken POC using the same models, the difference is that one team has closed the gap between capability and practice and the other hasn’t.
That gap doesn’t close overnight. It closes in levels. 8 of them. Most of you reading this are likely past the first few, and you should be eager to reach the next one because each subsequent level is a huge leap in output, and every improvement in model capability amplifies those gains further.
Level 1: Tab Complete
Level 2: Agent IDE
Level 3: Context Engineering
Level 4: Compounding Engineering
Level 5: MCP and Skills
Level 6: Harness Engineering
Level 7: Background Agents
Level 8: Autonomous Agent Teams
— Read More