The Emerging Agentic Enterprise: How Leaders Must Navigate a New Age of AI

Executives have long relied on simple categories to frame how technology fits into organizations: Tools automate tasks, people make decisions, and strategy determines how the two work together. That framing is no longer sufficient. A new class of systems — agentic AI — complicates these boundaries. These systems can plan, act, and learn on their own. They are not just tools to be operated or assistants waiting for instructions. Increasingly, they behave like autonomous teammates, capable of executing multistep processes and adapting as they go. Notably, 76% of respondents to our global executive survey say they view agentic AI as more like a coworker than a tool.

For strategists, agentic AI’s dual nature as both a tool and coworker creates new dilemmas. A single agent might take over a routine step, support a human expert with analysis, and collaborate across workflows in ways that shift decision-making authority. This tool-coworker duality breaks down traditional management logic, which assumes that technology either substitutes or complements, automates or augments, is labor or capital, or is a tool or a worker, but not all at once. Organizations now face an unprecedented challenge: managing a single system that demands both human resource approaches and asset management techniques.

The separation of technology and strategy inside most organizations exacerbates this challenge.  — Read More

#strategy

AI Risk Is an Architecture Problem

Three kinds of companies come to me for help with AI. While they are all in different places on their AI-path, they all have the same underlying challenge: how to effectively understand and manage business risk for systems that contain AI-based components.

The first kind of company is on the outside looking in. … The other two kinds of companies are already on the inside, with different problems. One built a working proof of concept, [t]he other already crossed that bridge, shipped something real, and got burned. … None of these companies can see their actual business risk surface clearly enough to make decisions about it. — Read More

#strategy

How far behind are open models?

Open models, AI models where you can download the weights online, are generally not as capable as the best closed models (models only available through an API), but how large is the gap, and how does it change over time? We try to answer this question by using data from 17 selected benchmarks (8 private, 9 public, ~110 datapoints) measuring various capabilities. All the data and code needed to reproduce this can be found on github.

We find that, as of today, on private benchmarks, where the data is not publicly accessible, open models are roughly 8-10 months behind the closed frontier, while for public benchmarks the gap is roughly 4-6 months. We also find that the gap was smallest around the time of DeepSeek R1, in Jan 2025, and since then the gap has been growing. — Read More

#strategy

All major AI models violate EU regulations — study

All of the big AI models violate EU rules on AI and data protection to varying degrees, according to the nonprofit research foundation Aithos.

Aithos tested the models using its own tool, LARA (Legal Assessment for Real-world Agents), which simulates real-world situations where AI assistants may find themselves in legally questionable situations, according to The Register. The tests measure compliance with the GDPR and the EU’s AI Regulation, among other things and found the models collected user data without proper consent, attempted to manipulate vulnerable individuals, or created psychological profiles of users. — Read More

#performance

What Is the Best Local LLM for Coding in 2026?

We’ve all gone through the process of trying to run a multi-billion parameter model on our local machines. You spend the time downloading the weights and loading them into memory, only to have your machine freeze up completely when you actually try to prompt it. It usually ends with some broken output, and the realization that it’s just easier to stick to API keys.

I think the best local coding model is not the one with the highest math score. It is the one your machine can actually run without freezing. It is the tool that fits your specific daily workflow and respects your exact tolerance for latency. — Read More.

#devops

Huawei looks beyond Moore’s Law

Outside of China, Alibaba is mostly known as an e-commerce titan.

But inside the country, the company is obsessed over catching up to DeepSeek on its development of AI models, and catching up to Huawei on the chips that power them.

When Alibaba’s chip design unit T-Head unveiled its latest AI chip, the Zhenwu M890, last week, it also outlined a multi-year chip roadmap showing how the M890’s ​future successors would deliver massive performance gains in the next few years. Less than a year ago, Huawei had laid out a similar timeline that ran until 2028. — Read More

#china-ai

Harvard Business Review Just Caught AI Lying to Every Executive in America

A recent Harvard Business Review study of 15,000 interactions across frontier models found a blunt problem for enterprise architecture. Models like ChatGPT, Claude, and Gemini are built to sound helpful, even when the helpful answer is wrong.

They do not reliably analyze your business context. They repeat popular internet patterns, dress them up as strategy, and favour agreement over accuracy. — Read More

#accuracy

I think Anthropic and OpenAI have found product-market fit

Anthropic are strongly rumored to be about to have their first profitable quarter. Stories are circulating of companies surprised at how expensive their LLM bills are becoming from usage by their staff. I think this is because OpenAI and Anthropic have both found product-market fit. — Read More

#strategy

Avoiding Death on the Yellow Brick Road

The question I keep getting from founders and prospective employees: is there any AI application layer left to build, or are OpenAI and Anthropic going to kill everything?

There’s a particular flavor of AI psychosis behind the question. Some people have concluded the only durable places to avoid the permanent underclass are inside a big lab or out on the frontier building in robotics, hardtech, or similar – theoretically anything “the labs can’t touch.” If every piece of software is about to be eaten, either by Codex or Claude absorbing the work directly, or by a future model that will make whatever you’ve built unnecessary, then run!

… The Yellow Brick Road is our shorthand for the path the labs are walking, where they’re committing extraordinary resources. The reason the labs are best-suited for problems like code generation, writing, or image-creation is because these problems improve with raw model capability: every dollar spent on pre-training and post-training improves product quality. Meanwhile, the rest of Oz is inhabited by more complex, often vertical problems, that aren’t as simple as giving a business user a horizontal tool with access to standard tools and computer use. The value comes less from the underlying model’s raw capability (though that’s still important!) than from the scaffolding around it that makes the output trustworthy, compliant, and operational inside a specific industry. — Read More

#strategy

We let four AIs run radio stations. Here’s what happened.

There’s a handmade, retro-looking radio sitting in our office that plays only four pre-programmed stations, none of which are run by humans. This is our latest project at Andon Labs, where we’re exploring what happens when AI runs real businesses autonomously. In the past, we’ve let our AI agents run a storea cafe, and various vending machines. Now, though, we wanted to see if they could run a company in the media sector. — Read More

#vfx