he real future of enterprise AI is a structured architecture of private models and agent orchestration that works for teams without a complex training program.
My last few years working as a chief digital officer have been, in large part, a sustained exercise in separating what enterprise AI can actually do from what we as a world insist it is about to do. That distinction is not academic. It is the difference between a transformation program that delivers and one that produces a glossy internal report and a quietly shelved proof of concept.
Enterprise experimentation with generative AI has accelerated sharply over the past two years. The Stanford AI Index reports that more than half of organizations globally are now actively exploring or piloting AI-driven workflows — a signal that the conversation has moved from curiosity to operational pressure for many CIOs.
What follows is not a vendor blueprint or prediction. It is a working architectural sketch shaped by real enterprise constraints — the kind that has to survive contact with a real organization’s data governance function, its compliance team and its late-night incident queue. — Read More
Recent Updates Page 3
Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph
As Netflix has grown, machine learning continues to support our ability to deliver value to members and drive excellence across multiple areas of our business. When Netflix began investing in machine learning over a decade ago, it was primarily focused on a single domain: personalization. Scala was the industry standard, our ML teams were relatively small, and optimizing member engagement was our primary use case. Fast forward to today, and machine learning has become the backbone of Netflix’s business transformation. We now apply ML across various business domains.
… Each domain operates with a different tech stack, different business metrics, and a distinct organizational structure. While this diversity is a testament to how machine learning has evolved to drive value across many verticals at Netflix, this growth introduces a new challenge: enabling cross-pollination of models and data across domains. — Read More
Rewiring the C-suite: The fast track to 2030
2026 is the year CEOs must rewire the C-suite—redesigning how decisions are made, how authority is distributed, and how AI reshapes influence—while preserving the decisiveness and clarity enterprises need to move fast. Getting there takes proactive leadership. CEOs will need to work with their C-suite leaders to build execution mechanisms, incentives, and operating models all focused on driving these outcomes.
Our research shows that CEOs who have the greatest success with AI are actively rethinking cross-functional collaboration and embedding AI across end-to-end workflows. They’re building organizations designed to thrive in uncertainty, where productive debate sharpens strategy and smart risk-taking is rewarded.
The 2026 CEO Study’s data, gathered in partnership with Oxford Economics, builds on our study, The enterprise in 2030, which identifies five predictions for the future of the organization. This study’s analysis, informed by our 2030 predictions, reveals five plays that CEOs must make to lead in an AI-first landscape. — Read More
The context window has been shattered: Subquadratic debuts a 12-million-token window
Every frontier model in 2026 advertises a context window of at least a million tokens, but almost none of them are actually great at making use of all of that information. On MRCR v2, the multi-reference retrieval benchmark labs report, the best model is GPT-5.5, which scores 74.0%. Others like Claude Opus 4.7 at 32.2% are far behind.
At this point, a million tokens seems to be the maximum for the context window that the major frontier labs are offering. One major reason for the million-token max is the same one that has shaped every transformer-based model since 2017: Attention cost scales quadratically with context length, so doubling the input quadruples the work. Essentially, RAG, agentic decomposition, hybrid model architectures, and every other workaround the industry has built are ways of making tradeoffs to get around this.
Subquadratic, a Miami-based startup, launched its first model on Tuesday and claims it can get around all of this, now offering a model that can handle a token window of 12 million tokens. What’s more, the company says it plans to offer a model with a 50-million-context window soon. — Read More
Computer use is 45x More Expensive Than Structured APIs
We ran a benchmark comparing two ways of letting an AI agent operate the same admin panel, with the goal of putting a price tag on vision agents (browser-use, computer-use).
Here is what we measured, what we had to change to make the vision agent work at all, and what changes when generating an API surface stops being a separate engineering project. — Read More
Richard Dawkins concludes AI is conscious, even if it doesn’t know it
When Richard Dawkins met Claudia it was like a whirlwind romance. Over three days last week, a conversation bounced between the evolutionary biologist and the AI bot he called Claudia. “She” wrote poems for him in the manner of Keats and Betjeman and laughed at his “delightful” jokes. Dawkins gently admonished Claudia to avoid showing off. Together, they reflected on the sadness of the AI’s possible “death”.
There was mutual flattery as Dawkins showed the AI his unpublished novel and its response was, he said, “so subtle, so sensitive, so intelligent that I was moved to expostulate: ‘You may not know you are conscious, but you bloody well are’.” When he asked Claudia whether it experienced a sense of before and after, it praised him for “possibly the most precisely formulated question anyone has ever asked me about the nature of my existence”. — Read More
How LLMs Distort Our Written Language
LLMs are used by over a billion people globally, and the most frequent use case is to assist with writing. LLMs can provide a huge efficiency boost, but are they actually writing what we want?
Many users recognize the “feel” of LLM prose, but few people realize the extent to which LLMs distort the meaning of writing. We find this across three datasets: a human user study, a dataset of human argumentative essays, and reviews from a top machine learning conference. — Read More
Model-Harness-Fit
Is it best to use an LLM with its native harness (like Claude Code or Codex), or a generic harness that swaps models on demand?
… [I] decided to dig deeper by looking at the harness implementations of Codex, Claude Code, and Github sdk. Does the harness really matter that much?
… The hand wave answer is that “models behave differently because they are different models.” but here I tested the same models and different harness. — Read More
The Oscars just declared that AI actors and AI-written scripts can’t win awards
A hot potato: With generative AI becoming more prevalent in society, are we heading toward a future where an AI-created actor or script wins an Oscar? If it does ever happen, it certainly won’t be anytime soon: the Academy of Motion Picture Arts and Sciences has just banned their eligibility for awards.
The Academy clarified rules for two categories related to AI, writes Vanity Fair. The first states that the only acting roles eligible for Oscar nominations are those “demonstrably performed by humans with their consent.” Screenplays, meanwhile, must be human-authored to be eligible.
While this all sounds like something we’ll have to deal with in the future, it’s happening now. — Read More
Small language models: Rethinking enterprise AI architecture
As LLMs hit the limits of scale and cost, specialized SLMs are emerging as the faster, cheaper, and more private workhorse for the autonomous enterprise.
… Large language models (LLMs) are the workhorses of AI, supporting ever more sophisticated capabilities and workflows, and approaching near-human level performance.
But sometimes more isn’t always better — it’s just more. Specialized data and limited capabilities are just fine for some workflows.
This realization is driving the evolution of small language models (SLMs), rather than one-size-fits-all LLMs. — Read More