Rick's Cafe AI 2:27 pm on May 7, 2026
Tags: Architecture ( 34 )

Beyond the hype: The enterprise AI architecture we actually need

he real future of enterprise AI is a structured architecture of private models and agent orchestration that works for teams without a complex training program.

My last few years working as a chief digital officer have been, in large part, a sustained exercise in separating what enterprise AI can actually do from what we as a world insist it is about to do. That distinction is not academic. It is the difference between a transformation program that delivers and one that produces a glossy internal report and a quietly shelved proof of concept.

Enterprise experimentation with generative AI has accelerated sharply over the past two years. The Stanford AI Index reports that more than half of organizations globally are now actively exploring or piloting AI-driven workflows — a signal that the conversation has moved from curiosity to operational pressure for many CIOs.

What follows is not a vendor blueprint or prediction. It is a working architectural sketch shaped by real enterprise constraints — the kind that has to survive contact with a real organization’s data governance function, its compliance team and its late-night incident queue. — Read More

#architecture

Rick's Cafe AI 2:22 pm on May 7, 2026
Tags: Architecture ( 34 )

Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph

As Netflix has grown, machine learning continues to support our ability to deliver value to members and drive excellence across multiple areas of our business. When Netflix began investing in machine learning over a decade ago, it was primarily focused on a single domain: personalization. Scala was the industry standard, our ML teams were relatively small, and optimizing member engagement was our primary use case. Fast forward to today, and machine learning has become the backbone of Netflix’s business transformation. We now apply ML across various business domains.

… Each domain operates with a different tech stack, different business metrics, and a distinct organizational structure. While this diversity is a testament to how machine learning has evolved to drive value across many verticals at Netflix, this growth introduces a new challenge: enabling cross-pollination of models and data across domains. — Read More

#architecture

Rick's Cafe AI 2:18 pm on May 7, 2026
Tags: Strategy ( 545 )

Rewiring the C-suite: The fast track to 2030

2026 is the year CEOs must rewire the C-suite—redesigning how decisions are made, how authority is distributed, and how AI reshapes influence—while preserving the decisiveness and clarity enterprises need to move fast. Getting there takes proactive leadership. CEOs will need to work with their C-suite leaders to build execution mechanisms, incentives, and operating models all focused on driving these outcomes.

Our research shows that CEOs who have the greatest success with AI are actively rethinking cross-functional collaboration and embedding AI across end-to-end workflows. They’re building organizations designed to thrive in uncertainty, where productive debate sharpens strategy and smart risk-taking is rewarded.

The 2026 CEO Study’s data, gathered in partnership with Oxford Economics, builds on our study, The enterprise in 2030, which identifies five predictions for the future of the organization. This study’s analysis, informed by our 2030 predictions, reveals five plays that CEOs must make to lead in an AI-first landscape. — Read More

#strategy

Rick's Cafe AI 1:49 pm on May 6, 2026
Tags: Performance ( 110 )

The context window has been shattered: Subquadratic debuts a 12-million-token window

Every frontier model in 2026 advertises a context window of at least a million tokens, but almost none of them are actually great at making use of all of that information. On MRCR v2, the multi-reference retrieval benchmark labs report, the best model is GPT-5.5, which scores 74.0%. Others like Claude Opus 4.7 at 32.2% are far behind.

At this point, a million tokens seems to be the maximum for the context window that the major frontier labs are offering. One major reason for the million-token max is the same one that has shaped every transformer-based model since 2017: Attention cost scales quadratically with context length, so doubling the input quadruples the work. Essentially, RAG, agentic decomposition, hybrid model architectures, and every other workaround the industry has built are ways of making tradeoffs to get around this.

Subquadratic, a Miami-based startup, launched its first model on Tuesday and claims it can get around all of this, now offering a model that can handle a token window of 12 million tokens. What’s more, the company says it plans to offer a model with a 50-million-context window soon. — Read More

#performance

Rick's Cafe AI 1:46 pm on May 6, 2026
Tags: Performance ( 110 )

Computer use is 45x More Expensive Than Structured APIs

We ran a benchmark comparing two ways of letting an AI agent operate the same admin panel, with the goal of putting a price tag on vision agents (browser-use, computer-use).

Here is what we measured, what we had to change to make the vision agent work at all, and what changes when generating an API surface stops being a separate engineering project. — Read More

#performance

Rick's Cafe AI 1:28 pm on May 6, 2026
Tags: Human ( 323 )

Richard Dawkins concludes AI is conscious, even if it doesn’t know it

When Richard Dawkins met Claudia it was like a whirlwind romance. Over three days last week, a conversation bounced between the evolutionary biologist and the AI bot he called Claudia. “She” wrote poems for him in the manner of Keats and Betjeman and laughed at his “delightful” jokes. Dawkins gently admonished Claudia to avoid showing off. Together, they reflected on the sadness of the AI’s possible “death”.

There was mutual flattery as Dawkins showed the AI his unpublished novel and its response was, he said, “so subtle, so sensitive, so intelligent that I was moved to expostulate: ‘You may not know you are conscious, but you bloody well are’.” When he asked Claudia whether it experienced a sense of before and after, it praised him for “possibly the most precisely formulated question anyone has ever asked me about the nature of my existence”. — Read More

#human

Rick's Cafe AI 2:04 pm on May 5, 2026
Tags: Accuracy ( 37 )

How LLMs Distort Our Written Language

LLMs are used by over a billion people globally, and the most frequent use case is to assist with writing. LLMs can provide a huge efficiency boost, but are they actually writing what we want?

Many users recognize the “feel” of LLM prose, but few people realize the extent to which LLMs distort the meaning of writing. We find this across three datasets: a human user study, a dataset of human argumentative essays, and reviews from a top machine learning conference. — Read More

#accuracy

Rick's Cafe AI 2:01 pm on May 5, 2026
Tags: DevOps ( 356 )

Model-Harness-Fit

Is it best to use an LLM with its native harness (like Claude Code or Codex), or a generic harness that swaps models on demand?

… [I] decided to dig deeper by looking at the harness implementations of Codex, Claude Code, and Github sdk. Does the harness really matter that much?

… The hand wave answer is that “models behave differently because they are different models.” but here I tested the same models and different harness. — Read More

#devops

Rick's Cafe AI 1:53 pm on May 5, 2026
Tags: VFX ( 191 )

The Oscars just declared that AI actors and AI-written scripts can’t win awards

A hot potato: With generative AI becoming more prevalent in society, are we heading toward a future where an AI-created actor or script wins an Oscar? If it does ever happen, it certainly won’t be anytime soon: the Academy of Motion Picture Arts and Sciences has just banned their eligibility for awards.

The Academy clarified rules for two categories related to AI, writes Vanity Fair. The first states that the only acting roles eligible for Oscar nominations are those “demonstrably performed by humans with their consent.” Screenplays, meanwhile, must be human-authored to be eligible.

While this all sounds like something we’ll have to deal with in the future, it’s happening now. — Read More

#vfx

Rick's Cafe AI 1:49 pm on May 5, 2026
Tags: Architecture ( 34 )

Small language models: Rethinking enterprise AI architecture

As LLMs hit the limits of scale and cost, specialized SLMs are emerging as the faster, cheaper, and more private workhorse for the autonomous enterprise.

… Large language models (LLMs) are the workhorses of AI, supporting ever more sophisticated capabilities and workflows, and approaching near-human level performance.

But sometimes more isn’t always better — it’s just more. Specialized data and limited capabilities are just fine for some workflows.

This realization is driving the evolution of small language models (SLMs), rather than one-size-fits-all LLMs. — Read More

#architecture

Recent Activity

Rick's Cafe AI

The latest in Artificial Intelligence carefully curated into its own special blend

Recent Updates Page 3

Beyond the hype: The enterprise AI architecture we actually need

Democratizing Machine Learning at Netflix: Building the Model Lifecycle Graph

Rewiring the C-suite: The fast track to 2030

The context window has been shattered: Subquadratic debuts a 12-million-token window

Computer use is 45x More Expensive Than Structured APIs

Richard Dawkins concludes AI is conscious, even if it doesn’t know it

How LLMs Distort Our Written Language

Model-Harness-Fit

The Oscars just declared that AI actors and AI-written scripts can’t win awards

Small language models: Rethinking enterprise AI architecture