The dominant paradigm in AI today, insofar as it is used in production-ready settings, is organized around language and code. The scaling laws governing large language models are well-characterized, the commercial flywheel of data, compute, and algorithmic improvement is spinning, and the returns to incremental capability gains remain large and mostly legible. This paradigm has earned the capital and attention it commands.
But a set of adjacent and related fields has been making meaningful strides in its gestation phase. These areas of activity include VLAs, WAMs, and other approaches to generalist robotics models, physical and scientific reasoning in the pursuit of AI scientists, and novel interfaces for human-computer interaction (including BCIs and neurotech) that take advantage of advances in AI to rethink how we interact with machines. Beyond technical progress, each of these areas has seen the beginnings of an influx in talent, capital, and founder activity. The technical primitives for extending frontier AI into the physical world are maturing concurrently, and the pace of progress over the past eighteen months suggests that these fields could soon enter a scaling regime of their own. — Read More
Author Archives: Rick's Cafe AI
Being a Staff+ Data Scientist in 2026
I became a data scientist in 2013 when the title was young. It was so new that most companies had no idea what a data scientist should be doing, only that they desperately needed one or they would be left behind. Sound familiar?
I’ve tried to survey the job description of data science a couple of times with varying degrees of success, most recently to go with some informal recommendations for creating data science degree programs. Together with a group of colleages we tried to summarize what data scientists do and the data science subtypes of maker, oracle, detective, generalist. But in the face of changing expectations this doesn’t feel like enough anymore. It’s time for a refresh. — Read More
Managing context in long-run agentic applications
In complex, long-running agentic systems, maintaining alignment and coherent reasoning between agents requires careful design. In this second article of our series, we explore these challenges and the mechanisms we built to keep teams of agents working productively over long time spans. We present a range of complementary techniques that balance the conflicting requirements of continuity and creativity.
… Language model APIs are stateless: to provide continuity between requests, the caller must provide the complete message history with each request. Agent frameworks solve the state management problem for users by accumulating message history between API calls. This fills the agent’s context window, which provides a hard limit on how much information the agent can handle. Even approaching an agent’s context window limit can degrade the quality of responses. For short-run applications, no extra context window management is typically required.
Complex security investigations can span hundreds of inference requests and generate megabytes of output, requiring special handling. Multi-agent applications, like ours, add further complexities. For each agent to optimally execute its role, it requires a tailored view of the investigation state. Each view must be carefully balanced. If agents are not anchored to the wider team, the investigation will be disconnected and incoherent. Conversely, sharing too much information stifles creativity and encourages confirmation bias.
Our solution uses three complementary context channels: Director’s Journal, Critic’s review and Critic’s Timeline. — Read More
Earlier Article
Stop Treating AI Memory Like a Search Problem
Back in October, my AI assistant stored a memory with an importance score of 8/10. Content: “Investigating Bun.js as a potential runtime swap.”
I never actually switched to Bun. To be fair, it was a two-day curiosity that went nowhere. But this memory persisted for six months, popping up each time I asked about my build process and quietly pushing the AI toward a Bun solution with confidence.
There was nothing wrong with the system; it was doing exactly what it was supposed to do. That was the issue. — Read More
Meta Is Warned That Facial Recognition Glasses Will Arm Sexual Predators
More than 70 civil liberties, domestic violence, reproductive rights, LGBTQ+, labor, and immigrant advocacy organizations are demanding that Meta abandon plans to deploy face recognition on its Ray-Ban and Oakley smart glasses, warning that the feature—reportedly known inside the company as “Name Tag”—would hand stalkers, abusers, and federal agents the ability to silently identify strangers in public.
The coalition, which includes the ACLU, the Electronic Privacy Information Center, Fight for the Future, Access Now, and the Leadership Conference on Civil and Human Rights, is demanding Meta kill the feature before launch, after internal documents surfaced showing the company hoped to use the current “dynamic political environment” as cover for the rollout, betting that civil society groups would have their resources “focused on other concerns.” — Read More
Before he wrote AI 2027, he predicted the world in 2026. How did he do?
Daniel Kokotajlo is the founder of the AI Futures Project and the lead author of the influential AI 2027 report: a detailed, narrative prediction of the next few years of AI development, culminating in the rise of superhuman agents capable of wresting control from humanity.
But AI 2027 wasn’t his first foray into long-form prediction. In August of 2021, Daniel wrote an essay called “What 2026 Looks Like.” This essay came out before the launch of ChatGPT, let alone the explosion of AI across the global economy. Now that it’s 2026, I thought it was time to evaluate Daniel’s predictions — and it brings me no joy to say that they are frighteningly accurate. — Read More
Mythos Won’t Kill Threat Hunting
Last week, a coalition of CISOs, SANS, OWASP, and the Cloud Security Alliance published a strategy briefing called “The AI Vulnerability Storm: Building a ‘Mythos-ready’ Security Program.” If you haven’t read it yet, you should. The author list alone is stacked: Gadi Evron, Rob T. Lee, Jen Easterly, Bruce Schneier, Chris Inglis, Heather Adkins, Rob Joyce. It’s the kind of document that doesn’t happen unless people are genuinely worried.
The headline is hard to ignore. Anthropic’s Claude Mythos can autonomously discover thousands of zero-day vulnerabilities across major operating systems and browsers. A 72% exploit success rate. It found a 27-year-old OpenBSD bug nobody caught. Where Opus 4.6 generated two working Firefox exploits, Mythos generated 181 under identical conditions. The time between vulnerability discovery and a working exploit now looks like hours, not weeks.
The briefing lays out a 90-day plan for CISOs. — Read More
Steve Jobs’s 10-80-10 Rule Is Even More Useful in the AI Era
This column is about how a principle known as the 10-80-10 rule can help you manage teams in the age of AI. But to really get a sense of how this rule works, it’s helpful to take an unlikely detour into the evolution of Steve Jobs’s management style, and how the legendary Apple boss went from micromanager to big believer in the 10-80-10 approach, [where you:].
— Spend the first 10 percent of the time communicating your vision for the thing.
— Allow others to spend the next 80 percent of the time moving the thing forward.
— Spend another 10 percent of the time polishing the thing, and helping others understand why and how you’re tweaking.
— Read More
OpenAI opens powerful cyber tools to verified users
OpenAI laid out a new plan on Tuesday to expand access to AI models with advanced cyber capabilities while implementing controls on who can use them.
Why it matters: The roadmap coincides with the release of a new model variant, GPT-5.4-Cyber, designed to assist with defensive cybersecurity tasks and be more permissive for vetted users. — Read More
8 Tips for Writing Agent Skills
Skills have become one of the most used extension points in agents. They’re flexible, easy to make, and simple to distribute.XXXXBut this flexibility also makes it hard to know what good and what works. What type of skills are worth making? What’s the secret to writing a good skill? When do you share them with others?
I have been using skills extensively with many of them in active use. Here are some tips I’ve learned along the way. — Read More