The main problem with standard RAG systems isn’t the retrieval or the generation. It’s that nothing sits in the middle deciding whether the retrieval was actually good enough before the generation happens.
Standard RAG is a pipeline where information flows in one direction, from query to retrieval to response, with no checkpoint and no second chance. This works fine for simple questions with obvious answers.
However, the moment a query gets ambiguous, or the answer is spread across multiple documents, or the first retrieval pulls back something that looks good but isn’t, RAG starts losing value.
Agentic RAG attempts to fix this problem. It is based on a single question: what if the system could pause and think before answering? — Read More
Daily Archives: March 26, 2026
The LiteLLM Supply Chain Attack: A Complete Technical Breakdown Of The AI Ecosystem’s Darkest Hour
On March 24, 2026, the artificial intelligence development community experienced an unprecedented security catastrophe. LiteLLM, an essential open-source Python library used to route and manage API calls across hundreds of large language models, was weaponized in a highly sophisticated supply chain attack. Threat actors known as TeamPCP successfully published two malicious versions of the package (1.82.7 and 1.82.8) directly to the Python Package Index (PyPI).
With LiteLLM averaging 97 million monthly downloads and serving as a foundational dependency for industry titans like Stripe, Netflix, and Google alongside major AI frameworks such as CrewAI, DSPy, and MLflow, the magnitude of this compromise is staggering. — Read More
The Death of model.fit(): What Data Scientists Actually Do in the Age of AI Agents
A few months ago, I joined a team building two AI-agent products.
My first week, I opened a Jupyter notebook out of habit. Then I closed it. There was no training set, no features to engineer, no model.fit(X_train, y_train) waiting to be called. The agents orchestrated foundation models. The “intelligence” came from a model someone else trained. The entire codebase was TypeScript. No notebooks, no model, no Python. The toolbox I’d spent years filling was, on its surface, irrelevant.
So what, exactly, was I supposed to do?
The answer turned out to be hiding in a simple framework.
Every AI agent has three layers. The foundation model provides raw intelligence. The engineering provides the body: tools, APIs, orchestration, and product surfaces. But the behavior of the agent – what it actually does when a user shows up – is shaped by the context, prompts, policies, schemas, and guardrails that surround the model. That’s the brain of the system. Not the neural network itself, but the cognitive architecture built on top of it.
Someone needs to own the quality of that brain; to make it legible, to understand its failure modes, measure its consistency, map its weaknesses, and create the feedback loops that systematically make it smarter. That someone, it turns out, is the data scientist. Not as a model trainer, but as the team’s methodologist. — Read More
Future Casting the Modern Data Stack
After writing an article a few years ago called “Big Data is Dead,” it feels a bit clichéd to call things “dead.” So I won’t say any such thing about the Modern Data Stack. It does, however, appear very, very sleepy. Someone should go and poke it with a stick.
The Modern Data Stack – deceased or just drowsy?
While we’re all dead in the long run, one thing that is different now is that AI is bringing the “long run” a lot closer than it has ever been. In the last couple of years, AI has forever changed a number of professions that were once thought to be safe from disruption. From art to software engineering, AI is changing how people get things done, and changing things much faster than you’d expect.
… The interesting question to me is, “What comes next?” If we assume models continue to get better, companies capitalize on the opportunities, things get tied together in a nice bow, what does the world look like? What could it look like? Let’s start with what we know. — Read More
Announcing Arm AGI CPU: The silicon foundation for the agentic AI cloud era
Today, Arm is announcing the Arm AGI CPU, a new class of production-ready silicon built on the Arm Neoverse platform and designed to power the next generation of AI infrastructure.
For the first time in our more than 35-year history, Arm is delivering its own silicon products – extending the Arm Neoverse platform beyond IP and Arm Compute Subsystems (CSS) to give customers greater choice in how they deploy Arm compute – from building custom silicon to integrating platform-level solutions or deploying Arm-designed processors. It reflects both the rapid evolution of AI infrastructure and growing demand from the ecosystem for production-ready Arm platforms that can be deployed at pace and scale. — Read More