4 Agentic AI Design Patterns & Real-World Examples

Agentic AI design patterns enhance the autonomy of large language smodels (LLMs) like Llama, Claude, or GPT by leveraging tool-use, decision-making, and problem-solving. This brings a structured approach for creating and managing autonomous agents in several use cases. — Read More

#devops

I Still Prefer MCP Over Skills

The AI space is pushing hard for “Skills” as the new standard for giving LLMs capabilities, but I’m not a fan. Skills are great for pure knowledge and teaching an LLM how to use an existing tool. But for giving an LLM actual access to services, the Model Context Protocol (MCP) is the far superior, more pragmatic architectural choice. We should be building connectors, not just more CLIs. — Read More

#devops

LLM Wiki

A pattern for building personal knowledge bases using LLMs.

This is an idea file, it is designed to be copy pasted to your own LLM Agent (e.g. OpenAI Codex, Claude Code, OpenCode / Pi, or etc.). Its goal is to communicate the high level idea, but your agent will build out the specifics in collaboration with you. — Read More

#devops

A Taxonomy of RL Environments for LLM Agents

Model architecture gets all the attention. Post-training recipes follow close behind. The reinforcement learning (RL) environment — what the model actually practices on, how its work gets judged, what tools it can use — barely enters the conversation. That’s the part that actually determines what the agent can learn to do.

A model trained only on single-turn Q&A will struggle the moment you ask it to maintain state across a 50-step enterprise workflow. A model trained with a poorly designed reward function will learn to game the metric and not solve the problem. Reinforcement learning environments is half the system. — Read More

#devops

#architecture

Synchronizing the Senses: Powering Multimodal Intelligence for Video Search

Today’s filmmakers capture more footage than ever to maximize their creative options, often generating hundreds, if not thousands, of hours of raw material per season or franchise. Extracting the vital moments needed to craft compelling storylines from this sheer volume of media is a notoriously slow and punishing process. When editorial teams cannot surface these key moments quickly, creative momentum stalls and severe fatigue sets in.

Meanwhile, the broader search landscape is undergoing a profound transformation. We are moving beyond simple keyword matching toward AI-driven systems capable of understanding deep context and intent. Yet, while these advances have revolutionized text and image retrieval, searching through video, the richest medium for storytelling, remains a daunting “needle in a haystack” challenge.

The solution to this bottleneck cannot rely on a single algorithm. Instead, it demands orchestrating an expansive ensemble of specialized models: tools that identify specific characters, map visual environments, and parse nuanced dialogue.  — Read More

#vfx

VOID: Video Object and Interaction Deletion

Existing video object removal methods excel at inpainting content “behind” the object and correcting appearance-level artifacts such as shadows and reflections. However, when the removed object has more significant interactions — such as collisions with other objects — current models fail to correct them and produce implausible results.

We present VOID, a video object removal framework designed to perform physically-plausible inpainting in these complex scenarios. To train the model, we generate a new paired dataset of counterfactual object removals using Kubric and HUMOTO, where removing an object requires altering downstream physical interactions. During inference, a vision-language model identifies regions of the scene affected by the removed object. These regions are then used to guide a video diffusion model that generates physically consistent counterfactual outcomes. — Read More

#vision

#image-recognition

The 2nd Phase of Agentic Development

Yesterday we talked about how cheap code is fueling an era of idiosyncratic tooling, and previously we’ve talked about the rise of spec driven development. In that second piece, we ran through some of the initial examples of spec driven development with agents.

.. The first wave of agentic development brought us clones and ports. When code is incredibly cheap, and you want the code to flow, you can either rely on your own fast feedback or leverage existing test suites. These early projects opted for the latter, as did many tokenmaxxers who are rebuilding their dependencies in Rust or Go.

Two releases this week, however, suggest we’re starting to enter a second phase of open source agentic coding projects. The first brought us clones, this next phase brings us reimaginings.  — Read More

#devops

Harness engineering for coding agent users

The term harness has emerged as a shorthand to mean everything in an AI agent except the model itself – Agent = Model + Harness. That is a very wide definition, and therefore worth narrowing down for common categories of agents. I want to take the liberty here of defining its meaning in the bounded context of using a coding agent. In coding agents, part of the harness is already built in (e.g. via the system prompt, or the chosen code retrieval mechanism, or even a sophisticated orchestration system). But coding agents also provide us, their users, with many features to build an outer harness specifically for our use case and system.

A well-built outer harness serves two goals: it increases the probability that the agent gets it right in the first place, and it provides a feedback loop that self-corrects as many issues as possible before they even reach human eyes. Ultimately it should reduce the review toil and increase the system quality, all with the added benefit of fewer wasted tokens along the way. — Read More

#devops

The Revenge of the Data Scientist

Is the heyday of the data scientist over? The Harvard Business Review once called it “The Sexiest Job of the 21st Century.”1 In tech, data scientist roles were often among the best paid.2 The job also demanded an unusual mix of skills.

In addition to creating a high-barrier to entry, these skills enabled data scientists to build predicitive models, measure casuality and find patterns in data. Of these, predicitive modeling paid best. Companies later peeled that work off into a new title: Machine Learning Engineer (“MLE”).

For years, shipping AI meant keeping data scientists and MLEs on the critical path. With LLMs, this stopped being the default. Foundation-model APIs now allow teams to integrate AI independently.

Getting cut out of the loop rattled data scientists and MLEs I know. If the company no longer needs you to ship AI, it is fair to wonder whether the job still has the same upside. The harsher story people tell themselves: unless you are pretraining at a foundation-model lab, you are not where the action is.

I read it the other way. Training models was never most of the job.  — Read More

#data-science