As with any tool, understanding how coding agents work under the hood can help you make better decisions about how to apply them.
A coding agent is a piece of software that acts as a harness for an LLM, extending that LLM with additional capabilities that are powered by invisible prompts and implemented as callable tools. — Read More
Author Archives: Rick's Cafe AI
AI Model Basics for Beginners
Free AI/ML Resources Everyone Should Learn From in 2026
AI and ML have gained a lot of popularity. Every company wants to stay ahead of the curve and introduce AI in its daily operations. Although we have multiple models from ChatGPT, Claude, Cursor, DeepSeek, and other models available in the market today, which amaze the world with their knowledge and data that they share.
However, to learn and grow, we need resources that can help us understand the basics, the technicalities, and most importantly, how to apply these concepts in real-world scenarios.
Below are multiple free resources I’ve gathered to help you master AI/ML concepts effortlessly. — Read More
The Future of Software Engineering with Anthropic
Sivesh and I recently hosted a roundtable on the future of software engineering with Anthropic’s Ash Prabaker and we were joined by engineering leaders from Stripe, NVIDIA, Microsoft, Google DeepMind, xAI, Apple, Scale AI, as well as the legend Peter Steinberger of OpenClaw/OpenAI.
… A major thread throughout the discussion was “closed-loop” development. One participant described a setup at their company where bug reports are automatically triaged by an agent, bucketed by severity, checked against an eval set, and then a fix PR is opened — much of it running with minimal human touch. The room broadly agreed that this kind of loop is where compounding gains actually come from: better coding tools improve the models, better models improve the coding tools. Several people noted their companies are prioritizing coding specifically because of this dynamic.
… The room converged on long-horizon tasks as the real frontier problem. One participant noted that product engineering has started to go exponential for them, but closing the loop on more complex research workflows isn’t there yet. The open questions everyone shared: what do you actually assign an agent for a four- or five-hour run? How do you observe it? How do you keep a human in the loop without babysitting? Nobody had a clean answer. — Read More
Can LLMs Be Computers?
Language models can solve tough math problems at research grade but struggle on simple computational tasks that involve reasoning over many steps and long context. Even multiplying two numbers or solving small Sudokus is nearly impossible unless they rely on external tools.
But what does it take for an LLM itself to be as reliable and efficient as a computer?
We answer this by literally building a computer inside a transformer. We turn arbitrary C code into tokens that the model itself can execute reliably for millions of steps in seconds. — Read More
The context problem: Why enterprise AI needs more than foundation models
Ask an AI coding assistant to, say, “build a React component with a dropdown menu,” and you’ll probably get something impressive in seconds—clean code, proper hooks, accessible markup. It’s the kind of demo that makes CTOs lean forward in their chairs.
Now ask that same AI about your company’s internal API for user authentication. Ask it to integrate with your legacy billing system. Ask it why your team deprecated a particular approach last quarter. Watch it hallucinate with confidence, suggesting endpoints that don’t exist, recommending patterns your architecture explicitly forbids, and generally ignoring the hard-won institutional knowledge that makes your systems actually, you know, work.
This is the enterprise AI paradox: Foundation models know everything about public libraries but precious little about the specifics that matter for your business. They’re trained on millions of open source repositories, but they’ve never seen your codebase. They can regurgitate best practices from popular engineering blogs, but they fail to grasp why those practices might be impossible in your environment. Without context—the community-vetted, institutional knowledge behind business decisions—AI assistants remain dangerously confident when they shouldn’t be. — Read More
How Karpathy’s Autoresearch Works And What You Can Learn From It
Most “autonomous AI research” demos look impressive for the same reason magic tricks do: you only see the interesting part. An agent edits some code, runs an experiment, and shows a better result. What you usually do not see is the part that actually determines whether the system is useful: what is the harness optimizing for, how stable is the evaluation, and what happens when the agent fails?
That is why Karpathy’s Autoresearch is worth paying attention to.X
Autoresearch is not trying to be a general-purpose AI scientist. It is a small, tightly constrained system for one specific job: let an agent modify a training script, run a bounded experiment, measure the result, keep the change if it helps, and discard it if it does not. The repo is tiny, but the design behind it is one of the cleanest examples I have seen of how to build a useful autonomous improvement harness. — Read More
The “Night Shift” Agentic Workflow
Since December, 2025, I’ve been integrating AI agents into my coding workflow.
Previous attempts at agentic workflows have left me exhausted, overwhelmed, and feeling out of touch with the systems I was building. They also degraded quality too much.
My current agentic workflow is about 5x faster, better quality, I understand the system better, and I’m having fun again.
I call this the Night Shift workflow. — Read More
MCP is Dead; Long Live MCP!
There is currently a social media and industry zeitgeist dialed-in on CLIs…just as there was a moment for MCP but just a few short months ago
While it is true that there are token savings to be had by using a CLI, many folks have not considered how agents using custom CLIs run into the same context problem as MCP, except now without structure and many other sacrifices
In much of the discourse, there is a lack of distinction between local MCP over stdio versus server MCP over HTTP; the latter is a very different use case
… The oversight made by many is that individual usage of coding agents looks very different from organizational adoption of coding agents where there is an emphasis on visibility, telemetry, security, quality, and being able to operationalize and maintain agent-coded systems by a team of varying degrees of skill and experience.
For enterprise and org-level use cases, MCP is the present and future and teams need to be able to cut through the hype of the moment. — Read More
5 design skills to sharpen in the AI era
AI is reshaping the way products are made: It’s accelerating exploration, lowering barriers to entry, and widening the circle of who can participate in the design process. In response, teams are honing new skills to meet the moment. In our recent report State of the Designer 2026, we asked the design community which skills matter most to them in the age of AI. Here, we’re sharing what those skills are—and how to perfect them. — Read More