Is it just me, or are the code generation AIs we’re all using fundamentally broken?
For months, I’ve watched developers praise AI coding tools while silently cleaning up their messes, afraid to admit how much babysitting they actually need.
I realized that AI IDEs don’t actually understand codebases — they’re just sophisticated autocomplete tools with good marketing. The emperor has no clothes, and I’m tired of pretending otherwise.
After two years of frustration watching my AI assistants constantly “forget” where files were located, create duplicates, and use completely incorrect patterns, I finally built what the big AI companies couldn’t (or wouldn’t.)
I decided to find out: What if I could make AI actually understand how my codebase works? — Read More
Tag Archives: DevOps
Vibe Coding: Pairing vs. Delegation
In The Vibe Coding Handbook: How To Engineer Production-Grade Software With GenAI, Chat, Agents, and Beyond, Steve Yegge and I describe a spectrum of coding modalities with GenAI. On one extreme is “pairing,” where you are working with the AI to achieve a goal. It really is like pair programming with another person, if that person was like a “summer intern who believes in conspiracy theories” (as coined by Simon Willison) and the world’s best software architect.
On the other extreme is “delegating” (which I think many will associate with “agentic coding”), where you ask the AI to do something, and it does so without any human interaction.
… These dimensions dictate the frequency of reporting and feedback you need. — Read More
Code is the new no-code
Most people can’t code. So if you’re running a business, for years you’ve had only two options when you wanted to improve your productivity with the tools and systems you used.
1. Buy better software
2. Pay someone to build better software
For years, we’ve been promised a future where anyone could build software without learning to code, giving us a third option. A promised third option was that you could just drag-and-drop some blocks, connect a few nodes, and voilà — you’ve built a fully functional app without writing a single line of code! — Read More
Not all AI-assisted programming is vibe coding (but vibe coding rocks)
Vibe coding is having a moment. The term was coined by Andrej Karpathy just a few weeks ago (on February 6th) and has since been featured in the New York Times, Ars Technica, the Guardian and countless online discussions.
I’m concerned that the definition is already escaping its original intent. I’m seeing people apply the term “vibe coding” to all forms of code written with the assistance of AI. I think that both dilutes the term and gives a false impression of what’s possible with responsible AI-assisted programming.
Vibe coding is not the same thing as writing code with the help of LLMs!
… When I talk about vibe coding I mean building software with an LLM without reviewing the code it writes. — Read More
Vibe Coding and the Future of Software Engineering
Vibe coding (or vibeware) is making rounds on X now. To the best of my knowledge Andrej Karpathy started the “meme” in this X entry. I find it well written and hilarious and it seems to have taken off.
Karpathy: “There’s a new kind of coding I call “vibe coding”, where you fully give in to the vibes, embrace exponentials, and forget that the code even exists. It’s possible because the LLMs (e.g. Cursor Composer w Sonnet) are getting too good. Also I just talk to Composer with SuperWhisper so I barely even touch the keyboard. I ask for the dumbest things like “decrease the padding on the sidebar by half” because I’m too lazy to find it. I “Accept All” always, I don’t read the diffs anymore. When I get error messages I just copy paste them in with no comment, usually that fixes it. The code grows beyond my usual comprehension, I’d have to really read through it for a while. Sometimes the LLMs can’t fix a bug so I just work around it or ask for random changes until it goes away. It’s not too bad for throwaway weekend projects, but still quite amusing. I’m building a project or webapp, but it’s not really coding – I just see stuff, say stuff, run stuff, and copy paste stuff, and it mostly works.” — Read More
Building DeepSeek R1 from Scratch Using Python
The entire training process of DeepSeek R1 is nothing but using different way of reinforcement learning on top of their base model (i.e. deepseek V3)
Starting with a tiny base model that runs locally, we’ll build everything from scratch using DeepSeek R1 tech report while covering theory alongside each step. — Read More
How to Build an LLM Chat App: The New Litmus Test for Junior Devs
Ah yes, building an LLM chat app—the junior dev’s favorite flex for “I’m a real developer now.” “Hur dur, it’s just an API call!” Sure, buddy. But let’s actually unpack this because, spoiler alert, it’s way more complicated than you think.
… Dismissing the complexity of an LLM chat app feels good, especially if you’re still in tutorial hell. “Hur dur, just use the OpenAI API!” But here’s the thing: that mindset is how you build an app that dies the second 100 people try to use it. Don’t just take my word for it—smarter people than both of us have written about system design for high-concurrency apps. Rate limits, bandwidth, and server meltdowns are real, folks. Check out some classic system design resources if you don’t believe me (e.g.,AWS scaling docs or concurrency breakdowns on Medium). — Read More
My LLM codegen workflow atm
I have been building so many small products using LLMs. It has been fun, and useful. However, there are pitfalls that can waste so much time. A while back a friend asked me how I was using LLMs to write software. I thought “oh boy. how much time do you have!” and thus this post.
I talk to many dev friends about this, and we all have a similar approach with various tweaks in either direction. — Read More
Data Formulator: Create Rich Visualizations with AI
Data Formulator is an application from Microsoft Research that uses large language models to transform data, expediting the practice of data visualization.
Data Formulator is an AI-powered tool for analysts to iteratively create rich visualizations. Unlike most chat-based AI tools where users need to describe everything in natural language, Data Formulator combines user interface interactions (UI) and natural language (NL) inputs for easier interaction. This blended approach makes it easier for users to describe their chart designs while delegating data transformation to AI. — Read More
Competitive Programming with Large Reasoning Models
We show that reinforcement learning applied to large language models (LLMs) significantly boosts performance on complex coding and reasoning tasks. Additionally, we compare two general-purpose reasoning models – OpenAI o1 and an early checkpoint of o3 – with a domain-specific system, o1-ioi, which uses hand-engineered inference strategies designed for competing in the 2024 International Olympiad in Informatics (IOI). We competed live at IOI 2024 with o1-ioi and, using hand-crafted test-time strategies, placed in the 49th percentile. Under relaxed competition constraints, o1-ioi achieved a gold medal. However, when evaluating later models such as o3, we find that o3 achieves gold without hand-crafted domain-specific strategies or relaxed constraints. Our findings show that although specialized pipelines such as o1-ioi yield solid improvements, the scaled-up, general-purpose o3 model surpasses those results without relying on hand-crafted inference heuristics. Notably, o3 achieves a gold medal at the 2024 IOI and obtains a Codeforces rating on par with elite human competitors. Overall, these results indicate that scaling general-purpose reinforcement learning, rather than relying on domain-specific techniques, offers a robust path toward state-of-the-art AI in reasoning domains, such as competitive programming. — Read More