Intent is the design superpower AI can’t replace

Everyone can vibe code now. That’s not an exaggeration. Product managers are prototyping. Engineers are creating UIs. Non-designers are producing interfaces that look pretty reasonable (only at first glance!).

So the pressure on designers to keep up is real. Move faster. Use AI more. Show your value through speed and output. Be the person who executes quickest.

I understand the instinct. But it’s pointing in exactly the wrong direction!!!

Design was never supposed to win on execution.

… What’s not democratised, what AI genuinely cannot do, is understanding. Knowing which problem is actually worth solving. Knowing how users think about a product, where they get stuck, and what they’re really trying to do underneath the surface request. Knowing why a solution will land or fall flat before you’ve spent two weeks building it.

That’s the job. It has always been. — Read More

#augmented-intelligence

Our evaluation of OpenAI’s GPT-5.5 cyber capabilities

In April, our evaluation of an early snapshot of Anthropic’s Claude Mythos Preview found that it represented a step up in cyber performance over previous frontier models and was the first to complete our corporate network attack simulation end-to-end, a multi-step exercise we estimate would take a human around 20 hours. A key question was whether this reflected a breakthrough specific to one model, or part of a broader trend. Results from an early checkpoint of GPT-5.5 suggest the latter: a second model, from a different developer, now reaches a similar level of performance on our cyber. — Read More

#cyber

How to achieve truly serverless GPUs

We are in the age of inference. Billion- to trillion-parameter neural networks are run on specialized accelerators at quadrillions of operations per second to generate mediaauthor software, and fold proteins at massive scale.

Inference workloads are more variable and less predictable than the training workloads that previously dominated. That makes them a natural fit for serverless computing, where applications are defined at a level above the (virtual) machine so that they can be more readily scaled up and down to handle variable load

But serverless computing only works if new replicas can be spun up quickly — as fast as demand changes, which can be at the scale of seconds. Naïvely spinning up a new instance of, say, SGLang serving a billion-parameter LLM on a B200 can take tens of minutes or stall for hours on GPU availability.

At Modal, we’ve done deep engineering work over the last five years to solve this problem. In this blog post, we walk through what we did. — Read More

#performance

Multi-Agent Systems: When 2 Agents Beat 1 (and When They Don’t)

You see the word multi-agent everywhere right now. People build systems with five different AI personas talking to each other in a simulated chat room just to scrape a website and write a blog post. They give them names like Researcher, Writer, and Editor and watch the terminal output scroll by as the agents debate with each other. It all looks impressive but is not the right way you build software.

Adding more agents to a system does not automatically make it smarter. It actually multiplies your failure rate. Think about the basic math of probability. If you have a single agent that executes its task correctly 90% of the time, your naive system reliability is 0.90.

If you chain three of those agents together, you multiply those probabilities. Your baseline reliability just dropped to 72%. You doubled your latency, tripled your API cost, and made the final output no better.

… We will see exactly why the single agent misses a critical billing logic flaw, and why the two-agent system catches it. — Read More

#devops

Why senior developers fail to communicate their expertise

…If you’re a senior developer, and if you’ve played with the agents and skills and models and all the other things that are blowing people’s minds, and if your intuition is still telling you something is off in how people are proclaiming your job obsolete, then here, in this post, I’m going to try and put words to your intuition (as a good copywriter does).

But wait a minute! Many seasoned and famous developers are also proclaiming the death of the developer.

How’s that? Whose intuition is right? And what’s causing this split? — Read More

#strategy

Turn Claude Code Into Your Personal Wall Street Analyst

In this guide, you will learn how to add Anthropic’s financial-services plugin marketplace to Claude Code, install market research skills, and use them to create market research reports, equity analysis, earnings reviews, and Excel spreadsheets from your own prompts.

This is not financial advice. The useful version of this workflow is a research assistant that organizes public market information, creates structured reports, and helps you decide what to inspect next. — Read More

#investing

SubQ AI Explained: How Good Is the 12M Context Window LLM?

On May 5, 2026, a tiny Miami-based startup called Subquadratic released a model named SubQ. The team is small, but they’ve raised $29M in seed funding and claim the model can process up to 12 million tokens in a single pass.

They have also made other crazy-sounding claims, like their model is up to 52 times more efficient than FlashAttention at 1M tokens and achieves a coding performance similar to Claude Opus at roughly 1/20th of the cost.

These are big statements, so it makes sense to break this down and see what’s actually going on. In this piece, I’ll walk through what SubQ is, how the architecture works, and what the early details and developer communities suggest about these claims. — Read More

#performance

Interaction Models: A Scalable Approach to Human-AI Collaboration

AI labs often treat the ability for AI to work autonomously as the model’s most important capability. As a result, today’s models and interfaces aren’t optimized for humans to remain in the loop.

Autonomous interfaces are valuable, but in most real work, users can’t fully specify their requirements upfront and walk away—good results benefit from a collaborative process where the human stays in the loop, clarifying and giving feedback along the way. However, humans increasingly get pushed out not because the work doesn’t need them, but because the interface has no room for them. Instead, people are most effective when they can collaborate with AI the same way we do with other people: messaging, talking, listening, seeing, showing, and interjecting as needed—and for the model to do the same.

In order to resolve this, we need to move beyond the current turn-based interface for the models.  — Read More

#augmented-intelligence

Im going back to writing code by hand

Here is k10s: https://github.com/shvbsle/k10s/tree/archive/go-v0.4.0

234 commits. ~30 weekends. Built entirely on vibe-coded sessions with Claude, whenever my tokens lasted long enough to ship something.

I’m archiving my TUI tool and rewriting it from scratch.

…I built it in Go with Bubble Tea [1] and it worked.

For a while… 😦

[What] I learned over these 7 months is worth more than the 1690 lines of model.go I’m throwing away. 

….AI writes features, not architecture. The longer you let it drive without constraints, the worse the wreckage gets. The velocity makes you think you’re winning right up until the moment everything collapses simultaneously. — Read More

#devops

The Token Economy: Tokenmaxxing Is Stupid Until It Isn’t

Meta engineers reportedly had an internal leaderboard called Claudeonomics. It is available to all 85,000 employees, and 60 trillion “AI tokens” have been burned in 30 days. The company even created digital badges like “Token Legend” and “Cache Wizard.” Some employees are reportedly leaving agents running overnight to climb the rankings.

On a recent podcast, Jensen Huang said something that sounds wild but is far more plausible if you extrapolate current trends out: if a $500,000 engineer isn’t burning at least $250,000 in tokens a year, he’d be “deeply alarmed.”

At the same time, economists are calling it a paradox. — Read More

#strategy