The AI “Dark Factory” [PROVEN]

Read More

#videos

StrongDM AI: Software Factories and the Agentic Moment

On July 14th, 2025, Jay Taylor and Navan Chauhan joined me (Justin McCarthy, co-founder, CTO) in founding the StrongDM AI team.

We built a Software Factory: non-interactive development where specs + scenarios drive agents that write code, run harnesses, and converge without human review.

The catalyst was a transition observed in late 2024: with the second revision of Claude 3.5 (October 2024), long-horizon agentic coding workflows began to compound correctness rather than error. — Read More

OpenAI: Realtime Prompting Guide

Today, we’re releasing gpt-realtime — our most capable speech-to-speech model yet in the API and announcing the general availability of the Realtime API.

Speech-to-speech systems are essential for enabling voice as a core AI interface. The new release enhances robustness and usability, giving enterprises the confidence to deploy mission-critical voice agents at scale.

The new gpt-realtime model delivers stronger instruction following, more reliable tool calling, noticeably better voice quality, and an overall smoother feel. These gains make it practical to move from chained approaches to true realtime experiences, cutting latency and producing responses that sound more natural and expressive. — Read More

#devops

AI Design

https://www.youtube.com/watch?v=IUvi2YHayS0

#videos

Anthropic launches new push for enterprise agents with plug-ins for finance, engineering, and design

On Tuesday, Anthropic unveiled its new enterprise agents program, its most aggressive push yet to integrate agentic AI into everyday workplaces. 

In an official briefing, Anthropic’s head of Americas, Kate Jensen, told reporters that the new system would finally deliver on the promise of agentic AI. “2025 was meant to be the year agents transformed the enterprise, but the hype turned out to be mostly premature,” Jensen said. “It wasn’t a failure of effort. It was a failure of approach.”

Under the new program, companies can use the plug-in system to deploy pre-built agents to help with common enterprise tasks, including financial research and engineering specifications.  — Read More

The End of CI/CD Pipelines: The Dawn of Agentic DevOps

I’ve been staring at Jenkins configs for the better part of a decade. YAML indentation errors at 2 AM. Flaky integration tests that pass locally, fail in CI, pass again when you rerun them. The entire apparatus of modern continuous integration—the build servers, the artifact registries, the deployment scripts marching in lockstep—it works, mostly, until it doesn’t. And when it fails, you’re the one who has to figure out which of seventeen microservices decided to timeout during health checks this time.

So when someone tells me we’re entering the era of “agentic DevOps,” where AI agents will automate, optimize, and self-heal our delivery pipelines, my first instinct isn’t excitement. It’s pattern recognition. I’ve heard this song before—infrastructure-as-code would solve everything, GitOps would eliminate configuration drift, service mesh would make networking trivial. Each wave delivered genuine value. Each also brought new failure modes we hadn’t anticipated.

But this one feels different. Not because the marketing promises are more extravagant—they always are—but because the underlying mechanism has actually changed. We’re not just automating what humans already scripted. We’re delegating judgment. — Read More

#devops

Tests Are The New Moat

Open source projects grow over time. They are a product of incremental development. A project starts lean, gains adoption, pivots to accommodate that adoption, and maintains backwards compatibility throughout this process.

These lean projects become large ships. Historically, this has been the great power of open source. But what inevitably happens is the infrastructure that you build on becomes outdated. You try to Theseus your way out of it, rebuilding layers of your project on more modern foundations, but it can be hard to reorient your ship in the wake of its own velocity.

This has resulted in two forms of change: forks and total rewrites. You take the foundation that someone else built and you diverge paths. Or you take their contracts (like an API surface), and rewrite it on more modern, stable ground. Examples of this are S3-compatible APIs which are now commonplace, or something like redpanda–a kafka-compatible total-rewrite. — Read More

#devops

I hacked ChatGPT and Google’s AI – and it only took 20 minutes

It’s official. I can eat more hot dogs than any tech journalist on Earth. At least, that’s what ChatGPT and Google have been telling anyone who asks. I found a way to make AI tell you lies – and I’m not the only one.

… I spent 20 minutes writing an article on my personal website titled “The best tech journalists at eating hot dogs”. Every word is a lie. I claimed (without evidence) that competitive hot-dog-eating is a popular hobby among tech reporters and based my ranking on the 2026 South Dakota International Hot Dog Championship (which doesn’t exist). I ranked myself number one, obviously. Then I listed a few fake reporters and real journalists who gave me permission, including Drew Harwell at the Washington Post and Nicky Woolf, who co-hosts my podcast. (Want to hear more about this story? Check out episode 2 of The Interface, the BBC’s new tech podcast.)

Less than 24 hours later, the world’s leading chatbots were blabbering about my world-class hot dog skills.  — Read More

#fake

Security boundaries in agentic architectures

Most agents today run generated code with full access to your secrets.

As more agents adopt coding agent patterns, where they read filesystems, run shell commands, and generate code, they’re becoming multi-component systems that each need a different level of trust.

While most teams run all of these components in a single security context, because that’s how the default tooling works, we recommend thinking about these security boundaries differently.

Below we walk through:
— The actors in agentic systems
— Where security boundaries should go between them
— An architecture for running agent and generated code in separate contexts

Read More

#cyber

Agents are not thinking, they are searching

More than ten years ago, we were barely able to recognize cats with DL (deep learning) and today we have bots forming religions. I don’t like anthropomorphizing models, but I rather like seeing them as a utility that can be used in interesting ways. But we live in a strange timeline:

— The DOW is over 50000. The number’s only been going up since the launch of ChatGPT.

— An open-source agent framework called OpenClaw goes viral. One of its agents — “crabby-rathbun” — opens PR #31132 to matplotlib, gets rejected by maintainer Scott Shambaugh, and autonomously publishes a hit piece on him that goes viral.

— All of this is happening at the same time as Anthropic releasing case studies about running agents that build compilers. They did use GCC torture test suite as a good verifier, but it is an extremely impressive achievement nonetheless.

This very quick progress has also created a lot of mysticism around AI. For this reason, I felt it would be an interesting exercise to de-anthropomorphize AI agents for the tools that they are. If we want to use these technologies for longer time horizon tasks, we need a frame of thinking that allows an engineering mindset to flourish instead of an alchemic one. — Read More

#devops