PentAGI is an innovative tool for automated security testing that leverages cutting-edge artificial intelligence technologies. The project is designed for information security professionals, researchers, and enthusiasts who need a powerful and flexible solution for conducting penetration tests. — Read More
Recent Updates Page 30
Patterns for Reducing Friction in AI-Assisted Development
The practices that make human pair programming effective—onboarding, structured design discussion, shared standards—apply equally to working with AI coding assistants. I propose five patterns that bring this collaborative scaffolding to AI-assisted development, shifting the experience from correcting a tool to collaborating with a capable teammate.
PATTERNS
Knowledge Priming
Design-First Collaboration
Context Anchoring
Encoding Team Standards
Feedback Flywheel
— Read More
Claude Managed Agents: get to production 10x faster
Today, we’re launching Claude Managed Agents, a suite of composable APIs for building and deploying cloud-hosted agents at scale.
Until now, building agents meant spending development cycles on secure infrastructure, state management, permissioning, and reworking your agent loops for every model upgrade. Managed Agents pairs an agent harness tuned for performance with production infrastructure to go from prototype to launch in days rather than months.
Whether you’re building single-task runners or complex multi-agent pipelines, you can focus on the user experience, not the operational overhead. — Read More
Meta debuts the Muse Spark model in a ‘ground-up overhaul’ of its AI
Meta released an AI model on Wednesday called Muse Spark, which marks its “first step” toward an “overhaul of [its] AI efforts.”
Muse Spark is the inaugural model to come out of Meta Superintelligence Labs, which was created last year because CEO Mark Zuckerberg was reportedly unhappy with the progress of Meta and its Llama models and how they lagged behind OpenAI’s ChatGPT and Anthropic’s Claude. Meta recruited former Scale AI co-founder and CEO Alexandr Wang to lead Meta Superintelligence Labs and invested $14.3 billion in the data labeling company for a 49% stake.
Now, it’s time for Zuckerberg to see if his reconfigured AI team can woo users. — Read More
Cybersecurity in the Age of Instant Software
AI is rapidly changing how software is written, deployed, and used. Trends point to a future where AIs can write custom software quickly and easily: “instant software.” Taken to an extreme, it might become easier for a user to have an AI write an application on demand—a spreadsheet, for example—and delete it when you’re done using it than to buy one commercially. Future systems could include a mix: both traditional long-term software and ephemeral instant software that is constantly being written, deployed, modified, and deleted.
AI is changing cybersecurity as well. In particular, AI systems are getting better at finding and patching vulnerabilities in code. This has implications for both attackers and defenders, depending on the ways this and related technologies improve.
In this essay, I want to take an optimistic view of AI’s progress, and to speculate what AI-dominated cybersecurity in an age of instant software might look like. There are a number of unknowns that will factor into how the arms race between attacker and defender might play out. — Read More
Spec-Driven Development Is Waterfall in Markdown
SpecKit has 77,000 GitHub stars. AWS built an entire IDE around spec-driven development. Tessl raised $125 million on the promise that specs, not code, should be the source of truth.
The pitch was clean: stop vibe coding, write a proper specification, let the agent execute against it. Engineers loved it. It felt like rigor. It felt like the adults had finally entered the room.
Then someone actually tested it on a real project. Ten times slower. More ceremony. Same bugs.
The industry built an entire ecosystem around one idea: if we give AI agents a detailed enough spec, they’ll produce working software. It’s the same bet the industry made with outsourcing, with offshoring, with every model that tries to replace understanding with documentation. Write it down clearly enough and someone (or something) on the other side will execute it perfectly. — Read More
Project Glasswing: Securing critical software for the AI era
Securing crToday we’re announcing Project Glasswing1, a new initiative that brings together Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks in an effort to secure the world’s most critical software.
We formed Project Glasswing because of capabilities we’ve observed in a new frontier model trained by Anthropic that we believe could reshape cybersecurity. Claude Mythos2 Preview is a general-purpose, unreleased frontier model that reveals a stark fact: AI models have reached a level of coding capability where they can surpass all but the most skilled humans at finding and exploiting software vulnerabilities.itical software for the AI era. — Read More
Closing the knowledge gap with agent skills
Large language models (LLMs) have fixed knowledge, being trained at a specific point in time. Software engineering practices are fast paced and change often, where new libraries are launched every day and best practices evolve quickly.
This leaves a knowledge gap that language models can’t solve on their own. At Google DeepMind we see this in a few ways: our models don’t know about themselves when they’re trained, and they aren’t necessarily aware of subtle changes in best practices (like thought circulation) or SDK changes.
Many solutions exist, from web search tools to dedicated MCP services, but more recently, agent skills have surfaced as an extremely lightweight but potentially effective way to close this gap.
While there are strategies that we, as model builders, can implement, we wanted to explore what is possible for any SDK maintainer. Read on for what we did to build the Gemini API developer skill and the results it had on performance. — Read More
SAFe Was Bad for Agility. For AI, It’s Catastrophic.
Last year, during an engagement with an insurance company, I worked with the product leadership team to understand why their 8-month AI initiative had stalled. They’d assembled a dedicated AI working group, ran three PI planning cycles where AI use cases were formally assigned to Release Trains, and produced a 21-slide deck explaining their AI strategy.
They had not shipped a single AI-powered feature.
The working group was waiting on the Q3 plan to be ratified before beginning experimentation. The Release Trains were waiting on the working group’s recommendations. The 21-slide deck was in review with the PMO.
This wasn’t negligence or laziness. This also wasn’t a technology problem. This was SAFe working exactly as designed. — Read More
AI replaced 80% of Coding, Only these 7 skills are left.
Something strange is happening in software engineering right now.
Companies adopted AI to speed up code generation, and on the surface, it worked. AI can write syntax faster than any human ever could. It can generate boilerplate, suggest implementations, create tests, and even imitate design patterns in seconds.
That sounds like the beginning of the end for software engineering.
But that is not what is actually happening.
The real story is more interesting. — Read More