At the end of 2024, 25 percent of new code at Google was being written not by humans, but by generative large language models (LLMs)—a practice known as “vibe coding.” While the name may sound silly, vibe coding is a tectonic shift in the way software is built. Indeed, the quality of LLMs themselves is improving at a rapid pace in every dimension we can measure—and many we can’t. This rapid automation is transforming software engineering on two fronts simultaneously: Artificial intelligence (AI) is not only writing new code; it is also beginning to analyze, debug, and reason about existing human-written code.
As a result, traditional ways of evaluating security—counting bugs, reviewing code, and tracing human intent—are becoming obsolete. AI experts no longer know if AI-generated code is safer, riskier, or simply vulnerable in different ways than human-written code. We must ask: Do AIs write code with more bugs, fewer bugs, or entirely new categories of bugs? And can AIs reliably discover vulnerabilities in legacy code that human reviewers miss—or overlook flaws humans find obvious? Whatever the answer, AI will never again be as inexperienced at code security analysis as it is today. And as is typical with information security, we are leaping into the future without useful metrics to measure position or velocity. — Read More