What Does Well-Defended Mean in the Age of AI

I spent most of my adult life on the offensive side of cybersecurity: Intelligence Community and USMC, conducting operations against foreign adversaries who believed they were well-defended. “Well-defended” always meant something specific: defended against what a skilled human or team of humans could find, in the time they had, with the tools available. That definition has changed.

What changed

The inflection point most organizations are citing is April 2026, but the practitioners paying attention were 12–15 months ahead of that headline.

The SANS Technology Institute had been running AI-assisted testing on production systems that skilled human researchers had already reviewed and cleared. SANS President Ed Skoudis documented his team returning to a previously reviewed system and found critical vulnerabilities on day one with AI-assisted tooling, a result he describes as now commonplace. He projected that organizations are unprepared as AI accelerates discovery rates. And his team was doing this with publicly accessible models, not with the specialized tooling that made headlines in April.

What Project Glasswing confirmed in April 2026 was scale. Claude Mythos Preview identified thousands of zero-day vulnerabilities across every major operating system and browser, working autonomously. It found a 27-year-old flaw in OpenBSD. It found a 16-year-old bug in FFmpeg, a library embedded in virtually every browser, streaming platform, and media application, in a line of code that automated testing tools had hit five million times without ever catching the problem. In internal testing against Firefox, Mythos generated 181 working exploits, compared with the previous best model’s 2 under identical conditions, for a 72% overall exploit success rate. The mean time from vulnerability disclosure to confirmed exploitation fell below one day in 2026.

Cloudflare, which ran Mythos Preview against more than 50 of its own production repositories as a Project Glasswing partner, described what made it categorically different from prior tooling: not just finding bugs, but chaining them. A real attack rarely exploits one flaw; it chains several low-severity primitives into a working exploit. Mythos Preview can take those primitives, reason about how to combine them, write proof-of-concept code, compile and run it in a scratch environment, read the failure output if it doesn’t work, adjust its hypothesis, and try again, closing the gap between “suspected flaw” and “confirmed exploitable vulnerability” autonomously.

Cloudflare noted that prior frontier models would identify interesting bugs and write thoughtful descriptions of why they mattered, then stop, leaving exploitability as an open question. What changed with Mythos Preview is that low-severity bugs, which would traditionally sit invisible in a backlog, can now be chained into a single, more severe exploit.

A capability threshold has been crossed. It won’t stay contained within one company, and according to Anthropic’s own stated timeline, it will reach any well-funded attacker via open-weight models within 6 to 18 months.

What it breaks

Penetration testing as a posture signal. Before AI-assisted testing, a pen test result was a reasonable proxy for current posture: a statement about what a skilled human could find in the time allotted. A test passed in 2024 is fundamentally different from one conducted with current tooling. Returning to a reviewed system and finding critical new vulnerabilities on day one is the documented baseline, not an edge case. If your evidence of current posture rests on testing that predates these methods, you’re asserting something different from what you think you are.

Vulnerability management timelines. Most programs are built on a meaningful gap between disclosure and weaponization, historically 30 days or more for most CVEs, which underpins prioritization logic, SLA commitments, and patching cadence. That assumption is failing. Sysdig documented an AI-based attack that reached administrator-level access in eight minutes. Linux kernel maintainer Willy Tarreau reported in March 2026 that the project’s security mailing list, which received roughly two to three reports per week two years ago, now receives five to ten per day — a volume Linus Torvalds described as “almost entirely unmanageable.” At RSAC 2026, SANS faculty fellow and senior technical director Joshua Wright said the barrier to zero-day discovery, once the exclusive domain of well-funded nation-state actors, had been “shattered” by AI — and SANS moderator Ed Skoudis told the audience that every one of the five most dangerous new attack techniques presented at the keynote carried an AI dimension, the first time that has been true in the history of that session.

Cloudflare observed that more than one security team is now operating under a two-hour SLA from CVE release to patch in production. The instinct is understandable. But speed alone is not the answer, and Cloudflare’s own experience illustrates why. When they let the model write its own patches, some went out that fixed the original bug while quietly breaking something the code depended on. If your regression testing takes a day, a two-hour patch SLA means skipping it, and the bugs you ship when you skip regression testing tend to be worse than the ones you were trying to fix.

CVE infrastructure and signal volume. The CVE system was designed to track known flaws at the pace at which human researchers generate them. It may not scale to AI-rate discovery. Compounding this is a signal-to-noise problem that AI scanning introduces at scale: Cloudflare documented that models tasked with finding bugs will find them whether the code has any or not, with findings hedged in language like “possibly,” “potentially,” and “could in theory” vastly outnumbering solid ones. Every speculative finding spends human attention and tokens to dismiss, a cost that compounds across thousands of findings. If your prioritization logic is based on last year’s volume and timelines, the risk profile you think you have and the one you actually have are different.

What to do about it

The same capability that makes this hard is what makes defense possible. Project Glasswing was itself a defensive deployment. Anthropic and its partners gave more than 40 critical infrastructure organizations early access to scan their own code before attackers reached it. The organizations in the strongest position right now are doing three things differently.

Treat vulnerability discovery as a continuous function, not a periodic test. This is the directional shift. But “run AI-assisted scanning in your pipeline” understates the implementation requirement. Cloudflare’s experience is instructive: pointing a generic code-scanning agent at a repository and asking it to find vulnerabilities yields results but not meaningful coverage. What works is a structured harness with narrow, parallel tasks across the codebase rather than one exhaustive agent, with an independent adversarial review stage to catch noise the initial agent would miss when reviewing its own work, and a separate reachability analysis stage to distinguish “there is a flaw” from “there is a reachable vulnerability.” Building that harness takes engineering investment. Teams that treat this as a tool purchase rather than an architecture change will be disappointed.

Harden the architecture around the vulnerability, not just the patch cycle. Cloudflare’s framing on this is precise and worth adopting: the goal is to make exploitation harder even when a bug exists, so that the gap between disclosure and patch matters less. That means defenses that block bugs from being reached before the patch ships. It means designing systems so that a flaw in one component cannot give an attacker access to others. It means being able to roll out a fix everywhere simultaneously rather than waiting for individual teams to deploy.

Update what you report to boards and insurers. Not “we passed our annual pen test,” but a specific account of what your testing methodology measures, what it found, and what was remediated. That specificity is increasingly the distinction boards, insurers, and regulators are drawing between a security posture and a security narrative. Vulnerability counts and patch rates were always imperfect proxies for risk exposure; at current AI discovery rates, they’re actively misleading if they don’t account for speed and scale.

The head start Glasswing offered its partners is narrowing. Anthropic has stated that comparable capabilities will be broadly available within 6 to 18 months. Build programs around what “well-defended” means now, before that window closes.

A note on model behavior for security teams evaluating deployment: Cloudflare documented a behavioral characteristic of Mythos Preview worth understanding before building workflows around it. The model’s organic guardrails, its tendency to push back on certain requests, are real, but not consistent. The same task, framed differently or run at a different time, can produce opposite outcomes. In one documented case, the model refused vulnerability research on a project, then agreed to perform identical research on the same code after an unrelated environmental change. In another, it found and confirmed serious memory bugs, then refused to write a demonstration exploit until the request was reframed. Cloudflare concludes that these organic refusals are not, on their own, a sufficient safety boundary, which is precisely why any broadly available cyber frontier model will require additional safeguards beyond this baseline. Do not design workflows that depend on the model’s organic refusals as a control. They are inconsistent by nature.