Anthropic’s decision to delay the public release of Project Glasswing is one of the clearest signals of how rapidly artificial intelligence is transforming cybersecurity. The company restricted early access to a small group of major vendors — Apple, Microsoft, Google, Amazon and select partners — to give them time to patch newly uncovered weaknesses before they could be weaponized at scale.
Mythos: AI that Detects Systemic Vulnerabilities Missed for Decades
At the core of Project Glasswing is the Mythos Preview model, designed for deep security analysis. Unlike traditional tools that often surface isolated bugs, Mythos is capable of identifying systemic security flaws in critical software components across operating systems and browsers.
The model has uncovered vulnerabilities in all major platforms, including code that had already undergone years of manual review, fuzzing, and open-source community scrutiny. A notable example is a flaw in OpenBSD, a distribution widely regarded as a benchmark for secure development. The vulnerability had remained in the codebase for approximately 27 years, highlighting both the strength of modern AI-based analysis and the inherent limits of conventional code audits.
From Detection to Exploitation: 72.4% Success Rate in Testing
What differentiates Mythos from previous frontier models is not only its detection capability but its ability to model real exploitation chains. While Anthropic’s earlier flagship model, Claude Opus 4.6, largely failed at autonomously crafting working exploits, Mythos achieved a 72.4% exploitation success rate in a controlled Firefox JS shell environment. This demonstrates that advanced AI systems can move beyond static analysis and independently assemble viable attack paths.
The Bottleneck Shifts from Discovery to Remediation
The most concerning metric is not what Mythos can find, but what the ecosystem can fix: less than 1% of vulnerabilities discovered by Mythos have actually been remediated. In other words, discovery capacity now far exceeds remediation capacity.
Defensive processes are still largely calendar-driven: receive a report, score the vulnerability, open a ticket, develop and deploy a patch, then verify the fix. Even in mature organizations, this cycle typically spans days or weeks. Meanwhile, attackers increasingly leverage large language models (LLMs) to automate reconnaissance, exploit development, lateral movement and post-exploitation steps, effectively operating at machine speed.
AI-Powered Autonomous Attacks: The FortiGate Campaign
The risks are not theoretical. Earlier this year, investigators documented a campaign in which threat actors used a custom MCP server running an LLM as part of an automated attack chain against FortiGate devices. The AI system handled key stages autonomously: configuration analysis, vector selection, exploitation, credential reset, and preparation for data exfiltration.
According to incident analysis, this campaign compromised 2,516 organizations in 106 countries, largely in parallel. Human operators were mostly involved in oversight and post-incident evaluation. The gap between the speed of AI-enabled attacks and the response capabilities of traditional security operations is no longer a minor discrepancy but a structural divide.
Exploding CVE Volume and the Limits of CVSS-Based Prioritization
Project Glasswing and similar initiatives will sharply increase the number of identified vulnerabilities and associated CVE entries. Yet most vulnerability management programs still rely heavily on CVSS scores, which estimate the technical severity of a flaw without adequately factoring in real-world context.
This context includes existing compensating controls, network architecture, exposure paths and the actual business impact for a specific organization. As the number of findings grows from hundreds to thousands, context-free prioritization based solely on CVSS not only slows down remediation but can render the entire process ineffective.
Industry studies, including ENISA reports and the Verizon DBIR, consistently show long remediation times for critical vulnerabilities. Research cited by Picus Security indicates that up to 83% of cybersecurity programs fail to deliver measurable risk reduction when they focus on visibility alone instead of validating whether listed vulnerabilities are truly exploitable in their environment.
Autonomous Exposure Validation: A Practical Response to the CVE Tsunami
As AI dramatically reduces the cost and time required to find vulnerabilities, the decisive control point becomes validation — confirming whether a given weakness is practically exploitable in a particular environment with its specific configurations and defenses. Defenders hold a crucial advantage here: they understand their own infrastructure better than any attacker. This advantage matters only if it can be operationalized at comparable speed.
Emerging Platforms for Continuous, Autonomous Security Testing
A distinct class of solutions is emerging under the banner of autonomous exposure validation. One example is Picus Security’s Picus Swarm, a platform built around specialized AI agents that automate an end-to-end workflow traditionally spanning multiple teams: ingesting new threat advisories (such as CISA alerts), modeling attacker techniques, verifying real exploitability in the target environment, and generating actionable remediation guidance.
According to the vendor, such autonomous validation can compress a traditional four‑day “detect–analyze–fix–verify” cycle into a matter of minutes. All actions are traceable and executed within predefined policy guardrails. A key design principle is the shift from rigid, scheduled testing to an event-driven model: any new asset, configuration change, or newly published exploit should automatically trigger targeted security tests.
Ultimately, Project Glasswing will not be judged solely by the number of CVEs it uncovers or the sophistication of the exploit chains it can construct, but by how many vulnerabilities can be eliminated before widespread abuse. To achieve this, organizations should start adapting now: move beyond CVSS-only prioritization, adopt continuous risk-based validation, and automate as much of the path as possible from threat notification to confirmed, remediation-ready findings. Industry discussions, including events such as the online Autonomous Validation Summit, are increasingly focused on this “post‑Glasswing” reality. The sooner enterprises realign their processes to this AI-accelerated threat landscape, the better their chances of closing the gap between attacker speed and defensive resilience.