Claude Mythos Opens The Cybersecurity Pandora’s box

This is exactly what Anthropic claimed to have achieved with Claude Mythos, its newest and most powerful model which‚ according to Anthropic‚ is too powerful to be released to the public.
In its announcement, Anthropic said its new model identified security problems in several operating systems (Linux, OpenBSD, FreeBSD), browsers (Firefox), and widely-used software libraries (FFmpeg)..
Making such a powerful tool available to anyone (including bad actors) would be irresponsible, so Anthropic only gave access to a small group of “launch partners” (among them AWS, Apple, Google, Microsoft, and the Linux Foundation) under Project Glasswing. The idea is to give important organizations and open source projects advance warning and tools to find more security problems, while Anthropic decides what to do with the wider release of Mythos.
The fine art of Doom Marketing
Of course, the idea is also to hype up the capabilities of the new model.
OpenAI already played the “Our new AI is so powerful, we can’t give it to you” card with GPT-2, a model that today anyone can train for under $100.
The tactic still works‚ the media (another example) and the wider public have bought Anthropic’s doom marketing wholesale. Fear sells, and an AI that can hack anyone is as bad as it gets (or as good as it gets, if you’re in marketing.
Where there’s smoke…
Just because it’s marketing doesn’t mean it’s not true.
For a while now, many security researchers have been increasingly impressed with AI cybersecurity capabilities.
In their testing of Mythos, the AI Security Institute (part of the UK government) “found significant improvement on cyber-attack simulations“.
Open source developers have seen an increasing number of security reports, too: Linux kernel developers (participants in Project Glasswing) said “All open source projects have real reports that are made with AI, but they’re good, and they’re real“. In a similar vein, the developer of the popular open source utility “curl”, who was very vocal about bad AI bug reports in the past, recently used AI to find 50 real bugs in the project.
Even the NSA, the feared U.S. cybersecurity agency, is reportedly using Mythos despite Anthropic being banned from U.S. government use just weeks before.
The scariest AI of them all?
Based on all the reports, there seems to be some substance to Anthropic’s doom marketing. But let’s stop panicking, breathe for a bit, and try to rationally unpack what might be happening.
The new model is certainly very capable, but it’s not obvious that it’s miles ahead of what’s already there. In fact, the researchers at Aisle tasked small local models with finding the same bugs with (limited) success, concluding that the most important part is the approach, not model capability.
Basically, you can ask the model to carefully review every single part of the codebase and find security bugs. The AI never gets tired of the tedious grind and is happy to spend a lot of time and burn a lot of tokens (and money) in the effort. And if there is something suspicious, there’s a high likelihood it’ll find it.
The researchers point out that more capable models will do better, but you don’t need an out-of-this-world capability to achieve these impressive results.
So, on one hand, we don’t need to be scared of Mythos. It’s likely an incremental improvement over previous models. On the other hand, this means everyone can already do this, and probably already is.
Now, you can panic.
GPT enters the Chat
As further proof, just a week after the Mythos announcement, OpenAI released GPT-5.4-Cyber, a dedicated AI model for cyber defense.
Available only to “verified individual defenders and teams responsible for defending critical software“, the new model shows that no great leap forward is required for such a tool.
In fact, both OpenAI and Anthropic have since released newer versions of their flagship models, GPT-5.5 and Claude Opus 4.7, respectively.
The AI Security Institute tested GPT-5.5 as well, and noted that “GPT-5.5 shows that rapid improvement on cyber tasks may be part of a more general trend“.
These models have been trained to refuse cybersecurity-related requests (unless you’re in the program), but the Chinese models are just a few months behind in general coding capabilities, and have no such guards.
Where do we go now?
To quote one of the security researchers, “vulnerability research is cooked“. There’s no going back; motivated actors can already do a lot with the current AI tools, and we’ll only get increasingly powerful ones in the future.
In the short run, this can look pretty bad: expect more exploits, hacks and bugs across all kinds of software, from critical infrastructure to supply chain attacks against popular software libraries.
In the long run, however, I believe this is a good thing: motivated attackers with a lot of money already have stashes of 0-days (unpublicized vulnerabilities). Now, more people will be able to use AI to find these problems in their own code and patch them, leading to more secure software overall.
This is why Anthropic’s Glasswing and OpenAI’s “Trusted Access for Cyber” programs are a good first step, even though they’re available only to select participants. In the future, using open-weights models in a similar manner will bring these capabilities to everyone, cheaply.
Buckle up, it’s gonna be a bumpy ride.


