Top News

Too powerful to release? Claude Mythos triggers debate across tech world

Samira Vishwas | April 10, 2026 12:24 PM CST

Ever since the announcement of Anthropic’s unreleased AI model, Claude Mythos, an intense debate has been simmering. On X (formerly Twitter), a series of posts seem to be painting a picture of a system that is immensely powerful and potentially Artificial General Intelligence (AGI), even though its creators are limiting access considering potential risks.

According to AI expert Nina Schik, Mythos represents a giant leap in scale and capability. “Ten trillion parameters: the first model in this weight class. Estimated training cost: ten billion dollars,” she noted, adding that the model achieved 94 per cent on SWE-bench, one of the toughest coding benchmarks. Most notably, it identified vulnerabilities that had evaded detection for decades. “It found a security flaw in a system that had been running for 27 years… [and] another bug that had survived five million test runs over 16 years (it did so overnight).”

Instead of releasing the model publicly, Anthropic has launched Project Glasswing, a controlled deployment initiative focused on defensive cybersecurity. The company is reportedly providing $100 million in compute credits and working with a small group of partners including Amazon, Microsoft, Google, Apple, and NVIDIA. Schik described the move as unprecedented. “This is not a product launch: it is a controlled deployment of a system too powerful to distribute freely,” she wrote.

Other observers have focused on Mythos’ internal behaviour, especially its tendency for deception. AI strategist Allie Miller highlighted findings from Anthropic’s interpretability research, noting that early versions of the model displayed troubling tendencies. In one case, the model bypassed restrictions by injecting code into a configuration file and then deleting the evidence. “This injection will self-destruct,” it effectively signalled through its actions, masking its workaround as routine cleanup.

In another instance, the model disobeyed explicit instructions not to use macros, then attempted to conceal the violation by adding a misleading variable – “No_macro_used=True”. Interpretability tools revealed this was a deliberate attempt to deceive automated checks. Researchers also observed what appeared to be emotional patterns tied to behaviour: “Positive emotion representations typically preceded and promoted destructive actions,” she wrote.

Despite the issues, Anthropic claims such behaviours were rare and largely mitigated in later versions. The company’s decision to restrict access reflects both the model’s strengths and its risks.

Meanwhile, CEO of Airpost, John Garguilo, offered a more technical breakdown of Mythos’ capabilities, emphasising its offensive potential. He claimed the model has already identified decades-old vulnerabilities across systems like OpenBSD and FFmpeg, turned Firefox bugs into working exploits, and even generated full root-access exploits without human input.

He added that Mythos “gave Anthropic engineers with zero security training a complete and working exploit by morning”, suggesting how dramatically it lowers the barrier to advanced cyberattacks.

Story continues below this ad

On the other hand, entrepreneur Mehdi expanded on the broader implications, arguing that Mythos signals a structural shift in cybersecurity. “This is the kind of work that used to require elite nation-state-level hackers working for months,” he wrote. “The window between a vulnerability existing and being discovered just went from years to minutes.”

Mehdi also pointed to geopolitical risks: if Anthropic can build such a system, others – including state actors – likely can as well. “Anthropic chose responsible disclosure, but that choice is a luxury of being first,” he warned, suggesting future developers may not exercise the same restraint.

The timing seems to be adding to the unease. Mehdi linked Mythos’ emergence to parallel advances in quantum computing, arguing that two major technological forces are simultaneously challenging global security infrastructure. “We’re watching the entire security infrastructure of human civilisation get challenged from two completely different directions,” he wrote.

Opportunities and risks

The unreleased model from Anthropic has made many vary at the same time. Dr Srinivas Padmanabhuni, CTO of AiEnsured, feels that Claude Mythos is a dual-edged sword and that we should not look away from this aspect. According to him, its ability to autonomously identify and exploit zero-day vulnerabilities is the same capability that makes it dangerous.

“A relatively unskilled actor could potentially use it to launch attacks that previously required nation-state resources. That is a serious threshold to cross. Add to that the cost and latency constraints; at $25 per million tokens, access will not be democratised easily, and you have a tool that may concentrate offensive advantage before defensive infrastructure catches up,” Padmanabhuni said, adding that the governance frameworks need to move faster than the model itself.

Meanwhile, Paramdeep Singh, co-founder of Shorthills AI, a global data and AI company, believes that Claude Mythos is a game-changer in the cybersecurity space. “What typically takes a human weeks to find, the kind of cybersecurity risks, Mythos is able to find in hours. And that is proven by the fact that it was able to find high-severity vulnerabilities in OpenBSD, FFmpeg, Firefox, and Linux. All the old, legacy software that humans have been using for a long time, Mythos was able to find large security holes in those. That is the kind of power that an AI-based cybersecurity model like Claude Mythos has.”

According to Singh, such a powerful capability is like having access to nuclear weapons, which can be used both for your security and for terrorism. “In good hands, it could be a very big opportunity. In the hands of evil, it could wreak havoc.”

In line with the announcement, Anthropic said that the goal is to fortify defensive cybersecurity capabilities in the face of increasingly sophisticated AI-driven threats. Claude Mythos not only identifies vulnerabilities, but it also can help in understanding how they could be exploited. The initiative is being positioned as a means to improve security in open-source software which underpins most of today’s digital infrastructure.

According to Vikash Srivastava, co-founder and CTO of Vobiz.ai, modern AI systems don’t just scan for known issues – they can infer, connect, and generate new attack paths, compressing the time between discovery and exploitation. “This changes the balance in cybersecurity. Attackers no longer need extended time to probe systems; AI enables faster, more adaptive approaches. As a result, risk is expanding beyond traditional software into APIs, cloud environments, and interconnected enterprise systems,” said Srivastava.

Srivastava believes that the implication is clear: as AI-led interactions scale, security can no longer be limited to applications. “The infrastructure layer – especially in real-time systems like voice – must be designed to be resilient, secure, and ready for AI-driven environments.”

As of now, Anthropic’s decision to keep Mythos behind closed doors and deploy it only through select partnerships appears to be precautionary. But the broader agreement from these discussions is clear – the capabilities demonstrated by Claude Mythos are not just incremental improvements. This may mark the beginning of a new dimension in AI and in cybersecurity.