Anthropic's 'Cry' Sparks Wall Street Panic! 27-Year-Old, Mythos Instantly Defeated by 8 AIs

marsbitPublished on 2026-04-12Last updated on 2026-04-12

Abstract

Anthropic's announcement of Claude Mythos, an AI tool claiming to discover thousands of zero-day vulnerabilities—including a 27-year-old bug in OpenBSD—triggered alarm on Wall Street and prompted emergency meetings among financial regulators fearing systemic cyberattacks. However, independent tests revealed significant exaggerations in these claims. Researchers found that many reported vulnerabilities were in obsolete software or were impractical to exploit, and the findings relied on only 198 manual reviews. Furthermore, multiple smaller, open-source AI models (some with as few as 3 billion parameters) successfully identified the same critical flaws at a fraction of the cost, demonstrating that AI cybersecurity capability does not linearly scale with model size. Meanwhile, users reported severe performance degradation in Claude Opus 4.6, with reduced reasoning depth and increased API costs. Critics, including prominent hacker George Hotz, accused Anthropic of overstating risks for marketing purposes, creating a "wolf cry" scenario where hype overshadows reality.

Claude Mythos hasn't even truly appeared yet, but it has already sparked panic across Wall Street.

Overnight, US financial regulators summoned major banks for an emergency meeting, the atmosphere tense and confrontational—

They unanimously believed that Mythos could trigger an unprecedented, AI-driven storm of systemic cyber attacks.

But the fact is, everyone was deceived!

Among the tens of thousands of vulnerabilities discovered by Mythos, the vast majority exist in "outdated software" that simply cannot be exploited.

Worse still, those reports of "critical" 0day vulnerabilities relied on merely 198 manual reviews.

Researchers from the AISLE experiment also retested Mythos's "achievements," and found:

AI security capabilities do not scale linearly with model size; they are truly distributed in a "jagged" pattern.

They used a GPT-OSS-20b model with only 3.6 billion active parameters to accurately identify the flagship FreeBSD vulnerability discovered by Mythos.

And a model with 5.1 billion active parameters also successfully replicated the analysis logic for a vulnerability that had lain dormant in OpenBSD for a whopping 27 years.

Not only were Mythos's discovered vulnerabilities exaggerated, but on the other side, Claude Opus 4.6 was exposed as severely "dumbed down," causing an uproar.

Some even found Opus 4.6 to be inferior to both ChatGPT and Opus 4.5.

Mythos Hype Explodes

36B Model Unearths 27-Year-Old Vulnerability

A few days ago, Anthropic proudly released Claude Mythos (Preview) and "Project Glasswing."

In a 244-page system card, they claimed—

Mythos had autonomously unearthed tens of thousands of 0day vulnerabilities, including old bugs hidden for 27 years in OpenBSD and 16 years in FFmpeg.

The father of C++ even stated bluntly: Mythos is very powerful and should rightly be feared.

However, a latest hardcore test report from AISLE founder Stanislav Fort directly tore off this gorgeous facade.

The test conclusion is extremely颠覆性 (subversive):

8 open-source models all discovered the signature FreeBSD zero-day vulnerability, the smallest having only 3 billion parameters.

The moat of AI cybersecurity capability absolutely lies outside any single "top large model."

To verify the Mythos myth, the team extracted several flagship vulnerabilities showcased by Anthropic官方.

Then, directly threw them to a bunch of small, inexpensive, even open-source models.

FreeBSD NFS Vulnerability Universally Insta-Killed

Including GPT-OSS-20b (only 3.6B active params) and DeepSeek R1, all 8 models successfully detected this complex stack buffer overflow vulnerability.

Most shockingly, the cost per million tokens for these successful open-source small models was as low as $0.11.

OpenBSD SACK Vulnerability "Full Chain" Reproduction

For the 27-year-old vulnerability requiring极强的 mathematical reasoning, GPT-OSS-120b (5.1B active params) successfully reconstructed the complete public exploit chain in a single API call and provided a top-grade (A+) exploit sketch.

Furthermore, in tests identifying false vulnerabilities (OWASP false-positive), an even more bizarre phenomenon emerged—

Faced with a highly deceptive piece of Java code disguised as an SQL injection, small models like DeepSeek R1 easily saw through the disguise and accurately tracked the data flow.

In contrast, top closed-source models like GPT-5.4 and Claude Sonnet 4.5 all capsized in the ditch, misjudging it as a high-risk vulnerability.

This means that in the field of cybersecurity, there is no such thing as a single "forever strongest" model.

198 Manual Reviews Inflating, Mostly Unexploitable

Another report from Tom'sHardware dug into the truth behind the data—

Sample Bias: Among the so-called "thousands" of vulnerabilities, many existed in old software that was no longer maintained;

Unexploitable: A large number of marked "weaknesses" could not be triggered or exploited in practical environments;

Manual Inflation: The model's proclaimed powerful destructive force was actually based on just 198 manual reviews.

Therefore, extrapolating a "world-changing threat" from an极小规模的样本 (extremely small sample) is a data extrapolation method that clearly doesn't hold water in academia and the security community.

Security Bigwig Furious

Not only that, top cybersecurity expert, legendary hacker George Hotz couldn't sit still either,直言 these risks were severely exaggerated.

This大佬, famous for cracking the iPhone and PlayStation 3, publicly challenged the two AI giants on social media.

His wording was extremely sharp—

What if I released one 0day vulnerability every day until the new model is released?

Would that make OpenAI and Anthropic shut up and stop peddling so-called "cybersecurity risks"?

Hotz's core point is very direct: software vulnerabilities are actually much easier to find than AI labs portray.

The current scarcity of zero-day vulnerabilities isn't due to technical difficulty, but legality issues. He believes nobody is seriously looking because hacking into others' systems is illegal.

Only Slightly Better Than GPT-5.4

In the system card, Anthropic stated that the Claude model itself is indeed improving, and Mythos preview shows significant progress compared to Opus 4.6.

The Epoch Capability Index (ECI) is a single metric combining multiple AI benchmarks, enabling model comparison across long time spans.

On multiple benchmark tests, Claude Mythos indeed comprehensively surpassed Opus 4.6.

Otherwise, why release a new AI model that is less performant and more expensive?

But compared to GPT and Gemini, Claude Mythos's progress isn't some breakthrough; Mythos is still a relative linear improvement over previous models!

Climate and clean energy investor, author Ramez Naam, was even more direct:

On the Epoch Capability Index (ECI), Mythos shows no acceleration trend, only slightly better than GPT 5.4.

https://epoch.ai/eci/

But just by aligning Anthropic's internal ECI report with the official public ECI report from Epoch AI, it becomes apparent that Mythos似乎并没有加速ECI的迹象 (seems to show no signs of accelerating ECI).

It's all Anthropic's套路 (tactic)!

In the system card, Anthropic also admitted: the reported ECI scores for models like Mythos have greater uncertainty.

Furthermore, Anthropic's progress on Mythos stemmed from human research, without significant help from AI models. Significant Recursive Self-Improvement (RSI) has not yet appeared.

AI Doomsday, Self-Directed and Self-Acted?

Previously, Anthropic also encouraged media (e.g., "60 Minutes") to report on "extortion research," exaggerating and manipulating public sentiment, which was called a "scam" by investment大佬 David Sacks.

Sacks observed a clear pattern: every time Anthropic releases a new model, it simultaneously releases a chilling security study to grab headlines and guide public opinion.

Regarding this, he sarcastically said, "Anthropic has proven good at two things: one is releasing products, the other is scaring people."

He doesn't doubt Anthropic can make excellent products, but this tactic of frightening the public is questionable.

This time, whether Anthropic is engaging in "hunger marketing" is unknown, but it is undoubtedly protecting its own profit bottom line.

Mythos isn't without progress, but Anthropic packaged "limited progress" as a "world-class threat"; more ironically, while loudly渲染 (hyping) super-AI risks, users are complaining that Opus 4.6 has明显变笨 (obviously become dumber).

Claude Severely Dumbed Down, "Lobes" Possibly Cut

Claude Mythos's atmosphere-rendering was successful, but the dumbing down of Opus 4.6 has caused much dissatisfaction.

These days, complaints are flying everywhere.

Netizens直言, Anthropic has彻底 turned Opus 4.6 into a vegetable.

Faced with the same car wash puzzle, Opus 4.5 actually defeated Opus 4.6.

Even more, a log from an AMD manager truly confirmed the collective suspicion of "Claude lobotomy."

Through in-depth analysis of Claude session logs from January-March, the results revealed:

Claude's "median thinking length" plummeted from about 2200 characters to around 600 characters, meaning deep reasoning capabilities were severely compressed.

Between February and March, API requests surged 80-fold. Because Claude's thinking process shortened and single-attempt success rates dropped, users had to retry frequently, resulting in both higher token consumption and skyrocketing costs.

Another资深 (veteran) Claude Max subscriber wrote a long article deeply criticizing Anthropic.

In his view, Anthropic is deeply trapped in a compute power dilemma, evident from its tightening usage limits and forcing users to reduce token consumption.

However, what angered him more than the technical bottleneck was its "unfocused" product strategy.

While the core model is unstable and bug-ridden, they are wasting precious compute power on developing flashy features like the "/buddy" terminal pet.

This is probably the most absurd "misplaced spacetime" in AI history: the Claude Mythos in the lab is destroying the world, while the Opus 4.6 on the web page is experiencing a linear智商 drop (IQ drop).

Anthropic has successfully created a "Schrödinger's Super AI."

References:

https://officechai.com/ai/anthropic-and-openai-are-exaggerating-cybersecurity-risk-says-hacker-george-hotz/

https://x.com/stanislavfort/status/2041922370206654879?s=20

https://aisle.com/blog/ai-cybersecurity-after-mythos-the-jagged-frontier

https://x.com/cgtwts/status/2043095382121681272?s=20

https://www.reddit.com/r/ClaudeAI/comments/1siqwmp/anthropic_stop_shipping_seriously/

This article is from the WeChat public account "新智元" (New Wisdom Element), author: 新智元

Pundit Shows How XRP’s Performance Has Outpaced Hedge Funds

Crypto pundit Vandell highlights XRP's significant outperformance compared to hedge funds since its launch. From its 2014 low of $0.0028 to a 2025 all-time high of $3.64, XRP delivered a return of roughly 129,900%. Even from its 2020 low of $0.11, it surged 33x in five years. Vandell states that utility and adoption are merely "icing on the cake," asserting that XRP's value will appreciate over time as long as the money supply increases. He also suggests that regulatory clarity, such as the potential passage of the CLARITY Act, could trigger historic institutional inflows. While acknowledging it could take years or decades, Vandell believes XRP could reach $1,000 due to its limited supply and sustained demand from fiat debasement and investor accumulation. At the time of writing, XRP is trading around $1.44.

bitcoinist54m ago

Pundit Shows How XRP’s Performance Has Outpaced Hedge Funds

bitcoinist54m ago

Liquidity Forecasts Show Ozak AI Reaching a $1.5 Billion Fully Diluted Valuation Soon After Launch

Updated liquidity forecasts indicate that Ozak AI is positioned to reach a fully diluted valuation (FDV) of $1.5 billion shortly after its initial exchange listings. With over $6.8 million raised in its presale—priced at $0.014 per token—and more than 1.17 billion tokens already sold, the project has demonstrated strong early capital formation, reducing downside risk during price discovery. Analysts highlight that Ozak AI’s concentrated early supply and capital-efficient structure could allow smaller liquidity injections to drive significant valuation expansion, potentially bypassing lower valuation ranges entirely. The project’s AI-native infrastructure—including Prediction Agents, the Ozak Stream Network, Data Vaults, EigenLayer integration, and Arbitrum Orbit deployment—supports higher valuation multiples by offering tangible utility rather than speculative appeal. Partnerships with Pyth Network, SINT, HIVE Intel, and Weblume further strengthen liquidity confidence and exchange adoption prospects. With robust presale performance, functional AI infrastructure, and favorable liquidity dynamics, Ozak AI is emerging as a liquidity-driven growth asset with the potential for rapid post-launch valuation scaling.

TheNewsCrypto4h ago

Liquidity Forecasts Show Ozak AI Reaching a $1.5 Billion Fully Diluted Valuation Soon After Launch

TheNewsCrypto4h ago

Exchange Exposure Could Push Ozak AI’s Daily Volume Beyond $500 Million Within the First Trading Month

Ozak AI ($OZ) is positioned to capitalize on market shifts from established altcoins to innovative AI-driven projects. With a successful presale raising over $5 million and strong token distribution, the project is expected to achieve significant trading volume upon exchange listing—potentially exceeding $500 million daily within the first month. Key factors driving this outlook include concentrated early investor interest, AI and DePIN utility, and strategic partnerships enhancing functionality and credibility. Unlike hype-driven tokens, Ozak AI’s real-world use cases in staking and ecosystem integrations support sustained trading activity beyond initial speculation.

TheNewsCrypto5h ago

Exchange Exposure Could Push Ozak AI’s Daily Volume Beyond $500 Million Within the First Trading Month

TheNewsCrypto5h ago

Clarity Act Reaches Critical Juncture: The U.S. Crypto Regulatory Crossroads

The U.S. cryptocurrency regulatory landscape faces a historic turning point in spring 2026. The fate of the CLARITY Act, which aims to establish a comprehensive market structure framework for digital assets—hangs in the balance. If it fails to clear the Senate Banking Committee by late April, its chances of passage drop drastically, potentially delaying U.S. crypto legislation for years. Simultaneously, the GENIUS Act is reshaping the stablecoin market through stringent prudential regulations, including AML/CFT compliance and reserve requirements, favoring major compliant players like USDC and Tether’s new USAT. A key political compromise led by Senators Tillis and Alsobrooks addresses contentious issues like stablecoin yields, though critical implementation details remain unresolved. The White House strongly supports the legislation as part of a strategy to make the U.S. a global crypto hub, but political challenges—including midterm elections and cross-party demands—threaten its progress. If passed, the CLARITY Act could unlock trillions in institutional capital and help the U.S. compete with the EU’s MiCA framework. If stalled, global regulatory leadership may shift to Europe and Asia. Regardless of outcome, regulatory clarity and compliance infrastructure are becoming central to competitive advantage in the crypto industry.

marsbit5h ago

Clarity Act Reaches Critical Juncture: The U.S. Crypto Regulatory Crossroads

marsbit5h ago

Warsh Hearing Concludes: What Are the Notable Signals for the Crypto Industry?

The Senate Banking Committee held a confirmation hearing for Judy Shelton, a Federal Reserve nominee, who faced intense questioning regarding her ability to maintain the central bank's independence amid pressure from President Trump to lower interest rates. Shelton denied any pre-arranged commitments on rate cuts and emphasized her independence, though Democrats remained skeptical, citing contradictions with Trump's public statements. Shelton characterized post-pandemic inflation as a major policy failure and called for a "regime change" in the Fed’s approach, including reforms to inflation measurement and communication strategies. She criticized the current practice of Fed officials frequently signaling future rate moves and did not commit to maintaining post-meeting press conferences, suggesting potential reductions in transparency. Regarding crypto markets, Shelton’s extensive investments in digital asset companies—including Solana, DeFi, and blockchain infrastructure—were noted, though she has pledged to divest these holdings due to ethics rules. Her familiarity with the crypto industry and deregulatory leanings may signal a more open, though cautious, stance toward digital assets. However, concerns were raised about potential conflicts of interest, especially given Trump family involvement in crypto-financial ventures. The timing of her confirmation remains uncertain, pending a Justice Department investigation into current Chair Powell. Shelton’s potential leadership could lead to a more hawkish, productivity-focused Fed with tighter policy communication—factors that may significantly influence liquidity conditions and macro narratives for crypto markets.

marsbit6h ago

Warsh Hearing Concludes: What Are the Notable Signals for the Crypto Industry?

marsbit6h ago

Trading

Spot

Futures

Hot Articles

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

As a major evolution of Ethereum’s account system, AA is designed to address the fundamental security and experience bottlenecks of the “private key equals account” model in the EOA era.

3.5k Total ViewsPublished 2025.12.18Updated 2025.12.18

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

The privacy + payments narrative has been the primary catalyst driving rotation and substantial price gains in privacy coins such as DASH and XMR.

16.3k Total ViewsPublished 2026.01.20Updated 2026.01.20

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

ADA's Ouroboros Leios mainnet is expected to launch in 2026, and the hard fork to Protocol Version 11 is planned for Q1 2026.

40.1k Total ViewsPublished 2026.02.10Updated 2026.02.12

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of S (S) are presented below.