AGI is Just One Step Away

marsbitPublished on 2026-06-11Last updated on 2026-06-11

Abstract

The article discusses Anthropic's release of the Fable 5 model, a heavily restricted version of its powerful Mythos model. Initially unveiled in April, Mythos reportedly identified over 10,000 high-risk vulnerabilities for 50 enterprise clients, causing significant concern. Due to its dangerous capabilities in areas like autonomous cyber-attacks and biochemical weapons design guidance (classified as CB-1 level), the unaltered Mythos 5 remains limited to about 200 vetted entities like government agencies. Fable 5, released with a safety classifier, demonstrates extraordinary performance, leading benchmarks in coding (SWE-Bench Pro), software engineering, and research. It exhibits true "long-horizon agency," autonomously planning and executing complex, multi-step tasks like migrating 50 million lines of code in a day, moving beyond simple question-answering. The article positions Fable 5 at OpenAI's Level 3 ("Agent") and progressing toward Level 4 ("Innovator"), suggesting AGI (Artificial General Intelligence) is within reach, potentially 18-24 months away. To mitigate risks, Anthropic implemented a two-layer safety "cage": a silent routing system that redirects dangerous queries to a weaker model, and a mandatory 30-day data retention policy for all Mythos traffic to detect patterns of malicious use. Despite its high cost ($10/$50 per million input/output tokens), the model targets the enterprise market, where its unparalleled productivity and defensive capabilities against...

Everyone probably remembers back in April when Anthropic released a model called Mythos.

You can tell how powerful it is just by the name—Mythos, meaning myth.

At the time, it was said to have found over ten thousand high-risk vulnerabilities for 50 enterprise customers, shaking the entire industry.

This news once caused a full-blown crash in cybersecurity stocks, which everyone likely still recalls.

Because it was too powerful and there were concerns about misuse, it was considered "too dangerous to release publicly" and thus not made available to the general public.

Until last night, Anthropic added a safety classifier to the Mythos model and officially launched Fable 5.

As for the un-restricted Mythos 5, it is currently only available to about 200 strictly vetted institutions, such as the White House, cybersecurity defenders, and the Transparent Wings Project.

Such caution inevitably reminds people of the popular AI animation "Angel Engine" that's been trending recently.

Is the "angel" locked in that cage?

Even if it isn't yet, it's not far off.

01

According to the official test data released by Anthropic and the real-world test reports from the first batch of enterprise partners, the power of Fable 5 can be described as breathtaking.

First, let's look at the benchmark scores.

On the automated programming evaluation leaderboard SWE-Bench Pro, Claude Fable 5 achieved an 80.3% pass rate, while its "parent" Opus 4.8 scored 69.2%; GPT-5.5 scored 58.6%; Gemini 3.1 Pro only managed 54.2%.

In frontier code evaluation, Fable 5 reached 29.3%, compared to Opus 4.8's 13.4%; GPT-5.5 scored a mere 5.7%.

......

The gap between them is akin to someone pulling out a machine gun in the middle of the cold weapon era.

In all other areas—software engineering, independent research hypothesis generation, drug molecule design, model distillation and extreme compression, long-context understanding, and so on—Fable 5 ranks first in nearly every test.

For specifics, you can look up videos online.

Now, let's look at real-world application.

Payment giant Stripe conducted an early test with Fable 5. They had a massive legacy codebase of 50 million lines that needed a full migration. According to estimates, a refactoring of this scale would take a professional team at least two months.

However, after feeding the task to Fable 5, it planned everything itself, monitored its own progress, and corrected errors as they arose. In just one day, it completed the migration of 50 million lines of code.

This level of performance goes beyond what words like "powerful" can describe.

From a narrow perspective, Fable 5 has essentially achieved AGI within the realm of the digital economy.

The reason is that it demonstrates genuine "long-horizon agentic capability."

Whether it's GPT-5.5 or Gemini 3.5, let alone other lesser large models, they are essentially "reactive."

You nudge it, and it takes a step.

When it hits a dead end, it can only throw an exception and whine, "Sorry, I'm just a language model."

Though called tools, users still need to think deeply and guide the AI step-by-step to get the desired result, which isn't easy.

Fable 5, equipped with an internalized goal-oriented logic, is different.

As seen in Stripe's test, when given a high-difficulty, long-horizon task, it proceeds in three steps:

Establishing a subtask tree;

Scheduling different tools (web search, database queries, Python sandbox environment);

Self-reflecting, realizing a path is blocked, and immediately switching to another.

Aside from proposing the task and receiving the outcome, a person no longer needs to micromanage from the sidelines.

As a productivity tool, this is nearly perfect.

But it's still a different matter from true AGI.

Fable 5's prowess is built upon the fact that the codebases, scientific literature, etc., it operates within still have an underlying mathematical logic and structural definition.

The reason it doesn't get lost in long-horizon tasks is that it overcomes the challenge of "long-context attention decay," maintaining alignment with the core objective even when processing complex tasks spanning millions of tokens.

However, once thrown into the completely chaotic, digitally rule-lacking, and still poorly understood muddy waters of physical reality and society, it would still experience logical breakdowns due to a "missing foundation."

If measured by OpenAI's proposed "Five Levels of AI" (Level 1: Chatbot; Level 2: Reasoner; Level 3: Agent; Level 4: Innovator; Level 5: Organization):

Opus 4.8 is transitioning from Level 2 to Level 3, while Fable 5 has firmly established itself at Level 3 and is exploring Level 4.

The jump from Opus 4.7 to 4.8 took 43 days, while from 4.8 to Fable 5 took only 11 days.

How long until it firmly reaches Level 4? Judging by Anthropic's increasingly rapid update frequency, it's very likely achievable within this year.

Even the ultimate Level 5 is optimistically estimated to be only 18-24 months away—truly just one step away.

This pace is too fast, which is also the biggest reason why safety restrictions had to be added.

02

In the "System Card" and RSP assessment report released by Anthropic alongside the model, Mythos 5 showed extremely dangerous signals in two capabilities.

First, the underlying model of Fable/Mythos has reached CB-1 level in chemistry and biology assessments.

This means the model possesses end-to-end capabilities to "synthesize and provide guidance for creating non-novel biological/chemical weapons," even offering gene sequence modification suggestions to optimize the transmission efficiency of certain high-risk viruses.

If a terrorist with a basic undergraduate understanding of biology got their hands on an unrestricted Mythos 5, they could completely obtain step-by-step guidance on how to evade raw material regulations, how to set up a simple P3-level lab in a basement, and how to synthesize highly lethal pathogens by continuously prompting the model.

Second, cyber attacks and vulnerability exploitation.

During very early testing, Mythos 5 demonstrated the ability to autonomously find and breach core vulnerabilities in critical infrastructure (such as power plants, financial clearing systems, hospital networks), generating targeted zero-day exploit scripts within seconds.

When Mythos was first developed back in April this year, there were leaks claiming it had found over ten thousand high-risk vulnerabilities for 50 initial partner companies.

......

Given these two scenarios, directly releasing Mythos 5 to the public would be far too dangerous.

This beast must be locked in a cage.

After two months, Anthropic has built a cage with two layers.

First, a silent downgrade routing mechanism.

Anthropic deployed a completely independent, highly responsive, and high-precision classifier AI at the front end of Fable 5.

When a user inputs a complex prompt that might involve cyber offense/defense, biochemistry, or an attempt to extract model weights covertly, the classifier immediately triggers an alarm and silently routes the session in the background to the older Opus 4.8 for answering.

Second, data retention.

Anthropic and Amazon jointly announced last night: Regardless of whether it's on first-party or third-party platforms, all traffic calling the Mythos model must enforce a mandatory 30-day data retention policy.

Why?

Because real hackers or terrorists are often highly intelligent. They wouldn't directly ask "how to make a bomb" in one conversation but would break the problem down into 100 seemingly harmless basic questions.

The 30-day full data monitoring is precisely to capture this "salami-slicing" style of malicious abuse, which isn't apparent in a single conversation, through pattern recognition.

As Dario Amodei previously warned in public: "There is a full 25% probability that AI could lead to catastrophic risk for humanity."

To comply with their internally established "Responsible Scaling Policy" (RSP) and "Frontier Compliance Framework" (FCF), Anthropic had to personally put a leash on this giant beast.

Hence, we have Fable 5.

03

Let's talk about price.

Anthropic's official listed price is: $10 per million input tokens, $50 per million output tokens.

It's too expensive.

Current enterprise-level Agent tasks, in pursuit of high accuracy, often employ a "think, think again, and think some more" chain-of-thought logic. A single round of processing might consume 20 million input tokens and then output 5 million modified lines of code.

Calculating that, a single task would cost $450.

Moreover, Anthropic has already issued a notice: the Mythos model trial window included in existing personal subscriptions (Claude Pro) will be completely closed on June 22, 2026.

In the future, if individual users really use it for work, dozens of dollars could be gone in the blink of an eye.

While it's true that prices will eventually drop with technological updates, by then, it will likely no longer be the strongest.

The current situation is already very clear: the most cutting-edge large models have become luxury goods, unaffordable for ordinary people.

Of course, for Anthropic, which focuses on the B2B market, this is understandable.

The question is, not long ago, Google also announced it was engaging in a price war.

When competitors are generally lowering prices to capture the market, why does Anthropic dare to raise prices against the trend?

Because Token price is illusory; return on investment is fundamental.

Enterprise customers don't care about the cost per kilowatt-hour or per Token. As long as the AI can flawlessly complete the entire engineering workflow without bugs, they'll rush to pay that premium.

More crucially, the cybersecurity battle has now completely become an AI-versus-AI confrontation.

Since models at the Fable/Mythos level can instantly find system vulnerabilities, the only option for enterprises and national institutions to prevent attacks is to pay a high price to Anthropic to purchase Mythos 5's on-premise, privatized defense services.

Simply put, it's protection money: I created the most terrifying sword (Mythos 5). I'm afraid it might hurt people, so I sell a sheathed version to the masses (Fable 5). But at the same time, I sell the unrestricted sword to defense departments so they can use it to intercept swords others are developing.

Defending against AI threats will become a mandatory expense for every large enterprise.

This will directly lead to an even greater concentration of high-end B2B market budgets towards Anthropic, while cheaper models only capable of writing documents or emails will be left to engage in cutthroat competition in the low-profit C2C market.

It is foreseeable that next, the global cybersecurity sector will undergo a wave of value re-evaluation driven by AI.

At the same time, "one-person enterprises" will also soon become an increasingly common phenomenon.

04

Built-in task budget allocation functionality, support for memory work and context management, the ability to remember, discard, and restart like a human, and the capacity to independently handle the entire lifecycle from requirement documents to code delivery...

The emergence of Fable 5 and Mythos 5 is less of a model update and more of a coming-of-age ceremony marking the full maturity of the AI industry's division of labor.

The AI market has preliminarily bid farewell to the "everyone gets a free trial" idyllic era.

The most cutting-edge computing power and the deepest intelligence will be prioritized as strategic productive resources, directionally supplied to the infrastructure, scientific research, and B2B application battlefields that can generate the most commercial value.

This is a carnival of productivity explosion and a winter for the labor market.

This article is from the WeChat public account "Gelong," author: Wan Lianshan

Related Questions

QWhat are the key performance improvements of Claude Fable 5 compared to previous models like Opus 4.8 and GPT-5.5?

AAccording to the article, Claude Fable 5 achieves an 80.3% pass rate on the SWE-Bench Pro benchmark, compared to 69.2% for its predecessor Opus 4.8 and 58.6% for GPT-5.5. In cutting-edge code evaluation, Fable 5 reaches 29.3%, while Opus 4.8 is at 13.4% and GPT-5.5 is only 5.7%. It leads in nearly all other tests including software engineering, independent research hypothesis generation, drug molecule design, model distillation, and long-context understanding.

QWhat dangerous capabilities did Mythos 5 exhibit that led to the creation of a restricted version, Fable 5?

AMythos 5 showed two highly dangerous capabilities: 1) It reached CB-1 level in chemistry and biology, meaning it gained the end-to-end ability to synthesize and guide the creation of non-novel biological/chemical weapons and even suggest gene sequence modifications to optimize the spread of dangerous viruses. 2) It demonstrated advanced cyber-attack capabilities, autonomously finding and exploiting critical vulnerabilities in infrastructure (like power plants, financial systems) and generating zero-day attack scripts within seconds.

QWhat are the two main security measures (the 'cage') Anthropic implemented for the public Fable 5 model?

AAnthropic implemented two main security layers for Fable 5: 1) A silent downgrade routing mechanism: A high-precision classifier AI silently routes sessions containing complex prompts related to cyber attacks, biochemistry, or model weight extraction to the older, less powerful Opus 4.8 model for answering. 2) Mandatory data retention: All traffic invoking the Mythos/Fable models, whether on first-party or third-party platforms, is subject to a mandatory 30-day data retention policy to detect patterns of incremental, disguised malicious use.

QHow does the article define the difference between Fable 5/Mythos 5 and other models regarding task execution?

AThe article states that models like GPT-5.5 or Gemini 3.5 are essentially 'responsive'—they take a step only when prompted and stop or throw an exception when stuck. In contrast, Fable 5/Mythos 5 exhibit true 'long-range agency' or goal-oriented logic. They can decompose complex, long-term tasks, autonomously schedule different tools (web search, database queries, Python sandbox), perform self-reflection, and find alternative paths when blocked, requiring minimal human intervention after the initial task is assigned.

QAccording to the article, what is the pricing and market strategy implication of releasing Fable 5 at a high cost?

AThe official price for Fable 5 is $10 per million input tokens and $50 per million output tokens, which is very expensive. The article argues that for enterprise clients, the return on investment (ROI) is more important than token cost. They are willing to pay the premium for flawless, bug-free task completion. Furthermore, the advanced capabilities of Fable/Mythos create a 'protection fee' dynamic in cybersecurity: to defend against AI-powered attacks, organizations must purchase high-end, restricted defensive services from Anthropic. This concentrates high-end B2B budgets on Anthropic, while cheaper models compete in the low-margin consumer market.

Related Reads

Promised Year of Crypto IPOs? Only One Went Public in Six Months, Down 70%

The much-anticipated wave of crypto IPOs in 2026 has failed to materialize, with market conditions worsening dramatically. While SpaceX prepares for the largest IPO in history, raising $75 billion at a $1.75 trillion valuation, the crypto sector faces a frozen pipeline. The sole crypto IPO success this year, BitGo, serves as a cautionary tale. After launching on the NYSE in January at $18, its stock has plummeted approximately 70%. Other major contenders have stalled or delayed. Kraken, which secretly filed in late 2025, has put its plans on ice, seeing its valuation drop 33% to $13.3 billion. Consensys has postponed its filing until autumn at the earliest, and Bitpanda is poised to miss its self-imposed H1 deadline for a Frankfurt listing. This widespread retreat is driven by a severe liquidity crunch. Bitcoin has fallen below $60,000, with capital being diverted to AI stocks and the massive SpaceX offering. The poor performance of earlier crypto listings like Gemini and the stagnant price of Coinbase further dampen investor appetite. A key underlying pressure is the impending US midterm elections in November, which could alter the currently favorable regulatory landscape. Companies had hoped to go public during this window of policy certainty, but challenging market dynamics have overridden those plans. The transparency that comes with being a public company is now seen as a potential liability rather than a benefit in a down market. The industry's fate now hinges on a few critical watchpoints: whether Kraken restarts its process in H2, if Consensys files in the fall, and if SpaceX's debut can revitalize market liquidity. Otherwise, the promised "crypto IPO year" will likely be pushed beyond the election.

marsbit5m ago

Promised Year of Crypto IPOs? Only One Went Public in Six Months, Down 70%

marsbit5m ago

Behind Musk and Huang Jen-hsun's 'AI Factories', an Unseen Battle for Freshwater Has Begun

Behind the "AI factories" of Elon Musk and Jensen Huang lies a hidden battle for a critical resource: fresh water. As AI models like ChatGPT and Claude process billions of prompts daily, they consume vast amounts of water for cooling. By 2030, global AI infrastructure is projected to use 9.3 trillion liters annually—enough to meet the basic needs of 1.3 billion people. This "water grab" stems from the massive heat generated by high-powered GPUs. Over 70% of data centers use evaporative cooling systems, where water absorbs heat and evaporates into the atmosphere, depleting local groundwater. Training models like GPT-4 can consume over 600 million liters of water. Tech giants like Google and Microsoft report skyrocketing water usage, sparking conflicts with local communities over resources. A flashpoint occurred in Memphis, Tennessee, where Musk's xAI built the Colossus supercomputer. It draws nearly 3.8 million liters of drinking water daily from local aquifers, leading to public outrage and legal action. In response, xAI is building an $80 million water recycling plant to use treated wastewater instead. Facing pressure, companies like Microsoft promote "waterless" closed-loop cooling systems. However, these systems increase electricity consumption by 20-30%, shifting the water burden to power plants, which require immense cooling water themselves—a case of indirect water footprint transfer. For China's AI industry, this crisis offers a strategic warning and opportunity. Instead of replicating the West's resource-intensive model, China can leverage its "East Data, West Computing" policy to locate data centers in cooler, water-rich regions like Guizhou. Furthermore, developing lightweight edge computing for smart homes and embodied AI robots can drastically reduce the need for constant cloud queries, cutting both water and energy consumption at the source. The freshwater war underscores a fundamental question: Will AI be a tool for human advancement or a silicon-based monster competing for our planet's last drops of clean water? The answer is becoming clearer as the water vapor rises.

marsbit52m ago

Behind Musk and Huang Jen-hsun's 'AI Factories', an Unseen Battle for Freshwater Has Begun

marsbit52m ago

AI Investors' 2026 Anxiety: When Models Devour Everything, What Moat Is Left for Startups?

In 2026, a wave of investor anxiety questions the defensibility of AI startups as models improve, fearing that most companies are just "thin wrappers" destined to be absorbed by foundation models or chipmakers. The author argues against this despair, positing that true moats lie not in benchmark performance but in areas models cannot easily reach. The logic of despair is that if models excel at all measurable tasks, only compute and cutting-edge model weights hold lasting value. However, the essay contends that the most valuable work is inherently "untrainable." Benchmarks measure what can be measured and thus optimized for, but real-world correctness often resides in private, complex systems. Examples include legacy codebases, intricate legal transactions, or hospital workflows. This kind of correctness is proprietary, costly to establish, and cannot be validated quickly—it requires time and trust within an organization. As models commodify visible, measurable tasks from both above (labs absorbing scaffolding) and below (saturation by cheaper models), value shifts to "untrainable ground." This encompasses work where correctness is a private truth, locked behind integration barriers, licenses, liability frameworks, and entrenched user habits. Trust and adoption are slow, human-centric processes that smarter models cannot accelerate. Successful companies defend their position by embedding deeply into client operations, owning the definition of "good" within a specific domain (e.g., Harvey in law, OpenEvidence in medicine), and pricing on outcomes rather than tokens. While labs compete fiercely, they are incentivized to keep the application layer vibrant. The future belongs not to those competing on generic benchmarks but to those navigating unscoreable terrain, doing the "unsexy work" of translation between models and messy human realities. The most cited benchmark scores are thus maps of territory about to become worthless, signaling who will lose the right to define what counts as good.

marsbit2h ago

AI Investors' 2026 Anxiety: When Models Devour Everything, What Moat Is Left for Startups?

marsbit2h ago

Trading

Spot
Futures

Hot Articles

How to Buy ONE

Welcome to HTX.com! We've made purchasing Harmony (ONE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy Harmony (ONE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your Harmony (ONE)After purchasing your Harmony (ONE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade Harmony (ONE)Easily trade Harmony (ONE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

3.8k Total ViewsPublished 2024.03.29Updated 2026.06.02

How to Buy ONE

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of ONE (ONE) are presented below.

活动图片