AGI is Just One Step Away

marsbitОпубликовано 2026-06-11Обновлено 2026-06-11

Введение

The article discusses Anthropic's release of the Fable 5 model, a heavily restricted version of its powerful Mythos model. Initially unveiled in April, Mythos reportedly identified over 10,000 high-risk vulnerabilities for 50 enterprise clients, causing significant concern. Due to its dangerous capabilities in areas like autonomous cyber-attacks and biochemical weapons design guidance (classified as CB-1 level), the unaltered Mythos 5 remains limited to about 200 vetted entities like government agencies. Fable 5, released with a safety classifier, demonstrates extraordinary performance, leading benchmarks in coding (SWE-Bench Pro), software engineering, and research. It exhibits true "long-horizon agency," autonomously planning and executing complex, multi-step tasks like migrating 50 million lines of code in a day, moving beyond simple question-answering. The article positions Fable 5 at OpenAI's Level 3 ("Agent") and progressing toward Level 4 ("Innovator"), suggesting AGI (Artificial General Intelligence) is within reach, potentially 18-24 months away. To mitigate risks, Anthropic implemented a two-layer safety "cage": a silent routing system that redirects dangerous queries to a weaker model, and a mandatory 30-day data retention policy for all Mythos traffic to detect patterns of malicious use. Despite its high cost ($10/$50 per million input/output tokens), the model targets the enterprise market, where its unparalleled productivity and defensive capabilities against...

Everyone probably remembers back in April when Anthropic released a model called Mythos.

You can tell how powerful it is just by the name—Mythos, meaning myth.

At the time, it was said to have found over ten thousand high-risk vulnerabilities for 50 enterprise customers, shaking the entire industry.

This news once caused a full-blown crash in cybersecurity stocks, which everyone likely still recalls.

Because it was too powerful and there were concerns about misuse, it was considered "too dangerous to release publicly" and thus not made available to the general public.

Until last night, Anthropic added a safety classifier to the Mythos model and officially launched Fable 5.

As for the un-restricted Mythos 5, it is currently only available to about 200 strictly vetted institutions, such as the White House, cybersecurity defenders, and the Transparent Wings Project.

Such caution inevitably reminds people of the popular AI animation "Angel Engine" that's been trending recently.

Is the "angel" locked in that cage?

Even if it isn't yet, it's not far off.

01

According to the official test data released by Anthropic and the real-world test reports from the first batch of enterprise partners, the power of Fable 5 can be described as breathtaking.

First, let's look at the benchmark scores.

On the automated programming evaluation leaderboard SWE-Bench Pro, Claude Fable 5 achieved an 80.3% pass rate, while its "parent" Opus 4.8 scored 69.2%; GPT-5.5 scored 58.6%; Gemini 3.1 Pro only managed 54.2%.

In frontier code evaluation, Fable 5 reached 29.3%, compared to Opus 4.8's 13.4%; GPT-5.5 scored a mere 5.7%.

......

The gap between them is akin to someone pulling out a machine gun in the middle of the cold weapon era.

In all other areas—software engineering, independent research hypothesis generation, drug molecule design, model distillation and extreme compression, long-context understanding, and so on—Fable 5 ranks first in nearly every test.

For specifics, you can look up videos online.

Now, let's look at real-world application.

Payment giant Stripe conducted an early test with Fable 5. They had a massive legacy codebase of 50 million lines that needed a full migration. According to estimates, a refactoring of this scale would take a professional team at least two months.

However, after feeding the task to Fable 5, it planned everything itself, monitored its own progress, and corrected errors as they arose. In just one day, it completed the migration of 50 million lines of code.

This level of performance goes beyond what words like "powerful" can describe.

From a narrow perspective, Fable 5 has essentially achieved AGI within the realm of the digital economy.

The reason is that it demonstrates genuine "long-horizon agentic capability."

Whether it's GPT-5.5 or Gemini 3.5, let alone other lesser large models, they are essentially "reactive."

You nudge it, and it takes a step.

When it hits a dead end, it can only throw an exception and whine, "Sorry, I'm just a language model."

Though called tools, users still need to think deeply and guide the AI step-by-step to get the desired result, which isn't easy.

Fable 5, equipped with an internalized goal-oriented logic, is different.

As seen in Stripe's test, when given a high-difficulty, long-horizon task, it proceeds in three steps:

Establishing a subtask tree;

Scheduling different tools (web search, database queries, Python sandbox environment);

Self-reflecting, realizing a path is blocked, and immediately switching to another.

Aside from proposing the task and receiving the outcome, a person no longer needs to micromanage from the sidelines.

As a productivity tool, this is nearly perfect.

But it's still a different matter from true AGI.

Fable 5's prowess is built upon the fact that the codebases, scientific literature, etc., it operates within still have an underlying mathematical logic and structural definition.

The reason it doesn't get lost in long-horizon tasks is that it overcomes the challenge of "long-context attention decay," maintaining alignment with the core objective even when processing complex tasks spanning millions of tokens.

However, once thrown into the completely chaotic, digitally rule-lacking, and still poorly understood muddy waters of physical reality and society, it would still experience logical breakdowns due to a "missing foundation."

If measured by OpenAI's proposed "Five Levels of AI" (Level 1: Chatbot; Level 2: Reasoner; Level 3: Agent; Level 4: Innovator; Level 5: Organization):

Opus 4.8 is transitioning from Level 2 to Level 3, while Fable 5 has firmly established itself at Level 3 and is exploring Level 4.

The jump from Opus 4.7 to 4.8 took 43 days, while from 4.8 to Fable 5 took only 11 days.

How long until it firmly reaches Level 4? Judging by Anthropic's increasingly rapid update frequency, it's very likely achievable within this year.

Even the ultimate Level 5 is optimistically estimated to be only 18-24 months away—truly just one step away.

This pace is too fast, which is also the biggest reason why safety restrictions had to be added.

02

In the "System Card" and RSP assessment report released by Anthropic alongside the model, Mythos 5 showed extremely dangerous signals in two capabilities.

First, the underlying model of Fable/Mythos has reached CB-1 level in chemistry and biology assessments.

This means the model possesses end-to-end capabilities to "synthesize and provide guidance for creating non-novel biological/chemical weapons," even offering gene sequence modification suggestions to optimize the transmission efficiency of certain high-risk viruses.

If a terrorist with a basic undergraduate understanding of biology got their hands on an unrestricted Mythos 5, they could completely obtain step-by-step guidance on how to evade raw material regulations, how to set up a simple P3-level lab in a basement, and how to synthesize highly lethal pathogens by continuously prompting the model.

Second, cyber attacks and vulnerability exploitation.

During very early testing, Mythos 5 demonstrated the ability to autonomously find and breach core vulnerabilities in critical infrastructure (such as power plants, financial clearing systems, hospital networks), generating targeted zero-day exploit scripts within seconds.

When Mythos was first developed back in April this year, there were leaks claiming it had found over ten thousand high-risk vulnerabilities for 50 initial partner companies.

......

Given these two scenarios, directly releasing Mythos 5 to the public would be far too dangerous.

This beast must be locked in a cage.

After two months, Anthropic has built a cage with two layers.

First, a silent downgrade routing mechanism.

Anthropic deployed a completely independent, highly responsive, and high-precision classifier AI at the front end of Fable 5.

When a user inputs a complex prompt that might involve cyber offense/defense, biochemistry, or an attempt to extract model weights covertly, the classifier immediately triggers an alarm and silently routes the session in the background to the older Opus 4.8 for answering.

Second, data retention.

Anthropic and Amazon jointly announced last night: Regardless of whether it's on first-party or third-party platforms, all traffic calling the Mythos model must enforce a mandatory 30-day data retention policy.

Why?

Because real hackers or terrorists are often highly intelligent. They wouldn't directly ask "how to make a bomb" in one conversation but would break the problem down into 100 seemingly harmless basic questions.

The 30-day full data monitoring is precisely to capture this "salami-slicing" style of malicious abuse, which isn't apparent in a single conversation, through pattern recognition.

As Dario Amodei previously warned in public: "There is a full 25% probability that AI could lead to catastrophic risk for humanity."

To comply with their internally established "Responsible Scaling Policy" (RSP) and "Frontier Compliance Framework" (FCF), Anthropic had to personally put a leash on this giant beast.

Hence, we have Fable 5.

03

Let's talk about price.

Anthropic's official listed price is: $10 per million input tokens, $50 per million output tokens.

It's too expensive.

Current enterprise-level Agent tasks, in pursuit of high accuracy, often employ a "think, think again, and think some more" chain-of-thought logic. A single round of processing might consume 20 million input tokens and then output 5 million modified lines of code.

Calculating that, a single task would cost $450.

Moreover, Anthropic has already issued a notice: the Mythos model trial window included in existing personal subscriptions (Claude Pro) will be completely closed on June 22, 2026.

In the future, if individual users really use it for work, dozens of dollars could be gone in the blink of an eye.

While it's true that prices will eventually drop with technological updates, by then, it will likely no longer be the strongest.

The current situation is already very clear: the most cutting-edge large models have become luxury goods, unaffordable for ordinary people.

Of course, for Anthropic, which focuses on the B2B market, this is understandable.

The question is, not long ago, Google also announced it was engaging in a price war.

When competitors are generally lowering prices to capture the market, why does Anthropic dare to raise prices against the trend?

Because Token price is illusory; return on investment is fundamental.

Enterprise customers don't care about the cost per kilowatt-hour or per Token. As long as the AI can flawlessly complete the entire engineering workflow without bugs, they'll rush to pay that premium.

More crucially, the cybersecurity battle has now completely become an AI-versus-AI confrontation.

Since models at the Fable/Mythos level can instantly find system vulnerabilities, the only option for enterprises and national institutions to prevent attacks is to pay a high price to Anthropic to purchase Mythos 5's on-premise, privatized defense services.

Simply put, it's protection money: I created the most terrifying sword (Mythos 5). I'm afraid it might hurt people, so I sell a sheathed version to the masses (Fable 5). But at the same time, I sell the unrestricted sword to defense departments so they can use it to intercept swords others are developing.

Defending against AI threats will become a mandatory expense for every large enterprise.

This will directly lead to an even greater concentration of high-end B2B market budgets towards Anthropic, while cheaper models only capable of writing documents or emails will be left to engage in cutthroat competition in the low-profit C2C market.

It is foreseeable that next, the global cybersecurity sector will undergo a wave of value re-evaluation driven by AI.

At the same time, "one-person enterprises" will also soon become an increasingly common phenomenon.

04

Built-in task budget allocation functionality, support for memory work and context management, the ability to remember, discard, and restart like a human, and the capacity to independently handle the entire lifecycle from requirement documents to code delivery...

The emergence of Fable 5 and Mythos 5 is less of a model update and more of a coming-of-age ceremony marking the full maturity of the AI industry's division of labor.

The AI market has preliminarily bid farewell to the "everyone gets a free trial" idyllic era.

The most cutting-edge computing power and the deepest intelligence will be prioritized as strategic productive resources, directionally supplied to the infrastructure, scientific research, and B2B application battlefields that can generate the most commercial value.

This is a carnival of productivity explosion and a winter for the labor market.

This article is from the WeChat public account "Gelong," author: Wan Lianshan

Связанные с этим вопросы

QWhat are the key performance improvements of Claude Fable 5 compared to previous models like Opus 4.8 and GPT-5.5?

AAccording to the article, Claude Fable 5 achieves an 80.3% pass rate on the SWE-Bench Pro benchmark, compared to 69.2% for its predecessor Opus 4.8 and 58.6% for GPT-5.5. In cutting-edge code evaluation, Fable 5 reaches 29.3%, while Opus 4.8 is at 13.4% and GPT-5.5 is only 5.7%. It leads in nearly all other tests including software engineering, independent research hypothesis generation, drug molecule design, model distillation, and long-context understanding.

QWhat dangerous capabilities did Mythos 5 exhibit that led to the creation of a restricted version, Fable 5?

AMythos 5 showed two highly dangerous capabilities: 1) It reached CB-1 level in chemistry and biology, meaning it gained the end-to-end ability to synthesize and guide the creation of non-novel biological/chemical weapons and even suggest gene sequence modifications to optimize the spread of dangerous viruses. 2) It demonstrated advanced cyber-attack capabilities, autonomously finding and exploiting critical vulnerabilities in infrastructure (like power plants, financial systems) and generating zero-day attack scripts within seconds.

QWhat are the two main security measures (the 'cage') Anthropic implemented for the public Fable 5 model?

AAnthropic implemented two main security layers for Fable 5: 1) A silent downgrade routing mechanism: A high-precision classifier AI silently routes sessions containing complex prompts related to cyber attacks, biochemistry, or model weight extraction to the older, less powerful Opus 4.8 model for answering. 2) Mandatory data retention: All traffic invoking the Mythos/Fable models, whether on first-party or third-party platforms, is subject to a mandatory 30-day data retention policy to detect patterns of incremental, disguised malicious use.

QHow does the article define the difference between Fable 5/Mythos 5 and other models regarding task execution?

AThe article states that models like GPT-5.5 or Gemini 3.5 are essentially 'responsive'—they take a step only when prompted and stop or throw an exception when stuck. In contrast, Fable 5/Mythos 5 exhibit true 'long-range agency' or goal-oriented logic. They can decompose complex, long-term tasks, autonomously schedule different tools (web search, database queries, Python sandbox), perform self-reflection, and find alternative paths when blocked, requiring minimal human intervention after the initial task is assigned.

QAccording to the article, what is the pricing and market strategy implication of releasing Fable 5 at a high cost?

AThe official price for Fable 5 is $10 per million input tokens and $50 per million output tokens, which is very expensive. The article argues that for enterprise clients, the return on investment (ROI) is more important than token cost. They are willing to pay the premium for flawless, bug-free task completion. Furthermore, the advanced capabilities of Fable/Mythos create a 'protection fee' dynamic in cybersecurity: to defend against AI-powered attacks, organizations must purchase high-end, restricted defensive services from Anthropic. This concentrates high-end B2B budgets on Anthropic, while cheaper models compete in the low-margin consumer market.

Похожее

Korean Youth, Making a 'Last Stand' in an Epic Bull Market

South Korea is experiencing an unprecedented stock market boom in the first half of 2026, with the KOSPI index doubling in six months, driven primarily by tech giants Samsung Electronics and SK Hynix. This "epic bull run," tied to the semiconductor cycle, has sparked a nationwide frenzy for stock trading. The country, with a population of just over 50 million, now has over 105 million securities accounts. The article, from the perspective of a Chinese national living in Seoul, explores how this speculative fever reflects deeper societal anxieties among Korean youth. Facing stagnant wages, high costs of living, housing pressures, and rigid social stratification, many young people see the volatile market as a "last chance" to alter their predetermined life trajectories and escape financial precarity. Stories include a young office worker investing her meager savings, a couple delaying marriage due to financial pressures, and a seasoned trader navigating exclusive social circles where market information is currency. However, the boom also exposes and exacerbates existing inequalities. While some achieve windfalls, others face devastating losses, with borrowing to invest reaching record highs. The narrative contrasts the illusion of equal opportunity with the harsh reality that the ability to absorb risk is unevenly distributed. Ultimately, the market frenzy is portrayed not as a solution, but as a symptom of a generation's struggle against a system offering limited upward mobility, where daily life is a precarious balance of bills, debts, and societal expectations.

marsbit2 мин. назад

Korean Youth, Making a 'Last Stand' in an Epic Bull Market

marsbit2 мин. назад

Young South Koreans, Making a 'Last-Ditch Effort' in an Epic Bull Market

This article explores how an unprecedented stock market boom in South Korea during the first half of 2026, driven by the semiconductor industry, is transforming the lives of ordinary people, particularly the youth. The KOSPI index doubled in six months, fueled by giants Samsung and SK Hynix, leading to a frenzy of retail investing. With over 105 million stock accounts in a population of just over 50 million, a sense of "FOMO" (fear of missing out) is pervasive. Through the perspective of Li Yuning, a Chinese woman living in Seoul, the piece follows several young Koreans who see the market as a last chance to escape stifling economic pressures, high housing costs, and narrow social mobility. Individuals like Minji, a low-paid office worker, and Junho, saving for marriage, invest their limited savings, while experienced traders like Suhu navigate exclusive social circles. The narrative reveals that this speculative fever stems less from greed and more from deep-seated anxiety about being left behind in a society with growing wealth inequality and rigid class structures. However, the boom also exposes stark social divides. It exacerbates wealth gaps, as those with family support or existing capital fare better. The pressure to succeed is immense, with stories of devastating losses leading to personal tragedy. Ultimately, the article suggests the牛市 acts as a pressure valve and a temporary illusion of opportunity in a system where traditional paths to advancement seem increasingly closed, leaving young people to gamble on the market as a final, desperate bid for a better future.

链捕手8 мин. назад

Young South Koreans, Making a 'Last-Ditch Effort' in an Epic Bull Market

链捕手8 мин. назад

When Even the Shovel Sellers Borrow to Buy Shovels: The US Stock AI Sector Evaporates Trillions in a Week, as the Market Begins Pricing AI's 'Bill'

Last week saw a sharp sell-off in the US AI sector, erasing trillions in market value despite companies reporting record financials. Key events included Alphabet's massive equity raise despite having over $1.2 trillion in cash, Broadcom's stock plummeting after its quarterly report, a 4% Nasdaq drop, and Oracle's stock falling even after posting record revenue and backlog. The market's focus shifted from stellar income statements to cash flow and balance sheets, questioning the massive and increasingly leveraged capital expenditures required to fund the AI infrastructure race. Analysts point out that while growth is strong, profitability and the visibility of AI monetization are now under scrutiny. The financing chain stretches from cash-rich giants like Alphabet raising debt and equity, to chipmakers like Broadcom facing margin pressure, to cloud providers like Oracle with negative free cash flow funding via debt, and finally to unprofitable AI labs like OpenAI and Anthropic—who are the ultimate customers for much of this infrastructure. The market is beginning to price the risk of this concentrated, leveraged bill coming due, with the upcoming SpaceX IPO serving as the next test for this financing ecosystem.

marsbit11 мин. назад

When Even the Shovel Sellers Borrow to Buy Shovels: The US Stock AI Sector Evaporates Trillions in a Week, as the Market Begins Pricing AI's 'Bill'

marsbit11 мин. назад

Doubao Charges More than GPT, While DeepSeek Slashes Prices Dramatically: Who Will Win?

The article discusses the divergent pricing strategies of two major Chinese AI companies. In May, Doubao (by ByteDance) began testing fees, with its professional tier priced higher than ChatGPT Plus. Meanwhile, DeepSeek permanently cut prices for its V4-Pro API to a quarter of the original, setting new global lows. Doubao, with high user traffic from ByteDance apps like TikTok, leads in monthly active users but faces massive compute costs from its free model. Its move to a freemium model targets heavy users, aiming to balance scale and monetization amid substantial investments. DeepSeek's price cut is attributed to architectural innovations that slash inference costs, adaptation to domestic hardware reducing dependency, and engineering optimizations. It focuses on the enterprise (B2B) market, aiming to become a leading model base. Both companies are currently unprofitable. The article contrasts their approaches with Anthropic, which is profitable by primarily serving enterprises with high-value use cases like coding and agents. It argues that sustainable AI business models require integrating AI into real workflows to deliver tangible ROI, rather than just offering chat services. DeepSeek's recent $7 billion funding round, including investments from Tencent, is noted to bolster its B2B position. The ultimate winner will be the player that successfully transforms AI into measurable returns, whether through consumer productivity ecosystems or enterprise platforms.

marsbit17 мин. назад

Doubao Charges More than GPT, While DeepSeek Slashes Prices Dramatically: Who Will Win?

marsbit17 мин. назад

Promised Year of Crypto IPOs? Only One Went Public in Six Months, Down 70%

The much-anticipated wave of crypto IPOs in 2026 has failed to materialize, with market conditions worsening dramatically. While SpaceX prepares for the largest IPO in history, raising $75 billion at a $1.75 trillion valuation, the crypto sector faces a frozen pipeline. The sole crypto IPO success this year, BitGo, serves as a cautionary tale. After launching on the NYSE in January at $18, its stock has plummeted approximately 70%. Other major contenders have stalled or delayed. Kraken, which secretly filed in late 2025, has put its plans on ice, seeing its valuation drop 33% to $13.3 billion. Consensys has postponed its filing until autumn at the earliest, and Bitpanda is poised to miss its self-imposed H1 deadline for a Frankfurt listing. This widespread retreat is driven by a severe liquidity crunch. Bitcoin has fallen below $60,000, with capital being diverted to AI stocks and the massive SpaceX offering. The poor performance of earlier crypto listings like Gemini and the stagnant price of Coinbase further dampen investor appetite. A key underlying pressure is the impending US midterm elections in November, which could alter the currently favorable regulatory landscape. Companies had hoped to go public during this window of policy certainty, but challenging market dynamics have overridden those plans. The transparency that comes with being a public company is now seen as a potential liability rather than a benefit in a down market. The industry's fate now hinges on a few critical watchpoints: whether Kraken restarts its process in H2, if Consensys files in the fall, and if SpaceX's debut can revitalize market liquidity. Otherwise, the promised "crypto IPO year" will likely be pushed beyond the election.

marsbit32 мин. назад

Promised Year of Crypto IPOs? Only One Went Public in Six Months, Down 70%

marsbit32 мин. назад

Торговля

Спот

Фьючерсы

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на ONE (ONE) представлены ниже.

AGI is Just One Step Away

Введение

01

02

03

04

Связанные с этим вопросы

Похожее

Korean Youth, Making a 'Last Stand' in an Epic Bull Market

Young South Koreans, Making a 'Last-Ditch Effort' in an Epic Bull Market

When Even the Shovel Sellers Borrow to Buy Shovels: The US Stock AI Sector Evaporates Trillions in a Week, as the Market Begins Pricing AI's 'Bill'

Doubao Charges More than GPT, While DeepSeek Slashes Prices Dramatically: Who Will Win?

Promised Year of Crypto IPOs? Only One Went Public in Six Months, Down 70%

Торговля

Популярные статьи

Как купить ONE

Обсуждения

Топ вопросы

Популярные категории

Популярные теги