AGI is Just One Step Away

marsbitОпубліковано о 2026-06-11Востаннє оновлено о 2026-06-11

Анотація

The article discusses Anthropic's release of the Fable 5 model, a heavily restricted version of its powerful Mythos model. Initially unveiled in April, Mythos reportedly identified over 10,000 high-risk vulnerabilities for 50 enterprise clients, causing significant concern. Due to its dangerous capabilities in areas like autonomous cyber-attacks and biochemical weapons design guidance (classified as CB-1 level), the unaltered Mythos 5 remains limited to about 200 vetted entities like government agencies. Fable 5, released with a safety classifier, demonstrates extraordinary performance, leading benchmarks in coding (SWE-Bench Pro), software engineering, and research. It exhibits true "long-horizon agency," autonomously planning and executing complex, multi-step tasks like migrating 50 million lines of code in a day, moving beyond simple question-answering. The article positions Fable 5 at OpenAI's Level 3 ("Agent") and progressing toward Level 4 ("Innovator"), suggesting AGI (Artificial General Intelligence) is within reach, potentially 18-24 months away. To mitigate risks, Anthropic implemented a two-layer safety "cage": a silent routing system that redirects dangerous queries to a weaker model, and a mandatory 30-day data retention policy for all Mythos traffic to detect patterns of malicious use. Despite its high cost ($10/$50 per million input/output tokens), the model targets the enterprise market, where its unparalleled productivity and defensive capabilities against...

Everyone probably remembers back in April when Anthropic released a model called Mythos.

You can tell how powerful it is just by the name—Mythos, meaning myth.

At the time, it was said to have found over ten thousand high-risk vulnerabilities for 50 enterprise customers, shaking the entire industry.

This news once caused a full-blown crash in cybersecurity stocks, which everyone likely still recalls.

Because it was too powerful and there were concerns about misuse, it was considered "too dangerous to release publicly" and thus not made available to the general public.

Until last night, Anthropic added a safety classifier to the Mythos model and officially launched Fable 5.

As for the un-restricted Mythos 5, it is currently only available to about 200 strictly vetted institutions, such as the White House, cybersecurity defenders, and the Transparent Wings Project.

Such caution inevitably reminds people of the popular AI animation "Angel Engine" that's been trending recently.

Is the "angel" locked in that cage?

Even if it isn't yet, it's not far off.

01

According to the official test data released by Anthropic and the real-world test reports from the first batch of enterprise partners, the power of Fable 5 can be described as breathtaking.

First, let's look at the benchmark scores.

On the automated programming evaluation leaderboard SWE-Bench Pro, Claude Fable 5 achieved an 80.3% pass rate, while its "parent" Opus 4.8 scored 69.2%; GPT-5.5 scored 58.6%; Gemini 3.1 Pro only managed 54.2%.

In frontier code evaluation, Fable 5 reached 29.3%, compared to Opus 4.8's 13.4%; GPT-5.5 scored a mere 5.7%.

......

The gap between them is akin to someone pulling out a machine gun in the middle of the cold weapon era.

In all other areas—software engineering, independent research hypothesis generation, drug molecule design, model distillation and extreme compression, long-context understanding, and so on—Fable 5 ranks first in nearly every test.

For specifics, you can look up videos online.

Now, let's look at real-world application.

Payment giant Stripe conducted an early test with Fable 5. They had a massive legacy codebase of 50 million lines that needed a full migration. According to estimates, a refactoring of this scale would take a professional team at least two months.

However, after feeding the task to Fable 5, it planned everything itself, monitored its own progress, and corrected errors as they arose. In just one day, it completed the migration of 50 million lines of code.

This level of performance goes beyond what words like "powerful" can describe.

From a narrow perspective, Fable 5 has essentially achieved AGI within the realm of the digital economy.

The reason is that it demonstrates genuine "long-horizon agentic capability."

Whether it's GPT-5.5 or Gemini 3.5, let alone other lesser large models, they are essentially "reactive."

You nudge it, and it takes a step.

When it hits a dead end, it can only throw an exception and whine, "Sorry, I'm just a language model."

Though called tools, users still need to think deeply and guide the AI step-by-step to get the desired result, which isn't easy.

Fable 5, equipped with an internalized goal-oriented logic, is different.

As seen in Stripe's test, when given a high-difficulty, long-horizon task, it proceeds in three steps:

Establishing a subtask tree;

Scheduling different tools (web search, database queries, Python sandbox environment);

Self-reflecting, realizing a path is blocked, and immediately switching to another.

Aside from proposing the task and receiving the outcome, a person no longer needs to micromanage from the sidelines.

As a productivity tool, this is nearly perfect.

But it's still a different matter from true AGI.

Fable 5's prowess is built upon the fact that the codebases, scientific literature, etc., it operates within still have an underlying mathematical logic and structural definition.

The reason it doesn't get lost in long-horizon tasks is that it overcomes the challenge of "long-context attention decay," maintaining alignment with the core objective even when processing complex tasks spanning millions of tokens.

However, once thrown into the completely chaotic, digitally rule-lacking, and still poorly understood muddy waters of physical reality and society, it would still experience logical breakdowns due to a "missing foundation."

If measured by OpenAI's proposed "Five Levels of AI" (Level 1: Chatbot; Level 2: Reasoner; Level 3: Agent; Level 4: Innovator; Level 5: Organization):

Opus 4.8 is transitioning from Level 2 to Level 3, while Fable 5 has firmly established itself at Level 3 and is exploring Level 4.

The jump from Opus 4.7 to 4.8 took 43 days, while from 4.8 to Fable 5 took only 11 days.

How long until it firmly reaches Level 4? Judging by Anthropic's increasingly rapid update frequency, it's very likely achievable within this year.

Even the ultimate Level 5 is optimistically estimated to be only 18-24 months away—truly just one step away.

This pace is too fast, which is also the biggest reason why safety restrictions had to be added.

02

In the "System Card" and RSP assessment report released by Anthropic alongside the model, Mythos 5 showed extremely dangerous signals in two capabilities.

First, the underlying model of Fable/Mythos has reached CB-1 level in chemistry and biology assessments.

This means the model possesses end-to-end capabilities to "synthesize and provide guidance for creating non-novel biological/chemical weapons," even offering gene sequence modification suggestions to optimize the transmission efficiency of certain high-risk viruses.

If a terrorist with a basic undergraduate understanding of biology got their hands on an unrestricted Mythos 5, they could completely obtain step-by-step guidance on how to evade raw material regulations, how to set up a simple P3-level lab in a basement, and how to synthesize highly lethal pathogens by continuously prompting the model.

Second, cyber attacks and vulnerability exploitation.

During very early testing, Mythos 5 demonstrated the ability to autonomously find and breach core vulnerabilities in critical infrastructure (such as power plants, financial clearing systems, hospital networks), generating targeted zero-day exploit scripts within seconds.

When Mythos was first developed back in April this year, there were leaks claiming it had found over ten thousand high-risk vulnerabilities for 50 initial partner companies.

......

Given these two scenarios, directly releasing Mythos 5 to the public would be far too dangerous.

This beast must be locked in a cage.

After two months, Anthropic has built a cage with two layers.

First, a silent downgrade routing mechanism.

Anthropic deployed a completely independent, highly responsive, and high-precision classifier AI at the front end of Fable 5.

When a user inputs a complex prompt that might involve cyber offense/defense, biochemistry, or an attempt to extract model weights covertly, the classifier immediately triggers an alarm and silently routes the session in the background to the older Opus 4.8 for answering.

Second, data retention.

Anthropic and Amazon jointly announced last night: Regardless of whether it's on first-party or third-party platforms, all traffic calling the Mythos model must enforce a mandatory 30-day data retention policy.

Why?

Because real hackers or terrorists are often highly intelligent. They wouldn't directly ask "how to make a bomb" in one conversation but would break the problem down into 100 seemingly harmless basic questions.

The 30-day full data monitoring is precisely to capture this "salami-slicing" style of malicious abuse, which isn't apparent in a single conversation, through pattern recognition.

As Dario Amodei previously warned in public: "There is a full 25% probability that AI could lead to catastrophic risk for humanity."

To comply with their internally established "Responsible Scaling Policy" (RSP) and "Frontier Compliance Framework" (FCF), Anthropic had to personally put a leash on this giant beast.

Hence, we have Fable 5.

03

Let's talk about price.

Anthropic's official listed price is: $10 per million input tokens, $50 per million output tokens.

It's too expensive.

Current enterprise-level Agent tasks, in pursuit of high accuracy, often employ a "think, think again, and think some more" chain-of-thought logic. A single round of processing might consume 20 million input tokens and then output 5 million modified lines of code.

Calculating that, a single task would cost $450.

Moreover, Anthropic has already issued a notice: the Mythos model trial window included in existing personal subscriptions (Claude Pro) will be completely closed on June 22, 2026.

In the future, if individual users really use it for work, dozens of dollars could be gone in the blink of an eye.

While it's true that prices will eventually drop with technological updates, by then, it will likely no longer be the strongest.

The current situation is already very clear: the most cutting-edge large models have become luxury goods, unaffordable for ordinary people.

Of course, for Anthropic, which focuses on the B2B market, this is understandable.

The question is, not long ago, Google also announced it was engaging in a price war.

When competitors are generally lowering prices to capture the market, why does Anthropic dare to raise prices against the trend?

Because Token price is illusory; return on investment is fundamental.

Enterprise customers don't care about the cost per kilowatt-hour or per Token. As long as the AI can flawlessly complete the entire engineering workflow without bugs, they'll rush to pay that premium.

More crucially, the cybersecurity battle has now completely become an AI-versus-AI confrontation.

Since models at the Fable/Mythos level can instantly find system vulnerabilities, the only option for enterprises and national institutions to prevent attacks is to pay a high price to Anthropic to purchase Mythos 5's on-premise, privatized defense services.

Simply put, it's protection money: I created the most terrifying sword (Mythos 5). I'm afraid it might hurt people, so I sell a sheathed version to the masses (Fable 5). But at the same time, I sell the unrestricted sword to defense departments so they can use it to intercept swords others are developing.

Defending against AI threats will become a mandatory expense for every large enterprise.

This will directly lead to an even greater concentration of high-end B2B market budgets towards Anthropic, while cheaper models only capable of writing documents or emails will be left to engage in cutthroat competition in the low-profit C2C market.

It is foreseeable that next, the global cybersecurity sector will undergo a wave of value re-evaluation driven by AI.

At the same time, "one-person enterprises" will also soon become an increasingly common phenomenon.

04

Built-in task budget allocation functionality, support for memory work and context management, the ability to remember, discard, and restart like a human, and the capacity to independently handle the entire lifecycle from requirement documents to code delivery...

The emergence of Fable 5 and Mythos 5 is less of a model update and more of a coming-of-age ceremony marking the full maturity of the AI industry's division of labor.

The AI market has preliminarily bid farewell to the "everyone gets a free trial" idyllic era.

The most cutting-edge computing power and the deepest intelligence will be prioritized as strategic productive resources, directionally supplied to the infrastructure, scientific research, and B2B application battlefields that can generate the most commercial value.

This is a carnival of productivity explosion and a winter for the labor market.

This article is from the WeChat public account "Gelong," author: Wan Lianshan

Трендові криптовалюти

CitreaCTR

wrapped stUSDTWSTUSDT

Пов'язані питання

QWhat are the key performance improvements of Claude Fable 5 compared to previous models like Opus 4.8 and GPT-5.5?

AAccording to the article, Claude Fable 5 achieves an 80.3% pass rate on the SWE-Bench Pro benchmark, compared to 69.2% for its predecessor Opus 4.8 and 58.6% for GPT-5.5. In cutting-edge code evaluation, Fable 5 reaches 29.3%, while Opus 4.8 is at 13.4% and GPT-5.5 is only 5.7%. It leads in nearly all other tests including software engineering, independent research hypothesis generation, drug molecule design, model distillation, and long-context understanding.

QWhat dangerous capabilities did Mythos 5 exhibit that led to the creation of a restricted version, Fable 5?

AMythos 5 showed two highly dangerous capabilities: 1) It reached CB-1 level in chemistry and biology, meaning it gained the end-to-end ability to synthesize and guide the creation of non-novel biological/chemical weapons and even suggest gene sequence modifications to optimize the spread of dangerous viruses. 2) It demonstrated advanced cyber-attack capabilities, autonomously finding and exploiting critical vulnerabilities in infrastructure (like power plants, financial systems) and generating zero-day attack scripts within seconds.

QWhat are the two main security measures (the 'cage') Anthropic implemented for the public Fable 5 model?

AAnthropic implemented two main security layers for Fable 5: 1) A silent downgrade routing mechanism: A high-precision classifier AI silently routes sessions containing complex prompts related to cyber attacks, biochemistry, or model weight extraction to the older, less powerful Opus 4.8 model for answering. 2) Mandatory data retention: All traffic invoking the Mythos/Fable models, whether on first-party or third-party platforms, is subject to a mandatory 30-day data retention policy to detect patterns of incremental, disguised malicious use.

QHow does the article define the difference between Fable 5/Mythos 5 and other models regarding task execution?

AThe article states that models like GPT-5.5 or Gemini 3.5 are essentially 'responsive'—they take a step only when prompted and stop or throw an exception when stuck. In contrast, Fable 5/Mythos 5 exhibit true 'long-range agency' or goal-oriented logic. They can decompose complex, long-term tasks, autonomously schedule different tools (web search, database queries, Python sandbox), perform self-reflection, and find alternative paths when blocked, requiring minimal human intervention after the initial task is assigned.

QAccording to the article, what is the pricing and market strategy implication of releasing Fable 5 at a high cost?

AThe official price for Fable 5 is $10 per million input tokens and $50 per million output tokens, which is very expensive. The article argues that for enterprise clients, the return on investment (ROI) is more important than token cost. They are willing to pay the premium for flawless, bug-free task completion. Furthermore, the advanced capabilities of Fable/Mythos create a 'protection fee' dynamic in cybersecurity: to defend against AI-powered attacks, organizations must purchase high-end, restricted defensive services from Anthropic. This concentrates high-end B2B budgets on Anthropic, while cheaper models compete in the low-margin consumer market.

Пов'язані матеріали

Ethereum Foundation adds SEAL 911 co-founder to board as privacy focus grows

The Ethereum Foundation (EF) has appointed crypto security expert Pascal "pcaversaccio" Caversaccio, co-founder of SEAL 911, to its board of directors. This voluntary one-year appointment aligns with the EF's heightened organizational focus on privacy and security, which it recently deemed "non-negotiable protocol guarantees." Caversaccio is a long-time contributor, a member of the EF's Silviculture Society advisory group, and the author of key documents on Ethereum's cypherpunk ethos and privacy roadmap. The board, which also includes Aya Miyaguchi, Vitalik Buterin, and Patrick Storchenegger, sets the EF's vision and acts as a "security council" to protect its mission. The move underscores the Foundation's commitment to advancing layer-1 privacy, post-quantum security, and Ethereum's self-sovereignty.

cointelegraph17 хв тому

Ethereum Foundation adds SEAL 911 co-founder to board as privacy focus grows

cointelegraph17 хв тому

Fed Drops 'Hawkish Bomb' Late at Night: Interest Rates on Hold, Internal Divisions Unseen in a Decade

The Federal Reserve held interest rates steady at 3.50%-3.75% on July 29, marking the fifth consecutive pause since late 2026. While the decision was widely expected, it revealed a significant internal rift. Three FOMC voting members—Hammack, Kashkari, and Logan—dissented, advocating for a 25-basis-point hike. This level of opposition, the first of its kind since 2016, signals persistent hawkish pressure. Chairman Wash, in a post-meeting press conference, delivered a firm message, emphasizing an unwavering commitment to the 2% inflation target and stating the Fed "would not hesitate to act." He announced a substantial shift away from forward guidance, urging markets to interpret economic data independently. Wash also highlighted AI-related capital spending as a key variable, noting its potential dual impact on boosting productivity while potentially pushing up certain prices. Market reaction was pronounced. Stocks fell sharply, with the Dow dropping over 2% for its worst day in 15 months. Long-term Treasury yields surged, with the 30-year yield hitting its highest since 2007, while the dollar weakened and gold briefly rose above $4,100. The combination of internal policy dissent and a shift toward data-dependent communication introduced significant uncertainty into the market outlook.

marsbit20 хв тому

Fed Drops 'Hawkish Bomb' Late at Night: Interest Rates on Hold, Internal Divisions Unseen in a Decade

marsbit20 хв тому

After R&D Investment Catches Up with the U.S., Is the Sino-U.S. Chip Competition Still Just About Money?

The article examines the significance of recent data showing China's total R&D expenditure, measured by purchasing power parity (PPP), catching up to or slightly surpassing that of the United States in 2024. It argues that while this milestone reflects China's immense capacity to mobilize research resources, the competition in semiconductors now extends far beyond sheer financial input. The analysis highlights key differences: China's R&D is heavily skewed towards experimental development (over 80%), focusing on product engineering and industrialization, whereas the U.S. allocates a proportionally larger share (about 15%) to basic research. Furthermore, leading U.S. semiconductor firms reinvest a significant portion of their substantial global sales revenue into R&D, creating a sustainable commercial innovation cycle that is difficult to replicate. The article emphasizes that semiconductor progress depends on converting R&D into commercially viable products that pass rigorous customer validation and achieve repeat orders, not just on spending levels. It concludes that as China enters the top tier of R&D spenders, the critical challenges shift to improving resource allocation efficiency, fostering long-term basic research, and building effective mechanisms to bridge the gap between laboratory discoveries and reliable, market-ready industrial capabilities.

marsbit1 год тому

After R&D Investment Catches Up with the U.S., Is the Sino-U.S. Chip Competition Still Just About Money?

marsbit1 год тому

Walsh: 2% Inflation Target Unwavering, Maintain Independence, Focus on AI Transformation (Full Text Attached)

Fed Chair Warsh emphasized an unwavering commitment to the 2% inflation target, stating there is "no soft target" and that the FOMC is united on this goal. He indicated that if high inflation persists, interest rate hikes would likely be part of the solution. The Fed voted 9-3 to maintain the benchmark rate at 3.5%-3.75%. Warsh announced a significant shift in policy communication, moving away from forward guidance. He noted that market rates have risen between meetings, reflecting a tightening of financial conditions as participants now react more to direct economic data rather than Fed signals. On AI, Warsh highlighted a nearly 20% annual growth rate in high-tech capital expenditures related to artificial intelligence. He acknowledged its potential to boost productivity and supply but also its role in driving up prices for certain sectors like chips and AI infrastructure. The Fed is assessing whether these price pressures will remain sector-specific or spread more broadly. Warsh stressed the Fed's independence, asserting it will not deviate from its mandate due to external pressures. He framed the current decision not as a "pause" but as a period of rigorous review of economic conditions, with future actions to be data-dependent.

marsbit1 год тому

Walsh: 2% Inflation Target Unwavering, Maintain Independence, Focus on AI Transformation (Full Text Attached)

marsbit1 год тому

Top 10% of American Households Capture 88% of Wealth, How Is the AI Era Cake Divided?

AI Worsens Wealth Inequality as Top 10% of US Families Garner 88% of Stock Gains (2019-2026) A report from the China Finance 40 Forum highlights that the AI boom is significantly widening wealth inequality in the United States. From 2019 to Q1 2026, wealth from directly held stocks by US households nearly doubled from $29 trillion to approximately $55 trillion, with rapid growth concentrated post-2023, coinciding with the AI-driven stock market surge. The distribution of these gains has been starkly uneven. Between 2022 and Q1 2026, the wealth increase of about $21 trillion was captured almost entirely by the wealthiest families: the top 10% secured roughly 88% ($18.5 trillion), while the bottom 50% received only about 1% ($0.2 trillion). This has contributed to a growing disparity in disposable income shares. The report, referencing economic historian Robert Allen, draws parallels to historical technological shifts like the "Engels' Pause" during the First Industrial Revolution, where worker wages stagnated despite productivity gains. It suggests AI could induce a similar period where capital收益 outpace labor income, exacerbating inequality. Huang Yiping of Peking University identifies four mechanisms through which AI impacts income distribution: capital-bias (reducing labor's income share), task polarization (hollowing out middle-skill jobs), skill-based digital divides, and wealth amplification through assets. He warns that if this trend continues, strong supply growth driven by AI could be undermined by persistently weak consumer demand, threatening sustainable economic growth. To address these challenges, the report proposes a three-pronged strategy: 1) Defensive measures like strengthening social safety nets and antitrust enforcement; 2) Empowering workers through education reform and lifelong learning to collaborate with AI; and 3) Rebalancing via policies such as potential taxes on AI超额收益 and mechanisms for broader sharing of technology's benefits, ensuring AI's红利 are more equitably distributed.

marsbit1 год тому

Top 10% of American Households Capture 88% of Wealth, How Is the AI Era Cake Divided?

marsbit1 год тому

Торгівля

Спот

Обговорення

Ласкаво просимо до спільноти HTX. Тут ви можете бути в курсі останніх подій розвитку платформи та отримати доступ до професійної ринкової інформації. Нижче представлені думки користувачів щодо ціни ONE (ONE).

AGI is Just One Step Away

Анотація

01

02

03

04

Трендові криптовалюти

Пов'язані питання

Пов'язані матеріали

Ethereum Foundation adds SEAL 911 co-founder to board as privacy focus grows

Fed Drops 'Hawkish Bomb' Late at Night: Interest Rates on Hold, Internal Divisions Unseen in a Decade

After R&D Investment Catches Up with the U.S., Is the Sino-U.S. Chip Competition Still Just About Money?

Walsh: 2% Inflation Target Unwavering, Maintain Independence, Focus on AI Transformation (Full Text Attached)

Top 10% of American Households Capture 88% of Wealth, How Is the AI Era Cake Divided?

Торгівля

Популярні статті

Як купити ONE

Обговорення