Breaking News! Anthropic Calls for a Universal Pause in AI Research

marsbitОпубликовано 2026-06-05Обновлено 2026-06-05

Введение

Anthropic warns of AI self-evolution, reporting that over 80% of its internal code is now written by its AI, Claude. Productivity has surged, with engineers merging 8x more code than in 2024. Claude's performance on complex, open-ended tasks jumped from 26% to 76% success in six months, nearing human parity. The company introduces a new metric: "AI task duration." In 2024, AI handled 4-minute tasks; by 2026, it manages 16-hour tasks, with capability doubling every 4 months. Claude also reviews code, catching bugs that previously caused outages, and significantly outperforms humans in research tasks like optimizing code (52x speedup) and conducting AI safety experiments. Anthropic outlines three potential futures: 1) Progress plateaus, 2) AI accelerates but humans remain in control, or 3) AI achieves full recursive self-improvement (RSI), designing its own successors. This final path could revolutionize fields like medicine but also risks catastrophic alignment failure if control is lost. The call echoes similar concerns from OpenAI. Anthropic proposes a coordinated pause on AI development—if a verifiable mechanism to ensure all labs comply can be established.

By Jay, published by QbitAI

Major Discovery: The Self-Evolution of AI Has Begun.

This is the provocative thesis Anthropic just laid out in a lengthy blog post.

Our internal data suggests Claude is accelerating AI development, potentially on a path of Recursive Self-Improvement (RSI).

This is not mere “scaremongering.” A look at the article shows Anthropic is speaking with hard data—

As of May this year, over 80% of Anthropic's code was written by Claude.

Before Claude Code was released, that figure was only in the single digits.

Simultaneously, the average amount of code delivered by Anthropic engineers per quarter is now 8 times that of the 2021-2025 period.

Even more important is quality—

On the most open-ended, ambiguous programming tasks where even the form of the answer is uncertain, Claude's success rate is now 76%, up from just 26% six months ago.

A 50-percentage-point leap. In half a year.

Many engineers within Anthropic already feel the quality of Claude's code is on par with humans.

It is expected to surpass humans within the year.

Anthropic also emphasizes that if this trend continues, it is entirely possible for AI to design and build the next generation of AI.

This could utterly transform society, bringing immense benefits in healthcare, technology, and the economy. But it could also compound alignment issues, ultimately leading to a loss of control.

Therefore, Anthropic is leading the call:

If there exists a verifiable mechanism that ensures AI labs are indeed not covertly racing ahead, we are willing to slow down, even pause.

Beyond this, Anthropic's blog post contains many other interesting perspectives and facts.

Below is a version organized for easier reading.

Enjoy.

Anthropic's Long-Form Thesis

AI's Moore's Law Has Arrived

Anthropic created a new metric called “Duration of tasks an AI can complete autonomously.”

In March 2024, Claude Opus 3 could handle software tasks that would take a human roughly 4 minutes.

One year later, Claude Sonnet 3.7: 1.5 hours.

Another year, Claude Opus 4.6: 12 hours.

And the latest, Mythos, in internal testing shows:

It can work continuously for “at least” 16 hours, already hitting the upper limit measurable by the METR testing framework.

This doubling speed has accelerated from once every 7 months to once every 4 months.

If the trend holds, by 2027, it could be several weeks.

Claude Writes Most of Anthropic's Code

As of May 2026, over 80% of the code in my Anthropic codebase is written by Claude.

Before Claude Code's release, this number had consistently been in the single digits.

This shift is also reflected in engineers' workflows.

In Anthropic's first four years, the lines of code merged per engineer per day remained largely constant.

In 2025, when Claude began writing its own code, the merge count suddenly skyrocketed.

Now, in Q2 2026, engineers are merging 8 times more code per day than in 2024.

But with more code, is the quality diluted?

Anthropic says that over the past year, engineers have needed to correct Claude less and less.

This is evident in benchmarks, as shown in the chart below.

Across all difficulty levels of tasks, Claude's success rate has been soaring without exception.

So, Anthropic now uses Claude to review code.

Yes, all changes submitted to the codebase first go through an automated Claude review, checking for bugs, security vulnerabilities, and other defects.

Their retrospective analysis found that if this automated review had been in place for every past change, about one-third of the bugs that caused incidents on claude.ai would have been caught before deployment.

Remember, the engineers writing that code are among the world's top experts in building AI systems.

Claude is catching their mistakes.

Creativity Amplifier

Next is Claude's involvement at the research level.

Anthropic has a routine: each time a new model is released, they give Claude a piece of code for training a small AI model and ask it to optimize the runtime speed to the maximum while ensuring correctness.

In May 2025, Claude Opus 4 delivered: a 3x speedup.

In April 2026, Claude Mythos Preview achieved 52x.

For reference, a skilled human researcher would need 4 to 8 hours to barely reach 4x.

In less than a year, Claude surpassed humans.

In April 2026, Anthropic gave Claude an AI safety research question, essentially “Can a weak model reliably supervise a strong model?”, and let Claude propose hypotheses, run experiments...

First, the human performance: two human researchers spent about a week narrowing the gap by 23%.

Claude, after about 800 hours and roughly $18,000 worth of compute—

Narrowed it by 97%.

Where Do We Go From Here?

By now, the conclusion is clear.

The human role in the AI development pipeline is narrowing at every stage.

Coding: Claude does it. Code review: Claude does it. Experiment execution: Claude is an order of magnitude faster than humans. Experiment design: Claude is starting to do it on its own...

The last comparative advantage humans have now is research taste and judgment.

But how long can this advantage hold?

Anthropic says in the blog they are unsure.

One possibility is that “research taste,” like other things AI couldn't do before, starts as impossible, then suddenly becomes possible.

Just as understanding humor, demonstrating theory of mind, and solving linguistic puzzles all followed similar curves.

Another possibility is that even if Claude never truly learns research taste, the current acceleration trend means each human researcher can now orchestrate several times more work simultaneously.

You don't need AI to think completely for you; it just needs to handle all the “execution” work, leaving you to make the 5% of directional choices.

Three Possible Futures for RSI

At the end of the blog, Anthropic outlines three possible evolutionary directions for this “self-evolution” trend.

1. Plateau.

Those exponential curves are actually S-curves.

Perhaps research judgment is something that simply cannot be solved by scaling and requires a completely new architectural breakthrough.

Or, the bottleneck lies in energy, chips, the physical supply chain of compute.

Even if AI capabilities plateau at today's level, it will still bring significant changes to the world.

The recent Project Glasswing saw Mythos Preview discover over ten thousand high and critical severity software vulnerabilities in its first few weeks, spanning the world's most critical systems.

2. AI continues to accelerate, but humans keep their hands on the wheel.

Organizational efficiency will improve exponentially, with 100-person companies doing the work of 10,000 or even 100,000.

Anthropic believes we are most likely heading into this scenario.

But they also observed an interesting phenomenon: the embodiment of Amdahl's Law within organizations—

Claude writes code much faster, making code review the new bottleneck. New ideas, tools, and experiments explode far beyond the organization's capacity to absorb them.

Bottlenecks don't disappear; they just shift to the next stage.

3. AI achieves full recursive self-improvement, beginning to build the next generation of itself.

In this scenario, the speed of AI development depends entirely on compute. Humans retreat to supervisory, verification, and auditing roles.

If this happens, this capability will likely transfer to other scientific fields—medicine, materials, energy—all taking off.

Of course, another future is alignment failure.

In this case, misalignment could accumulate step by step during AI's self-iteration, ultimately leading to—complete loss of control.

One More Thing

The above covers the most critical points of Anthropic's thesis on self-evolution.

Honestly, at first, I didn't take it too seriously. After all, Anthropic is about to IPO. Isn't this a classic “Anthropic-style” PR move?

You know what? This time, it might genuinely be different.

Because just a few days ago, OpenAI published a similar blog post:

We too see early signs of self-evolution in today's systems: AI development itself is being accelerated by AI. We expect this to intensify competitive pressures among developers and nations, and create governance challenges existing institutions cannot handle. With the emergence of RSI, society needs ways to shape AI's developmental trajectory to ensure it serves human interests.

The singularity seems to be arriving faster than anyone anticipated.

Blog: https://www.anthropic.com/institute/recursive-self-improvement

References:[1]https://x.com/kimmonismus/status/2062517474277675102[2]https://x.com/anthropicai/status/2062568873321513443

This article is from the WeChat public account “QbitAI”, author: Focus on Frontier Technology

Связанные с этим вопросы

QAccording to the article, what percentage of Anthropic's code is now written by Claude, and what was the figure before Claude Code was released?

AAccording to the article, over 80% of Anthropic's code is now written by Claude (as of May 2026). Before Claude Code was released, the figure was in the single digits.

QWhat is the new metric Anthropic created to measure AI progress, and how has the capability of their latest model, 'Mythos', performed on it?

AAnthropic created a new metric called 'the length of tasks an AI can complete independently.' Their latest model, 'Mythos', in internal testing, can work for 'at least' 16 hours continuously, which is the upper limit of what the METR testing framework can measure.

QWhat was the key finding in the AI safety research experiment where Claude was tasked to explore whether a weak model could reliably supervise a strong model?

AIn the AI safety research experiment, two human researchers spent about a week reducing the performance gap by 23%. Claude, using about 800 hours of compute (costing around $18,000), reduced the gap by 97%.

QThe article mentions three possible future scenarios for Recursive Self-Improvement (RSI). What are they?

AThe three possible future scenarios for RSI mentioned are: 1) Stagnation, where exponential curves are actually S-curves and progress plateaus. 2) AI continues to accelerate, but humans remain in control (the steering wheel). 3) AI achieves full recursive self-improvement and begins to build the next generation of itself autonomously.

QBesides Anthropic, which other major AI company recently published a blog with similar concerns about AI self-evolution, and what was the core of their message?

AOpenAI also recently published a similar blog. The core of their message was that they are seeing early signs of self-evolution in current systems, where AI development is being accelerated by AI itself. They warned this would increase competitive pressures and create governance challenges, emphasizing the need for societal methods to shape AI's trajectory to ensure it serves human interests.

Похожее

Bitcoin's 'Rally Ends,' Officially Entering the Later Stage of a Bear Market?

Bitcoin prices declined 13% this week, reversing the recent rebound and signaling a likely transition into the later stages of a bear market. Key on-chain metrics deteriorated, with the short-term holder cost basis falling below the Realized Price—a pattern last seen in early 2022, characteristic of bear market maturity. The rally to ~$82k proved to be a bear market bounce, as evidenced by the 90-day realized profit/loss ratio failing to sustain above the bullish threshold of 2. Daily realized losses surged to $1.35B, including significant selling from long-term holders who accumulated near cycle tops, indicating ongoing supply redistribution. Price was rejected almost precisely at the aggregate US spot ETF cost basis of ~$83k, turning that level into resistance and leaving the average ETF investor underwater again. Spot market selling pressure intensified, with the 7-day volume delta turning significantly negative to its weakest level since February. While a major long liquidation event cleared over $400M in leverage, spot demand has not yet stepped in to absorb the resulting supply. Options markets continue pricing in higher future volatility (elevated volatility risk premium) and maintain a skew toward put options, reflecting persistent demand for downside protection, though not yet panic. Overall, market structure remains fragile. Sustained recovery likely requires a reclaim of the ETF cost basis, a shift back to positive spot demand, and a slowdown in realized loss-taking. Until then, the market risks further downside or extended consolidation within the broader bear trend.

Foresight News1 ч. назад

Bitcoin's 'Rally Ends,' Officially Entering the Later Stage of a Bear Market?

Foresight News1 ч. назад

How Risky is the "Death Spiral" of MSTR and STRC?

Summary: This article explores the perceived "death spiral" risk between MicroStrategy (MSTR), its Bitcoin holdings, and its perpetual preferred stock (STRC), drawing comparisons to the LUNA-UST collapse. While both systems feature price anchors, high yields for holders, and potential feedback loops, their core mechanisms differ fundamentally. The MSTR-STRC structure relies on continuous financing to sustain its high dividend payouts, primarily through stock ATM offerings. A negative feedback cycle could occur: falling MSTR stock price makes raising equity capital harder, increasing pressure to sell Bitcoin, which undermines STRC confidence and further depresses MSTR. However, unlike LUNA-UST's automated, direct linkage, the MSTR-STRC loop is weaker and has brakes: STRC dividends can be deferred or rates lowered, and STRC holders have a $100/share liquidation preference in bankruptcy, providing a price floor. The company's sustainability hinges on its ability to continue financing. Its current ~$900 million USD reserves cover only about 6.3 months of its ~$1.71 billion annual interest/dividend burden. The next six months are critical, aligning with both the potential bottom in Bitcoin's four-year cycle and the depletion timeline of its reserves. While a LUNA-style catastrophic collapse is deemed highly unlikely due to structural differences, the key question is whether MicroStrategy can navigate this period through healthy deleveraging to restart its capital engine.

Foresight News1 ч. назад

How Risky is the "Death Spiral" of MSTR and STRC?

Foresight News1 ч. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片