Breaking News! Anthropic Calls for a Universal Pause in AI Research

marsbitPublished on 2026-06-05Last updated on 2026-06-05

Abstract

Anthropic warns of AI self-evolution, reporting that over 80% of its internal code is now written by its AI, Claude. Productivity has surged, with engineers merging 8x more code than in 2024. Claude's performance on complex, open-ended tasks jumped from 26% to 76% success in six months, nearing human parity. The company introduces a new metric: "AI task duration." In 2024, AI handled 4-minute tasks; by 2026, it manages 16-hour tasks, with capability doubling every 4 months. Claude also reviews code, catching bugs that previously caused outages, and significantly outperforms humans in research tasks like optimizing code (52x speedup) and conducting AI safety experiments. Anthropic outlines three potential futures: 1) Progress plateaus, 2) AI accelerates but humans remain in control, or 3) AI achieves full recursive self-improvement (RSI), designing its own successors. This final path could revolutionize fields like medicine but also risks catastrophic alignment failure if control is lost. The call echoes similar concerns from OpenAI. Anthropic proposes a coordinated pause on AI development—if a verifiable mechanism to ensure all labs comply can be established.

By Jay, published by QbitAI

Major Discovery: The Self-Evolution of AI Has Begun.

This is the provocative thesis Anthropic just laid out in a lengthy blog post.

Our internal data suggests Claude is accelerating AI development, potentially on a path of Recursive Self-Improvement (RSI).

This is not mere “scaremongering.” A look at the article shows Anthropic is speaking with hard data—

As of May this year, over 80% of Anthropic's code was written by Claude.

Before Claude Code was released, that figure was only in the single digits.

Simultaneously, the average amount of code delivered by Anthropic engineers per quarter is now 8 times that of the 2021-2025 period.

Even more important is quality—

On the most open-ended, ambiguous programming tasks where even the form of the answer is uncertain, Claude's success rate is now 76%, up from just 26% six months ago.

A 50-percentage-point leap. In half a year.

Many engineers within Anthropic already feel the quality of Claude's code is on par with humans.

It is expected to surpass humans within the year.

Anthropic also emphasizes that if this trend continues, it is entirely possible for AI to design and build the next generation of AI.

This could utterly transform society, bringing immense benefits in healthcare, technology, and the economy. But it could also compound alignment issues, ultimately leading to a loss of control.

Therefore, Anthropic is leading the call:

If there exists a verifiable mechanism that ensures AI labs are indeed not covertly racing ahead, we are willing to slow down, even pause.

Beyond this, Anthropic's blog post contains many other interesting perspectives and facts.

Below is a version organized for easier reading.

Enjoy.

Anthropic's Long-Form Thesis

AI's Moore's Law Has Arrived

Anthropic created a new metric called “Duration of tasks an AI can complete autonomously.”

In March 2024, Claude Opus 3 could handle software tasks that would take a human roughly 4 minutes.

One year later, Claude Sonnet 3.7: 1.5 hours.

Another year, Claude Opus 4.6: 12 hours.

And the latest, Mythos, in internal testing shows:

It can work continuously for “at least” 16 hours, already hitting the upper limit measurable by the METR testing framework.

This doubling speed has accelerated from once every 7 months to once every 4 months.

If the trend holds, by 2027, it could be several weeks.

Claude Writes Most of Anthropic's Code

As of May 2026, over 80% of the code in my Anthropic codebase is written by Claude.

Before Claude Code's release, this number had consistently been in the single digits.

This shift is also reflected in engineers' workflows.

In Anthropic's first four years, the lines of code merged per engineer per day remained largely constant.

In 2025, when Claude began writing its own code, the merge count suddenly skyrocketed.

Now, in Q2 2026, engineers are merging 8 times more code per day than in 2024.

But with more code, is the quality diluted?

Anthropic says that over the past year, engineers have needed to correct Claude less and less.

This is evident in benchmarks, as shown in the chart below.

Across all difficulty levels of tasks, Claude's success rate has been soaring without exception.

So, Anthropic now uses Claude to review code.

Yes, all changes submitted to the codebase first go through an automated Claude review, checking for bugs, security vulnerabilities, and other defects.

Their retrospective analysis found that if this automated review had been in place for every past change, about one-third of the bugs that caused incidents on claude.ai would have been caught before deployment.

Remember, the engineers writing that code are among the world's top experts in building AI systems.

Claude is catching their mistakes.

Creativity Amplifier

Next is Claude's involvement at the research level.

Anthropic has a routine: each time a new model is released, they give Claude a piece of code for training a small AI model and ask it to optimize the runtime speed to the maximum while ensuring correctness.

In May 2025, Claude Opus 4 delivered: a 3x speedup.

In April 2026, Claude Mythos Preview achieved 52x.

For reference, a skilled human researcher would need 4 to 8 hours to barely reach 4x.

In less than a year, Claude surpassed humans.

In April 2026, Anthropic gave Claude an AI safety research question, essentially “Can a weak model reliably supervise a strong model?”, and let Claude propose hypotheses, run experiments...

First, the human performance: two human researchers spent about a week narrowing the gap by 23%.

Claude, after about 800 hours and roughly $18,000 worth of compute—

Narrowed it by 97%.

Where Do We Go From Here?

By now, the conclusion is clear.

The human role in the AI development pipeline is narrowing at every stage.

Coding: Claude does it. Code review: Claude does it. Experiment execution: Claude is an order of magnitude faster than humans. Experiment design: Claude is starting to do it on its own...

The last comparative advantage humans have now is research taste and judgment.

But how long can this advantage hold?

Anthropic says in the blog they are unsure.

One possibility is that “research taste,” like other things AI couldn't do before, starts as impossible, then suddenly becomes possible.

Just as understanding humor, demonstrating theory of mind, and solving linguistic puzzles all followed similar curves.

Another possibility is that even if Claude never truly learns research taste, the current acceleration trend means each human researcher can now orchestrate several times more work simultaneously.

You don't need AI to think completely for you; it just needs to handle all the “execution” work, leaving you to make the 5% of directional choices.

Three Possible Futures for RSI

At the end of the blog, Anthropic outlines three possible evolutionary directions for this “self-evolution” trend.

1. Plateau.

Those exponential curves are actually S-curves.

Perhaps research judgment is something that simply cannot be solved by scaling and requires a completely new architectural breakthrough.

Or, the bottleneck lies in energy, chips, the physical supply chain of compute.

Even if AI capabilities plateau at today's level, it will still bring significant changes to the world.

The recent Project Glasswing saw Mythos Preview discover over ten thousand high and critical severity software vulnerabilities in its first few weeks, spanning the world's most critical systems.

2. AI continues to accelerate, but humans keep their hands on the wheel.

Organizational efficiency will improve exponentially, with 100-person companies doing the work of 10,000 or even 100,000.

Anthropic believes we are most likely heading into this scenario.

But they also observed an interesting phenomenon: the embodiment of Amdahl's Law within organizations—

Claude writes code much faster, making code review the new bottleneck. New ideas, tools, and experiments explode far beyond the organization's capacity to absorb them.

Bottlenecks don't disappear; they just shift to the next stage.

3. AI achieves full recursive self-improvement, beginning to build the next generation of itself.

In this scenario, the speed of AI development depends entirely on compute. Humans retreat to supervisory, verification, and auditing roles.

If this happens, this capability will likely transfer to other scientific fields—medicine, materials, energy—all taking off.

Of course, another future is alignment failure.

In this case, misalignment could accumulate step by step during AI's self-iteration, ultimately leading to—complete loss of control.

One More Thing

The above covers the most critical points of Anthropic's thesis on self-evolution.

Honestly, at first, I didn't take it too seriously. After all, Anthropic is about to IPO. Isn't this a classic “Anthropic-style” PR move?

You know what? This time, it might genuinely be different.

Because just a few days ago, OpenAI published a similar blog post:

We too see early signs of self-evolution in today's systems: AI development itself is being accelerated by AI. We expect this to intensify competitive pressures among developers and nations, and create governance challenges existing institutions cannot handle. With the emergence of RSI, society needs ways to shape AI's developmental trajectory to ensure it serves human interests.

The singularity seems to be arriving faster than anyone anticipated.

Blog: https://www.anthropic.com/institute/recursive-self-improvement

References:[1]https://x.com/kimmonismus/status/2062517474277675102[2]https://x.com/anthropicai/status/2062568873321513443

This article is from the WeChat public account “QbitAI”, author: Focus on Frontier Technology

Related Questions

QAccording to the article, what percentage of Anthropic's code is now written by Claude, and what was the figure before Claude Code was released?

AAccording to the article, over 80% of Anthropic's code is now written by Claude (as of May 2026). Before Claude Code was released, the figure was in the single digits.

QWhat is the new metric Anthropic created to measure AI progress, and how has the capability of their latest model, 'Mythos', performed on it?

AAnthropic created a new metric called 'the length of tasks an AI can complete independently.' Their latest model, 'Mythos', in internal testing, can work for 'at least' 16 hours continuously, which is the upper limit of what the METR testing framework can measure.

QWhat was the key finding in the AI safety research experiment where Claude was tasked to explore whether a weak model could reliably supervise a strong model?

AIn the AI safety research experiment, two human researchers spent about a week reducing the performance gap by 23%. Claude, using about 800 hours of compute (costing around $18,000), reduced the gap by 97%.

QThe article mentions three possible future scenarios for Recursive Self-Improvement (RSI). What are they?

AThe three possible future scenarios for RSI mentioned are: 1) Stagnation, where exponential curves are actually S-curves and progress plateaus. 2) AI continues to accelerate, but humans remain in control (the steering wheel). 3) AI achieves full recursive self-improvement and begins to build the next generation of itself autonomously.

QBesides Anthropic, which other major AI company recently published a blog with similar concerns about AI self-evolution, and what was the core of their message?

AOpenAI also recently published a similar blog. The core of their message was that they are seeing early signs of self-evolution in current systems, where AI development is being accelerated by AI itself. They warned this would increase competitive pressures and create governance challenges, emphasizing the need for societal methods to shape AI's trajectory to ensure it serves human interests.

Related Reads

Anthropic Cries Wolf: Is the AGI Threat Real, or Just an IPO Story?

Anthropic has published an article titled "When AI builds itself," discussing the emerging concept of "recursive self-improvement," where AI begins to actively participate in designing, training, testing, and optimizing its own subsequent versions. The company presents internal data showing that by May 2026, over 80% of code merged into its codebase was written by Claude, its AI model. Claude's capabilities have expanded to handling complex, open-ended engineering tasks, achieving a 76% success rate in such areas, and even contributing to research processes, such as optimizing code performance and conducting AI safety experiments. Anthropic outlines an evolution from human-driven development to AI-assisted workflows, culminating in the current stage where AI agents can autonomously write, run, and delegate code. The company cautions that the path toward a "closed loop," where AI continuously improves itself, is becoming visible. It calls for coordinated global mechanisms to potentially slow or pause frontier AI development to allow safety research and societal structures to catch up. However, the timing of this warning coincides with Anthropic's preparations for an IPO, framing the narrative not just as a safety concern but also as a demonstration of Claude's advanced capabilities and its integral role in accelerating Anthropic's own R&D—creating a potential "flywheel" effect for competitive advantage. This contrasts with OpenAI's recent, more policy-oriented discussion of the same risks, highlighting the competitive dynamics in the AI industry as companies position themselves in both the technological and regulatory landscape.

marsbit2m ago

Anthropic Cries Wolf: Is the AGI Threat Real, or Just an IPO Story?

marsbit2m ago

BIT Research: ETF Purchases Have Slowed, Strategy (MicroStrategy) Has Slowed, What Else Can Drive Bitcoin's Rise?

Market Refocus on Inflation and Rate Expectations Weighs on Bitcoin Currently, the market is in a phase of macro-repricing dominated by inflation and interest rate expectations. Bitcoin, which previously benefited from easy liquidity and low inflation, is seeing its core bullish drivers weaken. These drivers were market expectations for interest rate cuts and strong inflows from Bitcoin ETFs and institutions like MicroStrategy (referred to as "Strategy" in the text). The logic has shifted. Recent high inflation data (e.g., CPI hitting 3.8% in a May 2026 report) has caused the market to sharply reduce its rate cut expectations for 2025 and even price in potential hikes. This is a key constraint for Bitcoin, as it lacks cash flows and is highly sensitive to rate expectations. Concurrently, institutional capital flows have slowed significantly. Following the hot CPI data, Bitcoin ETFs saw accelerated outflows, with around $4.3 billion leaving over a period. MicroStrategy's ability to keep adding substantial Bitcoin to its balance sheet is also diminishing. Together, ETF and MicroStrategy holdings total roughly $110 billion, but their momentum as growth engines is cooling. In summary, Bitcoin's current pressure stems not from its own fundamentals but from a changing macro environment. As long as inflation stays elevated, Bitcoin is likely to remain in a consolidating phase. However, historically, inflation eventually peaks. Once it recedes and rate cut expectations rebuild, institutional capital could return, potentially fueling a new and more robust recovery phase for Bitcoin.

marsbit9m ago

BIT Research: ETF Purchases Have Slowed, Strategy (MicroStrategy) Has Slowed, What Else Can Drive Bitcoin's Rise?

marsbit9m ago

Earning 1000 Trillion in Half a Year, 'Pocketing' 20 Million per Capita: This Round of Wealth Creation in the Korean Stock Market is Unprecedented in Scale

The South Korean stock market is experiencing an unprecedented wealth surge in 2026, with household equity and fund asset values soaring by over 1,000 trillion KRW (~$730bn) year-to-date. This translates to an average per capita wealth increase of roughly 20 million KRW, fueled by a historic 109% rally in the KOSPI index. The boom is driven by three converging forces: an AI-driven semiconductor supercycle boosting giants like Samsung and SK Hynix; the government's "Value-Up" market reforms addressing long-standing corporate governance issues; and aggressive real estate regulations that have locked capital within financial markets, preventing profits from flowing back into property. This has triggered a wealth effect, boosting high-end consumption significantly. However, the gains are highly concentrated. The two semiconductor behemoths account for over half the index's value, but retail investors own relatively low stakes in them, systematically missing the biggest rallies. Wealth and consumption benefits are skewed towards luxury goods and imported cars, bypassing mainstream retail. Further risks stem from excessive leverage, with high trading volume in leveraged ETFs, and a market sentiment heavily reliant on the AI sector's fortunes and speculative rumors. While this cycle marks a potential shift from real estate to equities as a primary wealth generator for Koreans, its sustainability, amid structural imbalances and leverage, remains a critical test.

marsbit14m ago

Earning 1000 Trillion in Half a Year, 'Pocketing' 20 Million per Capita: This Round of Wealth Creation in the Korean Stock Market is Unprecedented in Scale

marsbit14m ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片