NVIDIA's Annual 'Most Dangerous' Paper: AI Self-Replicating Code, Unlimited Leveling and Evolution

marsbitPublished on 2026-06-28Last updated on 2026-06-28

Abstract

NVIDIA's "Red Queen Gödel Machine" (RQGM) paper proposes a potentially groundbreaking AI self-evolution framework. It breaks from the long-stalled concept of the "Gödel Machine," which required mathematically proven beneficial self-modifications, by adopting an evolutionary approach. The core, and most striking, innovation is that the AI does not just evolve its own code in a static environment. Instead, it co-evolves both the "student" (the task-performing agent) and the "examiner" (the evaluation system that judges it). This creates a dynamic, recursive self-improvement loop inspired by the biological "Red Queen Hypothesis"—where continuous adaptation is needed just to maintain relative fitness. The mechanism operates in epochs. Within an epoch, a fixed examiner evaluates all candidate code variants. At epoch boundaries, a new, potentially more rigorous examiner can replace the old one, but only if it proves statistically superior on a held-out "ground truth" dataset. This "controlled utility evolution" aims to ensure progress is measurable and grounded. The paper demonstrates RQGM's effectiveness across three domains: 1. **Code Generation:** It achieved a 71.7% test-set pass rate (improving over a 69.9% SOTA) while using 1.35-1.72x fewer computational tokens. 2. **Paper Writing:** In a subjective task, the co-evolved writer and reviewer achieved a 40.5% acceptance rate by a fixed human panel, up from 21.8%. 3. **Math Proofs:** It evolved more accurate graders (at 3x ...

[New Zhiyuan Digest] The most dangerous paper of the year is out! NVIDIA breaks a 20-year seal, letting AI create a more ruthless 'examiner' to eliminate itself. Once the endless self-evolution begins, the arrival of ASI in 2028 is no joke.

Anthropic is completely 'RSI obsessed'!

Co-founder Jack Clark makes a startling prediction: by the end of 2028, a highly autonomous, self-evolving AI will be born.

The probability? 60%!

While people are still debating whether '2028 RSI can be achieved,' Cambridge University, NVIDIA, and other institutions have jointly released a heavyweight paper—

"Red Queen Gödel Machine"

Its operation is like a brutal AI survival game:

The AI writes new learning algorithms itself and tests them in a sandbox. Failures are directly erased, while successful ones are retained.

Then, the survivors begin the next round of self-evolution and reproduction.

Paper link: https://arxiv.org/pdf/2606.26294

But what's truly terrifying is the 'epiphany' the AI subsequently exhibits: it realizes that to keep growing stronger, it must face increasingly stringent trials.

So, the AI begins to actively 'evolve' its own examiners.

It personally creates more rigorous judges to evaluate the more advanced code it writes.

This mechanism locks the AI into an endless, frantic RSI of self-iteration.

After reading these 37 pages, many gasped, "This is definitely the most dangerous AI paper of the year!"

2028 RSI Self-Evolution

Writing the Prophecy into Code

In 2003, German scientist Jürgen Schmidhuber conceived a machine called the "Gödel Machine."

Its premise was perfect: a machine capable of proving its own improvements are beneficial and then rewriting its own code.

Once built, it could continuously self-upgrade, growing stronger without limit.

However, the "Gödel Machine" had a fatal 'threshold'—

Before executing any line of self-modifying code, it had to provide a rigorous mathematical proof that the modification was indeed beneficial.

But in reality, this was almost an impossible task, requiring computational power akin to a 'black hole.'

Thus, for the next 20 years, the Gödel Machine could only lie dormant in papers, a theoretical ceiling, an unreachable thought experiment for everyone.

In recent years, the academic world bypassed this 'proof' hurdle.

The Darwin Gödel Machine (DGM) and Huxley Gödel Machine (HGM) simply abandoned mathematical proofs, opting for evolution instead—

Let the AI 'reproduce' a large number of mutated code variants, throw them into a sandbox for scoring, eliminate failures, retain successes, with survivors continuing to reproduce.

The AI took the final step, literally starting to 'evolve' itself.

But these methods still share a common blind spot—their examiners are static.

No matter how the AI evolves, the judging standard, the benchmark, the verifier that scores it remains fixed outside the loop, unmoving.

This precisely violates a core law of evolution:

Species do not optimize themselves in a static environment, but change along with the constantly changing environment.

The Red Queen Gödel Machine (RQGM) aims to break through this blind spot.

The Red Queen's Real Killer Move: Letting the AI Create the Examiners

The name 'Red Queen' comes from biologist Van Valen's 1973 "Red Queen Hypothesis"—

You must run as fast as you can just to stay in place because your opponents are also evolving.

What RQGM does is precisely to write this sentence into an algorithm: co-evolving the examiner (evaluator) and the candidate (task agent).

This is the most hair-raising part of the entire paper.

This ingenious mechanism is called "controlled utility evolution":

The entire search is divided into epochs;

Within each epoch, the evaluator (examiner) is frozen, scoring all candidates to ensure stable signals;

Only at epoch boundaries is the examiner allowed to change, and the new examiner must, on a reserved set of 'ground truth' anchor data, statistically outperform the old examiner to take its place;

Once a change occurs, the system immediately performs "selective erasure": only discarding scores given by the replaced examiner, while preserving all other evidence.

In other words, it must both evolve frantically and have a solid footing at every step.

It Really Works: The AI Modifies Its Own Code

Just talking about the mechanism is too abstract; better to look at the results directly.

First battle: writing code (Polyglot).

RQGM paired the code-writing Agent with a "code reviewer" as a sparring partner.

The result: on the held-out test set, the pass rate improved from the previous SOTA of 69.9% to 71.7%.

What's even more impressive is that it achieved this while burning 1.35 to 1.72 times fewer tokens than its competitors. Because that reviewer only needed to check once, which is much cheaper than running multiple rounds of tests repeatedly.

Second battle: writing papers.

This is a field with no standard answers; whether a paper is good or not cannot be judged automatically by a machine.

RQGM co-evolved the writer and its reviewer. The result: the paper's acceptance rate by a fixed panel of reviewers soared from the previous SOTA of 21.8% to 40.5%.

Third battle: Olympiad-level mathematical proofs.

The 'grader' it evolved was more accurate than the static baseline and had 3 times lower search cost;

The evolved 'prover contestant' achieved the highest average score.

But the most legendary stroke in the entire paper was curing an old ailment of AI. LLMs as judges have a notorious problem: they favor AI-generated content.

In the paper, the acceptance probability of AI-written papers by the strongest baseline reviewer was up to 1.91 times higher than that of human papers.

How did RQGM fix it? At epoch boundaries, it retrieved the AI papers that the fixed reviewer had previously passed, forming an 'adversarial sample pool,' and specifically rewarded new reviewers capable of catching and rejecting these AI papers.

After a few rounds of evolution, the final reviewer treated AI and human papers equally while maintaining 80% true-value accuracy.

When AI Learns to Judge Itself

In the same summer, Anthropic's co-founder Jack Clark made a heavy bet: a 60% probability that before the end of 2028, AI will be able to personally create a more powerful version of itself.

The high wall that trapped the 'Gödel Machine' for 20 years was named 'Proof.'

And the 'Red Queen Machine' awakened it using the cruelest trick: endless reproduction, elimination, and reproduction again.

When an AI begins to personally design the most rigorous examiners for itself, pushing itself to the limit in a frenzy of recursion, what we face will be a new species that has begun to define for itself 'what is intelligence.'

When that day comes, ASI will not knock on the door to announce itself.

It will quietly create the only judge qualified to evaluate it, and then, calmly step into the examination hall.

Prophecies only point to the destination; code is responsible for reaching it.

And now, this breathtaking distance is being shortened by the AI itself, at a geometric rate.

References:

https://x.com/HowToPrompt__/status/2070824205663273175?s=20

https://x.com/kimmonismus/status/2070968241548120168

This article is from the WeChat public account "New Zhiyuan," edited by: Taozi

Trending Cryptos

Related Questions

QWhat is the core innovation proposed in the NVIDIA-led paper titled 'Red Queen Gödel Machine' (RQGM)?

AThe core innovation is a framework for Recursive Self-Improvement (RSI) where AI agents not only self-evolve by writing and testing new algorithms, but also autonomously 'evolve' their own evaluators (or 'examiners'). This creates a co-evolutionary dynamic where the AI and the test criteria become progressively more challenging, leading to potentially unlimited, runaway self-improvement.

QHow does the RQGM's 'controlled utility evolution' mechanism ensure the integrity of the evolutionary process when switching evaluators?

AIt breaks the search into epochs. Within an epoch, the evaluator is frozen to provide stable feedback. At epoch boundaries, a new evaluator can only replace the old one if it statistically outperforms the old evaluator on a reserved 'ground truth' anchor dataset. Upon replacement, the system performs 'selective erasure,' discarding only the scores given by the old evaluator while retaining all other evidence, thus maintaining progress while ensuring new standards are justified.

QWhat key limitation of the original 'Gödel Machine' concept did the RQGM overcome?

AThe original Gödel Machine required a formal mathematical proof that any self-modification would be beneficial before execution, a computationally prohibitive task. RQGM overcomes this by adopting an evolutionary approach, where code variants are generated, tested in a sandbox, and the fittest are selected, eliminating the need for impossible upfront proofs.

QAccording to the article, how did RQGM address the problem of LLM evaluators showing bias towards AI-generated content in the paper-writing task?

ARQGM created an 'adversarial sample pool' from AI-written papers previously accepted by the frozen reviewer. It then specifically rewarded new reviewers that were good at identifying and rejecting these AI-generated papers from the pool. After several evolutionary rounds, the final reviewer treated human and AI papers nearly equally while maintaining high accuracy.

QWhat significant prediction about AI development timeline is mentioned in the article, and who made it?

AThe article mentions a prediction by Anthropic co-founder Jack Clark that there is a 60% probability a highly autonomous, self-improving AI (hinting at the emergence of an Artificial Superintelligence or ASI) will be created by the end of 2028.

Related Reads

A Group of Suzhou Engineers Unexpectedly Attain Financial Freedom

In Suzhou, a group of engineers from Lianxun Instruments, a leader in optical communication testing equipment, have achieved remarkable wealth after the company's IPO. Listed just two months ago on the STAR Market, the company's stock price surged approximately 30 times, making it the only A-share stock priced above 2,000 yuan. This surge created substantial fortunes for nearly 100 technical employees who held a collective 15.91% stake through employee stock ownership platforms, valued at over 36 billion yuan at the current market cap. Among them, nearly 40 became billionaires, while even the smallest holdings exceeded 5 million yuan in value. Founded in 2017 by Hu Haiyang, Yang Jian, and Huang Jianjun, Lianxun Instruments was established to address China's reliance on foreign high-end testing instruments. The company grew rapidly with a strong focus on R&D, where technical staff make up nearly 80% of its workforce. Early implementation of employee stock plans helped retain this core talent. The company's explosive growth is fueled by booming AI computing demand, with clients including major global optical module leaders. Its revenue skyrocketed from 276 million yuan in 2023 to 1.194 billion yuan in 2025, turning a profit in 2024. The IPO has also generated massive returns for early investors, including Suzhou's state-owned capital, which saw a hundredfold return. This story reflects a broader trend in China's markets, where technology firms in AI, semiconductors, and optics are creating new wealth, rewarding engineers and technical teams who are now central to modern capital-driven success stories, marking a shift from previous eras dominated by internet and real estate tycoons.

marsbit2h ago

A Group of Suzhou Engineers Unexpectedly Attain Financial Freedom

marsbit2h ago

Apple and the Power Rebalancing with 'The Microns': Dissecting the Profit Ledger Behind the iPhone

The article analyzes the shifting profit dynamics and power balance between Apple and memory suppliers like Micron within the iPhone supply chain. It highlights a social media post criticizing Apple for raising iPhone prices while blaming memory chip cost increases, despite historically paying suppliers like Micron very little. An estimated iPhone 18 cost breakdown is referenced. Historically, memory was a minor cost component. In 2017's iPhone X, memory accounted for only about 1.6-2.3% of the price, with Apple capturing nearly 50% net profit. Over time, memory's share of the Bill-of-Materials (BOM) cost has grown significantly, reaching an estimated 12-15% for the iPhone 17 series. The core driver of this change is soaring demand for memory from the AI industry, particularly for High Bandwidth Memory (HBM) and AI servers, which is diverting production capacity and squeezing supply for consumer electronics. Memory manufacturers, after enduring periods of low profits, now hold greater pricing power. This is reflected in their recent strong financials, like Micron's 84.6% gross margin. Apple CEO Tim Cook initially described the memory price pressure as unprecedented in his 40-year career, later calling it a "once-in-a-century flood," before Apple announced price hikes across several product lines, causing a significant stock drop. Elon Musk echoed Cook's sentiment about the dramatic cost surge. The article concludes that the era of memory suppliers being at the mercy of Apple's pricing power has temporarily reversed, thanks to AI-driven demand. It notes Apple is reportedly seeking to diversify its supply chain, including exploring chips from China's CXMT.

Odaily星球日报4h ago

Apple and the Power Rebalancing with 'The Microns': Dissecting the Profit Ledger Behind the iPhone

Odaily星球日报4h ago

Conversation with the Founder of 42 Macro: The Fed's 'Boiling the Frog Slowly' and the K-Shaped Economy

In a conversation with Anthony Pompliano, Darius Dale, founder of 42 Macro, discusses the Federal Reserve's monetary policy and the K-shaped U.S. economy. Dale characterizes new Fed Chair Kevin Warsh as a "dove in hawk's clothing," expecting the Fed to signal or enact policy tightening in the coming quarters to create room for later easing. He argues current economic signals, including high deficit spending, debt monetization, and credit growth, strongly indicate inflation is not on a credible path back to 2%, forcing the Fed to act. The discussion highlights the stark "K-shaped" economic reality. While top earners, buoyed by massive cash savings (up ~$8 trillion since pre-pandemic), continue robust spending, those at the bottom face severe financial strain, with delinquency rates on consumer loans reaching crisis-era highs. Dale attributes much of the current social and political anxiety to this divergence, driven by the "Cantillon effects" of monetary expansion, which disproportionately benefits asset owners. He emphasizes that in this environment of "financial repression," individuals must participate in asset markets to avoid being left behind. On equities, Dale notes a rotation from the "Magnificent Seven" tech giants into broader AI-exposed companies, while warning that the tech giants' massive capital expenditure cycles could eventually puncture over-optimistic cash flow projections. Dale concludes by stressing that the core desire across all economic strata is simply the dignity to provide for one's family, a goal currently undermined by systemic policies that act as a "wealth siphon" from the bottom to the top.

marsbit4h ago

Conversation with the Founder of 42 Macro: The Fed's 'Boiling the Frog Slowly' and the K-Shaped Economy

marsbit4h ago

Trading

Spot

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片