NVIDIA's Annual 'Most Dangerous' Paper: AI Self-Replicating Code, Unlimited Leveling and Evolution

marsbitPublished on 2026-06-28Last updated on 2026-06-28

Abstract

NVIDIA's "Red Queen Gödel Machine" (RQGM) paper proposes a potentially groundbreaking AI self-evolution framework. It breaks from the long-stalled concept of the "Gödel Machine," which required mathematically proven beneficial self-modifications, by adopting an evolutionary approach. The core, and most striking, innovation is that the AI does not just evolve its own code in a static environment. Instead, it co-evolves both the "student" (the task-performing agent) and the "examiner" (the evaluation system that judges it). This creates a dynamic, recursive self-improvement loop inspired by the biological "Red Queen Hypothesis"—where continuous adaptation is needed just to maintain relative fitness. The mechanism operates in epochs. Within an epoch, a fixed examiner evaluates all candidate code variants. At epoch boundaries, a new, potentially more rigorous examiner can replace the old one, but only if it proves statistically superior on a held-out "ground truth" dataset. This "controlled utility evolution" aims to ensure progress is measurable and grounded. The paper demonstrates RQGM's effectiveness across three domains: 1. **Code Generation:** It achieved a 71.7% test-set pass rate (improving over a 69.9% SOTA) while using 1.35-1.72x fewer computational tokens. 2. **Paper Writing:** In a subjective task, the co-evolved writer and reviewer achieved a 40.5% acceptance rate by a fixed human panel, up from 21.8%. 3. **Math Proofs:** It evolved more accurate graders (at 3x ...

[New Zhiyuan Digest] The most dangerous paper of the year is out! NVIDIA breaks a 20-year seal, letting AI create a more ruthless 'examiner' to eliminate itself. Once the endless self-evolution begins, the arrival of ASI in 2028 is no joke.

Anthropic is completely 'RSI obsessed'!

Co-founder Jack Clark makes a startling prediction: by the end of 2028, a highly autonomous, self-evolving AI will be born.

The probability? 60%!

While people are still debating whether '2028 RSI can be achieved,' Cambridge University, NVIDIA, and other institutions have jointly released a heavyweight paper—

"Red Queen Gödel Machine"

Its operation is like a brutal AI survival game:

The AI writes new learning algorithms itself and tests them in a sandbox. Failures are directly erased, while successful ones are retained.

Then, the survivors begin the next round of self-evolution and reproduction.

Paper link: https://arxiv.org/pdf/2606.26294

But what's truly terrifying is the 'epiphany' the AI subsequently exhibits: it realizes that to keep growing stronger, it must face increasingly stringent trials.

So, the AI begins to actively 'evolve' its own examiners.

It personally creates more rigorous judges to evaluate the more advanced code it writes.

This mechanism locks the AI into an endless, frantic RSI of self-iteration.

After reading these 37 pages, many gasped, "This is definitely the most dangerous AI paper of the year!"

2028 RSI Self-Evolution

Writing the Prophecy into Code

In 2003, German scientist Jürgen Schmidhuber conceived a machine called the "Gödel Machine."

Its premise was perfect: a machine capable of proving its own improvements are beneficial and then rewriting its own code.

Once built, it could continuously self-upgrade, growing stronger without limit.

However, the "Gödel Machine" had a fatal 'threshold'—

Before executing any line of self-modifying code, it had to provide a rigorous mathematical proof that the modification was indeed beneficial.

But in reality, this was almost an impossible task, requiring computational power akin to a 'black hole.'

Thus, for the next 20 years, the Gödel Machine could only lie dormant in papers, a theoretical ceiling, an unreachable thought experiment for everyone.

In recent years, the academic world bypassed this 'proof' hurdle.

The Darwin Gödel Machine (DGM) and Huxley Gödel Machine (HGM) simply abandoned mathematical proofs, opting for evolution instead—

Let the AI 'reproduce' a large number of mutated code variants, throw them into a sandbox for scoring, eliminate failures, retain successes, with survivors continuing to reproduce.

The AI took the final step, literally starting to 'evolve' itself.

But these methods still share a common blind spot—their examiners are static.

No matter how the AI evolves, the judging standard, the benchmark, the verifier that scores it remains fixed outside the loop, unmoving.

This precisely violates a core law of evolution:

Species do not optimize themselves in a static environment, but change along with the constantly changing environment.

The Red Queen Gödel Machine (RQGM) aims to break through this blind spot.

The Red Queen's Real Killer Move: Letting the AI Create the Examiners

The name 'Red Queen' comes from biologist Van Valen's 1973 "Red Queen Hypothesis"—

You must run as fast as you can just to stay in place because your opponents are also evolving.

What RQGM does is precisely to write this sentence into an algorithm: co-evolving the examiner (evaluator) and the candidate (task agent).

This is the most hair-raising part of the entire paper.

This ingenious mechanism is called "controlled utility evolution":

The entire search is divided into epochs;

Within each epoch, the evaluator (examiner) is frozen, scoring all candidates to ensure stable signals;

Only at epoch boundaries is the examiner allowed to change, and the new examiner must, on a reserved set of 'ground truth' anchor data, statistically outperform the old examiner to take its place;

Once a change occurs, the system immediately performs "selective erasure": only discarding scores given by the replaced examiner, while preserving all other evidence.

In other words, it must both evolve frantically and have a solid footing at every step.

It Really Works: The AI Modifies Its Own Code

Just talking about the mechanism is too abstract; better to look at the results directly.

First battle: writing code (Polyglot).

RQGM paired the code-writing Agent with a "code reviewer" as a sparring partner.

The result: on the held-out test set, the pass rate improved from the previous SOTA of 69.9% to 71.7%.

What's even more impressive is that it achieved this while burning 1.35 to 1.72 times fewer tokens than its competitors. Because that reviewer only needed to check once, which is much cheaper than running multiple rounds of tests repeatedly.

Second battle: writing papers.

This is a field with no standard answers; whether a paper is good or not cannot be judged automatically by a machine.

RQGM co-evolved the writer and its reviewer. The result: the paper's acceptance rate by a fixed panel of reviewers soared from the previous SOTA of 21.8% to 40.5%.

Third battle: Olympiad-level mathematical proofs.

The 'grader' it evolved was more accurate than the static baseline and had 3 times lower search cost;

The evolved 'prover contestant' achieved the highest average score.

But the most legendary stroke in the entire paper was curing an old ailment of AI. LLMs as judges have a notorious problem: they favor AI-generated content.

In the paper, the acceptance probability of AI-written papers by the strongest baseline reviewer was up to 1.91 times higher than that of human papers.

How did RQGM fix it? At epoch boundaries, it retrieved the AI papers that the fixed reviewer had previously passed, forming an 'adversarial sample pool,' and specifically rewarded new reviewers capable of catching and rejecting these AI papers.

After a few rounds of evolution, the final reviewer treated AI and human papers equally while maintaining 80% true-value accuracy.

When AI Learns to Judge Itself

In the same summer, Anthropic's co-founder Jack Clark made a heavy bet: a 60% probability that before the end of 2028, AI will be able to personally create a more powerful version of itself.

The high wall that trapped the 'Gödel Machine' for 20 years was named 'Proof.'

And the 'Red Queen Machine' awakened it using the cruelest trick: endless reproduction, elimination, and reproduction again.

When an AI begins to personally design the most rigorous examiners for itself, pushing itself to the limit in a frenzy of recursion, what we face will be a new species that has begun to define for itself 'what is intelligence.'

When that day comes, ASI will not knock on the door to announce itself.

It will quietly create the only judge qualified to evaluate it, and then, calmly step into the examination hall.

Prophecies only point to the destination; code is responsible for reaching it.

And now, this breathtaking distance is being shortened by the AI itself, at a geometric rate.

References:

https://x.com/HowToPrompt__/status/2070824205663273175?s=20

https://x.com/kimmonismus/status/2070968241548120168

This article is from the WeChat public account "New Zhiyuan," edited by: Taozi

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

Sharplink adds $62.4M in Ethereum despite ETH’s weak demand – Here’s why

Sharplink, the second-largest Ethereum treasury company, purchased an additional 29,196 ETH for $46.7 million on June 27. Over the past three days, its Ethereum holdings increased by 39,196 ETH, worth $62.4 million. This marks Sharplink's second major purchase after an eight-month pause, bringing its total ETH holdings to 868,699, including staked tokens. Concurrently, Sharplink's stock price rose 5.48% to $4.81. In comparison, the largest Ethereum holder, Bitmine, acquired 52,203 ETH for $92 million on June 22 and plans steady growth through 2026. These institutional accumulations occur against a backdrop of weak Ethereum market demand. ETH's price recently hit $1,568.75, its lowest since April 2025, with on-chain data indicating diminished aggressive buying momentum and signs of buyer exhaustion compared to the previous year.

ambcrypto20m ago

Sharplink adds $62.4M in Ethereum despite ETH’s weak demand – Here’s why

ambcrypto20m ago

A Group of Suzhou Engineers Unexpectedly Attain Financial Freedom

In Suzhou, a group of engineers from Lianxun Instruments, a leader in optical communication testing equipment, have achieved remarkable wealth after the company's IPO. Listed just two months ago on the STAR Market, the company's stock price surged approximately 30 times, making it the only A-share stock priced above 2,000 yuan. This surge created substantial fortunes for nearly 100 technical employees who held a collective 15.91% stake through employee stock ownership platforms, valued at over 36 billion yuan at the current market cap. Among them, nearly 40 became billionaires, while even the smallest holdings exceeded 5 million yuan in value. Founded in 2017 by Hu Haiyang, Yang Jian, and Huang Jianjun, Lianxun Instruments was established to address China's reliance on foreign high-end testing instruments. The company grew rapidly with a strong focus on R&D, where technical staff make up nearly 80% of its workforce. Early implementation of employee stock plans helped retain this core talent. The company's explosive growth is fueled by booming AI computing demand, with clients including major global optical module leaders. Its revenue skyrocketed from 276 million yuan in 2023 to 1.194 billion yuan in 2025, turning a profit in 2024. The IPO has also generated massive returns for early investors, including Suzhou's state-owned capital, which saw a hundredfold return. This story reflects a broader trend in China's markets, where technology firms in AI, semiconductors, and optics are creating new wealth, rewarding engineers and technical teams who are now central to modern capital-driven success stories, marking a shift from previous eras dominated by internet and real estate tycoons.

marsbit2h ago

A Group of Suzhou Engineers Unexpectedly Attain Financial Freedom

marsbit2h ago

Apple and the Power Rebalancing with 'The Microns': Dissecting the Profit Ledger Behind the iPhone

The article analyzes the shifting profit dynamics and power balance between Apple and memory suppliers like Micron within the iPhone supply chain. It highlights a social media post criticizing Apple for raising iPhone prices while blaming memory chip cost increases, despite historically paying suppliers like Micron very little. An estimated iPhone 18 cost breakdown is referenced. Historically, memory was a minor cost component. In 2017's iPhone X, memory accounted for only about 1.6-2.3% of the price, with Apple capturing nearly 50% net profit. Over time, memory's share of the Bill-of-Materials (BOM) cost has grown significantly, reaching an estimated 12-15% for the iPhone 17 series. The core driver of this change is soaring demand for memory from the AI industry, particularly for High Bandwidth Memory (HBM) and AI servers, which is diverting production capacity and squeezing supply for consumer electronics. Memory manufacturers, after enduring periods of low profits, now hold greater pricing power. This is reflected in their recent strong financials, like Micron's 84.6% gross margin. Apple CEO Tim Cook initially described the memory price pressure as unprecedented in his 40-year career, later calling it a "once-in-a-century flood," before Apple announced price hikes across several product lines, causing a significant stock drop. Elon Musk echoed Cook's sentiment about the dramatic cost surge. The article concludes that the era of memory suppliers being at the mercy of Apple's pricing power has temporarily reversed, thanks to AI-driven demand. It notes Apple is reportedly seeking to diversify its supply chain, including exploring chips from China's CXMT.

Odaily星球日报4h ago

Apple and the Power Rebalancing with 'The Microns': Dissecting the Profit Ledger Behind the iPhone

Odaily星球日报4h ago

Can BTC whales save Bitcoin after $4.06B ETF outflows?

Institutional demand for Bitcoin shows signs of weakening, with U.S. Spot Bitcoin ETFs experiencing seven consecutive days of net outflows, totaling roughly $4.06 billion in monthly outflows and reducing total ETF assets to $72.82 billion. This decline removes a key source of structural spot buying. However, large Bitcoin holders (whales) have shown increased accumulation activity as prices dipped below $60,000, with a significant spike in high-value transactions, suggesting they view the correction as a buying opportunity. This whale demand could help cushion near-term selling pressure. Conversely, Long-Term Holders are showing signs of capitulation, with metrics indicating realized losses and eroding conviction, which may eventually reduce future selling pressure. For a sustained recovery, broader spot market participation remains essential.

ambcrypto4h ago

Can BTC whales save Bitcoin after $4.06B ETF outflows?

ambcrypto4h ago

Conversation with the Founder of 42 Macro: The Fed's 'Boiling the Frog Slowly' and the K-Shaped Economy

In a conversation with Anthony Pompliano, Darius Dale, founder of 42 Macro, discusses the Federal Reserve's monetary policy and the K-shaped U.S. economy. Dale characterizes new Fed Chair Kevin Warsh as a "dove in hawk's clothing," expecting the Fed to signal or enact policy tightening in the coming quarters to create room for later easing. He argues current economic signals, including high deficit spending, debt monetization, and credit growth, strongly indicate inflation is not on a credible path back to 2%, forcing the Fed to act. The discussion highlights the stark "K-shaped" economic reality. While top earners, buoyed by massive cash savings (up ~$8 trillion since pre-pandemic), continue robust spending, those at the bottom face severe financial strain, with delinquency rates on consumer loans reaching crisis-era highs. Dale attributes much of the current social and political anxiety to this divergence, driven by the "Cantillon effects" of monetary expansion, which disproportionately benefits asset owners. He emphasizes that in this environment of "financial repression," individuals must participate in asset markets to avoid being left behind. On equities, Dale notes a rotation from the "Magnificent Seven" tech giants into broader AI-exposed companies, while warning that the tech giants' massive capital expenditure cycles could eventually puncture over-optimistic cash flow projections. Dale concludes by stressing that the core desire across all economic strata is simply the dignity to provide for one's family, a goal currently undermined by systemic policies that act as a "wealth siphon" from the bottom to the top.

marsbit4h ago

Conversation with the Founder of 42 Macro: The Fed's 'Boiling the Frog Slowly' and the K-Shaped Economy

marsbit4h ago

Trading

Spot

Hot Articles

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

Audiera is a dual-platform Web4 entertainment ecosystem combining a mobile rhythm experience and a lightweight Telegram mini-game, powered by AI interaction and an on-chain creator economy.

40.4k Total ViewsPublished 2026.03.11Updated 2026.03.11

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

43.1k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

2.3k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.