让AI自我构建的RSI火了,Google泼冷水,DeepSeek们摸到了边

marsbitPublished on 2026-06-06Last updated on 2026-06-06

Abstract

“递归自我改进”(RSI)概念近期在AI领域引发热议,指让AI系统自我训练、自我构建,以实现持续进步。国外如Recursive Superintelligence公司、安德烈·卡帕西的Auto-Research项目等正积极探索,旨在实现AI全自动研究闭环。Google CEO桑达尔·皮查伊则态度谨慎,认为当前尚未达到RSI所描述的加速阶段。 业内将RSI进程分为三级:AI能独立进行研究(足够级)、与人类研究质量相当(对等级)、超越人机协作(超越级)。专家预测,一旦达到对等级,进展可能急剧加速。 国内厂商虽较少公开提及RSI,但已在实际研发中触及相关理念。例如DeepSeek通过算法优化提升效率,百度文心利用强化学习推动模型自优化。不过,在顶尖人才密度和前沿探索上,国内仍处于跟随状态。 RSI面临现实挑战:AI自我生成数据可能导致质量退化(模型坍缩),且其发展依赖无限的算力与开放的全球协作生态,而当前算力成本高企、技术脱钩等因素构成制约。 行业趋势显示,人类正逐步从AI研发链条的各个环节后退。这种自动化进程虽能提升效率,但也可能削弱人类对技术本质的理解与掌控。

“递归”这个词,最近突然在AI圈子里火了。

两家初创公司直接把这个词当成了公司名,许多实验室开始在路线图里塞进一个叫做RSI的三字缩写中,也就是递归的英文名——recursive self-improvement(递归式自我改进)。就像AGI一样,RSI正在变成一个让人既兴奋又忐忑的行业暗号,哪怕大家对它的定义还没完全对齐。

(图源:X)

什么是RSI?简单来说,就是让AI自己训练自己,在技术界,RSI一直被视为人工智能进步的主要标志之一,与记忆、推理和多模态并列,唯一的限制是算力,人类在其中已经不是必要条件,甚至连帮手都算不上。

听起来很科幻,或者说,听起来很危险?但冷静下来想,这不是AI行业的第一次狂热。从2016年的AlphaGo到2023年的ChatGPT,再到今天各家大模型参数军备竞赛,AI行业的天性就是追逐下一个“改变一切”的东西,在雷科技AGI(ID:leikejiagi)看来,RSI可能就是下一场狂欢。

RSI火了:当AI能靠「递归」进行自我构建

今年5月,AI界知名研究员Richard Socher高调创办了一家叫Recursive Superintelligence的新公司,名字直接就是RSI。

他表示:“我们的核心目标是构建真正意义上的递归自我改进超级智能,整个研究的构思、实现和验证过程,全部自动完成。

另一个更让圈内人津津乐道的案例,是‌安德烈·卡帕西(Andrej Karpathy)推进的一个叫Auto-Research的项目:用智能体集群来训练语言模型,让模型自己做简单的研究任务,自己去改进自己。

图源:github

安德烈·卡帕西也是一个传奇人物,他在特斯拉做自动驾驶、在OpenAI做GPT都留下过硬货。现在他把RSI当成下一站来all in,而且是用公开透明的方式在推进,这也说明他是真的认为这事可以做到。

有意思的是,他对这个项目出奇地坦诚,定期在推特上更新进展,代码也开了GitHub公开仓库。当然,‌安德烈·卡帕西自己也说了,目前的工作还是在GPT-2级别的小模型上做迭代,“还不是什么突破性研究(暂时)”,但这已经足够带动一大批研究者跟进了。

更重要的是,‌安德烈·卡帕西最近加入了Anthropic的预训练团队。Anthropic有Claude,卡帕西有auto-research这套方法论,两边一合,大模型+自训练循环,一旦跑通,就不是GPT-2级别的小打小闹了。

图源:haimagazine

另一家叫Adaption的公司推出了一个AutoScientist工具,目标是自动化前沿模型的训练过程。逻辑跟‌安德烈·卡帕西的auto-researchers一样,训练agent做渐进式改进。只不过Adaption的野心更大,想直接搞定一整个全尺寸前沿模型的训练闭环。

这两家其实代表了两种路线:‌安德烈·卡帕西是从底层逐块验证,一边开源一边在社区里攒势能;Adaption是直接冲着商业化的大模型训练场景去的,落地意愿更强烈。两条路谁先跑通,对整个行业的影响会截然不同。

Google CEO泼冷水:我们还没到那一步

关于RSI,AI圈大佬们也众说纷纭。

Google CEO 桑达尔·皮查伊上个月在一档播客里,措辞相当谨慎地承认了现实:“(RSI)是一个连续体,我们确实都在进步。但如果按照大家描述RSI的方式,那代表的是下一个量级的加速,会有很多影响,但我们还没到那一步。

虽然如此,但这里面的“连续体”描述,已经包含了不少让人细思极恐的事情。

今年1月,Anthropic一位主导Claude Code开发的程序员坦言,团队里接近100%的代码是Claude Code写的,这是一种字面意义上的AI在写自己。不是AI辅助工程师写代码,而是AI工具在某种程度上已经在替代工程师写自己的代码。

图源:Anthropic

Anthropic有一份关于Mythos预览版本的内部调查:18位工程师里,有5位认为,如果配套系统再改进一下,这个版本的Mythos就可以替代一个L4工程师,即可以独立承接复杂项目、不需要实时监督的中级程序员。

但缺陷也写得很清楚:“Claude报告的主要弱点包括:管理周期以上的模糊任务、理解组织优先级、品味、验证、指令遵循和认识论。”意思就是说,它弱的,恰恰是自我驱动的那些事,而自我驱动,是RSI的根基。

好玩的是,Georgetown安全与新兴技术研究中心(CSET)去年组织了一批专家专门研究RSI。这群专家在评估时出现了明显分裂,一部分人预期即将迎来“超级智能爆炸”,另一部分人预期进展会更慢、最终会触达某个瓶颈期。

但他们有一个共识:递归,让未来变得格外难以预测。

为此,METR研究员Ajeya Cotra的一篇文章,把RSI的进程拆解成几个里程碑,我觉得这是目前最好用的分析框架。

第一级叫“足够”(adequacy):把人类完全移除后,系统依然能做研究——哪怕不如人类,但能运转。

第二级叫“对等”(parity):AI独立完成的研究,和人类独立完成的研究质量相当。

第三个叫“超越”(supremacy):AI独立系统的表现,超过了人类与AI协作的系统。

有点像自动驾驶里的L2、3、4、5。Ajeya Cotra的判断是:我们离第一级已经很近了。但第二级什么时候来,她没给时间表,但她给了一个非常明确的推演,一旦第二级到来,后续加速会远超过往,“一年之内可能就会冲到第三级。”

为什么这么快?因为到了第二级那一刻,AI就变成了一个不需要睡觉、不需要开会、不需要对齐KPI的研究团队。它可以24小时不间断地试、改、再试。而人类做研究,哪怕效率再高的人,一天的有效深度工作时间也就那么几个小时,中间还夹着无数打断和沟通成本,一旦这个瓶颈不存在了,加速度是断崖式上升的。

国内没人喊RSI,但DeepSeek们已摸到了边

前面聊了一堆海外的进展,你可能想问:国内呢?

坦白讲,国内厂商很少公开喊RSI,海外的AI公司能把“递归超级智能”写进公司使命,这种事在国内几乎不可想象。但如果说让AI自己改进自己,国内厂商其实已经在不同的路径上悄悄摸到边了。

最典型的例子是DeepSeek。他们花的钱比OpenAI少一个数量级,但在很多推理任务上已经可以正面刚。靠的就是算法效率的极致优化——MoE架构、激活参数的极致压缩、训练策略的工程化打磨。

虽说这跟RSI关系不大,但这是一条用更聪明的方法,替代蛮力堆算力的路。而这条路,恰好是RSI的核心逻辑之一:让模型在迭代中找到更聪明的那条路径。

百度文心这边,强化学习驱动模型自我优化已经是常规操作了。虽然没有用RSI这个名字,但做的是同一件事:让模型在特定任务上通过自反馈循环不断改进。从这个角度看,国内厂商不是没在做RSI,只是他们已经把RSI的某些环节变成了日常工程实践,只是不挂这个名。

(图源:gemini生成)

当然,差距也是客观存在的。OpenAI和Anthropic的人才密度,目前国内任何一家都还比不了,这意味着在RSI的探索上,眼下仍然是跟随状态。

但历史经验告诉我们,国内厂商在“管道路径明确之后”的追赶速度往往是惊人的。RSI的框架正在被海外大神们拆得越来越清晰,Karpathy的代码也公开在GitHub上,一旦可复现的路径走通了,国内玩家的成本控制能力和落地场景密度,会是一个被市场严重低估的变量。

但同时,我们也得适当泼点冷水。事实上,AI自己生成的数据,用来训练下一版AI,质量是会往下掉的。RSI的逻辑是AI生成好的数据,然后用这些数据训练下一代AI,使得下一代AI更强。

而实际情况可能反过来,AI生成的数据里往往会混进它自己的幻觉、偏见、质量退化,这些二手数据被喂给下一版,下一版再产出更差的三手货,循环几代之后整个系统就塌了,就像一个复印机不断复印复印件,印到第十张脸都糊了。

学术界管这个叫模型坍缩,已经有论文验证过这个现象真实存在。

再者,RSI需要的理想环境,在真实世界里根本不存在。这套系统要跑起来,两个前提缺一不可:无限算力、全球开放协作的研究生态。

而现实是训练一个前沿模型的成本已经到了十亿量级,芯片产能有限、能源有限、优质数据也在变少,出口管制和技术脱钩正在把AI研究切成几个互相不流通的圈子,人和货都流不动,连这些基础条件都凑不齐,就别谈什么RSI了。

RSI不只是一个技术问题了,它还需要一个足够开放的世界,而这个前提能不能成立,技术圈还真无法说了算。

写在最后

最后说个我觉得有意思的观察:整个行业在过去五年里,先是大规模预训练把人拉进了“参数崇拜”,然后是RLHF(基于人类反馈的强化学习‌)让人相信“价值观可以微调”,现在是RSI在讲一个“机器自己跑完整个研发链条”的故事。每一步都在让人类往后退一步,不是退出行业,而是退出决策链条。

虽说这种退法不一定是坏事,但它是不可逆的。一旦某个环节被自动化接管了,人的直觉、经验、判断力在那个环节就慢慢退化了,就像不用GPS之后你会发现认路能力确实在变差。

到那时候,我们连工具是怎么造出来的,都不一定能真的理解。

Trending Cryptos

Related Questions

Q什么是RSI(递归式自我改进),它在AI技术发展中的主要意义是什么?

ARSI(递归式自我改进)指的是让AI系统通过自我训练和自我改进来提升能力,人类在其中的参与度降至最低。它是人工智能进步的主要标志之一,被视为实现更高级AI甚至超级智能的关键路径,与技术界的记忆、推理和多模态能力并列。其核心是让AI自主完成研究、优化和迭代的循环。

Q安德烈·卡帕西(Andrej Karpathy)的Auto-Research项目目标是什么,其进展如何?

A安德烈·卡帕西的Auto-Research项目旨在使用智能体集群来训练语言模型,让模型自主执行研究任务并改进自身。目前他主要在GPT-2级别的小模型上进行迭代验证,并以开源方式推进。他近期已加入Anthropic的预训练团队,这意味着未来可能将其方法论与Claude等大模型结合,探索更大规模的RSI应用。

QGoogle CEO 桑达尔·皮查伊对RSI的态度如何?他提出了什么观点?

A桑达尔·皮查伊对RSI持谨慎态度,认为当前AI的自我改进是一个“连续体”,虽有进步但尚未达到人们通常描述的“下一个量级加速”阶段。他承认进步的存在,但也暗示这种加速可能带来重大影响,并强调我们“还没到那一步”。

Q文章中提到RSI发展的几个关键里程碑是什么?其潜在风险是什么?

A文章借METR研究员Ajeya Cotra的框架,将RSI进程分为三个里程碑:1) “足够”——AI系统无需人类即可独立进行研究;2) “对等”——AI的研究质量与人类相当;3) “超越”——AI的表现超过人机协作系统。潜在风险包括“模型坍缩”(即AI用自生成数据训练导致质量迭代下降)以及现实条件限制(如算力、数据、开放协作生态的不足)。

Q国内如DeepSeek等在RSI相关领域的发展状况如何?与海外相比有哪些特点和差距?

A国内厂商如DeepSeek、百度文心等并未高调宣扬RSI概念,但已在相关路径上取得进展。例如,DeepSeek通过算法优化(如MoE架构、训练策略优化)以较低成本实现强大推理能力;百度文心则利用强化学习推动模型自我优化。与海外相比,国内在人才密度和前沿探索上仍处跟随状态,但具备快速追赶和成本控制的潜力,且更倾向于将RSI的某些环节作为工程实践而非单独概念推进。

Related Reads

Giants Wage the Context War, Reconstructing AI Moats

The article "Giants Launch the Context War, Reconstructing AI's Moat" discusses how leading AI companies—OpenAI, Anthropic, and Google—are shifting their competitive focus from model size to acquiring, managing, and utilizing user context (Context). Initially, Context referred to the length of text a model could process, leading to a "arms race" for longer context windows. However, the competition has evolved through three key phases: expanding text capacity (long context windows), enabling memory across sessions, and finally, integrating AI into real user environments like browsers and desktops to capture dynamic task states. Each company is pursuing a distinct strategy. OpenAI is building Context around the ChatGPT account, turning it into a central hub that accumulates user understanding across various integrated applications and tools. Anthropic, lacking a major user base, focuses on high-value verticals like coding, empowering its Claude model to actively gather Context through GUI interaction (Computer Use) and system connections (MCP protocol). Google, with vast existing user data from products like Search and Gmail, faces the challenge of restructuring this data into actionable, AI-understandable Context for its Gemini model within its ecosystem. The core argument is that the nature of competitive advantage in AI is changing. The internet era prized network effects—connecting more users. The AI era values "individual depth": the ability to build deep, task-specific understanding of a user. This creates a new moat through 1) the compounding value of accumulated Context, 2) deep integration with user tools and permissions, and 3) the establishment of trust for complex tasks. Therefore, the battle for Context is fundamentally about capturing "task entry points" and converting existing digital ecosystems into environments where AI can effectively understand and act, rather than merely scaling user numbers.

marsbit16m ago

Giants Wage the Context War, Reconstructing AI Moats

marsbit16m ago

Foundation Steps Back, Ethlabs Steps Forward: Ethereum Undergoes Its Largest Restructuring in History

On June 23rd, the Ethereum ecosystem witnessed two major shifts, signaling a significant governance realignment. First, former Ethereum Foundation researchers established Ethlabs, a new independent non-profit. Backed by major ETH holders like Bitmine and SharpLink, Ethlabs aims to address practical needs for institutional adoption, including faster settlement, native asset issuance, cross-chain transactions, and mainnet scaling. Secondly, the Ethereum Foundation announced a major restructuring, laying off 54 employees (20% of its staff) to become a leaner entity focused on protocol governance and maintenance rather than being the primary builder. This move represents a pivotal correction. Criticisms had mounted over the Foundation's perceived slowness, lack of clear strategy, and over-reliance on Vitalik Buterin's influence. Ethlabs emerges as a more execution-oriented, "industrialized" layer focused on market adoption—bridging the gap between research and real-world use. Notably, Vitalik Buterin is absent from its list of supporters, interpreted as an intentional step to avoid excessive personal endorsement and allow the organization to build independent credibility. The Ethereum Foundation's downsizing and redefinition mark a retreat from its former central coordinating role. It now aims to share the "privilege of stewarding Ethereum" with other emerging groups like Ethlabs, the Ethereum Applications Guild, and The Ethereum Economic Zone. Analysts frame this dual shift as the Foundation ensuring Ethereum remains "correct" (credibly neutral), while Ethlabs must prove it remains "effective" (competitive and attractive for capital and adoption). This addresses community "shareholder-like anxiety" about ETH's market performance. While risks exist—such as concerns over shifting from Foundation centrality to large-holder influence—the consensus is that the greater risk for Ethereum was inaction, caught between technical idealism and organizational inertia. These steps aim to create a more multi-stakeholder, execution-driven future for the network.

链捕手7h ago

Foundation Steps Back, Ethlabs Steps Forward: Ethereum Undergoes Its Largest Restructuring in History

链捕手7h ago

Second Half of U.S. Crypto Policy: The Clarity Act Aims for 60 Votes, CFTC's "One-Person Commission" Becomes Biggest Variable

In a pivotal year for US crypto policy, the "CLARITY Act" is advancing in the Senate but faces a high hurdle, needing 60 votes to pass. Key challenges include bridging partisan divides on ethics and swaying undecided Republican senators within a tight legislative calendar of only about 40 working days. The policy "second half" involves intense negotiations on a broader framework for Web3 and DeFi, including crypto tax reforms and the Blockchain Regulatory Certainty Act. A significant uncertainty is the understaffed CFTC, operating with four commissioner vacancies, which complicates regulatory clarity. Meanwhile, the departure of key "crypto champions"—SEC Commissioner Hester Peirce and Senator Cynthia Lummis—will impact ongoing policy efforts. Industry experts are cautiously optimistic but realistic. Sara K. Weed notes that while progress is being made, CLARITY is unlikely to pass this Congress, pushing agencies like the SEC and CFTC to provide more guidance. Sulolit Mukherjee suggests meaningful crypto tax legislation is more likely to be attached to larger must-pass bills. Rashan Colbert discusses the jurisdictional debate over prediction markets, emphasizing the need for a regulatory framework that fosters their development as financial tools rather than treating them broadly as gambling. The clock is ticking, but opportunities remain for substantive progress through continued bipartisan dialogue and pragmatic efforts.

marsbit10h ago

Second Half of U.S. Crypto Policy: The Clarity Act Aims for 60 Votes, CFTC's "One-Person Commission" Becomes Biggest Variable

marsbit10h ago

Dan Koe's New Essay: Escaping the Fate of the Wage Slave, How to Survive the AI Replacement Wave?

Dan Koe argues that the true threat in the AI era isn't technology itself, but a reliance on others for one's livelihood and happiness. The core problem is "wage slavery"—spending life on unfulfilling work. To survive and thrive, one must escape this by building their own enterprise. The key is developing five elements: Agency (initiative), Taste (discernment), Persuasion, Persistence, and Iteration. These boil down to problem-solving skills and experiential knowledge, which cannot be learned passively but only through doing your own projects. The solution is to become "unemployable" by shifting your identity. This requires: 1) Radically changing your environment to force growth, 2) Choosing a medium (like content creation) that provides real feedback through trial and error, and 3) Mastering either code or, preferably, media (content). Content creation is more valuable because its subjective nature and need for human perspective create a durable advantage over generic AI output. To start, define your life's work by answering foundational questions about your innate knowledge, unique abilities, and contrarian beliefs. Then, immediately act by publishing your first piece of content. The cycle of creating, receiving feedback, and iterating is the essential path to developing the skills needed for an independent, meaningful career and financial resilience.

marsbit11h ago

Dan Koe's New Essay: Escaping the Fate of the Wage Slave, How to Survive the AI Replacement Wave?

marsbit11h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片