Anthropic's Triple Moment: Code Leak, Government Confrontation, and Weaponization

marsbit发布于2026-06-16更新于2026-06-16

文章摘要

This article analyzes Anthropic's recent conflicts and strategic moves following the U.S. government's emergency halt of its new Fable model, citing national security concerns over potential "jailbreaks." The author argues this incident reveals deeper tensions between AI labs, governments, and the software industry. While critics view Anthropic's safety-focused rhetoric as marketing fear, the author suggests it serves as a commercial moat masking the company's core economic imperative: moving closer to end-users and their valuable data to avoid being commoditized. The piece outlines a coming clash between frontier AI labs like Anthropic and established software companies. Labs need real-world usage data for model improvement via reinforcement learning, creating a cycle where better products attract more users and more data. This threatens software firms who, as Microsoft's Satya Nadella warns, risk having their value captured by a few dominant models. Anthropic's controversial policy changes—initially secretly degrading Fable's performance for LLM development and expanding data retention—are framed as assertions of control, justified by its safety narrative. The company's foundational belief that it alone is sufficiently concerned about superintelligent AI dangers legitimizes its actions, from resisting government demands to shaping usage policies. The author concludes that this alignment of mission, talent, and business strategy is powerful but concerning, as it concentrat...

Author: Ben Thompson

Translation: Deep Tide TechFlow

Deep Tide Insight: Anthropic's new model, Fable, was urgently halted by the U.S. government just two months after its release. On the surface, it's about "security leaks," but in reality, it exposes a dual war between AI labs, the government, and the software industry. This company, which sells itself on "safety," is turning the safety narrative into a commercial moat. What they are really after is the user data currently held by companies like Microsoft.

I understand the cynics' perspective. They always think Anthropic's public statements—especially those accompanying model releases—are marketing-fueled fearmongering. Two months ago, Anthropic announced the launch of Mythos Preview, claiming the model was too dangerous to release publicly, particularly due to its powerful cybersecurity capabilities. Then, two months later, the company publicly released Fable, a version of Mythos with various safety guardrails added.

Based on my limited experience using it, Fable is indeed an excellent model. It's becoming difficult to objectively assess models beyond programming performance, but subjective feelings remain. I found interacting with Fable to be an outstanding experience; it made other models, including GPT 5.5 and Opus 4.8, seem small and dumb in comparison. I've only had this feeling twice before: once with GPT-4 and once with Grok 4—both represented a new generation in terms of foundational model scale and complexity. I believe Fable originates from new pre-training and is the first of a new generation.

Therefore, I fully accept that Fable/Mythos might indeed be much better at identifying and exploiting security issues, justifying Anthropic's cautious rollout. But the problem with publicly releasing a model is that guardrails can be bypassed, and apparently, this happened not long after the release.

Anthropic Confronts the U.S. Government Again

What happened next is somewhat unclear. Anthropic wrote in a blog post:

The U.S. government invoked national security authority, issuing an export control order suspending access to Fable 5 and Mythos 5 for all foreign nationals, both within and outside the United States, including Anthropic's foreign employees. The practical effect of this order is that we had to abruptly disable Fable 5 and Mythos 5 for all customers to ensure compliance. Access to all other Anthropic models remains unaffected.

We received the government's directive today at 5:21 PM ET. The letter did not provide specific details of the national security concerns. We understand the government believes a method to bypass or "jailbreak" Fable 5 has been discovered. We reviewed demos that used this specific technique to identify a handful of known minor vulnerabilities. These vulnerabilities all appeared relatively simple, and we found that other publicly available models could also discover them without requiring a bypass.

Anthropic went on to argue that non-general jailbreaks are inevitable and limited in scope, with no evidence of a general jailbreak; the discovered jailbreak appears to have been reported by Amazon, which is notable because Amazon is both an investor in Anthropic and a primary provider of the company's inference services. As I write this, Anthropic executives are in Washington D.C., trying to resolve what they insist is a misunderstanding but what White House officials hint is company leadership's indifference to legitimate national security concerns.

Given the many contested facts, I don't have much to add about the current conflict; but I'm not surprised it's happening. As I explained in "Anthropic and Alignment," conflict between the U.S. government and Anthropic was inevitable. For that matter, those who think Mythos isn't powerful enough yet to warrant such drastic government action are missing the point: if it's not powerful enough now, the next one will be, or the one after that, especially now that models are becoming increasingly useful at creating their successors.

However, this leads to another question—one that seems to validate the cynics' view: If Mythos is so dangerous, why release Fable in the first place? Why fight the government on doing what you claim to want? In fact, I find Anthropic's behavior perfectly understandable; what's unique about the company is how it justifies these actions, and it's precisely these justifications that give cynics fuel and give Anthropic its magic.

Economic Inevitability

In the early years of AI, the most economic value flowed to compute power, for obvious reasons: we didn't have enough supply to meet demand, which meant prices soared; the biggest beneficiaries were NVIDIA, TSMC, and memory makers (SK Hynix, Samsung, and Micron). Meanwhile, Anthropic and OpenAI collectively lost tens of billions of dollars building frontier models, which, once released, were distilled and commodified by open-source models, mostly from China.

This represents the pessimistic scenario for the labs—they can never cover their costs because their differentiation is fleeting, and free alternatives become "good enough"—which I believe is plausible. In a world of interchangeable models, models are commodities, and most of the value flows elsewhere. Right now it's compute, but over time, when we have enough compute, the most valuable place in the value chain will be where it has always been: owning the user touchpoint.

Therefore, there is an economic inevitability for frontier labs to get closer to users, which has always been clear to me. If you own the user touchpoint, then you have meaningful lock-in, and the best way to own the user touchpoint is to become the canvas for everything they need to do. This, in turn, means frontier labs are heading for a collision with software companies: it's the software that owns the user touchpoint, and the frontier labs' long-term interest is not simply to be a commodity input for software, but to directly replace it.

Meanwhile, software companies are striving to do the opposite. Satya Nadella outlined his vision for how companies should build on models in a post on X:

Every company must build what I call human capital and token capital. Human capital includes its employees' knowledge, judgment, relationships, ingenuity, and pattern recognition, while token capital is the AI capabilities a company builds and owns. Importantly, as token capital grows, human capital does not become less valuable. It only becomes more valuable! I believe human initiative will be the driver of token capital growth. Humans will set ambitious goals, connect dots across domains, build relationships, and identify the most important patterns. Without human guidance, your compute is idling.

This means the real opportunity isn't in choosing the best model, but in building learning loops on top of models that allow human and token capital to compound. You can outsource a task, even a job, but you can never outsource your learning. The future of a company is enabling that learning to compound between people and AI. This requires a new architectural approach that allows every business to build agent systems that improve over time while still retaining control over their intellectual property. Companies should be able to swap out 'general' models without losing the 'company veteran' expertise built into their learning systems. This is a key 'test' for your control and sovereignty in the age to come.

Nadella prefaced this vision with a warning:

What none of us want to see is a world where every company in every industry cedes value to a handful of all-consuming models. If all value is captured by just a few models, the political economy simply won't tolerate it. Society will not grant license for an AI future that hollows out entire industries.

Think about what happened in the first stage of globalization, where entire industrial economies were hollowed out by outsourcing. On the surface, GDP numbers looked good, but the displacement was real, and the consequences are still felt today. Let's not bring that dynamic into the AI era, where a handful of AI systems capture all the economic returns while entire industries find their knowledge commoditized right under their noses.

The problem with this analogy is: Globalization did happen, and industrial economies were hollowed out. It's possible this isn't a warning but a prophecy; no wonder Nadella is sounding the alarm, as Microsoft could be one of the victims. Similarly, the economic inevitability for model makers is precisely to achieve this.

Data Inevitability

These models—even Mythos—are not there yet. What they need, besides more compute, is more and better data. Model improvements increasingly come from reinforcement learning; some of that can be generated synthetically, but the most powerful lever for frontier labs is real-world use.

I think this is a primary reason both OpenAI and Anthropic offer heavily subsidized subscription plans. SemiAnalysis recently estimated that the $200 plan gets you $8,000 worth of Claude tokens and $14,000 worth of Codex tokens. Of course, both are competing for user and developer mindshare, but they are also competing for access to real usage data to improve their models.

Anthropic upped the ante significantly with Fable, announcing they will retain all data used for 30 days, even for enterprise plans that previously promised zero data retention. The company says they won't use this data for training, but they haven't put any safeguards in place to guarantee they won't in the future (like storing data with a third party). If this policy change (when Fable is restored) doesn't lead to significant customer churn, I suspect it's only a matter of time before they start using the data: it's too valuable for their ultimate goal.

Also note the virtuous cycle with moving up to the user touchpoint: the more workflows completed directly with Claude or Codex, the more data each company gets that can be fed back into training, making their product more powerful and useful, expanding the number of workflows they can serve, and expanding their access to data.

Nadella emphasizes the importance of this data in his piece, but naturally believes it should be independent of the models:

Companies need to convert workflows, domain knowledge, and accumulated judgment into AI systems that improve with every use. Private evaluation should capture whether models are truly improving on outcomes important to the business (not just external benchmarks!). Private reinforcement learning environments should make models stronger on real trajectories within the organization. Its knowledge base makes institutional memory queryable and token use more efficient.

This loop becomes the company's new intellectual property. I see it as a hill-climbing machine. Unlike most assets, it compounds. Each improved workflow generates better training signals, accelerating the accumulation of tacit knowledge unique to the company. Companies that build this early will have advantages that are difficult to replicate, regardless of any new individual model capabilities.

However, what if companies submitting to Anthropic's data policies get better results right now? Or if existing companies resist, leaving an opening for new companies—or the model makers themselves—to beat them in the market? Anthropic is certainly testing the resolve Nadella calls for.

A Claim to Power

Astonishingly, the data retention policy around Fable/Mythos wasn't even the most controversial part of the release. Instead, Anthropic stated at launch that Fable's performance would be quietly degraded if it was used for LLM development; the system card read:

We also added protective measures related to frontier LLM development. As discussed in Section 6.1 of our February 2026 Risk Report, we are concerned about risks from accelerating the overall pace of AI development, though we remain uncertain about the severity of these risks. In particular, our concern lies—as we wrote at the time—"in accelerating the ability of other AI developers to build powerful AI systems with risks similar to ours—without necessarily having corresponding protective measures."

Given recent models' ability to accelerate their own development, we have implemented new interventions limiting Claude's effectiveness on requests targeting frontier LLM development (e.g., building pre-training pipelines, distributed training infrastructure, or ML accelerator design). Using Claude to develop competing models already violates our Terms of Service, but enforcing this restriction through protective measures avoids accelerating those actors most willing to violate those terms.

Unlike our interventions for cybersecurity, biochemistry, and distillation attempts, these protective measures are invisible to the user. Fable 5 will not fall back to another model. Instead, the protective measures will limit effectiveness through methods like prompt modification, steering vectors, or Parameter-Efficient Fine-Tuning (PEFT). These interventions will not affect the vast majority of programming work. We estimate they will affect approximately 0.03% of traffic, concentrated in less than 0.1% of organizations. When these interventions are active, we expect their impact on model behavior to be minimal beyond limiting its effectiveness for developing frontier LLMs. Claude will still respond helpfully to user requests. We will continue to improve the precision of our detection methods after this model's release.

Anthropic walked back this change—Fable will now offload LLM-related requests to Opus 4.8 and disclose this offload to users—but I find the original policy highly revealing. On one hand, I don't really blame Anthropic for not wanting to help competitors; on the other hand, it should be very clear that Anthropic believes no one but them should be making frontier LLMs.

What makes this policy even more striking is that it was enacted just two months after Anthropic's dispute with the War Department: the latter wanted to use Claude for any lawful purpose, while the former wanted stricter controls on surveillance and autonomous weapons. This degradation measure represents both Anthropic's ability and willingness to quietly alter its model to enforce its policy preferences. In other words, Anthropic actively validated some critics' biggest concerns about it as a supply chain risk.

However, the broader takeaway from that episode is that Anthropic believes they should have the final say over how Anthropic is used; given they believe only they should develop frontier AI, then they effectively believe only they should have the final say over AI overall. When you combine this realization with the company's statements about AI being capable of all economic activity, you realize that Anthropic's leadership essentially wants power over everything and everyone.

The Safety Narrative

Of course, Anthropic would never phrase it so bluntly; instead, the story is about safety:

I expect Anthropic will increasingly expose its model capabilities to end-users through endpoints increasingly tailored to different workflows, even as they begin restricting the API. This substitution for software and restriction of access will be done in the name of safety, even as Anthropic fulfills its economic imperative to get closer to the end-user.

Anthropic's explanation for its significant data retention policy change is safety. Specifically, the company claims that retaining all user data for 30 days is necessary to prevent the jailbreaks the U.S. government fears. I can certainly imagine a future where safety factors also compel them to train on this data to better defend against malicious use.

Anthropic's entire origin story is rooted in the founders' belief that OpenAI wasn't taking safety seriously enough; the company believes only they can be trusted to control AI, and because they uniquely care about safety, they are justified in trying to control everyone else, including the U.S. government.

The thing about these safety justifications is this: I think they work because, for Anthropic, they are not justifications. The company genuinely believes they are the only ones who believe in superintelligence and thus are the only ones sufficiently focused on the dangers. This excuses decision after decision, policy after policy, confrontation after confrontation that, to outsiders, seem like a strange mix of cynicism and naivety.

The contrast with OpenAI is stark: One way to understand how and why OpenAI lost its lead is that, in the years following ChatGPT's release, the company was at war with itself internally, a former research lab suddenly burdened with becoming an accidental consumer tech company; as OpenAI resolved this conflict, it bled enormous talent to companies like Anthropic.

Anthropic, on the other hand, has perfect alignment between talent, mission, and business. The company can sell researchers the vision of creating a machine god, with the aura of being the kind of people who care about the dangers and are smart enough to navigate them on behalf of humanity; and every resulting policy change happens to be good for business, which is the most wonderful coincidence in the world.

I both respect and fear this alignment. I respect it because it's clearly very effective; the closest analogy might be Apple, a company that always wraps every self-serving action in the guise of doing the right thing for the user—and often they do. So does Anthropic. However, I fear that letting people convinced they know best build a smartphone I can accept or reject is one thing; letting them build superintelligence with the potential to rival or surpass the power of nation-states, or simply large corporations, is far more concerning. The history of clever people convinced they know what humanity needs is sordid, precisely because they convinced themselves the intentions were good, providing a rationale for actions that weren't.

你可能也喜欢

Kraken计划为美国专业交易者推出CFTC监管的永续期货

加密货币交易所Kraken计划通过其收购的衍生品交易所Bitnomial，为符合条件的美国专业交易者推出受美国商品期货交易委员会（CFTC）监管的永续期货合约。此举旨在为美国交易者提供一个受监管的国内渠道，以获取在海外加密货币市场中占据主导地位的衍生品。永续期货是全球加密货币交易中的重要产品，允许交易者在没有到期日的情况下持有多头或空头头寸，并通过资金费率机制使合约价格贴近现货市场。由于监管限制，美国交易者此前难以通过本土受监管平台获得此类产品。 Kraken计划将这款产品整合至其Kraken Pro平台，为交易者提供统一的现货、保证金和期货交易界面。该合约设计将包含连续定价、无到期日以及每八小时结算的资金费率。然而，该产品目前仅面向符合条件的美国专业交易者，并非对所有零售用户开放。此举若成功推行，可能吸引更多流动性，并推动其他竞争对手寻求类似的受监管途径，从而将永续期货进一步纳入美国受监管的市场基础设施。不过，产品的实际影响将取决于用户准入范围、流动性深度以及执行质量等因素。

bitcoinist42分钟前

bitcoinist42分钟前

沃什首秀：史上最懂Crypto的FED主席会为市场带来惊喜还是惊吓？

2026年6月16日，新任美联储主席凯文·沃什将迎来首次货币政策记者会。他面临通胀升温、国债被抛售与白宫降息压力的复杂局面。沃什因个人资产中持有Solana等多个加密资产而备受关注，是首位间接投资加密领域的联储主席。沃什的政策立场呈现双重性：一方面，他以通胀鹰派著称，可能倾向紧缩货币政策；另一方面，他长期关注加密资产，视其为宏观经济的“监测器”和美国竞争力的组成部分，有望推动更友善的监管框架。这种“鹰派利率+友善监管”的组合可能成为影响加密市场的关键。其上任可能从三方面影响加密市场：一是监管范式从“防御”转向“整合”，利好稳定币和DeFi发展；二是其清晰的沟通可能降低市场政策不确定性；三是其背景可能加速全球机构资金配置加密资产。发布会结果可能呈现两种情境：若释放鸽派基调并认可数字资产创新，市场或迎惊喜；若鹰派超预期，强调加息，风险资产可能普遍承压。尽管沃什已承诺出售加密持仓，但其对技术的深层理解，长期看将为加密资产主流化提供更稳固的基础。市场关注点在于能否从其信号中窥见一个更具连贯性的新时代轮廓。

marsbit59分钟前

marsbit59分钟前

XRP Ledger 发布 3.2.0 版本升级并启用 XRPLd 新品牌名

XRP Ledger发布了3.2.0版本，这是对其底层区块链基础设施的一次重要升级。本次更新的核心是将运行网络的软件名称从“rippled”更名为“xrpld”，以更好地反映整个项目生态。与此前侧重于前端功能的版本不同，3.2.0版本优先进行了后端升级和效率提升，旨在增强网络性能并为未来的扩展做准备。关键改进包括内存优化措施，预计可节省高达40%的服务器内存使用。此次升级引入了名为“fixCleanup3_2_0”的修改，为单资产金库、借贷协议、权限系统、去中心化交易所、多用途代币和权限域等多个模块带来了安全性增强。开发团队还新增了不变性检查，以确保已删除账户不会在账本上留下不一致的数据，从而加强整个网络的完整性和可靠性。对于开发者而言，新版本增加了一项重要功能：应用程序无需连接服务器即可检索XRP Ledger协议和服务器定义信息，这将极大便利钱包、区块链浏览器和API等的开发工作。在可扩展性和稳定性方面，更新包括可配置的区块大小、通过nuDB实现的高效数据库存储，以及将gRPC服务器的TLS/双向TLS支持改为可选，以提升企业用户的性能和连接性。此外，默认对等端口从51235更改为2459，并修复了涉及自动做市商、支付、代币托管、多用途代币、订单簿和RPC等多个方面的问题。出于性能考虑，3.2.0版本暂时禁用了交易不变性检查，但开发团队表示这不会构成安全威胁。

TheNewsCrypto1小时前

TheNewsCrypto1小时前

AGI不是终点，DeepMind新论文：迈向ASI，真正的AI进步才刚开始

DeepMind团队最新研究报告提出，通用人工智能（AGI）很可能不是AI发展的终点。AI将继续超越人类水平，走向超级人工智能（ASI）。报告区分了AGI（达到人类中位认知水平）、ASI（在所有重要领域整体超越人类专家集体）和UAI（理论智能上界）三个概念，并探讨了从AGI迈向ASI的四条潜在路径：持续扩展计算、模型与数据规模；算法持续演化乃至范式转变；系统通过递归自我改进形成正反馈；以及通过多智能体协作形成超越单体的集体智能。报告同时指出了发展过程中可能面临的六大关键瓶颈：高质量人类数据可能面临枯竭的“数据墙”；能源、芯片等经济和自然资源压力；现有神经网络范式在持续学习、稳定推理等方面的根本性局限；研究难度随领域成熟而增加；AI可能难以自主提炼新概念原语的“抽象壁垒”；以及监管、治理与社会反弹带来的影响。报告最后强调，一旦AI能力超越人类，现有以人类水平为参照的评估体系将失效，需要建立面向后AGI时代的新评估机制。ASI的发展仍受物理规律、资源等现实约束，其具体路径与速度存在高度不确定性，未来需持续开展跨学科研究以应对这一前景。

marsbit2小时前

marsbit2小时前

Kraken推出OpenAI和Anthropic的Pre-IPO永续合约，提供高达5倍杠杆

加密货币交易所Kraken推出了针对OpenAI和Anthropic的“IPO前永续合约”（pre-IPO perps），允许符合条件的交易者在这两家备受关注的私人人工智能公司公开上市前，进行高达5倍杠杆的多空交易。此举标志着加密货币衍生品平台正超越数字资产范畴，尝试对接私人公司股权等链下资产，为散户投资者提供原本难以触及的热门投资主题（如AI）的敞口。然而，这类产品与普通的加密货币永续合约存在显著差异。其定价基于不透明、非连续的私人市场估值（受融资轮次、二级交易、IPO预期等因素影响），而非流动的现货市场价格，因此风险管理更为复杂。高杠杆进一步放大了与估值波动、流动性及上市时间表相关的独特风险。总之，Kraken的推出显示了加密交易基础设施向更广泛投机标的的扩张，为投资者提供了新的表达观点的工具，但产品本身风险较高，投资者需充分了解其复杂性和风险后再谨慎参与。

bitcoinist2小时前

Kraken推出OpenAI和Anthropic的Pre-IPO永续合约，提供高达5倍杠杆

bitcoinist2小时前

交易

现货

合约

Anthropic's Triple Moment: Code Leak, Government Confrontation, and Weaponization

文章摘要

Anthropic Confronts the U.S. Government Again

Economic Inevitability

Data Inevitability

A Claim to Power

The Safety Narrative

相关问答

你可能也喜欢

Kraken计划为美国专业交易者推出CFTC监管的永续期货

沃什首秀：史上最懂Crypto的FED主席会为市场带来惊喜还是惊吓？

XRP Ledger 发布 3.2.0 版本升级并启用 XRPLd 新品牌名

AGI不是终点，DeepMind新论文：迈向ASI，真正的AI进步才刚开始

Kraken推出OpenAI和Anthropic的Pre-IPO永续合约，提供高达5倍杠杆

交易

热门文章

如何购买S

Sonic：Andre Cronje主导升级，逆势上涨的Layer1新星

成长学院：学习“ Sonic“ ，瓜分价值 1000 USDT

相关讨论

热门问答

热门分类

热门标签