AI Giants Enter the Dark Forest

marsbitPubblicato 2026-04-25Pubblicato ultima volta 2026-04-25

Introduzione

In the AI industry's "dark forest," major players like Anthropic, OpenAI, and DeepSeek are strategically withholding their most advanced models to avoid becoming targets in a high-stakes competitive landscape. Anthropic released Claude Opus 4.7 but admitted it underperforms compared to their unreleased model Mythos, citing safety concerns. They delayed addressing user complaints about performance regression until OpenAI’s GPT-5.5 launch, highlighting a tactic of controlled disclosure aligned with competitors’ moves. OpenAI’s GPT-5.5, though a full retrain since GPT-4.5, was seen as incremental rather than revolutionary. Leaks revealed internal models like Glacier and Heisenberg, indicating significant unreleased capabilities. OpenAI acknowledges a "capability overhang," where real model power exceeds what users experience, often due to infrastructure-driven throttling. DeepSeek launched V4 Preview, a cost-efficient model, but its full potential (V4 Pro Max) awaits Huawei’s Ascend 950 super-nodes量产 in late 2026. Their strategy focuses on affordability and scalability, aiming to democratize AI access globally, a move noted even by NVIDIA’s CEO as a disruptive threat. Together, these actions reflect a broader trend: leading AI labs are deliberately pacing releases, hiding strengths, and aligning disclosures with competitive dynamics—each avoiding the risk of exposure in a forest where first movers become targets.

By | Xiang Xianzhi

In "The Three-Body Problem," Liu Cixin wrote an image that has been cited countless times—the dark forest. Every civilization is a hunter with a gun; whoever exposes themselves first dies first. The forest is not empty—it's that everyone knows turning on a light will attract bullets, so everyone keeps their lights off.

In the spring of 2026, top AI labs entered such a dark forest.

On April 16, Anthropic was the first to release Claude Opus 4.7. On the same day, they made an unusual move—publicly admitting that Opus 4.7's performance was not as good as an unreleased model called Mythos, citing safety concerns.

On April 23, OpenAI posted GPT-5.5 on its official website. On the same day, Anthropic published an incident review report titled "An update on recent Claude Code quality reports" on its official blog, admitting that Claude Code had indeed become dumber over the past month—one releasing a new card, the other admitting a mistake. But this "new king" was almost showing off: we admit Claude has temporarily become dumber—but don’t forget, we still have a Mythos card up our sleeve.

On April 24, the "mysterious Eastern force" DeepSeek V4 Preview was launched, with Liang Wenfeng's team officially announcing the model's deep integration with Huawei's Ascend 950PR for the first time; but everyone understood—the truly "full-blooded" V4 Pro Max would only be released after the mass production of Ascend 950 super nodes in the second half of the year.

Three companies, three moves. On the surface, they are各自的 product rhythms, but when pieced together, one thing becomes clear:

Each one holds at least one "gun"—a model stronger than the public version, a next-generation architecture not yet released to the public, or a super node of chips not yet widely deployed. But none dare to raise this gun first.

Because in this industry, the cost of "showing first" is far more than just leaking secrets. Showing first means handing your capability上限 to competitors as a reference;意味着率先承担安全审视、监管收紧、舆论压力的全部火力;意味着把自己变成下一轮所有竞对都要瞄准的那个移动靶子. There is no heroism in the forest—everyone who fires first becomes the next target.

So the most rational choice for hunters is to turn off the lights, hold their breath, and keep their weapons hidden behind their backs.

This is the optimal solution in game theory.

Anthropic's Fearlessness

On Claude's side, the past month has almost been the worst version release ever.

After早早 updating to Opus 4.7, Anthropic still dominated various charts, and they still had Mythos, which is only provided to enterprise customers—a seemingly unhurried attitude.

But the Opus 4.7 cycle was almost the worst user experience for Claude, with a flood of negative reviews.

In early March, Anthropic changed Claude Code's default reasoning depth from high to medium. The intention behind this decision was understandable: in high mode, the UI often appeared frozen, with responses so slow that paying users were frustrated. The problem was, they didn’t announce it at the time.

At the end of March, they launched an "efficiency optimization"—if a Claude Code session was idle for more than an hour, the system would clear old reasoning blocks. In design, this was to save computing power. In practice, after each round of conversation, Claude seemed to have amnesia, forgetting the context completely. The developer community was flooded with complaints during those weeks: "Claude no longer remembers what I asked it to do in the previous round."

Recently, a third thing happened—adding an instruction to compress verbosity in the system prompt. By Anthropic's own later admission, this instruction directly reduced Claude Code's coding quality by 3%.

These three things叠加在一起, led to an AMD senior director writing on GitHub: "Claude has regressed to the point it cannot be trusted to perform complex engineering." Axios' April 16 article, "Anthropic's AI downgrade stings power users," brought it into the mainstream spotlight.

Then Anthropic admitted that there were indeed some issues.

On April 7, they quietly rolled back the reasoning effort adjustment; on April 10, they fixed the cache bug; on April 20, they removed the system prompt compressing verbosity. But the real incident review report wasn’t released until April 23—coinciding with the day GPT-5.5 was publicly released.

This sense of slight contempt—"oh, there was a bug in our engineering strategy, it’s fixed now"—came just before and after OpenAI's heavyweight release. It’s hard to believe this was a coincidence.

What’s more intriguing is that when Opus 4.7 was released, Anthropic made an unusual move: publicly admitting that Opus 4.7's performance was不及 an unreleased model—Mythos. This was clearly a "strategic retreat"—Anthropic kept its strongest capabilities on the enterprise side and was in no hurry to release them to the public because the team wasn’t ready to release Mythos.

This explanation is believable. But from a business narrative perspective, the other half is equally true: Anthropic waited six weeks to publicly admit that Claude Code was regressing, and only brought up the issue on the day OpenAI was about to play a new card. If not for sufficient competitive pressure, if Opus 4.7 hadn’t proven "we still have a backup," this statement might never have come.

On Claude's side, squeezing toothpaste doesn’t mean deliberately crippling capabilities; it means: the pace of capability release and the timing of issue disclosure are both aligned with competitors' rhythms.

Releasing your most advanced capabilities will inevitably make you a target. Or, from Anthropic's perspective, the pressure from 4.6 on competitors hasn’t faded—so there’s no need to play the stronger card now.

OpenAI's Old Tricks

If Anthropic is "hiding a Mythos and not releasing it," then OpenAI's toothpaste-squeezing is more subtle—it leaves the power of capability release in the load curve of its own servers and a tiering mechanism called auto-router.

On April 23, the same day GPT-5.5 was released, Simon Willison (co-creator of the Django framework, well-known independent AI evaluator) wrote a cautious sentence in his blog: "It's not a dramatic departure from what we've had before."

He added a key piece of information: GPT-5.5 is the first completely retrained base model since GPT-4.5; that is, the past half-year's releases of 5.1, 5.2, 5.3, and 5.4 were all just incremental updates. In other words, OpenAI released the past four minor updates with restrained effort—because they didn’t know what competitors would release.

"Releasing with restrained effort" has a more understandable name: squeezing toothpaste.

But a more memorable scene occurred hours after GPT-5.5 went live. Codex users filed Issue #19241 on GitHub, complaining that Fast mode was initially very fast but became visibly slower as more users were let in, while billing was still at the Fast tier. The wording was familiar: "Please investigate whether GPT-5.5 Fast mode is downgraded under high load."

This was almost an exact replay of the scene on August 7, 2025, the day GPT-5 was first released—back then, Reddit r/ChatGPT pushed "GPT-5 is horrible" to 4600+ upvotes, and Sam Altman personally admitted in an AMA the next day that "the autoswitcher broke... GPT-5 seemed way dumber"—admitting that the router had downgraded users behind the scenes.

The same script was上演 again eight months later.

More ironically, the day before GPT-5.5's official release, OpenAI's Codex mistakenly pushed the internal staging environment to production, captured by several Pro users, fixed within minutes, but the content had already spread. What appeared in the selector at that time, besides GPT-5.5 itself, was a series called Glacier (tooltip reading "Intelligence that moves continents"), a life sciences model called Heisenberg, an unknown-purpose model called Arcanine, and multiple versions with codenames like oai-2.1.

That is, at the same time OpenAI released GPT-5.5 as the "next generation," internally there were at least 5 to 6 parallel product lines running, none of which had reached the public yet.

OpenAI itself admitted it. In its official 2026 roadmap, they used a term long discussed in academic circles—capability overhang—admitting that there is a huge gap between the true capabilities of current large models and the effects users can actually achieve.

Sound familiar? It’s almost the same wording Anthropic used for Mythos. Even if the Codex leak on April 22 was truly a mistake, OpenAI actively putting the term "capability overhang" into its roadmap sends a clear message—we have much more in hand, you deal with it.

You can only squeeze if you have far more than what you sell to users. The 24 hours of GPT-5.5 turned this premise into a live broadcast once again.

DeepSeek's Patient Wait

On DeepSeek's side, the way of "squeezing" has completely changed—it is not hiding capabilities but waiting for a more suitable delivery time.

1.6T MoE, 1M context, Pro/Flash dual specifications, priced at 3.48 per 1M tokens—dozens of times cheaper than GPT-5.5, an order of magnitude difference from Opus 4.7. Overseas independent evaluators concluded with two sentences: performance is close to but slightly lower than GPT-5.4 / Gemini 3.1-Pro, price "shatters the economics of frontier labs."

But in DeepSeek's own coordinate system, V4 Preview is already significantly more expensive than V3's "bizarrely cheap" price. Everyone knows—this is not the full-blooded version.

The complete story of DeepSeek V4 does not end with its release, nor does it start with it.

It starts with the unreleased release of R2 in 2025. R2 was originally scheduled for release in May 2025 but was eventually postponed to autumn/winter. DeepSeek's entire infrastructure in China migrated to Huawei's CANN ecosystem. For any lab, this is not an engineering feat that can be completed in a quarter—compiler, operators, communication libraries, inference framework, MoE routing, all had to be rewritten.

And this time with V4, it is the first time DeepSeek officially wrote Ascend into the training hardware list. V4 is the first version of mixed training—Ascend's first entry.

But the next-generation chip Ascend 950DT, optimized for large-scale training, is scheduled for mass production in Q4 2026 according to Huawei's roadmap. That is, V4 training was able to run by拼凑上一代的 950PR; to make the full-blooded version like V4 Pro Max, a 1.6T MoE model, both fully trainable and deployable at scale, we must wait for the next generation to arrive.

The real engineering challenge is not "whether V4 can be trained"—it has been trained—but "how to make V4 run fully, stably, and cheaply on Ascend."

Ascend 950PR was mass-produced in Q1 2026, with FP4 computing power of 1.56 PFLOPS, on-chip memory of 112GB, paper specifications对标 and exceeding NVIDIA H20. But from a single chip running, to a whole super node stably serving millions of tokens/second inference requests, are two different things. The full-blooded version of V4 Pro Max is locked to this "super node"—the large-scale cluster version of the Ascend 950 series, which will be available in the second half of 2026.

This constitutes a strategy completely different from the previous two. Anthropic and OpenAI's logic for squeezing toothpaste is: I have something stronger, I won’t give it to you yet; DeepSeek's logic for squeezing toothpaste is: my full-blooded version must wait for a moment when the price can drop another notch.

This difference is important.

DeepSeek's real killer feature has never been "the most cutting-edge performance," but "with adequate performance, cutting the token price to a level others dare not." V4 Preview has been adapted to run on NVIDIA cards and Ascend 950PR, but to achieve full-blooded inference at production scale, we must wait for the super nodes to arrive. Once that moment comes, two things will happen simultaneously: first, V4 Pro Max's capabilities can be released to the maximum; second, inference costs and API pricing will drop another level—for a company that relies on price to break through the market, the latter is more致命 than the former.

What people truly expected, the "DeepSeek moment" that happened in early 2025, did not happen again in this release. And the release of V4 Preview is actually a trailer; the real highlight is the "DeepSeek + Huawei Ascend" moment in the second half of the year.

From this perspective, what Liang Wenfeng's team is doing now is not被迫 "hiding," but a commercially restrained "choice"—choosing to hand the premiere of the strongest version to a scenario where it has the most say: the first day after the large-scale deployment of domestic super nodes. Before that, use V4 Preview to consolidate the narrative of cost-effectiveness for another round.

What DeepSeek carries has never been the "longboard narrative" of making domestic large models rank first on some chart, but the "systemic narrative" of simultaneously making chips, training, inference, and pricing work together—the latter is far more important than the former.

Just a few days ago, Jensen Huang said on Dwarkesh Patel's podcast that if DeepSeek premieres on Huawei chips, "that's a horrible outcome for our nation."

NVIDIA still controls the top computing power for now. But according to the "AI five-layer cake" that Jensen Huang himself proposed—energy, chips, infrastructure, models, applications—the domestic large model industry already has workable domestic options at every layer, and the gap is narrowing at a visible speed. With the final piece of the chip puzzle in place, DeepSeek's open-source large model story becomes a bigger story than American large models: this is an important step towards global intelligence parity without excessive cost consumption.

Allowing the world to bypass certain advanced computing powers controlled by hegemony and enter an efficient intelligent society.

Epilogue

Anthropic's "hiding"—is active. They have Mythos, didn’t release it, citing safety.

OpenAI's "hiding"—is structural. They have Pro tiers, don’t always give them to you, citing infrastructure and price tiers.

DeepSeek's "hiding"—is necessary. It concerns a whole set of narrative templates for societal intelligence leap.

But from another perspective, this is exactly like the dark forest depicted by Liu Cixin: in this dark forest of intelligence, no top hunter will fire the first shot.

Exposure means having no reservations,意味着没有底牌, and becoming a live target for another hunter.

No one knows who will fire the most致命 shot first. But one thing is certain: every model you use today is not its true form.

Domande pertinenti

QWhat is the 'Dark Forest' metaphor used to describe in the context of top AI labs in 2026?

AThe 'Dark Forest' metaphor, from Liu Cixin's 'The Three-Body Problem', describes a state where each top AI lab is like an armed hunter. Exposing one's full capabilities first makes them a target for competitors, leading to a strategic equilibrium where everyone hides their strongest models and advancements to avoid becoming a moving target for scrutiny, regulation, and competitive pressure.

QWhat was the strategic reason Anthropic gave for not releasing its most powerful model, Mythos, to the public?

AAnthropic cited 'safety concerns' as the official reason for not releasing its most powerful model, Mythos, to the public. Strategically, this also allowed them to retain their strongest capability as a competitive advantage, avoiding the pressure of being the first to set a new benchmark that others would aim to surpass.

QHow did OpenAI demonstrate the concept of 'capability overhang' with the release of GPT-5.5?

AOpenAI demonstrated 'capability overhang' by admitting that the GPT-5.5 release was their first fully retrained base model since GPT-4.5, revealing that the previous minor versions (5.1 to 5.4) were only incremental updates. Furthermore, a leak showed they had multiple other advanced, unreleased models in development (e.g., Glacier, Heisenberg, proving they possess far more advanced capabilities than what is currently available to users.

QWhy is DeepSeek's V4 Preview not considered its 'full-blooded' version, and what are they waiting for?

ADeepSeek's V4 Preview is not the 'full-blooded' version because its training and current operation rely on the previous-generation Ascend 950PR chips. The company is waiting for the mass production and deployment of the next-generation Ascend 950DT super nodes in the year's second half. This will allow the 'V4 Pro Max' version to run at full capacity and enable a further drastic reduction in inference costs, which is core to DeepSeek's market strategy.

QWhat common strategic behavior did all three AI labs (Anthropic, OpenAI, and DeepSeek) exhibit, according to the article?

AAll three labs exhibited the strategic behavior of 'withholding' or 'squeezing' their full capabilities. They each possess more advanced technology—be it a stronger model, a next-gen architecture, or more efficient hardware—than what they have released to the public. None are willing to be the first to fully reveal their hand, as it would make them a target for competitors and regulators in the 'Dark Forest' of AI competition.

Letture associate

A 134% Surge, 75 P/E Ratio: Why Is the Market Paying Up for Murata's 'Zero Growth'?

Murata Manufacturing, the world's largest passive components maker, saw its stock price surge 134% over the past year and hit a record high on May 28th, despite reporting nearly zero growth in operating profit for its latest fiscal year. This has pushed its valuation to a P/E ratio of approximately 75x. The disconnect is driven by a fundamental market re-rating. The catalyst was a late-May meeting where management upgraded the AI investment cycle outlook to "lasting until around 2030" and noted that demand for its components is roughly double its supply capacity, with customers prioritizing securing volume over price. While Murata's revenue grew only 5.0% and operating profit stagnated at ¥281.8 billion for the fiscal year ending March 2026, its guidance for the current fiscal year projects a 34.8% jump in operating profit to ¥380 billion. This sharp growth is underpinned by expectations that its AI/data center-related revenue will nearly double from ¥170 billion to ¥325 billion, becoming a key pillar of its business. Analysts highlight that this growth stems not from broad price hikes but from a shift towards higher-value, cutting-edge MLCCs for AI servers, where Murata holds over 70% market share. The market is now pricing Murata not as a cyclical component maker but as a critical "AI pick-and-shovel" supplier with structural pricing power. However, the high valuation also carries risk if future AI demand or quarterly guidance falls short of the elevated expectations.

marsbit7 min fa

A 134% Surge, 75 P/E Ratio: Why Is the Market Paying Up for Murata's 'Zero Growth'?

marsbit7 min fa

a16z: Why Do Prediction Markets Matter?

Prediction markets, which allow users to trade on the outcome of future events, have gained significant traction, especially in the U.S. At their core, these markets function like any other market by aggregating information from all participants and translating it into a price signal—in this case, the perceived probability of a specific event occurring. Unlike polls or surveys that offer static snapshots, prediction markets provide dynamic, quantifiable probability estimates that update in real-time as new information and participants enter. A key advantage is the incentive structure: participants risk their own capital, which encourages serious research and trading based on genuine knowledge. This can surface information that traditional methods might miss. Furthermore, prediction markets can be created for a vast array of specialized questions—from geopolitical events to AI model performance—that aren't covered by traditional financial markets. However, several challenges remain. Infrastructure issues include reliably determining event outcomes and resolving disputes. Market design must ensure participation from well-informed individuals while preventing manipulation, such as insider trading or attempts to sway public perception by artificially moving prices. Addressing these concerns around rules, participation, and contract design is crucial. If these hurdles are overcome, prediction markets could evolve into a powerful, widely-used tool for forecasting and navigating uncertainty.

marsbit17 min fa

a16z: Why Do Prediction Markets Matter?

marsbit17 min fa

Interview with 7 Ordinary Professionals: After AI Arrived, How Are You Doing?

This article interviews seven professionals from diverse fields like Web3, bulk chemical trading, digital agriculture, and traditional wholesale to examine the impact of AI on their work. Key themes emerge from the discussions. AI has become integral to their workflows, primarily for increasing efficiency in tasks such as coding, content creation, research, and data analysis. Individuals across roles, from developers to managers, report that AI tools like ChatGPT and Claude have significantly reduced workloads and accelerated learning, creating opportunities for "super individuals" or one-person teams. However, this efficiency comes with a double-edged sword. It intensifies competition, pushing professionals to constantly learn new tools and adapt, leading to widespread anxiety about job security and a heightened pressure to keep pace. Interviewees anticipate significant job reductions in roles like administrative support, finance, HR, customer service, and some creative fields. A recurring view is that AI acts as a "great equalizer," amplifying the capabilities of those who use it effectively while leaving others behind, potentially deepening polarization. Despite AI's capabilities, interviewees identify enduring human strengths. AI struggles with tasks requiring deep contextual understanding, complex judgment in areas like risk assessment and system stability (especially in finance/Web3), nuanced human communication, and handling exceptions in logistics and manufacturing. These areas remain firmly in the human domain. Consequently, many professionals are refocusing their career strategies. They plan to evolve from task executors into "complex system owners," "super coordinators" managing AI agents, or specialists in high-level areas like business context, risk control, product design, and personal branding. In summary, the article portrays AI not as an optional tool but as a transformative force reshaping job demands. While it automates routine work, it also creates new forms of pressure and competition. The future, as seen by these professionals, belongs to those who can strategically integrate AI to augment uniquely human skills like judgment, responsibility, and strategic oversight.

marsbit33 min fa

Interview with 7 Ordinary Professionals: After AI Arrived, How Are You Doing?

marsbit33 min fa

Satoshi Nakamoto Sued? $83.7 Billion Worth of BTC Up for 'Legal Claim'

An anonymous individual known as Noah Doe, along with two Wyoming LLCs, has filed a lawsuit in the New York Supreme Court. They are attempting to use New York's "lost and found" laws to claim legal ownership of approximately 837 billion USD worth of Bitcoin held in 39,069 dormant addresses. Crucially, this list includes addresses believed to belong to Bitcoin's creator, Satoshi Nakamoto (holding around 837 billion USD), alongside other long-inactive addresses from Mt. Gox and early Bitcoin holders. The plaintiff's legal strategy hinges on classifying these public Bitcoin addresses as "lost property." They submitted a USB drive containing only the public addresses to the New York Police Department, sent OP_RETURN notifications on the Bitcoin blockchain, and issued press releases. Their argument is that after these efforts and a waiting period, they should be granted ownership. A key, and highly controversial, claim is an unnamed "independent expert" valuing each address at under 10 USD, allowing for a faster legal process. Analysts from Galaxy point out major flaws in the case. The plaintiff never physically possessed the Bitcoin or private keys. The "under 10 USD" valuation is considered unrealistic, and allowing anonymous companies to claim such vast assets is highly unusual. Even if the plaintiff wins, they would only receive a court declaration of ownership, not the actual private keys to move the Bitcoin. The real danger lies in this court document acting as a "cloud on title." If any of these Bitcoins are later transferred to a regulated exchange or custodian, the plaintiff could present the judgment to freeze the assets, forcing the true owner into lengthy and de-anonymizing litigation to prove ownership. The outcome is uncertain, but the case highlights potential legal risks for dormant cryptocurrency holdings.

marsbit37 min fa

Satoshi Nakamoto Sued? $83.7 Billion Worth of BTC Up for 'Legal Claim'

marsbit37 min fa

Trading

Spot
Futures
活动图片