Stratechery Overturns the AI Bubble Theory: What Should We Use AI For?

marsbitPublished on 2026-03-17Last updated on 2026-03-17

Abstract

Stratechery's Ben Thompson revises his stance on the AI bubble debate, arguing that current AI investment reflects structural growth driven by fundamental technological shifts, not speculation. He identifies three key transitions in LLM development: ChatGPT’s debut (making AI broadly usable but unreliable), OpenAI’s o1 (introducing reasoning and reliability), and the recent emergence of true Agent systems (e.g., Anthropic’s Opus 4.5 and OpenAI’s GPT-5.2-Codex). The critical innovation is the "agent harness"—a control layer that autonomously schedules models, uses tools, and verifies outcomes, reducing human intervention. This transforms AI from an assistive tool into an actionable infrastructure, enabling complex task execution. Agentic systems drive surging demand for compute, as they require repeated model calls and higher usage frequency. Thompson emphasizes that AI demand now depends not on user numbers, but on per-user agent utilization. Enterprises are adopting AI not only for efficiency gains but for structural workforce reduction, as agents amplify the impact of key employees while lowering coordination costs. He concludes that massive capital expenditures are justified by real demand, and profits will flow to integrated model-harness providers rather than commoditized standalone models.

Editor's Note: Against the backdrop of the ongoing surge in AI investment and industry narratives, the question of "whether there is a bubble" has become a core issue repeatedly discussed in the market. On one hand, extreme risk narratives continue to amplify concerns about technological loss of control; on the other hand, rapidly expanding capital expenditures and valuation levels also keep the "bubble theory" persistently alive. Amid this divergence, market judgment shows significant uncertainty.

The author of this article, Ben Thompson, is the founder of the technology analysis platform Stratechery and has long focused on the evolution of technology industry structures and business models. On the occasion of Nvidia's GTC 2026, he revised his previous judgment on "whether AI is in a bubble": no longer viewing the current situation as a bubble, but rather understanding it as a phase of structural growth driven by changes in the technological paradigm.

This judgment is based on observations of three key leaps in LLMs. Since ChatGPT first demonstrated the capabilities of large language models to the market in 2022, LLMs have evolved from "usable but unreliable" to "possessing reasoning capabilities," and then to "being able to independently execute tasks." Especially by the end of 2025, with the release of Anthropic's Opus 4.5 and OpenAI's GPT-5.2-Codex, agentic workloads began to move from concept to reality.

The key lies not in the models themselves, but in the emergence of the "agent harness." Agents decouple users from models, responsible for scheduling models, calling tools, and verifying results, transforming AI from a tool requiring continuous human intervention into an execution system that can be entrusted with tasks. This change not only improves reliability but also expands the application boundaries of AI.

Based on this paradigm shift, the author further points out that the expansion of AI demand no longer depends on the number of users, but more on the scheduling capacity per user; meanwhile, agentic workloads exhibit a "winner-takes-all" characteristic, which will continue to drive up demand for high-performance computing power and bring structural opportunities to chip manufacturers and cloud service providers.

Under this framework, the current large-scale capital expenditures are no longer just speculative bets on the future, but more likely a reflection of real demand in advance. As AI moves from being an "assistive tool" to "execution infrastructure," its economic impact may only just be beginning to show.

The following is the original text:

In the past, I leaned more towards the latter, even believing that a bubble might not be a bad thing at certain stages.

But now, standing in March 2026, at the opening of Nvidia's GTC, my judgment has changed: this may not be a bubble. (And ironically, this judgment itself might precisely be a signal of a bubble.)

Three Paradigm Shifts in LLMs

Over the past few weeks, while discussing Nvidia and Oracle's earnings reports, I have repeatedly mentioned that LLMs have undergone three key shifts.

Phase One: ChatGPT

The first inflection point was the release of ChatGPT in November 2022, which hardly needs elaboration. Although large language models based on Transformer have existed since 2017, with capabilities continuously improving, they were long underestimated. Even in October 2022, during an interview on Stratechery, I believed that while the technology was astonishing, it lacked productization and entrepreneurial momentum.

But a few weeks later, everything completely reversed. ChatGPT made the world truly aware of LLM capabilities for the first time.

However, the early versions also left two strong impressions, often cited by "bubble theorists":

First, the models often made mistakes, even "hallucinating" and fabricating answers when they didn't know. This made it more of a "showy tool"—impressive but unreliable.

Second, even so, it was still very useful, but only if you knew how to use it and constantly verified outputs and corrected errors.

Phase Two: o1

The second inflection point was the release of the o1 model by OpenAI in September 2024. By then, LLMs had significantly improved due to stronger base models and post-training techniques, with more accurate outputs and fewer hallucinations.

But the key breakthrough of o1 was: it would "think" before answering.

Traditional LLMs are path-dependent; once they go wrong in the reasoning process, they continue down the wrong path. This is a fundamental weakness of "autoregressive models." Reasoning models, however, self-evaluate answers; they generate an answer first, then judge if it's correct, and try other paths if necessary.

This means the model begins to actively manage errors, reducing the burden of user intervention. The results were also significant. If ChatGPT breakthrough was about "making LLMs usable," then o1's breakthrough was about "making LLMs reliable."

Phase Three: Agent (Opus 4.5 / Codex)

At the end of 2025, the third shift occurred.

In November 2025, Anthropic released Opus 4.5, which initially received little attention. But by December, Claude Code, equipped with this model, suddenly demonstrated unprecedented capabilities; almost simultaneously, OpenAI released GPT-5.2-Codex, showing similar performance.

People had been talking about "Agents" before, but at this moment, they finally began to truly complete tasks, even complex ones requiring hours, and they did so correctly.

The key is not the model itself, but the control layer (harness)—the software layer that schedules models, calls tools, and executes processes. In other words, users no longer directly operate the model; instead, they set goals, and the Agent schedules the model, calls tools, executes processes, and verifies results.

Take programming as an example:

· Phase One: The model generates code

· Phase Two: The model reasons during generation

· Phase Three: The Agent generates code → runs tests → automatically executes tests → retries if wrong, with no need for continuous user intervention.

This means the core flaws of the ChatGPT era are being systematically resolved: higher accuracy, stronger reasoning capabilities, and automatic verification mechanisms.

The only remaining question is: What should we actually use it for?

The reason I emphasize these three inflection points repeatedly is to explain why the entire industry is severely short of computing power and why massive capital expenditures are justified.

The three paradigms have completely different demands on computing power:

· Phase One: Training consumes computing power, but inference costs are low

· Phase Two: Inference costs surge (more tokens + higher usage frequency)

· Phase Three (Agent): Multiple calls to inference models, the Agent itself also consumes computing power (even leaning towards CPU), and usage frequency explodes further

But more importantly, the third point: the change in demand structure is severely underestimated.

Currently, far more people use chatbots than use Agents, and many people aren't using AI to its full potential. The reason is that using AI requires "proactivity." LLMs are tools; they have no goals, no will, and can only be actively called upon.

But Agents change this; they reduce the requirement for human proactivity. In the future, one person could command multiple Agents simultaneously.

This means that even if only a few people possess "proactivity," it is enough to drive huge computing power demand and economic output.

AI still needs "people to drive it," but it no longer needs "many people."

Consumer willingness to pay for AI is limited, which has become increasingly clear. Those truly willing to pay for productivity are enterprises.

What excites enterprises most is not just that AI improves efficiency, but that AI can replace human labor, and do so more efficiently.

The current reality is that in large companies, the people who truly drive the business forward are often a minority; yet the organizations are庞大,带来大量协调成本庞大, bringing significant coordination costs. The role of Agents is to amplify the influence of "value-driving people" while reducing organizational friction.

The result is "fewer people → higher output → lower costs." This is also why future layoffs might not just be "cyclical adjustments" but structural changes.

Companies will rethink not only whether they "over-hired during the pandemic," but also whether, in the AI era, they simply don't need this many people to begin with.

Why This Isn't a Bubble

From this perspective, the logic of "not a bubble" becomes clearer:

1. The core flaws of LLMs are being continuously resolved by computing power and architecture

2. The threshold number of people driving demand is decreasing

3. The benefits brought by Agents are not just cost reduction, but also revenue increase

Therefore, it's not hard to understand why all cloud providers are reporting that computing power is in short supply and are continuously significantly increasing capital expenditures.

Agents and Value Chain Restructuring

Another key question is, if models eventually become commoditized, can OpenAI and Anthropic still make money?

Conventional wisdom says no, but Agents change this. The key is that the real value lies not in the model itself, but in the integration of "model + control system."

Profits tend to flow to the "integration layer," not to replaceable modules. Just like Apple, its hardware avoids commoditization because of deep integration with software. Similarly, Agents require deep synergy between the model and the harness, making OpenAI and Anthropic key integrators in the value chain, not replaceable links.

Microsoft's shift is a signal; it originally emphasized "model replaceability," but after launching a true Agent product, it had to abandon this stance.

This means models might not become completely commoditized, because Agents require integrated capabilities.

The Final Paradox

I must return to the paradox at the beginning.

I have always believed that as long as people are still worried about a bubble, it isn't one yet; a true bubble is when no one questions it anymore.

And now, my conclusion is: This is not a bubble.

But if "me saying this is not a bubble" itself proves it is a bubble, then so be it.

Illustrating the Capital Market After DeepSeek V4's Launch: Zhipu and MiniMax Plunge, NVIDIA Panics

DeepSeek V4, a 1T parameter MoE model with a 285B Flash version, has been fully open-sourced under Apache 2.0, triggering significant reactions across capital markets. Chinese AI chipmakers like Cambricon and Hygon saw major stock gains, with Cambricon rising 60% monthly. In contrast, Hong Kong-listed AI firms Zhipu and MiniMax dropped over 7%, facing heavy short-selling. NVIDIA’s shares dipped, with analysts noting a "decoupling" of Chinese and North American AI inference demand. The launch intensified competition in the AI model space, following 11 major releases in 30 days, including GPT-5.5 and Llama 4. Unlike others, V4’s permissive licensing and full open-source release challenged closed-source models on performance, cost, and accessibility. Critically, V4 announced Day-0 support for domestic chips like Huawei’s Ascend 950PR and Cambricon’s Siyuan 590, offering better cost-performance than NVIDIA counterparts. This shift reduces reliance on CUDA, aligning with NVIDIA CEO’s earlier concerns about Chinese AI chips threatening its dominance. The move signals a tangible step in China’s AI supply chain independence, redirecting compute demand to local manufacturers like Hua Hong Semiconductor.

marsbit16m ago

Illustrating the Capital Market After DeepSeek V4's Launch: Zhipu and MiniMax Plunge, NVIDIA Panics

marsbit16m ago

Crypto Coalition Urges Senate To Fast-Track CLARITY Act As US Leadership Faces Critical Moment

A coalition of over 120 crypto industry organizations, including the Crypto Council for Innovation and the Blockchain Association, is urging the U.S. Senate Banking Committee to fast-track the CLARITY Act, a comprehensive market structure bill for digital assets. They argue that timely legislation is critical to ensure consumer protection, clarify regulatory roles, and maintain U.S. leadership in financial innovation, warning that delay could cede advantages to other jurisdictions. The push comes amid reports of a potential delay until mid-May due to banking sector opposition, particularly concerning restrictions on stablecoin yields. Industry leaders emphasize that May is a crucial window for action before political attention shifts to election campaigns.

bitcoinist22m ago

Crypto Coalition Urges Senate To Fast-Track CLARITY Act As US Leadership Faces Critical Moment

bitcoinist22m ago

Day 6 of the rsETH Incident: DeFi United Secures Approximately $100 Million in Intentional Commitments, but a $50 Million Gap Remains

On April 18, Kelp DAO’s rsETH LayerZero bridge was exploited, resulting in the unauthorized minting of 116.5k rsETH (approx. $292M). The attacker borrowed around $190M on Aave V3. The Arbitrum Security Council froze 30,766 ETH linked to the incident. DeFi United, a cross-protocol rescue initiative led by Awe, was formed to cover a total shortfall of 112.2k rsETH ($258M). As of April 24, several protocols have pledged around $100M in support, though most commitments are still under DAO voting or discussion. Key pledges include: - Golem: 1,000 ETH ($2.3M) - Aave founder Stani Kulechov: 5,000 ETH ($11.5M) - EtherFi: up to 5,000 ETH ($11.5M) - Lido: up to 2,500 stETH ($5.75M), contingent on full coverage - Mantle: proposed a $69M loan to Aave DAO under specific terms The remaining shortfall is estimated at $50M. Aave’s treasury and safety module (~$236M combined) can cover the worst-case bad debt scenario ($230M). Three potential loss distribution paths were outlined by DefiLlama’s 0xngmi: 1. Uniform 18.5% haircut for all rsETH holders: Aave bad debt ~$216M 2. Only protect Mainnet, abandon L2: bad debt up to $341M 3. Repay only pre-attack holders: technically difficult, ~$91M net loss KelpDAO has not yet announced a specific plan. The success of DeFi United depends heavily on KelpDAO’s final decision on loss allocation.

marsbit27m ago

Day 6 of the rsETH Incident: DeFi United Secures Approximately $100 Million in Intentional Commitments, but a $50 Million Gap Remains

marsbit27m ago

$467K In Crypto Seized As Spain Cracks Down On Illegal Piracy Platform

Spanish police seized €400,000 ($467,000) in cryptocurrency from two cold wallets hidden inside a wall thermometer during a raid in Almería. Three suspects were arrested in connection with the country’s largest illegal Spanish-language manga distribution platform, operational since 2014. The site generated over €4 million ($4.55 million) in ad revenue by offering pirated content. Authorities have not confirmed whether they can access the seized funds, as cold wallets require PINs or seed phrases. The case highlights challenges law enforcement face in handling crypto seizures, illustrated by custody failures in other countries like South Korea.

bitcoinist1h ago

$467K In Crypto Seized As Spain Cracks Down On Illegal Piracy Platform

bitcoinist1h ago

Kicked Out of PayPal, Musk Aims for a Comeback in the Crypto Market

Elon Musk's X (formerly Twitter) has launched its "Smart Cashtags" feature, generating approximately $1 billion in trading volume within days of its April 2026 pilot launch. The feature allows users to click on stock or crypto tickers (or even full Solana token contract addresses) in posts to view real-time price charts and discussions without leaving the app. Initially available to iPhone users in the US and Canada, with a partnership in Canada enabling direct trading via the Wealthsimple app. This move is part of Musk's broader "Everything App" vision, spearheaded by the upcoming X Money platform. Analysts, such as Mizuho's Dan Dolev, see this as a potential disruptor to the US payments market, even prompting a downgrade of PayPal's stock. X Money's beta offers services like 6% APY on deposits, cashback, and P2P transfers, with speculation it may later incorporate crypto trading and stablecoin settlements for faster transactions. However, the ambitious plan faces significant regulatory scrutiny. Senator Elizabeth Warren has questioned the sustainability of the high 6% yield and raised concerns over X's banking partner, Cross River Bank, which has a history of regulatory violations. Additional risks involve the "GENIUS Act," which may create loopholes for stablecoin issuance without full FDIC insurance coverage, potentially leaving users unprotected. The integration of social trading on a platform with over 500 million users could inject new liquidity and retail interest into the crypto market. Yet, it also amplifies risks like herd mentality and the blurring of lines between entertainment and financial speculation. Musk's return to finance, after his ouster from PayPal, hinges on balancing innovation with regulatory compliance.

marsbit2h ago

Kicked Out of PayPal, Musk Aims for a Comeback in the Crypto Market

marsbit2h ago

Trading

Spot

Futures

Hot Articles

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

Audiera is a dual-platform Web4 entertainment ecosystem combining a mobile rhythm experience and a lightweight Telegram mini-game, powered by AI interaction and an on-chain creator economy.

39.7k Total ViewsPublished 2026.03.11Updated 2026.03.11

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

41.2k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

1.4k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.