Editor's Note: Against the backdrop of the ongoing surge in AI investment and industry narratives, the question of "whether there is a bubble" has become a core issue repeatedly discussed in the market. On one hand, extreme risk narratives continue to amplify concerns about technological loss of control; on the other hand, rapidly expanding capital expenditures and valuation levels also keep the "bubble theory" persistently alive. Amid this divergence, market judgment shows significant uncertainty.
The author of this article, Ben Thompson, is the founder of the technology analysis platform Stratechery and has long focused on the evolution of technology industry structures and business models. On the occasion of Nvidia's GTC 2026, he revised his previous judgment on "whether AI is in a bubble": no longer viewing the current situation as a bubble, but rather understanding it as a phase of structural growth driven by changes in the technological paradigm.
This judgment is based on observations of three key leaps in LLMs. Since ChatGPT first demonstrated the capabilities of large language models to the market in 2022, LLMs have evolved from "usable but unreliable" to "possessing reasoning capabilities," and then to "being able to independently execute tasks." Especially by the end of 2025, with the release of Anthropic's Opus 4.5 and OpenAI's GPT-5.2-Codex, agentic workloads began to move from concept to reality.
The key lies not in the models themselves, but in the emergence of the "agent harness." Agents decouple users from models, responsible for scheduling models, calling tools, and verifying results, transforming AI from a tool requiring continuous human intervention into an execution system that can be entrusted with tasks. This change not only improves reliability but also expands the application boundaries of AI.
Based on this paradigm shift, the author further points out that the expansion of AI demand no longer depends on the number of users, but more on the scheduling capacity per user; meanwhile, agentic workloads exhibit a "winner-takes-all" characteristic, which will continue to drive up demand for high-performance computing power and bring structural opportunities to chip manufacturers and cloud service providers.
Under this framework, the current large-scale capital expenditures are no longer just speculative bets on the future, but more likely a reflection of real demand in advance. As AI moves from being an "assistive tool" to "execution infrastructure," its economic impact may only just be beginning to show.
The following is the original text:
In the past, I leaned more towards the latter, even believing that a bubble might not be a bad thing at certain stages.
But now, standing in March 2026, at the opening of Nvidia's GTC, my judgment has changed: this may not be a bubble. (And ironically, this judgment itself might precisely be a signal of a bubble.)
Three Paradigm Shifts in LLMs
Over the past few weeks, while discussing Nvidia and Oracle's earnings reports, I have repeatedly mentioned that LLMs have undergone three key shifts.
Phase One: ChatGPT
The first inflection point was the release of ChatGPT in November 2022, which hardly needs elaboration. Although large language models based on Transformer have existed since 2017, with capabilities continuously improving, they were long underestimated. Even in October 2022, during an interview on Stratechery, I believed that while the technology was astonishing, it lacked productization and entrepreneurial momentum.
But a few weeks later, everything completely reversed. ChatGPT made the world truly aware of LLM capabilities for the first time.
However, the early versions also left two strong impressions, often cited by "bubble theorists":
First, the models often made mistakes, even "hallucinating" and fabricating answers when they didn't know. This made it more of a "showy tool"—impressive but unreliable.
Second, even so, it was still very useful, but only if you knew how to use it and constantly verified outputs and corrected errors.
Phase Two: o1
The second inflection point was the release of the o1 model by OpenAI in September 2024. By then, LLMs had significantly improved due to stronger base models and post-training techniques, with more accurate outputs and fewer hallucinations.
But the key breakthrough of o1 was: it would "think" before answering.
Traditional LLMs are path-dependent; once they go wrong in the reasoning process, they continue down the wrong path. This is a fundamental weakness of "autoregressive models." Reasoning models, however, self-evaluate answers; they generate an answer first, then judge if it's correct, and try other paths if necessary.
This means the model begins to actively manage errors, reducing the burden of user intervention. The results were also significant. If ChatGPT breakthrough was about "making LLMs usable," then o1's breakthrough was about "making LLMs reliable."
Phase Three: Agent (Opus 4.5 / Codex)
At the end of 2025, the third shift occurred.
In November 2025, Anthropic released Opus 4.5, which initially received little attention. But by December, Claude Code, equipped with this model, suddenly demonstrated unprecedented capabilities; almost simultaneously, OpenAI released GPT-5.2-Codex, showing similar performance.
People had been talking about "Agents" before, but at this moment, they finally began to truly complete tasks, even complex ones requiring hours, and they did so correctly.
The key is not the model itself, but the control layer (harness)—the software layer that schedules models, calls tools, and executes processes. In other words, users no longer directly operate the model; instead, they set goals, and the Agent schedules the model, calls tools, executes processes, and verifies results.
Take programming as an example:
· Phase One: The model generates code
· Phase Two: The model reasons during generation
· Phase Three: The Agent generates code → runs tests → automatically executes tests → retries if wrong, with no need for continuous user intervention.
This means the core flaws of the ChatGPT era are being systematically resolved: higher accuracy, stronger reasoning capabilities, and automatic verification mechanisms.
The only remaining question is: What should we actually use it for?
The reason I emphasize these three inflection points repeatedly is to explain why the entire industry is severely short of computing power and why massive capital expenditures are justified.
The three paradigms have completely different demands on computing power:
· Phase One: Training consumes computing power, but inference costs are low
· Phase Two: Inference costs surge (more tokens + higher usage frequency)
· Phase Three (Agent): Multiple calls to inference models, the Agent itself also consumes computing power (even leaning towards CPU), and usage frequency explodes further
But more importantly, the third point: the change in demand structure is severely underestimated.
Currently, far more people use chatbots than use Agents, and many people aren't using AI to its full potential. The reason is that using AI requires "proactivity." LLMs are tools; they have no goals, no will, and can only be actively called upon.
But Agents change this; they reduce the requirement for human proactivity. In the future, one person could command multiple Agents simultaneously.
This means that even if only a few people possess "proactivity," it is enough to drive huge computing power demand and economic output.
AI still needs "people to drive it," but it no longer needs "many people."
Consumer willingness to pay for AI is limited, which has become increasingly clear. Those truly willing to pay for productivity are enterprises.
What excites enterprises most is not just that AI improves efficiency, but that AI can replace human labor, and do so more efficiently.
The current reality is that in large companies, the people who truly drive the business forward are often a minority; yet the organizations are庞大,带来大量协调成本庞大, bringing significant coordination costs. The role of Agents is to amplify the influence of "value-driving people" while reducing organizational friction.
The result is "fewer people → higher output → lower costs." This is also why future layoffs might not just be "cyclical adjustments" but structural changes.
Companies will rethink not only whether they "over-hired during the pandemic," but also whether, in the AI era, they simply don't need this many people to begin with.
Why This Isn't a Bubble
From this perspective, the logic of "not a bubble" becomes clearer:
1. The core flaws of LLMs are being continuously resolved by computing power and architecture
2. The threshold number of people driving demand is decreasing
3. The benefits brought by Agents are not just cost reduction, but also revenue increase
Therefore, it's not hard to understand why all cloud providers are reporting that computing power is in short supply and are continuously significantly increasing capital expenditures.
Agents and Value Chain Restructuring
Another key question is, if models eventually become commoditized, can OpenAI and Anthropic still make money?
Conventional wisdom says no, but Agents change this. The key is that the real value lies not in the model itself, but in the integration of "model + control system."
Profits tend to flow to the "integration layer," not to replaceable modules. Just like Apple, its hardware avoids commoditization because of deep integration with software. Similarly, Agents require deep synergy between the model and the harness, making OpenAI and Anthropic key integrators in the value chain, not replaceable links.
Microsoft's shift is a signal; it originally emphasized "model replaceability," but after launching a true Agent product, it had to abandon this stance.
This means models might not become completely commoditized, because Agents require integrated capabilities.
The Final Paradox
I must return to the paradox at the beginning.
I have always believed that as long as people are still worried about a bubble, it isn't one yet; a true bubble is when no one questions it anymore.
And now, my conclusion is: This is not a bubble.
But if "me saying this is not a bubble" itself proves it is a bubble, then so be it.






