The First Year of Computing Power Inflation: The Cheaper DeepSeek Gets, the Harder It Is to Stop This Round of Price Hikes

marsbitPublished on 2026-04-17Last updated on 2026-04-17

Abstract

The year 2026 marks the beginning of "computing power inflation." While AI inference costs have dropped by over 80% in 18 months globally, China's three major cloud providers—Alibaba Cloud, Baidu AI Cloud, and Tencent Cloud—simultaneously announced price hikes of 20–30%. This reflects a deeper structural shift driven by Jevons Paradox: as unit costs fall (e.g., via models like DeepSeek-R1), demand explodes, especially with the rise of reasoning models and AI agents that consume 10–50x more tokens per task. Although DeepSeek open-sourced its model weights, it did not release its inference optimization stack, leaving a significant engineering efficiency gap between cloud providers and smaller players. The big three are leveraging this advantage to reposition: Alibaba focuses on high-margin premium clients, Baidu filters out low-value users, and Tencent capitalizes on ecosystem lock-in. Meanwhile, ByteDance’s Volcano Engine adopts a more moderate pricing strategy to capture displaced customers. Unexpectedly, the price surge is pushing large enterprises toward self-built computing solutions once their cloud bills exceed a certain threshold. While cloud providers aim to boost profitability, they risk driving away innovative startups and accelerating competition from GPU leasing and domestic hardware providers like Huawei. The涨价 trend is expected to persist for 2–3 years, fueled by rising token consumption from reasoning models, AI agent adoption, and NVIDIA export restrictions....

AI inference costs have dropped by over 80% in 18 months, yet China's three major cloud providers announced price hikes in the same week. This will be a structural price game lasting at least two to three years. This article attempts to answer a more important question: when will it end?

Tomorrow (April 18), Alibaba Cloud and Baidu AI Cloud will officially begin price adjustments. Three weeks later, Tencent Cloud will also usher in a new round of price increases. Globally, OpenAI and Anthropic have reduced API prices by over 80% in the past 18 months, and the emergence of DeepSeek-R1 has further led the outside world to believe that inference costs are about to hit zero.

As a result, China's three major cloud providers announced price increases of 20% to 30% in the same week.

Figure | Timeline of global cloud computing price increase events in 2026

The media's initial reaction was "the price war is over, and the big players are starting to reap profits." This assessment is not wrong, but it stops at the most superficial interpretation. It explains why cloud providers are raising prices but does not answer the more critical question: Is this price hike a temporary correction or the starting point of a sustained trend? The answer lies in an economic paradox from 150 years ago.

01. Jevons Paradox: The Cheaper It Gets, the More It Burns

In 1865, British economist William Jevons observed a counterintuitive phenomenon: after the efficiency of steam engines improved, the total coal consumption in the UK increased dramatically. The reduction in usage costs triggered an explosion in demand. This is the Jevons Paradox, which has accurately reappeared in the computing power market of 2026.

DeepSeek-R1 has indeed significantly reduced the cost per token for inference. But it has also opened a floodgate of demand: many enterprises that previously found "AI too expensive" have begun integrating AI into their business processes. Once integrated, token consumption expands at a nonlinear rate.

A more critical change is that AI applications have moved from "dialogue" to "doing things": Agents and Reasoning Models have entered the scene. A task that previously burned 1,000 tokens now burns 5,000 tokens with a reasoning chain, as Reasoning Models "think" on their own, consuming 10 to 50 times more than standard models.

Figure | Before and after DeepSeek's release: Token unit price vs. total call volume trend

Note: 2025Q2 = Baseline 100 | Comprehensive estimate of inference APIs from major Chinese cloud providers

DeepSeek lowered the entry barrier but broke through the computing power ceiling. Each unit token is becoming cheaper, but each business task is becoming more expensive. This is the real foundation on which this round of price increases is built.

02. Weights Open-Sourced, Inference Stack Not Open-Sourced

Another detail largely overlooked in reports: DeepSeek open-sourced the model weights but not its inference optimization stack. The difference between the two is like being given the design blueprints for an engine but not being told how to tune it for F1 performance.

What truly determines inference cost is not just the model architecture but the engineering capabilities hidden beneath the surface: the hit rate of speculative decoding, memory scheduling strategies for KV Cache, optimization separation between Prefill and Decode phases, and the network topology of ten-thousand-card clusters. These hard skills remain the moat of a few leading cloud providers.

Figure | Actual efficiency gap in inference under equivalent model scale

Using DeepSeek-R1-67B as a benchmark, comparison of tokens processed per second (TPS) under different deployment conditions | Values are comprehensive industry estimates

Running the same DeepSeek-R1, the inference efficiency of leading cloud providers can be 3 to 5 times higher than that of self-built enterprise deployments. This means that with the same computing power investment, cloud providers can serve more concurrency, resulting in lower unit costs.

This efficiency gap is one source of the "premium" charged by cloud providers. It is a tangible engineering barrier. Therefore, this price increase is, to some extent, also about pricing their technical advantage.

03. Battle of the Giants: The Ledgers and Ambitions of Four Major Players

In this wave of collective price adjustments, the stances of the four core giants vary, reflecting different commercial calculations.

Alibaba Cloud: Wu Yongming-style "Profit Quality" Defense. Alibaba's adjustment is the most resolute, with increases focused mainly on high-end GPU instances and storage (CPFS). Against the backdrop of Alibaba's full return to "efficiency first," Alibaba Cloud is no longer pursuing所谓的 "cloud market share first" but is instead aiming for "AI computing power profit margin first." The subtext is that Alibaba Cloud is establishing a "VIP computing power zone." If you cannot cover this 30% premium, you might not be on Alibaba's core target customer list.

Baidu AI Cloud: "User Filter." As the earliest player to bet big on large models, Baidu is facing pressure from the quantum leap in inference costs for its ERNIE model as call volumes scale. Therefore, Baidu's price hike is more like a "user reshuffle." It is actively weeding out small individual developers who only seek free benefits without creating commercial closed loops,转而全力服务对价格不敏感的B端大客户 (turning instead to fully serve large B-end customers who are price-insensitive). Baidu needs to prove through price adjustments that its AI growth no longer relies on subsidies but on "selling intelligence at a premium."

Tencent Cloud: "ROI Correction" After Ecosystem Lock-in. Tencent Cloud's move came three weeks later than Alibaba's, a typical "follow-the-leader strategy." Tencent's confidence lies in the deep integration with the WeChat ecosystem and Enterprise WeChat. When enterprise workflows are deeply embedded in Tencent's social/collaboration ecosystem, the migration cost is extremely high. Tencent Cloud's price increase is more like a "catch-up increase," used to correct the ROI sacrificed over the past two years to seize the ecosystem, making the AI business look more "respectable" in financial reports.

Volcano Engine: Strategic "Unbalanced Followership" and Talent Grab Plan. Volcano Engine (ByteDance) is the variable in this wave of price hikes. Although it has also adjusted some prices, the increases on many core APIs are significantly lower than those of Alibaba and Baidu. ByteDance is using this window period for "stock interception." Relying on the tremendous computing power absorption capacity brought by internal Douyin and TikTok, Volcano holds an extremely strong cost amortization card. While competitors are "driving away customers" to protect profits, Volcano is waiting for those who fall behind, attempting to use the price difference to achieve one last "installed base" overtake.

04. The Biggest Surprise: Large Enterprises Start "Leaving"

This price increase has triggered an unintended counterforce: it has实质上坚定了大企业“自建算力”的决心 (substantially strengthened large enterprises' determination to "self-build computing power").

The cloud computing industry has a hidden rule: when the monthly bill exceeds a certain threshold, the financial model of "self-build vs. rent" flips. For banks, central state-owned enterprises, and large manufacturers, this threshold is roughly at a monthly cloud computing expenditure of 3 million to 5 million yuan.

In 2024, most large enterprises were below this threshold, making self-building uneconomical. In 2025, as AI projects rolled out, some enterprises began to touch the line. And this round of 20% to 30% price increases in 2026 has directly pushed a group of customers who were originally just on the line into the zone where they "must seriously consider self-building."

Figure | Cloud rental vs. self-build: Total Cost of Ownership (TCO) break-even point calculation

Horizontal axis: average monthly computing power expenditure (10k yuan/month), Vertical axis: 36-month cumulative cost (million yuan) | Comparison before and after price increase

The beneficiaries of this self-building wave are not the cloud providers' competitors but more peripheral players: GPU rental platforms saw inquiry volumes triple year-on-year in March; Huawei Ascend's delivery schedule for large customers has been extended to 6 months; integrators specializing in helping enterprises build "private inference clusters" have suddenly become highly sought after.

Cloud providers intended to raise prices to harvest high-end customers but inadvertently pushed away a group of large customers with self-building capabilities. This decision risk may be reassessed when the earnings season arrives.

05. Who Wins? The Truth About Benefit Distribution

The price increases by the three cloud providers are seen by the media as "big players harvesting." But from the perspective of the entire industry chain, the distribution of real winners is much more complex.

There is an ironic reality: the most hurt are the small and medium-sized AI startups full of innovative vitality. If they fall on a large scale due to costs, the cloud providers' own ecosystems will wither accordingly.

This is not without precedent. In the early 2010s, Amazon AWS's aggressive price increases on some services accelerated the shift of some developers to Google Cloud, indirectly helping GCP complete its early ecosystem accumulation. History does not repeat itself simply, but it rhymes.

06. How Long Will the Era of Price Hikes Last?

Simply put, the essence of this round of price increases is a pressure release in China's AI computing power market amidst exploding demand and supply constraints. Squeezed from both sides, prices can only move upward. This is not entirely an active choice by cloud providers; in a sense, it is also a forced pricing correction.

Figure | China's high-performance AI computing power: Demand growth rate vs. domestic supply capacity expansion rate Index: 2023 = 100 | The持续扩大的供需缺口 (continuously expanding supply-demand gap) is the underlying logic of this price increase

None of the three structural factors supporting this round of price increases will substantially disappear within 12 months: the quantum leap in token consumption brought by the adoption of Reasoning models, the accelerated large-scale deployment of AI Agents, and the supply constraints caused by Nvidia's export controls.

The B2B software market has a repeatedly verified规律 (rule): the Price Ratchet Effect. None of AWS's several price increases in the early 2010s were fully rolled back after supply improved. Google Cloud storage pricing has only seen one downward step since 2021, accompanied by tightened storage limits. Cloud providers understand this规律 (rule): this price increase is not just "harvesting during a window" but also about locking in a new price baseline.

Computing power price index trend: Three scenario predictions (2025Q2–2028Q2)

2025Q2 = Baseline 100 | Comprehensive inference API average price index estimate, including price increase effects

Therefore, before 2027, "computing power to zero" will not become a reality. The real factor determining the price inflection point depends on when the scheduling efficiency of domestic computing power can substantially catch up with Nvidia's H100. Judging from the current engineering progress, this point in time is most likely between 2027 and 2028.

And during this window, cloud providers have every reason to "raise prices first out of respect," because they know the window will not stay open forever.

07 Conclusion: A Structural Game on the Supply Side

What this round of price increases reveals is not the grand narrative of "AI commercialization's coming of age" but a more specific industrial reality: when an efficiency revolution and demand explosion occur simultaneously, prices may not fall but instead rise. The Jevons Paradox held true in the coal era, and it同样成立 (holds equally true) in the computing power era.

For small and medium-sized AI application companies, rather than arguing about who is harvesting, it is better to seriously calculate: in their own business scenarios, how many tokens are being consumed无效ly (ineffectively)?

Saving tokens is the hardest moat in this era.

This article is from the WeChat public account "EmphasizeNext" (ID: leo89203898), author: Wen Xin, editor: Xiao Bai

Hong Kong Crypto Scam Shock: Woman Loses Nearly $1 Million As AI Fraud Surges

Hong Kong Crypto Scam Shock: Woman Loses Nearly $1 Million As AI Fraud Surges A Hong Kong woman lost HK$7.7 million (approximately $982,000) after being lured into a fraudulent cryptocurrency investment platform. The scam began on Telegram, where a person posing as an investment expert promoted an AI-powered trading strategy with guaranteed returns. After making 17 transfers of USDT and Ethereum, the victim was unable to withdraw her funds. Hong Kong Police have reported a significant surge in such online investment fraud, with over 80 cases recorded in a single week, totaling losses of nearly HK$80 million. Authorities warn that scammers are increasingly using sophisticated tactics, including promises of "AI trading" and "guaranteed profits," to appear credible. This follows a similar incident last month where a retiree lost HK$6.6 million in a multi-stage scam. Police urge the public to be cautious of unsolicited investment advice and to verify platforms through the official CyberDefender service before transferring any money. They emphasize that no legitimate investment can guarantee returns. Investigations are ongoing.

bitcoinist1h ago

a16z New Article: Prediction Markets Entering the Fast-Forward Phase

Prediction markets are evolving from niche tools focused on elections and sports into a broader financial infrastructure for pricing real-world uncertainty. Key shifts include: application expansion beyond sports into entertainment, macro, and CPI markets; the creation of direct price benchmarks for events (e.g., tariffs, Fed decisions), enabling precise hedging without correlated asset risks; and growing institutional adoption, though still early-stage. While sports drive volume, long-tail markets show faster growth. Current institutional use is primarily for data, but progression toward system integration and active trading is expected. Regulatory advancements, like margin trading, are critical for scaling. The market is transitioning toward an essential, institutional-grade tool, similar to the evolution of options markets.

marsbit3h ago

a16z New Article: Prediction Markets Entering the Fast-Forward Phase

marsbit3h ago

Kyrgyzstan President Meets Justin Sun, Tron Collaborates with Kyrgyzstan to Build a New Digital Economy Landscape in Central Asia

Justin Sun, founder of TRON, met with Kyrgyzstan President Sadyr Japarov in Bishkek to discuss digital financial transformation, virtual asset regulation, and TRON’s strategic expansion into Central Asia. This marks TRON's first major partnership in the region. President Japarov emphasized Kyrgyzstan’s goal to become a regional hub for virtual assets and Web3 technologies, aligning with TRON’s strengths in high-throughput, low-cost stablecoin infrastructure. Sun expressed TRON’s commitment to supporting this vision through concrete projects. Japarov also highlighted the work of the National Committee for Virtual Assets and Blockchain Development, which he chairs. The committee has advanced initiatives including exploring a national stablecoin, testing a CBDC, and developing a regulatory sandbox. The meeting underscores growing international interest in Kyrgyzstan’s digital economy strategy.

marsbit3h ago

Kyrgyzstan President Meets Justin Sun, Tron Collaborates with Kyrgyzstan to Build a New Digital Economy Landscape in Central Asia

marsbit3h ago

TechFlow Intelligence Bureau: KelpDAO Attack Causes Nearly $300 Million Loss, Triggers Aave Withdrawal Wave, RAVE Crashes 95% in a Single Day

China's AI firm DeepSeek is seeking external funding for the first time, with a valuation exceeding $10 billion, signaling intensifying competition and high R&D costs in the domestic large model sector. Meanwhile, OpenAI CEO Sam Altman faces scrutiny over potential conflicts of interest between his personal investments and OpenAI’s business ahead of a possible IPO. In Web3, KelpDAO suffered a $294 million attack due to forged cross-chain messages on LayerZero, leading to massive withdrawals from Aave and a resulting 18% drop in AAVE tokens. Separately, RAVE cryptocurrency collapsed by 95% in a single day amid suspected insider manipulation. Geopolitically, Iran is now demanding Bitcoin payments for transit through the Strait of Hormuz, reflecting both internal governmental discord and the growing adoption of crypto in tense regions. In semiconductors, Nvidia CEO Jensen Huang showed rare public frustration over questions regarding chip sales to China, while the industry faces renewed price hikes. Tesla continues expanding its Robotaxi service, and a Chinese humanoid robot outperformed humans in a half-marathon, marking a milestone in robotics. Despite Middle East tensions and market uncertainties, U.S. stocks continue to rise, prompting discussions about market optimism versus risk blindness. Overall, today’s developments highlight systemic vulnerabilities—in tech, finance, and geopolitics—while also showcasing innovation in crises.

marsbit4h ago

TechFlow Intelligence Bureau: KelpDAO Attack Causes Nearly $300 Million Loss, Triggers Aave Withdrawal Wave, RAVE Crashes 95% in a Single Day

marsbit4h ago

Trading Volume Soars 80-Fold: How Kalshi is Reshaping the Sports Prediction Market

Kalshi, a US-regulated prediction market platform, has experienced explosive growth, with monthly trading volume surging 80x to $14.4 billion by March 2026. Sports markets drive this expansion, representing 68% of total activity. Analysis shows Kalshi's prices are highly efficient, with pre-game probabilities deviating only 5.5% from actual outcomes. While its peer-to-peer exchange model offers advantages over traditional sportsbooks—including nationwide availability and no betting limits—Kalshi faces challenges with higher effective fees (2.8-4.2%) and limited in-play liquidity. The platform's $22 billion valuation reflects its positioning as an exchange rather than a bookmaker, but maintaining this premium requires improving liquidity depth and managing competition from rivals like Polymarket and potential entrants from traditional sports betting.

marsbit9h ago

Trading

Spot

Futures

The First Year of Computing Power Inflation: The Cheaper DeepSeek Gets, the Harder It Is to Stop This Round of Price Hikes

Abstract

01.

Jevons Paradox: The Cheaper It Gets, the More It Burns

02.

Weights Open-Sourced, Inference Stack Not Open-Sourced

03.

Battle of the Giants: The Ledgers and Ambitions of Four Major Players

04.

The Biggest Surprise: Large Enterprises Start "Leaving"

05.

Who Wins? The Truth About Benefit Distribution

06.

How Long Will the Era of Price Hikes Last?

07

Conclusion: A Structural Game on the Supply Side

Related Questions

Related Reads

Hong Kong Crypto Scam Shock: Woman Loses Nearly $1 Million As AI Fraud Surges

a16z New Article: Prediction Markets Entering the Fast-Forward Phase

Kyrgyzstan President Meets Justin Sun, Tron Collaborates with Kyrgyzstan to Build a New Digital Economy Landscape in Central Asia

TechFlow Intelligence Bureau: KelpDAO Attack Causes Nearly $300 Million Loss, Triggers Aave Withdrawal Wave, RAVE Crashes 95% in a Single Day

Trading Volume Soars 80-Fold: How Kalshi is Reshaping the Sports Prediction Market

Trading