The First Year of Computing Power Inflation: The Cheaper DeepSeek Gets, the Harder It Is to Stop This Round of Price Hikes

marsbitPublished on 2026-04-17Last updated on 2026-04-17

Abstract

The year 2026 marks the beginning of "computing power inflation." While AI inference costs have dropped by over 80% in 18 months globally, China's three major cloud providers—Alibaba Cloud, Baidu AI Cloud, and Tencent Cloud—simultaneously announced price hikes of 20–30%. This reflects a deeper structural shift driven by Jevons Paradox: as unit costs fall (e.g., via models like DeepSeek-R1), demand explodes, especially with the rise of reasoning models and AI agents that consume 10–50x more tokens per task. Although DeepSeek open-sourced its model weights, it did not release its inference optimization stack, leaving a significant engineering efficiency gap between cloud providers and smaller players. The big three are leveraging this advantage to reposition: Alibaba focuses on high-margin premium clients, Baidu filters out low-value users, and Tencent capitalizes on ecosystem lock-in. Meanwhile, ByteDance’s Volcano Engine adopts a more moderate pricing strategy to capture displaced customers. Unexpectedly, the price surge is pushing large enterprises toward self-built computing solutions once their cloud bills exceed a certain threshold. While cloud providers aim to boost profitability, they risk driving away innovative startups and accelerating competition from GPU leasing and domestic hardware providers like Huawei. The涨价 trend is expected to persist for 2–3 years, fueled by rising token consumption from reasoning models, AI agent adoption, and NVIDIA export restrictions....

AI inference costs have dropped by over 80% in 18 months, yet China's three major cloud providers announced price hikes in the same week. This will be a structural price game lasting at least two to three years. This article attempts to answer a more important question: when will it end?

Tomorrow (April 18), Alibaba Cloud and Baidu AI Cloud will officially begin price adjustments. Three weeks later, Tencent Cloud will also usher in a new round of price increases. Globally, OpenAI and Anthropic have reduced API prices by over 80% in the past 18 months, and the emergence of DeepSeek-R1 has further led the outside world to believe that inference costs are about to hit zero.

As a result, China's three major cloud providers announced price increases of 20% to 30% in the same week.

Figure | Timeline of global cloud computing price increase events in 2026

The media's initial reaction was "the price war is over, and the big players are starting to reap profits." This assessment is not wrong, but it stops at the most superficial interpretation. It explains why cloud providers are raising prices but does not answer the more critical question: Is this price hike a temporary correction or the starting point of a sustained trend? The answer lies in an economic paradox from 150 years ago.

01.

Jevons Paradox: The Cheaper It Gets, the More It Burns

In 1865, British economist William Jevons observed a counterintuitive phenomenon: after the efficiency of steam engines improved, the total coal consumption in the UK increased dramatically. The reduction in usage costs triggered an explosion in demand. This is the Jevons Paradox, which has accurately reappeared in the computing power market of 2026.

DeepSeek-R1 has indeed significantly reduced the cost per token for inference. But it has also opened a floodgate of demand: many enterprises that previously found "AI too expensive" have begun integrating AI into their business processes. Once integrated, token consumption expands at a nonlinear rate.

A more critical change is that AI applications have moved from "dialogue" to "doing things": Agents and Reasoning Models have entered the scene. A task that previously burned 1,000 tokens now burns 5,000 tokens with a reasoning chain, as Reasoning Models "think" on their own, consuming 10 to 50 times more than standard models.

Figure | Before and after DeepSeek's release: Token unit price vs. total call volume trend

Note: 2025Q2 = Baseline 100 | Comprehensive estimate of inference APIs from major Chinese cloud providers

DeepSeek lowered the entry barrier but broke through the computing power ceiling. Each unit token is becoming cheaper, but each business task is becoming more expensive. This is the real foundation on which this round of price increases is built.

02.

Weights Open-Sourced, Inference Stack Not Open-Sourced

Another detail largely overlooked in reports: DeepSeek open-sourced the model weights but not its inference optimization stack. The difference between the two is like being given the design blueprints for an engine but not being told how to tune it for F1 performance.

What truly determines inference cost is not just the model architecture but the engineering capabilities hidden beneath the surface: the hit rate of speculative decoding, memory scheduling strategies for KV Cache, optimization separation between Prefill and Decode phases, and the network topology of ten-thousand-card clusters. These hard skills remain the moat of a few leading cloud providers.

Figure | Actual efficiency gap in inference under equivalent model scale

Using DeepSeek-R1-67B as a benchmark, comparison of tokens processed per second (TPS) under different deployment conditions | Values are comprehensive industry estimates

Running the same DeepSeek-R1, the inference efficiency of leading cloud providers can be 3 to 5 times higher than that of self-built enterprise deployments. This means that with the same computing power investment, cloud providers can serve more concurrency, resulting in lower unit costs.

This efficiency gap is one source of the "premium" charged by cloud providers. It is a tangible engineering barrier. Therefore, this price increase is, to some extent, also about pricing their technical advantage.

03.

Battle of the Giants: The Ledgers and Ambitions of Four Major Players

In this wave of collective price adjustments, the stances of the four core giants vary, reflecting different commercial calculations.

Alibaba Cloud: Wu Yongming-style "Profit Quality" Defense. Alibaba's adjustment is the most resolute, with increases focused mainly on high-end GPU instances and storage (CPFS). Against the backdrop of Alibaba's full return to "efficiency first," Alibaba Cloud is no longer pursuing所谓的 "cloud market share first" but is instead aiming for "AI computing power profit margin first." The subtext is that Alibaba Cloud is establishing a "VIP computing power zone." If you cannot cover this 30% premium, you might not be on Alibaba's core target customer list.

Baidu AI Cloud: "User Filter." As the earliest player to bet big on large models, Baidu is facing pressure from the quantum leap in inference costs for its ERNIE model as call volumes scale. Therefore, Baidu's price hike is more like a "user reshuffle." It is actively weeding out small individual developers who only seek free benefits without creating commercial closed loops,转而全力服务对价格不敏感的B端大客户 (turning instead to fully serve large B-end customers who are price-insensitive). Baidu needs to prove through price adjustments that its AI growth no longer relies on subsidies but on "selling intelligence at a premium."

Tencent Cloud: "ROI Correction" After Ecosystem Lock-in. Tencent Cloud's move came three weeks later than Alibaba's, a typical "follow-the-leader strategy." Tencent's confidence lies in the deep integration with the WeChat ecosystem and Enterprise WeChat. When enterprise workflows are deeply embedded in Tencent's social/collaboration ecosystem, the migration cost is extremely high. Tencent Cloud's price increase is more like a "catch-up increase," used to correct the ROI sacrificed over the past two years to seize the ecosystem, making the AI business look more "respectable" in financial reports.

Volcano Engine: Strategic "Unbalanced Followership" and Talent Grab Plan. Volcano Engine (ByteDance) is the variable in this wave of price hikes. Although it has also adjusted some prices, the increases on many core APIs are significantly lower than those of Alibaba and Baidu. ByteDance is using this window period for "stock interception." Relying on the tremendous computing power absorption capacity brought by internal Douyin and TikTok, Volcano holds an extremely strong cost amortization card. While competitors are "driving away customers" to protect profits, Volcano is waiting for those who fall behind, attempting to use the price difference to achieve one last "installed base" overtake.

04.

The Biggest Surprise: Large Enterprises Start "Leaving"

This price increase has triggered an unintended counterforce: it has实质上坚定了大企业“自建算力”的决心 (substantially strengthened large enterprises' determination to "self-build computing power").

The cloud computing industry has a hidden rule: when the monthly bill exceeds a certain threshold, the financial model of "self-build vs. rent" flips. For banks, central state-owned enterprises, and large manufacturers, this threshold is roughly at a monthly cloud computing expenditure of 3 million to 5 million yuan.

In 2024, most large enterprises were below this threshold, making self-building uneconomical. In 2025, as AI projects rolled out, some enterprises began to touch the line. And this round of 20% to 30% price increases in 2026 has directly pushed a group of customers who were originally just on the line into the zone where they "must seriously consider self-building."

Figure | Cloud rental vs. self-build: Total Cost of Ownership (TCO) break-even point calculation

Horizontal axis: average monthly computing power expenditure (10k yuan/month), Vertical axis: 36-month cumulative cost (million yuan) | Comparison before and after price increase

The beneficiaries of this self-building wave are not the cloud providers' competitors but more peripheral players: GPU rental platforms saw inquiry volumes triple year-on-year in March; Huawei Ascend's delivery schedule for large customers has been extended to 6 months; integrators specializing in helping enterprises build "private inference clusters" have suddenly become highly sought after.

Cloud providers intended to raise prices to harvest high-end customers but inadvertently pushed away a group of large customers with self-building capabilities. This decision risk may be reassessed when the earnings season arrives.

05.

Who Wins? The Truth About Benefit Distribution

The price increases by the three cloud providers are seen by the media as "big players harvesting." But from the perspective of the entire industry chain, the distribution of real winners is much more complex.

There is an ironic reality: the most hurt are the small and medium-sized AI startups full of innovative vitality. If they fall on a large scale due to costs, the cloud providers' own ecosystems will wither accordingly.

This is not without precedent. In the early 2010s, Amazon AWS's aggressive price increases on some services accelerated the shift of some developers to Google Cloud, indirectly helping GCP complete its early ecosystem accumulation. History does not repeat itself simply, but it rhymes.

06.

How Long Will the Era of Price Hikes Last?

Simply put, the essence of this round of price increases is a pressure release in China's AI computing power market amidst exploding demand and supply constraints. Squeezed from both sides, prices can only move upward. This is not entirely an active choice by cloud providers; in a sense, it is also a forced pricing correction.

Figure | China's high-performance AI computing power: Demand growth rate vs. domestic supply capacity expansion rate Index: 2023 = 100 | The持续扩大的供需缺口 (continuously expanding supply-demand gap) is the underlying logic of this price increase

None of the three structural factors supporting this round of price increases will substantially disappear within 12 months: the quantum leap in token consumption brought by the adoption of Reasoning models, the accelerated large-scale deployment of AI Agents, and the supply constraints caused by Nvidia's export controls.

The B2B software market has a repeatedly verified规律 (rule): the Price Ratchet Effect. None of AWS's several price increases in the early 2010s were fully rolled back after supply improved. Google Cloud storage pricing has only seen one downward step since 2021, accompanied by tightened storage limits. Cloud providers understand this规律 (rule): this price increase is not just "harvesting during a window" but also about locking in a new price baseline.

Computing power price index trend: Three scenario predictions (2025Q2–2028Q2)

2025Q2 = Baseline 100 | Comprehensive inference API average price index estimate, including price increase effects

Therefore, before 2027, "computing power to zero" will not become a reality. The real factor determining the price inflection point depends on when the scheduling efficiency of domestic computing power can substantially catch up with Nvidia's H100. Judging from the current engineering progress, this point in time is most likely between 2027 and 2028.

And during this window, cloud providers have every reason to "raise prices first out of respect," because they know the window will not stay open forever.

07

Conclusion: A Structural Game on the Supply Side

What this round of price increases reveals is not the grand narrative of "AI commercialization's coming of age" but a more specific industrial reality: when an efficiency revolution and demand explosion occur simultaneously, prices may not fall but instead rise. The Jevons Paradox held true in the coal era, and it同样成立 (holds equally true) in the computing power era.

For small and medium-sized AI application companies, rather than arguing about who is harvesting, it is better to seriously calculate: in their own business scenarios, how many tokens are being consumed无效ly (ineffectively)?

Saving tokens is the hardest moat in this era.

This article is from the WeChat public account "EmphasizeNext" (ID: leo89203898), author: Wen Xin, editor: Xiao Bai

Related Questions

QWhat is the Jevons Paradox and how does it apply to the current AI computing market in China?

AThe Jevons Paradox, observed by British economist William Jevons in 1865, refers to the phenomenon where increased efficiency in resource use (like coal in steam engines) leads to higher overall consumption due to reduced costs triggering explosive demand. In the context of China's AI computing market in 2026, it explains why lower token inference costs (e.g., from models like DeepSeek-R1) have not reduced total spending but instead fueled a surge, as cheaper access encourages more extensive AI integration into business processes, especially with reasoning models and agents that consume 10-50 times more tokens per task.

QWhy did Chinese cloud providers like Alibaba Cloud and Baidu Intelligent Cloud raise prices despite falling AI inference costs?

AChinese cloud providers raised prices by 20-30% due to a combination of factors: the Jevons Paradox driving demand explosion beyond cost savings, their proprietary engineering advantages in inference optimization (e.g., speculative decoding, KV cache management) that justify premium pricing, and strategic shifts to focus on high-value enterprise clients while improving profitability. Additionally, supply constraints from NVIDIA export controls and the transition to AI agents requiring more tokens per task contributed to the pricing adjustment.

QHow does DeepSeek's approach to open-sourcing model weights but not inference optimization stacks impact the market?

ADeepSeek open-sourced model weights (e.g., for DeepSeek-R1) but kept its inference optimization stacks proprietary, creating a disparity where others have the model architecture but lack the engineering expertise to achieve high efficiency. This allows leading cloud providers to maintain a 3-5x efficiency advantage over self-deployed solutions, enabling them to serve more tokens per unit of compute and justifying price hikes as a way to monetize their technical barriers.

QWhat unintended consequence did the price hike by cloud providers trigger among large enterprises?

AThe price hike accelerated large enterprises' decisions to build their own computing infrastructure. For organizations with monthly cloud AI spending exceeding 3-5 million RMB, the increased costs pushed them past the threshold where self-building becomes financially viable. This led to a surge in demand for GPU leasing platforms, Huawei Ascend hardware, and integration services for private inference clusters, potentially reducing cloud providers' revenue from these high-value clients.

QWhat are the key structural factors sustaining the AI compute price increase, and when might a downturn occur?

AThe price increase is sustained by three structural factors: the adoption of reasoning models and AI agents drastically increasing token consumption per task, ongoing supply constraints from NVIDIA export controls limiting domestic GPU availability, and the price ratchet effect in B2B markets where hikes are fully reversed. A downturn is unlikely before 2027-2028, contingent on domestic compute alternatives (e.g., Huawei Ascend) achieving parity with NVIDIA H100 in scheduling efficiency, which would alleviate supply pressures and introduce competitive pricing.

Related Reads

US Government Suddenly Halts Anthropic's Strongest Model, "Quasi-IPO Stock Price" Plunges 3.7% Overnight

U.S. Government Halts Anthropic's Top AI Models, 'Pre-IPO' Price Drops 3.7% On June 12, the U.S. government ordered Anthropic to shut down access to its two most powerful AI models, Claude Fable 5 and Claude Mythos 5, citing national security concerns. The directive, issued by the Department of Commerce, required Anthropic to block access for all foreign nationals, leading the company to disable the models globally for all users. Anthropic strongly opposed the move, arguing the government's basis was a "narrow jailbreak vulnerability" and warning that applying such a standard industry-wide would effectively halt all frontier model deployments. The news impacted Anthropic's implied valuation in speculative markets. The Anthropic perpetual contract on Hyperliquid fell approximately 3.7% to around $1,627, down from highs above $1,800 following the models' release. Unauthorized tokenized products linked to Anthropic on Solana also saw significant declines. The models, launched just days earlier on June 9, represented a major capability leap for Anthropic. Fable 5 was its first public release of a "Mythos"-tier model above its flagship Claude Opus. The shutdown creates an ironic situation for Anthropic, a company founded on "AI safety" principles, and adds uncertainty to its ongoing IPO preparations. The company is actively engaging with regulators to resolve what it calls a "misunderstanding" and restore service.

marsbit13m ago

US Government Suddenly Halts Anthropic's Strongest Model, "Quasi-IPO Stock Price" Plunges 3.7% Overnight

marsbit13m ago

SpaceX IPO Creates Trillion-Dollar Billionaire: Musk's Wealth Equals Half of Crypto Market

SpaceX's record-breaking IPO has propelled Elon Musk to become the first modern billionaire with a personal net worth exceeding $1 trillion, reaching $1.11 trillion according to Bloomberg. This staggering wealth surpasses the total market capitalization of all cryptocurrencies excluding Bitcoin and equals roughly half of the entire crypto market's value. The milestone highlights extreme wealth concentration and the significant devaluation of the altcoin market, whose total cap has nearly halved since late 2025 as capital flows into large tech stocks. SpaceX's Nasdaq debut saw its valuation hit $2.2 trillion, with shares soaring from a $135 offer price to close at $161. Its first-day trading volume of $85 billion set a new global IPO record. Musk owns 42% of the company. Despite his wealth dwarfing the altcoin sector, Musk maintains deep ties to digital assets. He personally holds Bitcoin, Ethereum, and Dogecoin, while his companies, SpaceX and Tesla, collectively hold over 30,000 Bitcoin, ranking among the top corporate BTC holders globally. His acquisition and integration of financial data tools into X (formerly Twitter) further connect his ecosystem to the markets. Ultimately, Musk's trillion-dollar status underscores the immense wealth controlled by tech founders, though this fortune remains largely tied to volatile stock prices rather than liquid assets.

Foresight News21m ago

SpaceX IPO Creates Trillion-Dollar Billionaire: Musk's Wealth Equals Half of Crypto Market

Foresight News21m ago

Hardcore First Look | Ocean Embodied Intelligence Company 'Shihang Intelligence' Secures Record-Breaking 1 Billion in Funding, Zhu Xiaohu, Temasek Place Bets

Breaking News | Ocean Embodied Intelligence company "Shihang Intelligent" secures a record-breaking 1 billion RMB (approximately 10 billion yuan) in Series A financing, with investment from Zhu Xiaohu and Temasek. Author: Qiu Xiaofen | Editor: Yuan Silai Ocean Embodied Intelligence company "Shihang Intelligent" has completed its Series A funding round, raising over 1 billion RMB. This marks the largest single funding round in the global marine robotics field to date. Investors include upstream momentum funds from chip companies "Moore Thread" and "Kunlunxin," Singapore's state-owned investment platform Vertex Growth, and listed company Dyneo, among others. Existing investors like GSR Ventures (whose founder Zhu Xiaohu has invested for the fifth time), Vertex Ventures China, Hua Ying Capital, and Long Capital also significantly increased their investments. Founder and CEO Chen Xiaobo, a 1989-born alumnus of Harbin Engineering University, is a long-time expert in underwater robotics. He received the National Defense Science and Technology Progress Award at age 28 (the youngest recipient) and led the development of China's first commercial underwater cleaning robot. The funds will be used for core technology R&D, global market expansion, and building the industry chain ecosystem to scale the application of marine robots in complex underwater scenarios. The ocean is considered one of the most challenging environments for robotics due to low light, high turbidity, complex currents, limited communication, high pressure, and corrosion. "Shihang Intelligent" focuses on developing core underlying technologies for marine robots, covering six key systems: power, control, sensing, navigation, sealing, and deployment. Its robots are capable of operating at depths from 0 to 10,000 meters with full degrees of freedom, performing complex maneuvers, autonomous navigation, and multi-robot collaboration. Applications include ship cleaning, underwater security, offshore wind power, marine ranching, and seabed inspection. The company's order value for the first half of 2026 alone has exceeded 1 billion RMB. Its "Orca Robot" is used by major shipping companies and has performed maintenance on over a thousand large vessels. In April of this year, the company launched its ocean embodied large model "Cangqiong CEORION." Unlike traditional remote-controlled or pre-programmed robots, this model integrates environmental perception, task understanding, and action generation into a single end-to-end architecture. Trained on millions of hours of commercial operation data and simulation data, it covers 12 major underwater operation scenarios. In simulations, it achieved over 90% task success rate and over 70% zero-shot adaptation capability to unseen environments. A built-in physics reasoning module reduces collision risk by 80%, enabling autonomous operation even with weak or no communication. Recently, "Shihang Intelligent" was selected as a core technology partner for Singapore's Maritime and Port Authority national hull inspection and cleaning program. These advancements indicate marine robotics is moving from pilot projects to scaled applications, with real-world operations generating valuable data to continuously improve robot capabilities. CEO Chen Xiaobo stated the company will continue investing in core marine robotics technology, the embodied intelligence model, and global application scenarios to expand into more high-risk, high-difficulty, and high-value underwater operations.

marsbit46m ago

Hardcore First Look | Ocean Embodied Intelligence Company 'Shihang Intelligence' Secures Record-Breaking 1 Billion in Funding, Zhu Xiaohu, Temasek Place Bets

marsbit46m ago

Three Months, 35 Billion Yuan: Investors Rush to Grab the OpenAI of the Physical World

Investors flock to a physical AI startup as the race for the "OpenAI of the physical world" heats up. Ji Jia Shi Jie (GigaWorld), a company dedicated to developing Artificial General Intelligence (AGI) for the physical world, has raised 3.5 billion RMB (approximately $490 million) in just three months, according to a report from investment media outlet Touzijie. The latest B2 funding round of 1 billion RMB attracted a wide range of top-tier investors, including sovereign wealth funds, industrial capital, and financial institutions. This brings the total funding for the young company, now valued over 10 billion RMB, to 3.5 billion RMB across three recent rounds. The company is led by Huang Guan, a post-90s Tsinghua University PhD with extensive experience in AI, autonomous driving, and entrepreneurship. Its core innovation is a "dual-pyramid" system comprising a five-layer data pyramid (from internet videos to real-world robot data) and a three-layer algorithm pyramid focused on world simulation, action alignment, and reinforcement learning. This system underpins its key models: the "World Action Model" (e.g., GigaBrain series for robot control) and the "World Generation Model" (e.g., GigaWorld series for simulating and understanding the physical world). Its models have reportedly achieved top rankings in global robotics benchmarks. Ji Jia Shi Jie argues that while current digital AGI excels in information processing, the next frontier is physical AGI—systems that can understand and interact with the real world. The company believes the field is approaching its "GPT-3 moment," a key inflection point in capability scaling. To achieve this, the company is pursuing a dual-market strategy. For the consumer (C) market, it launched the "SeeLight" brand and its S1 general-purpose humanoid robot, which has secured initial orders for deployment in real homes. For the business (B) market, it focuses on industrial automation with its Maker series robots, having signed agreements for large-scale deployment in factories, and its DriveDreamer world model for autonomous driving, which is already in use with over 30 automakers and tech companies. The report concludes that by bridging the gap between digital intelligence and physical action, Ji Jia Shi Jie aims to unlock a new wave of productivity, ultimately bringing physical AGI into everyday life.

marsbit1h ago

Three Months, 35 Billion Yuan: Investors Rush to Grab the OpenAI of the Physical World

marsbit1h ago

What's the Connection Between Pinduoduo's Huang Zheng and Blockchain?

This text explores the unexpected connection between Pinduoduo founder Colin Huang and blockchain, as suggested in his article *Turning Capitalism Upside Down*. Huang argues Pinduoduo's core business is about managing "uncertainty." He posits that wealth flows to the rich because they absorb life's uncertainties (e.g., illness, job loss) that devastate the poor, who pay a premium for certainty through insurance or stable prices. Pinduoduo's model attempts a "reverse insurance": by aggregating consumer demand via group-buying and flash sales, it creates a large, predictable order for manufacturers. This certainty allows factories to remove risk premiums, passing savings back as lower prices, thus partially reversing the wealth flow. The key obstacle, Huang notes, is that an individual's buying intent is an unreliable promise. He then asks if blockchain is the natural solution for this "reverse insurance." The text elaborates that blockchain, through smart contracts with binding deposits, could transform casual intent into a costly-to-break, enforceable commitment. This replaces interpersonal trust with coded rules, making promises credible, pricable, and resistant to fraud. Finally, the author draws a parallel to Bitcoin, framing two paths to creating certainty: the "Pinduoduo path" of aggregating decentralized will into scale, and the "Bitcoin path" of locking rules into immutable code. Both sacrifice something—personal freedom or system flexibility—to manufacture trust and predictability.

链捕手2h ago

What's the Connection Between Pinduoduo's Huang Zheng and Blockchain?

链捕手2h ago

Trading

Spot
Futures
活动图片