Source: International Business Times UK
Original Author: Anastasia Matveeva |
Compiled and Edited by: Gonka.ai
AI is expanding at an astonishing rate, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world's computing power, when training costs approach $1 billion, and when inference bills catch startups off guard—the true cost of this computing arms race is quietly reshaping the value distribution of the entire AI industry.
This article does not discuss who will build the most advanced models. It addresses a more fundamental question: Is the current economic model of AI infrastructure truly sustainable after scaling? How will changes in the allocation mechanism of computing power reshape the value distribution of the entire market?
I. The Hidden Cost of Intelligence
Training a cutting-edge large model can cost tens or even hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost "tens of millions of dollars," and its CEO, Dario Amodei, previously estimated that the training cost for the next-generation model could approach $1 billion. According to industry reports, the training cost of GPT-4 may have exceeded $100 million.
However, training costs are just the tip of the iceberg. The structural and ongoing pressure comes from inference costs—the expenses incurred every time a model is called. According to OpenAI's publicly available API pricing, inference is billed per million tokens. For applications with high usage, this means daily inference costs could already reach thousands of dollars even before scaling.
AI is often described as software. But its economic essence is increasingly resembling capital-intensive infrastructure—requiring both substantial upfront investment and continuous stream of operational expenses.
This shift in economic structure is quietly altering the competitive landscape of the entire AI industry. Those who can afford computing power are the giants who have already built large-scale infrastructure; startups trying to survive in the cracks are being gradually eroded by inference bills.
II. Capital Intensity and Market Concentration
According to Holori's 2026 cloud market analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure about 22%, and Google Cloud about 11%. Together, these three control approximately two-thirds of the global cloud infrastructure, and the vast majority of global AI workloads run on their infrastructure.
The practical implication of this concentration is: when OpenAI's API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences an outage, services across industries and regions are disrupted.
Concentration is not narrowing; infrastructure spending is instead continuing to expand. Taking NVIDIA as an example, its data center business has reached an annualized revenue of over $80 billion, indicating sustained strong demand for high-performance GPUs.
More noteworthy is a hidden structural inequality. According to SEC filings and market reports, top labs like OpenAI and Anthropic secure GPU resources at near-cost prices as low as $1.30–$1.90 per hour through multi-billion dollar "equity-for-compute" agreements. In contrast, small and medium-sized companies lacking strategic partnerships with NVIDIA, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour—a premium of up to 600%.
This pricing gap is driven by NVIDIA's recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly determined by capital-intensive procurement agreements rather than open market competition.
In the early adoption phase, this concentration can appear "efficient." But after scaling, it brings pricing risk, supply bottlenecks, and infrastructure dependency—a triple vulnerability.
III. The Overlooked Energy Dimension
The cost issue of AI infrastructure has another often-overlooked dimension: energy.
According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this proportion in the coming years.
This means that the economics of computing power is not just a financial issue but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of power supply will become increasingly prominent—the country that can provide the most stable computing power at the lowest energy cost will hold a structural advantage in the industrial competition of the AI era.
When Jensen Huang announced at GTC26 that NVIDIA's order visibility had surpassed $1 trillion, he was describing not just the commercial success of one company but the grand process of civilization converting electricity, land, and scarce minerals into intelligent computing power.
IV. Rethinking Infrastructure Mechanisms
While centralized data centers continue to expand, another type of exploration is quietly emerging—attempting to fundamentally redefine how computing resources are coordinated.
Decentralized Inference: A Structural Alternative
The Gonka protocol is a representative practice in this direction. It is a decentralized network designed specifically for AI inference, with the core design objective of minimizing network synchronization and consensus overhead, directing as much computing resources as possible to real AI workloads.
At the governance level, Gonka adopts a "one compute unit, one vote" principle—governance weight is determined by verifiable computing power contribution, not capital shareholding. At the technical level, the protocol uses short-cycle performance measurement intervals (called Sprints), requiring participants to demonstrate real GPU computing power in real-time through a Transformer-based Proof-of-Work (PoW) mechanism.
The significance of this design is that nearly 100% of the network's computing power is directed to the AI inference workload itself, rather than consumed on maintaining consensus, coordinating communication, and other infrastructure overhead.
The Economic Logic of Distributed Computing Power
From an economic perspective, the value proposition of decentralized computing networks has three layers.
The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes massive fixed asset depreciation, data center operating costs, and shareholder profit expectations. Decentralized networks can significantly compress these costs by monetizing idle GPU resources. Taking Gonka as an example, the current pricing for inference services provided through its USD billing gateway, GonkaGate, is approximately $0.0009 per million tokens—while centralized providers like Together AI charge about $1.50 for similar models (e.g., DeepSeek-R1), a difference of over a thousand times.
The second is the supply elasticity layer. The computing power supply of centralized providers is rigid, with expansion cycles measured in months or even quarters. Participants in decentralized networks can join or exit elastically with demand fluctuations, theoretically enabling a faster response to demand peaks—just as Amazon Web Services was born from holiday traffic peak demands, the peaks and valleys of AI inference similarly require elastic infrastructure to handle.
The third is the sovereignty layer. This dimension is particularly prominent from the perspective of sovereign nations. When a government's public services deeply rely on an external cloud service provider, computing dependency becomes a strategic vulnerability. Decentralized networks offer a possibility: local data centers can join the global distributed network as nodes, ensuring data sovereignty while obtaining sustainable commercial returns by providing computing power to the global market.
V. The Moment of Value Redistribution
Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable after scaling?
The answer is: For the top players, yes; for everyone else, increasingly no.
AWS, Azure, and Google Cloud have built moats through decades of capital accumulation, and their scale advantages are almost unshakable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependency are highly concentrated in the hands of a few private entities.
Historically, every major monopoly in technological infrastructure eventually gave rise to alternative distributed architectures—the internet itself was a rebellion against telecom monopolies, BitTorrent颠覆ed centralized distribution, and Bitcoin challenged the centralization of currency issuance.
The decentralization of AI infrastructure may not be an ideological choice but an economic inevitability—when the cost of centralization becomes high enough to drive large-scale user migration, the demand for alternatives will truly erupt. Jensen Huang used the analogy that "every financial crisis pushes more people towards Bitcoin"—a logic equally applicable to the computing power market.
The emergence of DeepSeek has already demonstrated one thing: in a world where the capabilities of open-source models are approaching the closed-source frontier, inference cost will become the core variable determining the scaling speed of AI applications. Whoever can provide the lowest-cost, highest-availability inference computing power holds the entry ticket to this competition.
Conclusion: The Infrastructure War Has Just Begun
The next phase of AI competition will not be decided on the leaderboards of model capabilities but in the economic game of infrastructure.
Centralized computing giants hold capital and scale advantages but also bear the burden of fixed cost structures and pricing pressures. Decentralized networks are entering the market with extremely low marginal costs but need to prove they can meet real commercial thresholds in stability, usability, and ecosystem scale.
The two paths will coexist long-term and pressure each other. The tension between centralization and decentralization will be one of the most significant structural themes to track in the AI industry over the next five years.
This infrastructure war has just begun.








