AI's Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

marsbitОпубликовано 2026-03-26Обновлено 2026-03-26

Введение

AI is expanding, but its underlying economic model is fragile. While training cutting-edge models like Claude 3.5 Sonnet costs tens of millions—with future models potentially reaching $1 billion—the real burden is inference costs, which accumulate with each API call and strain startups. Three cloud giants—AWS, Azure, and Google Cloud—control two-thirds of global cloud infrastructure, creating market concentration and supply risks. Top AI labs secure GPU access at near-cost rates (as low as $1.30–$1.90/hour) via strategic partnerships, while smaller players pay retail prices exceeding $14/hour—a 600% premium. Energy consumption is another challenge: data centers already use 1–1.5% of global electricity, and AI’s growth will intensify this demand. Decentralized inference networks like Gonka offer an alternative, aiming to reduce costs (e.g., $0.0009 per million tokens vs. $1.50 for centralized services), increase supply elasticity, and enhance sovereignty by leveraging idle GPUs globally. The AI infrastructure war is just beginning. Centralized providers hold scale advantages, but economic pressures may drive adoption of decentralized models, reshaping value distribution in the AI industry.

Source: International Business Times UK

Original Author: Anastasia Matveeva |

Compiled and Edited by: Gonka.ai

AI is expanding at an astonishing rate, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world's computing power, when training costs approach $1 billion, and when inference bills catch startups off guard—the true cost of this computing arms race is quietly reshaping the value distribution of the entire AI industry.

This article does not discuss who will build the most advanced models. It addresses a more fundamental question: Is the current economic model of AI infrastructure truly sustainable after scaling? How will changes in the allocation mechanism of computing power reshape the value distribution of the entire market?

I. The Hidden Cost of Intelligence

Training a cutting-edge large model can cost tens or even hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost "tens of millions of dollars," and its CEO, Dario Amodei, previously estimated that the training cost for the next-generation model could approach $1 billion. According to industry reports, the training cost of GPT-4 may have exceeded $100 million.

However, training costs are just the tip of the iceberg. The structural and ongoing pressure comes from inference costs—the expenses incurred every time a model is called. According to OpenAI's publicly available API pricing, inference is billed per million tokens. For applications with high usage, this means daily inference costs could already reach thousands of dollars even before scaling.

AI is often described as software. But its economic essence is increasingly resembling capital-intensive infrastructure—requiring both substantial upfront investment and continuous stream of operational expenses.

This shift in economic structure is quietly altering the competitive landscape of the entire AI industry. Those who can afford computing power are the giants who have already built large-scale infrastructure; startups trying to survive in the cracks are being gradually eroded by inference bills.

II. Capital Intensity and Market Concentration

According to Holori's 2026 cloud market analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure about 22%, and Google Cloud about 11%. Together, these three control approximately two-thirds of the global cloud infrastructure, and the vast majority of global AI workloads run on their infrastructure.

The practical implication of this concentration is: when OpenAI's API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences an outage, services across industries and regions are disrupted.

Concentration is not narrowing; infrastructure spending is instead continuing to expand. Taking NVIDIA as an example, its data center business has reached an annualized revenue of over $80 billion, indicating sustained strong demand for high-performance GPUs.

More noteworthy is a hidden structural inequality. According to SEC filings and market reports, top labs like OpenAI and Anthropic secure GPU resources at near-cost prices as low as $1.30–$1.90 per hour through multi-billion dollar "equity-for-compute" agreements. In contrast, small and medium-sized companies lacking strategic partnerships with NVIDIA, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour—a premium of up to 600%.

This pricing gap is driven by NVIDIA's recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly determined by capital-intensive procurement agreements rather than open market competition.

In the early adoption phase, this concentration can appear "efficient." But after scaling, it brings pricing risk, supply bottlenecks, and infrastructure dependency—a triple vulnerability.

III. The Overlooked Energy Dimension

The cost issue of AI infrastructure has another often-overlooked dimension: energy.

According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this proportion in the coming years.

This means that the economics of computing power is not just a financial issue but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of power supply will become increasingly prominent—the country that can provide the most stable computing power at the lowest energy cost will hold a structural advantage in the industrial competition of the AI era.

When Jensen Huang announced at GTC26 that NVIDIA's order visibility had surpassed $1 trillion, he was describing not just the commercial success of one company but the grand process of civilization converting electricity, land, and scarce minerals into intelligent computing power.

IV. Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, another type of exploration is quietly emerging—attempting to fundamentally redefine how computing resources are coordinated.

Decentralized Inference: A Structural Alternative

The Gonka protocol is a representative practice in this direction. It is a decentralized network designed specifically for AI inference, with the core design objective of minimizing network synchronization and consensus overhead, directing as much computing resources as possible to real AI workloads.

At the governance level, Gonka adopts a "one compute unit, one vote" principle—governance weight is determined by verifiable computing power contribution, not capital shareholding. At the technical level, the protocol uses short-cycle performance measurement intervals (called Sprints), requiring participants to demonstrate real GPU computing power in real-time through a Transformer-based Proof-of-Work (PoW) mechanism.

The significance of this design is that nearly 100% of the network's computing power is directed to the AI inference workload itself, rather than consumed on maintaining consensus, coordinating communication, and other infrastructure overhead.

The Economic Logic of Distributed Computing Power

From an economic perspective, the value proposition of decentralized computing networks has three layers.

The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes massive fixed asset depreciation, data center operating costs, and shareholder profit expectations. Decentralized networks can significantly compress these costs by monetizing idle GPU resources. Taking Gonka as an example, the current pricing for inference services provided through its USD billing gateway, GonkaGate, is approximately $0.0009 per million tokens—while centralized providers like Together AI charge about $1.50 for similar models (e.g., DeepSeek-R1), a difference of over a thousand times.

The second is the supply elasticity layer. The computing power supply of centralized providers is rigid, with expansion cycles measured in months or even quarters. Participants in decentralized networks can join or exit elastically with demand fluctuations, theoretically enabling a faster response to demand peaks—just as Amazon Web Services was born from holiday traffic peak demands, the peaks and valleys of AI inference similarly require elastic infrastructure to handle.

The third is the sovereignty layer. This dimension is particularly prominent from the perspective of sovereign nations. When a government's public services deeply rely on an external cloud service provider, computing dependency becomes a strategic vulnerability. Decentralized networks offer a possibility: local data centers can join the global distributed network as nodes, ensuring data sovereignty while obtaining sustainable commercial returns by providing computing power to the global market.

V. The Moment of Value Redistribution

Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable after scaling?

The answer is: For the top players, yes; for everyone else, increasingly no.

AWS, Azure, and Google Cloud have built moats through decades of capital accumulation, and their scale advantages are almost unshakable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependency are highly concentrated in the hands of a few private entities.

Historically, every major monopoly in technological infrastructure eventually gave rise to alternative distributed architectures—the internet itself was a rebellion against telecom monopolies, BitTorrent颠覆ed centralized distribution, and Bitcoin challenged the centralization of currency issuance.

The decentralization of AI infrastructure may not be an ideological choice but an economic inevitability—when the cost of centralization becomes high enough to drive large-scale user migration, the demand for alternatives will truly erupt. Jensen Huang used the analogy that "every financial crisis pushes more people towards Bitcoin"—a logic equally applicable to the computing power market.

The emergence of DeepSeek has already demonstrated one thing: in a world where the capabilities of open-source models are approaching the closed-source frontier, inference cost will become the core variable determining the scaling speed of AI applications. Whoever can provide the lowest-cost, highest-availability inference computing power holds the entry ticket to this competition.

Conclusion: The Infrastructure War Has Just Begun

The next phase of AI competition will not be decided on the leaderboards of model capabilities but in the economic game of infrastructure.

Centralized computing giants hold capital and scale advantages but also bear the burden of fixed cost structures and pricing pressures. Decentralized networks are entering the market with extremely low marginal costs but need to prove they can meet real commercial thresholds in stability, usability, and ecosystem scale.

The two paths will coexist long-term and pressure each other. The tension between centralization and decentralization will be one of the most significant structural themes to track in the AI industry over the next five years.

This infrastructure war has just begun.

Связанные с этим вопросы

QWhat are the main cost components in AI infrastructure, and why is inference cost considered more structurally significant than training cost?

AThe main cost components in AI infrastructure are training costs and inference costs. Training a state-of-the-art large model can cost tens to hundreds of millions of dollars (e.g., Claude 3.5 Sonnet cost 'tens of millions', and next-gen models may approach $1 billion). However, inference cost—the expense generated each time a model is called—is more structurally significant because it is a continuous operational expenditure. For high-usage applications, daily inference costs can reach thousands of dollars even before scaling, making it a persistent financial burden that shapes the competitive landscape and sustainability of AI businesses.

QHow does the concentration of cloud infrastructure among AWS, Azure, and Google Cloud impact the AI industry's market dynamics and vulnerability?

AAWS, Azure, and Google Cloud collectively control about two-thirds of the global cloud infrastructure market. This concentration means that most AI workloads run on these three providers, creating market dynamics where pricing power, supply access, and infrastructure dependency are highly concentrated. It leads to systemic vulnerabilities: outages at one provider (e.g., OpenAI API downtime) can disrupt thousands of products and services globally. Additionally, it exacerbates structural inequality, as large players secure GPU resources at near-cost rates (e.g., $1.30–$1.90/hour) via strategic partnerships, while smaller companies pay retail prices (e.g., over $14/hour)—a 600% premium—due to lack of bargaining power.

QWhat is the role of energy consumption in AI infrastructure economics, and why is it a growing concern?

AEnergy consumption is a critical but often overlooked dimension of AI infrastructure economics. Data centers currently account for 1–1.5% of global electricity consumption, and AI-driven demand is expected to significantly increase this share. This makes energy a fundamental cost factor and a geopolitical challenge: countries with lower energy costs and stable power supplies will have a structural advantage in the AI industry. The conversion of electricity, land, and scarce minerals into compute power (as highlighted by Nvidia's $1 trillion order visibility) underscores that AI's expansion is not just a financial issue but a resource-intensive process with broad infrastructure implications.

QHow do decentralized compute networks like Gonka propose to address the economic and structural challenges of centralized AI infrastructure?

ADecentralized compute networks like Gonka aim to address centralized AI infrastructure challenges through three key value propositions: 1) Cost reduction: By monetizing idle GPU resources, they avoid the fixed costs, depreciation, and profit margins of centralized providers, offering dramatically lower prices (e.g., Gonka charges $0.0009 per million tokens vs. $1.50 for centralized services). 2) Supply elasticity: Decentralized networks allow participants to join or exit dynamically, providing flexible scaling to handle demand peaks without rigid expansion cycles. 3) Sovereignty: They enable local data centers to participate in a global network while retaining data sovereignty, reducing dependency on foreign cloud providers and offering commercial returns through global compute supply.

QWhy might the decentralization of AI infrastructure become an economic necessity rather than an ideological choice?

ADecentralization of AI infrastructure may become an economic necessity because the high costs and concentrated control of centralized models are unsustainable for most players. While giants like AWS and Azure can sustain their scale, the pricing pressure, supply bottlenecks, and infrastructure dependency create barriers for smaller companies and nations. Historically, monopolies in critical infrastructure (e.g., telecom, content distribution, currency) have spurred distributed alternatives (e.g., internet, BitTorrent, Bitcoin). Similarly, when centralized AI costs drive large-scale user migration, decentralized networks—with their marginal cost advantages and elastic supply—could emerge as viable alternatives. As open-source models close the capability gap with closed-source ones, inference cost becomes the key variable for scalability, making low-cost, decentralized compute increasingly attractive.

Похожее

A 120,000 Yuan Tombstone or 399 Yuan AI Immortality: Which Would You Choose?

"The 'Deathcare Moutai' Fushouyuan, once a highly profitable cemetery operator, has halted trading amid a severe crisis, with its net profit plummeting by 52.8% in 2024. This reflects a broader trend of people rejecting expensive traditional burials, as average grave prices in China have soared to over ¥120,000. In response, the industry is pivoting to digital alternatives, with companies like Fushouyuan offering AI-powered memorial services, such as virtual farewell halls and AI-generated recreations of the deceased. Simultaneously, a low-cost, unregulated AI 'resurrection' industry has emerged online, with services priced as low as ¥399. These often use open-source tools to create crude digital avatars from photos and voice clips, exploiting vulnerable individuals, particularly bereaved parents who have lost their only child. However, these services raise significant ethical and legal concerns, including data privacy risks and potential use in scams. Academic studies warn that such AI companions may exacerbate grief, leading to prolonged mourning disorders and emotional dependency, rather than providing genuine comfort. While regulations are being drafted to manage digital human services, the deep emotional drive to 'reconnect' with loved ones often overshadows rational concerns. Ultimately, the article questions whether digital immortality truly preserves memory or merely offers a commercialized illusion, emphasizing that no technology can replace the real, irreplaceable loss of a human life."

marsbit2 мин. назад

A 120,000 Yuan Tombstone or 399 Yuan AI Immortality: Which Would You Choose?

marsbit2 мин. назад

Exploring Bitcoin Valuation in 2026 from Macro and On-Chain Structural Perspectives

Tiger Research analyzes Bitcoin's valuation outlook for 2026 from macro and on-chain perspectives. Despite a 27% price drop in Q1, the macro environment remains supportive. Global M2 hit a record $13.44 trillion, but Chinese liquidity, which contributed over 60% of M2 growth, has limited access to Bitcoin markets. The Iran conflict pushed oil prices higher, raising March CPI to 3.3% and narrowing the Fed's rate cut path. However, the easing direction remains intact. Bitcoin ETF flows turned positive in March after five months of outflows, and corporate accumulation continues. On-chain metrics show a shift from undervaluation to early equilibrium. Key indicators like MVRV-Z and NUPL have exited panic zones. The critical resistance is at $78k, the long-term holder cost basis, while the key support is at $54k. Although transaction counts increased, active addresses and average transfer size declined, indicating superficial growth rather than real network expansion. BTCFi ecosystem growth has weakened, leading to a -10% adjustment in fundamental metrics. The 12-month price target is set at $143k, based on a $132.5k neutral benchmark adjusted by -10% (fundamentals) and +20% (macro). This represents a 103% upside from current levels. Short-term catalysts include a break above $78k, sustained ETF inflows, and a Fed policy shift post-geopolitical de-escalation.

marsbit22 мин. назад

Exploring Bitcoin Valuation in 2026 from Macro and On-Chain Structural Perspectives

marsbit22 мин. назад

Anthropic Starts Poaching Scientists? $27K Weekly Onsite Stipend to Fix Claude's Expert-Level Errors

Anthropic has launched a new STEM Fellow program, offering $3,800 per week for a three-month, in-person residency in San Francisco. The role targets experts from science, technology, engineering, and mathematics (STEM) fields—machine learning experience is helpful but not required. Instead, Anthropic values scientific judgment and a willingness to learn quickly. Fellows will work with Claude models and internal tools under the guidance of an Anthropic researcher. Example projects include a materials scientist identifying errors in Claude’s reasoning or a climate scientist integrating atmospheric modeling software with Claude. The goal is to have experts "tell Claude where it's wrong" and improve its scientific capabilities. This initiative is part of Anthropic’s broader strategy to strengthen its scientific ecosystem, following earlier programs like the AI Safety Fellows and AI for Science programs. The company acknowledges that current AI models, while powerful, still produce high-confidence errors and lack end-to-end research autonomy. The program aims to embed domain expertise directly into model development, turning scientists into "high-level reviewers" for AI. Anthropic CEO Dario Amodei has previously emphasized AI’s potential to accelerate scientific breakthroughs, particularly in biology and healthcare. The company believes that the next phase of AI competition will depend not on scaling parameters, but on integrating human expertise to refine model accuracy and reliability.

marsbit52 мин. назад

Anthropic Starts Poaching Scientists? $27K Weekly Onsite Stipend to Fix Claude's Expert-Level Errors

marsbit52 мин. назад

VCs Sticking to the Primary Market: How Much Money Do They Still Have?

The article discusses the current state of the cryptocurrency venture capital (VC) market, based on a debate among investors from firms like Pantera Capital, Crucible Capital, and others. Key insights reveal that while VC funds are not short on capital, there is a scarcity of high-quality investment opportunities. The discussion highlights a structural shift: early-stage funding is more dispersed and competitive, with many smaller funds active, but later-stage growth capital is highly concentrated among a few large players like Paradigm and Pantera. Data indicates that over 80% of funding in 2026 went to later rounds, but this includes "graduated" fintech projects that have entered traditional VC realms. The core issue is not the availability of money, but its distribution and accessibility—early-stage founders face intense competition for funding, while later-stage funding is gatekept by a handful of major firms. This reflects a broader trend where VCs are being more selective, prioritizing long-term, proven projects over narrative-driven investments, and founders must demonstrate substantial progress and resilience to secure capital.

Odaily星球日报54 мин. назад

VCs Sticking to the Primary Market: How Much Money Do They Still Have?

Odaily星球日报54 мин. назад

Little Pepe ($LILPEPE) Builds Strong Investment Case With Potential 50x–100x Returns for Early Buyers

Little Pepe ($LILPEPE) is gaining significant attention as a promising crypto project, having raised nearly $28 million in its ongoing presale. The token price is currently $0.0022 and will increase to $0.0023 in the next stage, incentivizing early investment. The project highlights a potential 50x to 100x return for early buyers, targeting prices of $0.11 to $0.22 from the current near $0.01 value, though returns depend on market conditions. Beyond being a meme coin, Little Pepe offers utility through its Ethereum-based Layer 2 blockchain, featuring zero tax trades, bot protection, staking, a meme launchpad, and DAO governance. Additional incentives include a $77,000 token giveaway for ten participants and rewards for top contributors. While high-risk, the project's strong presale performance and functional ecosystem make it an attractive opportunity for investors seeking substantial returns.

TheNewsCrypto1 ч. назад

Little Pepe ($LILPEPE) Builds Strong Investment Case With Potential 50x–100x Returns for Early Buyers

TheNewsCrypto1 ч. назад

Торговля

Спот

Фьючерсы

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на S (S) представлены ниже.