AI's Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

marsbitPublished on 2026-03-26Last updated on 2026-03-26

Abstract

AI is expanding, but its underlying economic model is fragile. While training cutting-edge models like Claude 3.5 Sonnet costs tens of millions—with future models potentially reaching $1 billion—the real burden is inference costs, which accumulate with each API call and strain startups. Three cloud giants—AWS, Azure, and Google Cloud—control two-thirds of global cloud infrastructure, creating market concentration and supply risks. Top AI labs secure GPU access at near-cost rates (as low as $1.30–$1.90/hour) via strategic partnerships, while smaller players pay retail prices exceeding $14/hour—a 600% premium. Energy consumption is another challenge: data centers already use 1–1.5% of global electricity, and AI’s growth will intensify this demand. Decentralized inference networks like Gonka offer an alternative, aiming to reduce costs (e.g., $0.0009 per million tokens vs. $1.50 for centralized services), increase supply elasticity, and enhance sovereignty by leveraging idle GPUs globally. The AI infrastructure war is just beginning. Centralized providers hold scale advantages, but economic pressures may drive adoption of decentralized models, reshaping value distribution in the AI industry.

Source: International Business Times UK

Original Author: Anastasia Matveeva |

Compiled and Edited by: Gonka.ai

AI is expanding at an astonishing rate, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world's computing power, when training costs approach $1 billion, and when inference bills catch startups off guard—the true cost of this computing arms race is quietly reshaping the value distribution of the entire AI industry.

This article does not discuss who will build the most advanced models. It addresses a more fundamental question: Is the current economic model of AI infrastructure truly sustainable after scaling? How will changes in the allocation mechanism of computing power reshape the value distribution of the entire market?

I. The Hidden Cost of Intelligence

Training a cutting-edge large model can cost tens or even hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost "tens of millions of dollars," and its CEO, Dario Amodei, previously estimated that the training cost for the next-generation model could approach $1 billion. According to industry reports, the training cost of GPT-4 may have exceeded $100 million.

However, training costs are just the tip of the iceberg. The structural and ongoing pressure comes from inference costs—the expenses incurred every time a model is called. According to OpenAI's publicly available API pricing, inference is billed per million tokens. For applications with high usage, this means daily inference costs could already reach thousands of dollars even before scaling.

AI is often described as software. But its economic essence is increasingly resembling capital-intensive infrastructure—requiring both substantial upfront investment and continuous stream of operational expenses.

This shift in economic structure is quietly altering the competitive landscape of the entire AI industry. Those who can afford computing power are the giants who have already built large-scale infrastructure; startups trying to survive in the cracks are being gradually eroded by inference bills.

II. Capital Intensity and Market Concentration

According to Holori's 2026 cloud market analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure about 22%, and Google Cloud about 11%. Together, these three control approximately two-thirds of the global cloud infrastructure, and the vast majority of global AI workloads run on their infrastructure.

The practical implication of this concentration is: when OpenAI's API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences an outage, services across industries and regions are disrupted.

Concentration is not narrowing; infrastructure spending is instead continuing to expand. Taking NVIDIA as an example, its data center business has reached an annualized revenue of over $80 billion, indicating sustained strong demand for high-performance GPUs.

More noteworthy is a hidden structural inequality. According to SEC filings and market reports, top labs like OpenAI and Anthropic secure GPU resources at near-cost prices as low as $1.30–$1.90 per hour through multi-billion dollar "equity-for-compute" agreements. In contrast, small and medium-sized companies lacking strategic partnerships with NVIDIA, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour—a premium of up to 600%.

This pricing gap is driven by NVIDIA's recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly determined by capital-intensive procurement agreements rather than open market competition.

In the early adoption phase, this concentration can appear "efficient." But after scaling, it brings pricing risk, supply bottlenecks, and infrastructure dependency—a triple vulnerability.

III. The Overlooked Energy Dimension

The cost issue of AI infrastructure has another often-overlooked dimension: energy.

According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this proportion in the coming years.

This means that the economics of computing power is not just a financial issue but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of power supply will become increasingly prominent—the country that can provide the most stable computing power at the lowest energy cost will hold a structural advantage in the industrial competition of the AI era.

When Jensen Huang announced at GTC26 that NVIDIA's order visibility had surpassed $1 trillion, he was describing not just the commercial success of one company but the grand process of civilization converting electricity, land, and scarce minerals into intelligent computing power.

IV. Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, another type of exploration is quietly emerging—attempting to fundamentally redefine how computing resources are coordinated.

Decentralized Inference: A Structural Alternative

The Gonka protocol is a representative practice in this direction. It is a decentralized network designed specifically for AI inference, with the core design objective of minimizing network synchronization and consensus overhead, directing as much computing resources as possible to real AI workloads.

At the governance level, Gonka adopts a "one compute unit, one vote" principle—governance weight is determined by verifiable computing power contribution, not capital shareholding. At the technical level, the protocol uses short-cycle performance measurement intervals (called Sprints), requiring participants to demonstrate real GPU computing power in real-time through a Transformer-based Proof-of-Work (PoW) mechanism.

The significance of this design is that nearly 100% of the network's computing power is directed to the AI inference workload itself, rather than consumed on maintaining consensus, coordinating communication, and other infrastructure overhead.

The Economic Logic of Distributed Computing Power

From an economic perspective, the value proposition of decentralized computing networks has three layers.

The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes massive fixed asset depreciation, data center operating costs, and shareholder profit expectations. Decentralized networks can significantly compress these costs by monetizing idle GPU resources. Taking Gonka as an example, the current pricing for inference services provided through its USD billing gateway, GonkaGate, is approximately $0.0009 per million tokens—while centralized providers like Together AI charge about $1.50 for similar models (e.g., DeepSeek-R1), a difference of over a thousand times.

The second is the supply elasticity layer. The computing power supply of centralized providers is rigid, with expansion cycles measured in months or even quarters. Participants in decentralized networks can join or exit elastically with demand fluctuations, theoretically enabling a faster response to demand peaks—just as Amazon Web Services was born from holiday traffic peak demands, the peaks and valleys of AI inference similarly require elastic infrastructure to handle.

The third is the sovereignty layer. This dimension is particularly prominent from the perspective of sovereign nations. When a government's public services deeply rely on an external cloud service provider, computing dependency becomes a strategic vulnerability. Decentralized networks offer a possibility: local data centers can join the global distributed network as nodes, ensuring data sovereignty while obtaining sustainable commercial returns by providing computing power to the global market.

V. The Moment of Value Redistribution

Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable after scaling?

The answer is: For the top players, yes; for everyone else, increasingly no.

AWS, Azure, and Google Cloud have built moats through decades of capital accumulation, and their scale advantages are almost unshakable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependency are highly concentrated in the hands of a few private entities.

Historically, every major monopoly in technological infrastructure eventually gave rise to alternative distributed architectures—the internet itself was a rebellion against telecom monopolies, BitTorrent颠覆ed centralized distribution, and Bitcoin challenged the centralization of currency issuance.

The decentralization of AI infrastructure may not be an ideological choice but an economic inevitability—when the cost of centralization becomes high enough to drive large-scale user migration, the demand for alternatives will truly erupt. Jensen Huang used the analogy that "every financial crisis pushes more people towards Bitcoin"—a logic equally applicable to the computing power market.

The emergence of DeepSeek has already demonstrated one thing: in a world where the capabilities of open-source models are approaching the closed-source frontier, inference cost will become the core variable determining the scaling speed of AI applications. Whoever can provide the lowest-cost, highest-availability inference computing power holds the entry ticket to this competition.

Conclusion: The Infrastructure War Has Just Begun

The next phase of AI competition will not be decided on the leaderboards of model capabilities but in the economic game of infrastructure.

Centralized computing giants hold capital and scale advantages but also bear the burden of fixed cost structures and pricing pressures. Decentralized networks are entering the market with extremely low marginal costs but need to prove they can meet real commercial thresholds in stability, usability, and ecosystem scale.

The two paths will coexist long-term and pressure each other. The tension between centralization and decentralization will be one of the most significant structural themes to track in the AI industry over the next five years.

This infrastructure war has just begun.

Cook's Curtain Call and Ternus Takes the Helm: The Disruption and Reboot of Apple's 4 Trillion Dollar Empire

Tim Cook has officially announced he will step down as CEO of Apple in September, transitioning to executive chairman after a 15-year tenure during which he grew the company’s market value from around $350 billion to nearly $4 trillion. He will be succeeded by John Ternus, a 50-year-old hardware engineering veteran who has been groomed for the role through increasing public visibility and internal responsibility. Ternus’s appointment signals a strategic shift toward hardware and engineering leadership, with Johny Srouji—head of Apple Silicon—taking on an expanded role as Chief Hardware Officer. This consolidation aims to strengthen Apple’s core technological capabilities. However, Cook’s departure highlights a significant unresolved issue: Apple’s delayed and fragmented approach to artificial intelligence. Despite early efforts, such as hiring John Giannandrea from Google in 2018, Apple’s AI initiatives—particularly around Siri—have struggled with internal restructuring and reliance on external partnerships, including with Google. The transition comes at a critical moment as Apple faces paradigm shifts with the rise of artificial general intelligence (ASI). The company’s closed ecosystem of hardware, software, and services—once a major advantage—now presents challenges in adapting to an AI-centric world where intelligence may matter more than the device itself. Ternus must quickly articulate a clear AI strategy, possibly starting at WWDC, to reassure markets and redefine Apple’s role in a new technological era. His task is not only to maintain Apple’s operational excellence but also to reinvigorate its capacity to innovate and lead in the age of AI.

marsbit45m ago

Cook's Curtain Call and Ternus Takes the Helm: The Disruption and Reboot of Apple's 4 Trillion Dollar Empire

marsbit45m ago

W3.io and Space and Time Collaborate to Launch Verifiable AI Finance Infrastructure

W3.io and Space and Time have partnered to launch a verifiable AI finance infrastructure designed to address the accountability gap in autonomous financial operations. The collaboration combines W3’s platform for creating and automating agent-powered financial workflows with Space and Time’s verifiable data blockchain, ensuring end-to-end proof from execution to settlement. This two-layer verification system processes over 200,000 operations daily and has been validated in production by Creatorland, a platform serving 100,000 content creators. The partnership enables businesses to deploy multi-vendor financial processes quickly and auditably, with support from integrations including Circle, Stripe, and PayPal. The system provides a trust layer for autonomous financial movements, ensuring data integrity and operational transparency beyond traditional cloud databases.

TheNewsCrypto3h ago

W3.io and Space and Time Collaborate to Launch Verifiable AI Finance Infrastructure

TheNewsCrypto3h ago

New York AG Sues Coinbase, Gemini Over Alleged State Law Violations

New York Attorney General Letitia James has filed a lawsuit against Coinbase and Gemini, alleging their prediction markets violate state gambling laws. The complaint argues these platforms operate as unlicensed gambling activities since outcomes rely on chance, not skill. Both companies are also accused of allowing users under 21, the legal age for mobile sports betting in New York. The suit seeks restitution for customers, repayment of illegal profits, civil penalties, and restrictions on marketing, including a ban on promoting services on college campuses. Following the news, Coinbase's stock fell roughly 10%, while Gemini's dropped about 4%.

bitcoinist4h ago

New York AG Sues Coinbase, Gemini Over Alleged State Law Violations

bitcoinist4h ago

Crypto Extortion Hits Strait Of Hormuz As Scammers Exploit Shipping Crisis

An emerging crypto extortion scam is targeting shipping companies with vessels stranded near the Strait of Hormuz, according to maritime risk firm Marisks. Criminals posing as Iranian security officials are sending fraudulent messages offering safe passage in exchange for transit fees paid in Bitcoin or Tether. The scam exploits desperation amid the ongoing regional conflict, which has severely restricted movement through the critical waterway. Victims are instructed to submit documents and pay a cryptocurrency fee to secure transit. However, the messages are not from Iranian authorities. Paying could not only result in financial loss but also potentially violate international sanctions, as any crypto transfer linked to Iranian entities may be considered material support. Legal risks remain even if companies are defrauded. The scheme appears to capitalize on earlier reports that Iran was considering a real crypto-based toll system.

bitcoinist7h ago

Crypto Extortion Hits Strait Of Hormuz As Scammers Exploit Shipping Crisis

bitcoinist7h ago

MIT Researcher Proposes New Path To Make Bitcoin Quantum-Safe

MIT Digital Currency Initiative director Neha Narula proposes a practical, staged approach to make Bitcoin quantum-safe. She argues that Bitcoin should immediately implement low-risk defenses through a soft fork, rather than waiting for full consensus on more complex issues. Her recommended solution involves deploying a post-quantum output type like P2MR (BIP 360) along with a new quantum-resistant signature opcode. This would allow users to securely migrate their funds to quantum-resistant addresses, provided they avoid address exposure. While this doesn’t resolve all challenges—such as how to handle inactive or lost coins—Narula emphasizes that immediate action reduces risk and provides time to gather data before a cryptographically relevant quantum computer emerges. She dismisses alternative proposals as impractical for broad use and acknowledges tradeoffs, such as reduced privacy efficiency, but insists progress shouldn’t be delayed by unresolved governance debates.

bitcoinist8h ago

MIT Researcher Proposes New Path To Make Bitcoin Quantum-Safe

bitcoinist8h ago

Trading

Spot

Futures

Hot Articles

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

As a major evolution of Ethereum’s account system, AA is designed to address the fundamental security and experience bottlenecks of the “private key equals account” model in the EOA era.

3.5k Total ViewsPublished 2025.12.18Updated 2025.12.18

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

The privacy + payments narrative has been the primary catalyst driving rotation and substantial price gains in privacy coins such as DASH and XMR.

16.3k Total ViewsPublished 2026.01.20Updated 2026.01.20

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

ADA's Ouroboros Leios mainnet is expected to launch in 2026, and the hard fork to Protocol Version 11 is planned for Q1 2026.

40.1k Total ViewsPublished 2026.02.10Updated 2026.02.12

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of S (S) are presented below.