AI's Cost Dilemma: How Infrastructure Economics Will Reshape the Next Phase of the Market

marsbitОпубликовано 2026-03-26Обновлено 2026-03-26

Введение

AI is expanding, but its underlying economic model is fragile. While training cutting-edge models like Claude 3.5 Sonnet costs tens of millions—with future models potentially reaching $1 billion—the real burden is inference costs, which accumulate with each API call and strain startups. Three cloud giants—AWS, Azure, and Google Cloud—control two-thirds of global cloud infrastructure, creating market concentration and supply risks. Top AI labs secure GPU access at near-cost rates (as low as $1.30–$1.90/hour) via strategic partnerships, while smaller players pay retail prices exceeding $14/hour—a 600% premium. Energy consumption is another challenge: data centers already use 1–1.5% of global electricity, and AI’s growth will intensify this demand. Decentralized inference networks like Gonka offer an alternative, aiming to reduce costs (e.g., $0.0009 per million tokens vs. $1.50 for centralized services), increase supply elasticity, and enhance sovereignty by leveraging idle GPUs globally. The AI infrastructure war is just beginning. Centralized providers hold scale advantages, but economic pressures may drive adoption of decentralized models, reshaping value distribution in the AI industry.

Source: International Business Times UK

Original Author: Anastasia Matveeva |

Compiled and Edited by: Gonka.ai

AI is expanding at an astonishing rate, but its underlying economic logic is far more fragile than it appears on the surface. When three cloud giants control two-thirds of the world's computing power, when training costs approach $1 billion, and when inference bills catch startups off guard—the true cost of this computing arms race is quietly reshaping the value distribution of the entire AI industry.

This article does not discuss who will build the most advanced models. It addresses a more fundamental question: Is the current economic model of AI infrastructure truly sustainable after scaling? How will changes in the allocation mechanism of computing power reshape the value distribution of the entire market?

I. The Hidden Cost of Intelligence

Training a cutting-edge large model can cost tens or even hundreds of millions of dollars. Anthropic has publicly stated that training Claude 3.5 Sonnet cost "tens of millions of dollars," and its CEO, Dario Amodei, previously estimated that the training cost for the next-generation model could approach $1 billion. According to industry reports, the training cost of GPT-4 may have exceeded $100 million.

However, training costs are just the tip of the iceberg. The structural and ongoing pressure comes from inference costs—the expenses incurred every time a model is called. According to OpenAI's publicly available API pricing, inference is billed per million tokens. For applications with high usage, this means daily inference costs could already reach thousands of dollars even before scaling.

AI is often described as software. But its economic essence is increasingly resembling capital-intensive infrastructure—requiring both substantial upfront investment and continuous stream of operational expenses.

This shift in economic structure is quietly altering the competitive landscape of the entire AI industry. Those who can afford computing power are the giants who have already built large-scale infrastructure; startups trying to survive in the cracks are being gradually eroded by inference bills.

II. Capital Intensity and Market Concentration

According to Holori's 2026 cloud market analysis, AWS currently holds about 33% of the global cloud market share, Microsoft Azure about 22%, and Google Cloud about 11%. Together, these three control approximately two-thirds of the global cloud infrastructure, and the vast majority of global AI workloads run on their infrastructure.

The practical implication of this concentration is: when OpenAI's API goes down, thousands of products are affected simultaneously; when a major cloud service provider experiences an outage, services across industries and regions are disrupted.

Concentration is not narrowing; infrastructure spending is instead continuing to expand. Taking NVIDIA as an example, its data center business has reached an annualized revenue of over $80 billion, indicating sustained strong demand for high-performance GPUs.

More noteworthy is a hidden structural inequality. According to SEC filings and market reports, top labs like OpenAI and Anthropic secure GPU resources at near-cost prices as low as $1.30–$1.90 per hour through multi-billion dollar "equity-for-compute" agreements. In contrast, small and medium-sized companies lacking strategic partnerships with NVIDIA, Microsoft, or Amazon are forced to purchase at retail prices exceeding $14 per hour—a premium of up to 600%.

This pricing gap is driven by NVIDIA's recent strategic investments totaling $40 billion in leading labs. Access to AI infrastructure is increasingly determined by capital-intensive procurement agreements rather than open market competition.

In the early adoption phase, this concentration can appear "efficient." But after scaling, it brings pricing risk, supply bottlenecks, and infrastructure dependency—a triple vulnerability.

III. The Overlooked Energy Dimension

The cost issue of AI infrastructure has another often-overlooked dimension: energy.

According to data from the International Energy Agency (IEA), data centers currently account for about 1–1.5% of global electricity consumption, and AI-driven demand growth could significantly increase this proportion in the coming years.

This means that the economics of computing power is not just a financial issue but also an infrastructure and energy challenge. As AI workloads continue to expand, the geopolitical significance of power supply will become increasingly prominent—the country that can provide the most stable computing power at the lowest energy cost will hold a structural advantage in the industrial competition of the AI era.

When Jensen Huang announced at GTC26 that NVIDIA's order visibility had surpassed $1 trillion, he was describing not just the commercial success of one company but the grand process of civilization converting electricity, land, and scarce minerals into intelligent computing power.

IV. Rethinking Infrastructure Mechanisms

While centralized data centers continue to expand, another type of exploration is quietly emerging—attempting to fundamentally redefine how computing resources are coordinated.

Decentralized Inference: A Structural Alternative

The Gonka protocol is a representative practice in this direction. It is a decentralized network designed specifically for AI inference, with the core design objective of minimizing network synchronization and consensus overhead, directing as much computing resources as possible to real AI workloads.

At the governance level, Gonka adopts a "one compute unit, one vote" principle—governance weight is determined by verifiable computing power contribution, not capital shareholding. At the technical level, the protocol uses short-cycle performance measurement intervals (called Sprints), requiring participants to demonstrate real GPU computing power in real-time through a Transformer-based Proof-of-Work (PoW) mechanism.

The significance of this design is that nearly 100% of the network's computing power is directed to the AI inference workload itself, rather than consumed on maintaining consensus, coordinating communication, and other infrastructure overhead.

The Economic Logic of Distributed Computing Power

From an economic perspective, the value proposition of decentralized computing networks has three layers.

The first is the cost layer. The pricing structure of centralized cloud service providers inherently includes massive fixed asset depreciation, data center operating costs, and shareholder profit expectations. Decentralized networks can significantly compress these costs by monetizing idle GPU resources. Taking Gonka as an example, the current pricing for inference services provided through its USD billing gateway, GonkaGate, is approximately $0.0009 per million tokens—while centralized providers like Together AI charge about $1.50 for similar models (e.g., DeepSeek-R1), a difference of over a thousand times.

The second is the supply elasticity layer. The computing power supply of centralized providers is rigid, with expansion cycles measured in months or even quarters. Participants in decentralized networks can join or exit elastically with demand fluctuations, theoretically enabling a faster response to demand peaks—just as Amazon Web Services was born from holiday traffic peak demands, the peaks and valleys of AI inference similarly require elastic infrastructure to handle.

The third is the sovereignty layer. This dimension is particularly prominent from the perspective of sovereign nations. When a government's public services deeply rely on an external cloud service provider, computing dependency becomes a strategic vulnerability. Decentralized networks offer a possibility: local data centers can join the global distributed network as nodes, ensuring data sovereignty while obtaining sustainable commercial returns by providing computing power to the global market.

V. The Moment of Value Redistribution

Returning to the core question at the beginning of the article: Is the current economic model of AI infrastructure sustainable after scaling?

The answer is: For the top players, yes; for everyone else, increasingly no.

AWS, Azure, and Google Cloud have built moats through decades of capital accumulation, and their scale advantages are almost unshakable in the short term. But this structural advantage also means that pricing power, data access, and infrastructure dependency are highly concentrated in the hands of a few private entities.

Historically, every major monopoly in technological infrastructure eventually gave rise to alternative distributed architectures—the internet itself was a rebellion against telecom monopolies, BitTorrent颠覆ed centralized distribution, and Bitcoin challenged the centralization of currency issuance.

The decentralization of AI infrastructure may not be an ideological choice but an economic inevitability—when the cost of centralization becomes high enough to drive large-scale user migration, the demand for alternatives will truly erupt. Jensen Huang used the analogy that "every financial crisis pushes more people towards Bitcoin"—a logic equally applicable to the computing power market.

The emergence of DeepSeek has already demonstrated one thing: in a world where the capabilities of open-source models are approaching the closed-source frontier, inference cost will become the core variable determining the scaling speed of AI applications. Whoever can provide the lowest-cost, highest-availability inference computing power holds the entry ticket to this competition.

Conclusion: The Infrastructure War Has Just Begun

The next phase of AI competition will not be decided on the leaderboards of model capabilities but in the economic game of infrastructure.

Centralized computing giants hold capital and scale advantages but also bear the burden of fixed cost structures and pricing pressures. Decentralized networks are entering the market with extremely low marginal costs but need to prove they can meet real commercial thresholds in stability, usability, and ecosystem scale.

The two paths will coexist long-term and pressure each other. The tension between centralization and decentralization will be one of the most significant structural themes to track in the AI industry over the next five years.

This infrastructure war has just begun.

Связанные с этим вопросы

QWhat are the main cost components in AI infrastructure, and why is inference cost considered more structurally significant than training cost?

AThe main cost components in AI infrastructure are training costs and inference costs. Training a state-of-the-art large model can cost tens to hundreds of millions of dollars (e.g., Claude 3.5 Sonnet cost 'tens of millions', and next-gen models may approach $1 billion). However, inference cost—the expense generated each time a model is called—is more structurally significant because it is a continuous operational expenditure. For high-usage applications, daily inference costs can reach thousands of dollars even before scaling, making it a persistent financial burden that shapes the competitive landscape and sustainability of AI businesses.

QHow does the concentration of cloud infrastructure among AWS, Azure, and Google Cloud impact the AI industry's market dynamics and vulnerability?

AAWS, Azure, and Google Cloud collectively control about two-thirds of the global cloud infrastructure market. This concentration means that most AI workloads run on these three providers, creating market dynamics where pricing power, supply access, and infrastructure dependency are highly concentrated. It leads to systemic vulnerabilities: outages at one provider (e.g., OpenAI API downtime) can disrupt thousands of products and services globally. Additionally, it exacerbates structural inequality, as large players secure GPU resources at near-cost rates (e.g., $1.30–$1.90/hour) via strategic partnerships, while smaller companies pay retail prices (e.g., over $14/hour)—a 600% premium—due to lack of bargaining power.

QWhat is the role of energy consumption in AI infrastructure economics, and why is it a growing concern?

AEnergy consumption is a critical but often overlooked dimension of AI infrastructure economics. Data centers currently account for 1–1.5% of global electricity consumption, and AI-driven demand is expected to significantly increase this share. This makes energy a fundamental cost factor and a geopolitical challenge: countries with lower energy costs and stable power supplies will have a structural advantage in the AI industry. The conversion of electricity, land, and scarce minerals into compute power (as highlighted by Nvidia's $1 trillion order visibility) underscores that AI's expansion is not just a financial issue but a resource-intensive process with broad infrastructure implications.

QHow do decentralized compute networks like Gonka propose to address the economic and structural challenges of centralized AI infrastructure?

ADecentralized compute networks like Gonka aim to address centralized AI infrastructure challenges through three key value propositions: 1) Cost reduction: By monetizing idle GPU resources, they avoid the fixed costs, depreciation, and profit margins of centralized providers, offering dramatically lower prices (e.g., Gonka charges $0.0009 per million tokens vs. $1.50 for centralized services). 2) Supply elasticity: Decentralized networks allow participants to join or exit dynamically, providing flexible scaling to handle demand peaks without rigid expansion cycles. 3) Sovereignty: They enable local data centers to participate in a global network while retaining data sovereignty, reducing dependency on foreign cloud providers and offering commercial returns through global compute supply.

QWhy might the decentralization of AI infrastructure become an economic necessity rather than an ideological choice?

ADecentralization of AI infrastructure may become an economic necessity because the high costs and concentrated control of centralized models are unsustainable for most players. While giants like AWS and Azure can sustain their scale, the pricing pressure, supply bottlenecks, and infrastructure dependency create barriers for smaller companies and nations. Historically, monopolies in critical infrastructure (e.g., telecom, content distribution, currency) have spurred distributed alternatives (e.g., internet, BitTorrent, Bitcoin). Similarly, when centralized AI costs drive large-scale user migration, decentralized networks—with their marginal cost advantages and elastic supply—could emerge as viable alternatives. As open-source models close the capability gap with closed-source ones, inference cost becomes the key variable for scalability, making low-cost, decentralized compute increasingly attractive.

Похожее

A 120,000 Yuan Tombstone or 399 Yuan AI Immortality: Which Would You Choose?

"The 'Deathcare Moutai' Fushouyuan, once a highly profitable cemetery operator, has halted trading amid a severe crisis, with its net profit plummeting by 52.8% in 2024. This reflects a broader trend of people rejecting expensive traditional burials, as average grave prices in China have soared to over ¥120,000. In response, the industry is pivoting to digital alternatives, with companies like Fushouyuan offering AI-powered memorial services, such as virtual farewell halls and AI-generated recreations of the deceased. Simultaneously, a low-cost, unregulated AI 'resurrection' industry has emerged online, with services priced as low as ¥399. These often use open-source tools to create crude digital avatars from photos and voice clips, exploiting vulnerable individuals, particularly bereaved parents who have lost their only child. However, these services raise significant ethical and legal concerns, including data privacy risks and potential use in scams. Academic studies warn that such AI companions may exacerbate grief, leading to prolonged mourning disorders and emotional dependency, rather than providing genuine comfort. While regulations are being drafted to manage digital human services, the deep emotional drive to 'reconnect' with loved ones often overshadows rational concerns. Ultimately, the article questions whether digital immortality truly preserves memory or merely offers a commercialized illusion, emphasizing that no technology can replace the real, irreplaceable loss of a human life."

marsbit2 мин. назад

A 120,000 Yuan Tombstone or 399 Yuan AI Immortality: Which Would You Choose?

marsbit2 мин. назад

Exploring Bitcoin Valuation in 2026 from Macro and On-Chain Structural Perspectives

Tiger Research analyzes Bitcoin's valuation outlook for 2026 from macro and on-chain perspectives. Despite a 27% price drop in Q1, the macro environment remains supportive. Global M2 hit a record $13.44 trillion, but Chinese liquidity, which contributed over 60% of M2 growth, has limited access to Bitcoin markets. The Iran conflict pushed oil prices higher, raising March CPI to 3.3% and narrowing the Fed's rate cut path. However, the easing direction remains intact. Bitcoin ETF flows turned positive in March after five months of outflows, and corporate accumulation continues. On-chain metrics show a shift from undervaluation to early equilibrium. Key indicators like MVRV-Z and NUPL have exited panic zones. The critical resistance is at $78k, the long-term holder cost basis, while the key support is at $54k. Although transaction counts increased, active addresses and average transfer size declined, indicating superficial growth rather than real network expansion. BTCFi ecosystem growth has weakened, leading to a -10% adjustment in fundamental metrics. The 12-month price target is set at $143k, based on a $132.5k neutral benchmark adjusted by -10% (fundamentals) and +20% (macro). This represents a 103% upside from current levels. Short-term catalysts include a break above $78k, sustained ETF inflows, and a Fed policy shift post-geopolitical de-escalation.

marsbit22 мин. назад

Exploring Bitcoin Valuation in 2026 from Macro and On-Chain Structural Perspectives

marsbit22 мин. назад

Anthropic Starts Poaching Scientists? $27K Weekly Onsite Stipend to Fix Claude's Expert-Level Errors

Anthropic has launched a new STEM Fellow program, offering $3,800 per week for a three-month, in-person residency in San Francisco. The role targets experts from science, technology, engineering, and mathematics (STEM) fields—machine learning experience is helpful but not required. Instead, Anthropic values scientific judgment and a willingness to learn quickly. Fellows will work with Claude models and internal tools under the guidance of an Anthropic researcher. Example projects include a materials scientist identifying errors in Claude’s reasoning or a climate scientist integrating atmospheric modeling software with Claude. The goal is to have experts "tell Claude where it's wrong" and improve its scientific capabilities. This initiative is part of Anthropic’s broader strategy to strengthen its scientific ecosystem, following earlier programs like the AI Safety Fellows and AI for Science programs. The company acknowledges that current AI models, while powerful, still produce high-confidence errors and lack end-to-end research autonomy. The program aims to embed domain expertise directly into model development, turning scientists into "high-level reviewers" for AI. Anthropic CEO Dario Amodei has previously emphasized AI’s potential to accelerate scientific breakthroughs, particularly in biology and healthcare. The company believes that the next phase of AI competition will depend not on scaling parameters, but on integrating human expertise to refine model accuracy and reliability.

marsbit52 мин. назад

Anthropic Starts Poaching Scientists? $27K Weekly Onsite Stipend to Fix Claude's Expert-Level Errors

marsbit52 мин. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Как купить S

Добро пожаловать на HTX.com! Мы сделали приобретение Sonic (S) простым и удобным. Следуйте нашему пошаговому руководству и отправляйтесь в свое крипто-путешествие.Шаг 1: Создайте аккаунт на HTXИспользуйте свой адрес электронной почты или номер телефона, чтобы зарегистрироваться и бесплатно создать аккаунт на HTX. Пройдите удобную регистрацию и откройте для себя весь функционал.Создать аккаунтШаг 2: Перейдите в Купить криптовалюту и выберите свой способ оплатыКредитная/Дебетовая Карта: Используйте свою карту Visa или Mastercard для мгновенной покупки Sonic (S).Баланс: Используйте средства с баланса вашего аккаунта HTX для простой торговли.Третьи Лица: Мы добавили популярные способы оплаты, такие как Google Pay и Apple Pay, для повышения удобства.P2P: Торгуйте напрямую с другими пользователями на HTX.Внебиржевая Торговля (OTC): Мы предлагаем индивидуальные услуги и конкурентоспособные обменные курсы для трейдеров.Шаг 3: Хранение Sonic (S)После приобретения вами Sonic (S) храните их в своем аккаунте на HTX. В качестве альтернативы вы можете отправить их куда-либо с помощью перевода в блокчейне или использовать для торговли с другими криптовалютами.Шаг 4: Торговля Sonic (S)С легкостью торгуйте Sonic (S) на спотовом рынке HTX. Просто зайдите в свой аккаунт, выберите торговую пару, совершайте сделки и следите за ними в режиме реального времени. Мы предлагаем удобный интерфейс как для начинающих, так и для опытных трейдеров.

1.2k просмотров всегоОпубликовано 2025.01.15Обновлено 2025.03.21

Как купить S

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

Он решает проблемы масштабируемости, совместимости между блокчейнами и стимулов для разработчиков с помощью технологических инноваций.

2.2k просмотров всегоОпубликовано 2025.04.09Обновлено 2025.04.09

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

HTX Learn — ваш проводник в мир перспективных проектов, и мы запускаем специальное мероприятие "Учитесь и Зарабатывайте", посвящённое этим проектам. Наше новое направление .

1.8k просмотров всегоОпубликовано 2025.04.10Обновлено 2025.04.10

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на S (S) представлены ниже.

活动图片