When Tokens Cost More Than People, 'AI Narrative' Runs Into Trouble

marsbitPublished on 2026-05-29Last updated on 2026-05-29

Abstract

Title: When Tokens Cost More Than People, the "AI Narrative" Hits Trouble The economic sustainability of corporate AI adoption is under scrutiny as token consumption soars while measurable business value remains elusive. Major companies like Uber and Microsoft report struggling to justify rising AI costs, with executives coining terms like "tokenmaxxing" to describe wasteful usage. Data reveals a stark picture: for every dollar spent on AI tokens, only 18 cents translates to user-facing value, with the rest consumed by bug fixes, rework, and friction. The debate splits into bullish and bearish camps. Bulls, like Goldman Sachs analysts, see current inefficiencies as growing pains, predicting a 24-fold increase in token demand by 2030 and a shift towards healthier metrics like "cost per effective action." They point to indicators of real productivity gains and argue current tech valuations are not in bubble territory. Bears, however, highlight an unsustainable model where value is heavily concentrated in semiconductor companies like Nvidia, funded by cloud giants taking on massive debt. Studies show 95% of firms investing in generative AI see zero return. A deeper concern is the circular financial structure between cloud providers (hyperscalers) and AI labs like OpenAI and Anthropic. Billions in cloud service commitments are tied to these labs, which are partly funded by the hyperscalers' own investment. This creates a loop where cloud revenue depends on labs securing contin...

Author: Bao Yilong

Source: Wall Street News

The justification for corporate AI spending is facing a severe test, as Token consumption continues to climb, yet quantifiable commercial value remains elusive.

On May 22, Uber's Chief Operating Officer Andrew Macdonald, whose company is valued at over $200 billion, stated publicly on a podcast that the link between the growth in token consumption and substantial product improvement "doesn't exist yet."

Macdonald pointed out that companies are finding it increasingly difficult to rationalize the continuously rising AI expenditures. He even coined a term for the wasteful phenomenon within engineering teams: "tokenmaxxing."

Earlier in mid-May, Microsoft began cutting internal Claude Code licenses, citing token bills as "unsustainable."

The combination of these two events forces the market to confront a previously overlooked variable. Token economics, specifically the unit economics of token consumption at enterprise scale, has evolved from a peripheral issue to the central load-bearing pillar of the entire AI investment thesis.

Five Data Points, Painting a New Picture

Since April, multiple data points have emerged successively, collectively sketching an alarming picture.

In April this year, Uber's Chief Technology Officer publicly stated that the company had burned through its annual Claude Code budget in just four months.

Among 5,000 engineers, monthly usage rates ranged from 84% to 95%, with individual monthly bills varying from $150 to $2,000. The CTO himself reportedly consumed $1,200 worth of tokens during a two-hour internal demonstration.

Macdonald described being "speechless" upon hearing this number.

Regarding Microsoft, according to a report in The Verge's Tom Warren's Notepad newsletter, Claude Code quickly became popular among Microsoft's internal engineering teams. However, the token-based billing model made scaled spending unsustainable, prompting Microsoft to proceed with cutting related licenses.

GitHub announced that starting June 1, all Copilot plans would shift from a fixed subscription model to usage-based billing.

The official discussion thread garnered nearly 900 downvotes, as users calculated that a single AI programming session typically consumes $30 to $40, meaning a $10 monthly subscription could be exhausted in a single use.

Developer productivity platform Entelligence.AI aggregated data from 2,444 companies and found:

  • For every $1 spent on AI token costs, only 18 cents generated actual value reaching users.
  • 44 cents were used to fix bugs introduced by the AI itself; 27 cents went to rework; 11 cents were consumed by review friction.

According to Bloomberg's Silicon Data LLM Token Expenditure Index, token prices have risen about 65% since the end of February this year, and US AI software prices have increased by 20% to 37% cumulatively over the past year.

Bull vs. Bear Debate: One Fact, Two Interpretations

The same data points to starkly different conclusions under different analytical frameworks.

The bullish view argues that the current chaos is merely the growing pains of a successful transformation.

According to Goldman Sachs' Jim Schneider in early May, by 2030, agentic AI will drive a 24-fold increase in token consumption, reaching approximately 120 sextillion tokens per month. The gross margins of hyperscale cloud providers and model vendors will turn positive within the next 3 to 12 months.

Goldman's Rich Privorotsky believes that Q1 2026 might have been the peak for "token maximization" as a KPI. The industry is shifting from pursuing consumption volume to the healthier metric of "cost per effective action."

JP Morgan's economic research also found a jump in new and updated Python packages on PyPI in early 2026, a trend not seen when ChatGPT launched in 2022, indicating that real productivity gains are occurring.

Furthermore, the Magnificent 7 currently trades at about 20 times forward earnings, far below the 52 times at the peak of the 2000 tech bubble, 67 times for Japan in 1989, and 34 times during the "Nifty Fifty" era. By historical bubble standards, this does not constitute a bubble.

The bearish view was most systematically articulated by Goldman Sachs semiconductor analyst Jim Covello in an April report.

He pointed out that almost all value in the AI supply chain flows to semiconductor companies, a phenomenon unprecedented and unsustainable in history. Chip companies should benefit when their customers benefit, but in this cycle, their prosperity comes at the expense of consumption across the entire upstream industry chain.

Nvidia's net profit has grown about 20-fold since ChatGPT's launch; major hyperscale cloud providers have burned through their operating cash flow and are turning to debt—data center-related debt issuance in 2025 was approximately $182 billion, doubling from 2024.

MIT Nanda research shows 95% of enterprises investing in generative AI see zero return. This decoupling may persist for a while, but cannot last forever.

Concerns of the Circular Financing Structure

This discussion touches on a more complex level: the financial loop between hyperscale cloud providers and AI labs.

According to corporate disclosure documents compiled by The Information, OpenAI and Anthropic account for more than half of the approximately $2 trillion in future cloud service commitments from Microsoft, Oracle, Google, and Amazon. Specifically:

  • Of Microsoft's $627 billion cloud service backlog, $280 billion is tied to OpenAI;
  • Of Oracle's $553 billion pipeline business, 54% (approx. $300 billion) is committed by OpenAI;
  • Of Google's $467.6 billion, Anthropic accounts for 43% (approx. $200 billion);
  • Amazon's corresponding exposure also reaches 51% of its $464 billion backlog.

This financing structure is inherently circular. Microsoft's $13 billion investment in OpenAI was largely delivered in the form of Azure credits, which OpenAI used to purchase Azure compute. Microsoft then booked this as cloud revenue.

The same hyperscale cloud providers are both equity investors in the AI labs and service providers collecting compute bills.

This structure is also reflected in profit data. Alphabet reported a record Q1 profit of $62.6 billion, of which about $28.7 billion, nearly half, came from the paper appreciation of its Anthropic stake.

Amazon's Q1 profit of $30.3 billion included $16.8 billion in pre-tax unrealized gains from Anthropic, while its free cash flow plummeted 95% to $1.2 billion due to data center capital expenditures of $44.2 billion in the same period.

The sustainability of this system depends on AI labs' continued ability to secure external financing to fulfill cloud computing commitments, which in turn relies on enterprise customers' continued willingness to pay rising token bills.

It is reported that Anthropic currently incurs costs of $3 for every $1 of revenue. Once the pace of financing slows, the credibility of cloud revenue projections will decline, and the valuation multiples of hyperscale cloud vendors will also face re-evaluation pressure.

This chain transmits in both directions and will break in both directions.

This Isn't 1999, But the Problem is Real

The current situation does not constitute a typical bubble setup.

From a valuation multiple perspective, the Tech 7 currently trades at about 20 times forward price-to-earnings, far below the 52 times at the peak of the 2000 tech bubble, 67 times for the Japanese market in 1989, or the 34 times during the "Nifty Fifty" era.

AI technology itself is real. For heavy user groups, data on productivity gains is verifiable. OpenAI has an annualized revenue of about $20 billion, Anthropic about $4.3 billion; these two labs are not going to disappear.

Today, token cost (compute expense) has become the key determinant of AI success or failure. Six months ago, people weren't even discussing this topic.

Back then, people only cared about "whether the technology works." Now the answer is clear: in the eyes of specific jobs and specific people, the technology indeed works.

But a new question arises: Can the money saved by downstream companies using AI be transmitted upward in time to outrun the valuation window the capital market has left for AI labs and cloud giants?

Those bullish on AI believe that as long as the technology continues to mature, corporate ROI (Return on Investment) will turn positive within 1 to 1.5 years.

The bearish believe more executives will follow Macdonald's lead, publicly complaining about low AI ROI and starting to cut budgets.

Both scenarios are playing out; the outcome is undecided. The only certainty is that the old lie—"as long as token consumption is rising, it means the AI transformation is successful"—has been shattered.

High token consumption does not equal commercial value; this bubble must eventually be squeezed out. The bill for AI has come due, but who will ultimately pay for it? That remains an unknown for now.

Related Questions

QAccording to the article, what is the major problem that enterprise AI spending is currently facing?

AThe major problem is that token consumption is rapidly increasing, but quantifiable business value is hard to find. The article states that 'the line between the growth of token consumption and substantive product improvement... does not yet exist.' Executives are finding it difficult to justify the escalating costs.

QWhat key finding did the developer platform Entelligence.AI discover regarding the value generated from AI token spending?

AEntelligence.AI found that for every dollar spent on AI token fees, only 18 cents generated tangible value that reached end-users. The rest was consumed by other costs: 44 cents for fixing AI-introduced bugs, 27 cents for rework, and 11 cents for review friction.

QWhat is the critical concern regarding the financial structure between hyperscale cloud providers and AI labs, as described in the article?

AThe concern is a potentially unsustainable, cyclical financing structure. Hyperscale cloud providers (like Microsoft, Amazon) are both equity investors in and service providers for AI labs (like OpenAI, Anthropic). The labs use cloud credits from the investments to purchase cloud compute, which the providers book as revenue. This structure depends on continuous external funding for the labs, which itself relies on enterprise clients' willingness to pay rising token bills.

QBased on the bull argument presented, what metric is the AI industry supposedly shifting towards from 'tokenmaxxing'?

AAccording to the bull argument, the industry is shifting from focusing on 'tokenmaxxing' (maximizing token consumption as a KPI) towards a healthier metric: the 'cost per effective action' or the return on investment (ROI) of AI deployments.

QWhat does the article conclude is the 'new question' now that the technical capability of AI is proven for specific tasks?

AThe new question is: 'Can the money saved by downstream companies using AI be transmitted upwards quickly enough to outpace the valuation window that capital markets have left for AI labs and cloud giants?' In other words, can the business value and cost savings materialize fast enough to justify the high costs and valuations before investor patience runs out?

Related Reads

Xiaomi MiMo's 99% Price Cut is Not Marketing! Luo Fuli Posts on X to Refute Critics

The price of Xiaomi's MiMo-V2.5 series API has been permanently reduced by up to 99%, specifically for the "Input (Cache Hit)" cost, which covers users re-reading historical context in long conversations. MiMo's head, Luo Fuli, published a detailed technical blog to clarify that this drastic price cut stems from genuine engineering breakthroughs, not a marketing stunt or a simple price war. The core of the achievement lies in six key engineering optimizations. First, the model architecture adopts a Hybrid Sliding Window Attention (SWA), reducing the memory footprint (KVCache) to 1/7th of a traditional model. Second, a dual-pool memory management system actually utilizes these savings, allowing a single GPU to handle over 5 times more concurrent users. Third, an upgraded prefix caching mechanism achieves a cache hit rate of 93-95% for repeated reads, meaning most such requests bypass GPU computation entirely. Fourth, a self-developed distributed cache (GCache) utilizes idle SSD space on existing GPU servers, eliminating additional storage costs. Fifth, an intelligent scheduling system (LLM-Router) efficiently routes requests to maximize cache reuse and performance. Sixth, Multi-Token Prediction (MTP) accelerates the model's text generation ("output") side. Together, these systemic optimizations dramatically lower the real computational cost per request, enabling the 99% price reduction for cached inputs while reportedly maintaining positive gross margins. Luo Fuli's disclosure aims to shift the narrative from "price war" to a demonstration of substantive AI engineering progress.

marsbit1h ago

Xiaomi MiMo's 99% Price Cut is Not Marketing! Luo Fuli Posts on X to Refute Critics

marsbit1h ago

$26 Billion: An 'All-Chinese Team' Backs the World's Highest-Valued AI Programming Company

Cognition AI, the company behind the AI programmer "Devin," has raised over $1 billion in new funding at a valuation of $26 billion, just eight months after reaching a $10.2 billion valuation. The round was led by Lux Capital, General Catalyst, and 8VC. Founded by three young Chinese entrepreneurs with strong competitive programming backgrounds, Cognition initially gained fame with Devin, marketed as the world's first AI software engineer capable of handling tasks from start to finish. While its early demos were impressive, real-world usage revealed reliability and cost-effectiveness issues, leading to a significant price cut for Devin in 2025. A pivotal moment came when Cognition acquired the assets of AI IDE company Windsurf after a failed acquisition by OpenAI. This move gave Cognition a crucial developer-facing tool, allowing it to pursue a two-pronged strategy: Devin for autonomous task execution and Windsurf for integrated, collaborative coding within an IDE. This shift helped the company move away from the controversial "AI replacement" narrative towards a model of augmenting human engineers, particularly for repetitive or maintenance tasks. This strategic pivot is backed by strong commercial metrics. The company reports a 10x increase in enterprise usage this year, with an annual revenue run-rate of $492 million and a 50% month-over-month growth in enterprise Devin usage over the past six months. Its client list now includes major corporations like Goldman Sachs and Mercedes-Benz, as well as government agencies like NASA and the U.S. Army. Investors are betting on Cognition becoming a foundational piece of next-generation software engineering infrastructure, positioning it at the center of a hybrid future where AI agents and human developers work in tandem.

marsbit1h ago

$26 Billion: An 'All-Chinese Team' Backs the World's Highest-Valued AI Programming Company

marsbit1h ago

The Hottest 00s Generation on Wall Street

"Wall Street's Hottest '00s Phenom: The 25-Year-Old Fund Manager Who Bet on AI's 'Boring' Backbone" At just 25, Leopold Aschenbrenner, once fired by OpenAI, now runs a hedge fund worth $13.7 billion. His strategy? Betting against the consensus. While others chased AI chips, he invested early in the physical infrastructure powering the AI boom: electricity, data centers, and energy. Expelled from OpenAI's safety team in 2024, Aschenbrenner foresaw the coming bottleneck. He argued that AI progress would be limited not by algorithms, but by power, chip capacity, and space. Acting on this, he founded Situational Awareness LP to go long on these "old economy" assets. His bets have paid off spectacularly. His fund's assets soared from $255 million in late 2024 to $13.7 billion by Q1 2026. His portfolio is a direct reflection of his thesis: major long positions in fuel cell company Bloom Energy and data center/bitcoin mining firms like CleanSpark and Riot Platforms, which control critical land and power resources. Conversely, he holds massive put options against overheated semiconductor giants like NVIDIA and AMD. A notable exception was his bullish bet on storage company SanDisk, which surged ~160% in Q2. Aschenbrenner's vision is materializing. Tech giants like Amazon, Alphabet, and Meta are ramping up colossal capital expenditure on data centers. Global data center power consumption is projected to skyrocket, with AI accounting for over half by 2030. The demand for enabling technologies like optical fiber and modules is also exploding. His story underscores a fundamental truth of the AI era: the ethereal intelligence of algorithms rests on a very physical, heavy, and power-hungry foundation. The future is being built not just in code, but in concrete, copper, and kilowatts.

marsbit3h ago

The Hottest 00s Generation on Wall Street

marsbit3h ago

Review of Cathie Wood's Masterstroke Operation on Circle

A Recap of Cathie Wood's Masterful Trading in Circle's IPO This article analyzes the strategic moves made by ARK Invest's Cathie Wood around the IPO of Circle (CRCL). Despite her typical long-term, narrative-driven investment style, Wood executed a textbook "buy low, sell high" trade. Wood secured a core position of approximately 4.49 million shares at the $31 IPO price. The stock debuted at $69, surged to a high of $299 in June 2025 fueled by stablecoin regulatory news (the GENIUS Act), and then entered a prolonged decline. During this rally, ARK systematically sold around 1.7 million shares at an average price near $210, driven partly by internal fund rebalancing rules triggered by the stock's soaring weight. This move locked in substantial profits. As the stock later fell due to lockup expirations, new share issuance, and interest rate concerns—even dipping below $50—Wood began repurchasing shares. Starting in November 2025 around $86, she continued buying on the way down, eventually rebuilding her position to roughly the original size by Q1 2026. Key takeaways include: 1) Having a strong, independent long-term thesis (viewing Circle as critical digital dollar infrastructure). 2) Trading in tranches instead of trying to time exact tops or bottoms. 3) Maintaining strict position-sizing discipline, using rules to force profit-taking and preserve buying power. For most retail investors, chasing the dramatic "pop" at open is dangerous, as the subsequent 83% drawdown showed. Wood's success hinged on pre-IPO access, a clear investment thesis, and disciplined execution.

marsbit5h ago

Review of Cathie Wood's Masterstroke Operation on Circle

marsbit5h ago

Sharplink CEO: Ethereum's Future is Unfolding Now

In an article titled "Sharplink CEO: Ethereum's Future is Unfolding," Joseph Chalom, a former BlackRock executive and current Sharplink CEO, argues that the current debates surrounding the Ethereum Foundation (EF) and ETH price miss the bigger picture. He asserts that Ethereum's long-term institutional adoption is secured by its foundational strengths: trust, security, and liquidity. Chalom highlights Ethereum's dominance in settling stablecoin value, tokenizing real-world assets (RWA), and facilitating high-value DeFi transactions as evidence of its winning position. He defends the Ethereum Foundation's focus on rigorous protocol development and a decade-long track record of major upgrades (The Merge, EIP-1559, Dencun, etc.), viewing its upcoming technical roadmap as the most ambitious in the industry. Contrary to critics, Chalom posits that Ethereum's decentralization and reliable neutrality are core strengths for institutional adoption, not weaknesses, as they prevent control by any single entity. Drawing a parallel to Amazon's early days, he suggests that ETH's intrinsic value is tied to the expansion of its network, which is poised for a step-change in transaction volume across stablecoins, RWAs, DeFi, and agentic finance. Chalom advocates for a "be greedy when others are fearful" approach, citing historical examples from Warren Buffett and his own experience at BlackRock during the crypto winter. He concludes that while the EF should remain focused on core protocol attributes (CROPS: Censorship Resistance, Capture Resistance, Open Source, Privacy, Security), there is a leadership gap in market outreach. Chalom calls for ecosystem participants, including Sharplink and other key players, to become more vocal advocates to support the coming institutional adoption supercycle, asserting that "Ethereum's future is unfolding now."

marsbit5h ago

Sharplink CEO: Ethereum's Future is Unfolding Now

marsbit5h ago

Trading

Spot
Futures

Hot Articles

How to Buy PEOPLE

Welcome to HTX.com! We've made purchasing ConstitutionDAO (PEOPLE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy ConstitutionDAO (PEOPLE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your ConstitutionDAO (PEOPLE)After purchasing your ConstitutionDAO (PEOPLE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade ConstitutionDAO (PEOPLE)Easily trade ConstitutionDAO (PEOPLE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

6.9k Total ViewsPublished 2024.03.29Updated 2025.03.21

How to Buy PEOPLE

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of PEOPLE (PEOPLE) are presented below.

活动图片