Claude Bill Skyrockets by 5 Billion, Surges 60-Fold Overnight—Can Your Token Budget Keep Up?

marsbitPublicado a 2026-06-01Actualizado a 2026-06-01

Resumen

An enterprise reportedly ran up a staggering $500 million bill on Anthropic's Claude AI in just one month due to a simple oversight: failing to set usage limits for employee accounts. This incident highlights a growing trend of runaway AI costs. Other examples include a Google Cloud user hit with an unexpected $18,000 bill from API key abuse, and an OpenAI internal experiment that consumed 603 billion tokens, costing $1.3 million in 30 days. Major AI providers like OpenAI and GitHub are shifting from flat monthly fees to granular, usage-based pricing (per input/output/cached token), causing shock for some users whose costs skyrocketed by orders of magnitude. The root causes extend beyond pricing. The rise of autonomous AI agents executing long, complex tasks has drastically increased token consumption. Furthermore, misaligned incentives, like internal "leaderboards" ranking employees by AI usage, can encourage wasteful "tokenmaxxing"—using powerful models for trivial tasks just to inflate metrics. This has sparked a new industry focused on cost optimization. Solutions include providing AI with better context (reducing redundant searches) and intelligent model routing (matching tasks to the most cost-effective model). Research indicates token consumption for agentic tasks can vary wildly (up to 30x for the same job) without guaranteeing better results, and models often underestimate their own costs. As AI expenses begin to rival or even surpass human labor costs for some t...

A $500 Million Bill in Just 1 Month!

Recently, a shocking blunder erupted in the tech world. According to Axios, a company actually managed to rack up a $500 million bill on Claude in just one month!

The reason is laughable: management forgot to set usage limits when granting employees access to Claude accounts.

In fact, this isn't the only case of AI bills exploding.

In April, a Google Cloud user, whose publicly accessible API key was misused, received a bill for $18,000 overnight, despite having only a $7 budget set.

The unlucky user, Jesse Davies, is an Australian AI consultant and founder of Agentic Labs. He had set up two safeguards for his Google Cloud account: a A$10 (about $7) budget alert and a hard spending cap of $1,400.

As reported by Tom's Hardware, attackers discovered a Cloud Run service he had deployed months earlier from AI Studio, sending over 60,000 requests. Both safeguards failed: there was a delay in billing calculations, and by the time the system reacted, the amount had skyrocketed to $18,000.

In mid-May, Peter Steinberger, founder of the open-source project OpenClaw, posted a screenshot on X: a $1.3 million OpenAI API bill for 30 days.

His team has only three people, but they orchestrated 100 Codex agents running in parallel: burning through 603 billion Tokens and making 7.6 million requests in 30 days. Fortunately, he didn't have to foot this $1.3 million bill himself.

Steinberger joined OpenAI this past February, and this $1.3 million was treated as an internal experiment:

to test the absolute limits of AI programming when token cost is not a consideration. He added that this was the result of Codex's "Fast Mode" (higher-tier billing); turning it off reduced the cost to about $300,000.

Even earlier, Uber's CTO Praveen Neppalli Naga had admitted to The Information that the company had exhausted its annual Claude Code budget by April, and their COO also publicly stated that AI costs were becoming increasingly "hard to justify."

$500 million, $1.3 million, $18,000—though these figures differ by orders of magnitude, they point to the same reality:

In the age of agents, any one of these—a compromised key, an army of agents running 24/7, or an account with forgotten limits—can blow up your token bill overnight.

Why Do AI Bills Explode?

The answer lies mainly in the shift in billing methods.

Starting April this year, OpenAI began transitioning from monthly flat fees to usage-based billing by Token.

On April 2, Codex billing shifted from per-message estimates to alignment with actual Token usage: Input, Cached Input, and Output Tokens are billed separately. On April 23, this rule was extended to all Enterprise, Edu, Health, and Gov plans: the invisible discount within the monthly fee was removed.

GitHub followed closely, just announcing: all Copilot plans will switch to usage-based billing effective June 1, 2026. The old premium request logic is scrapped, replaced with AI credits, settled based on actual consumption of Input, Output, and Cached Tokens against each model's API rate.

GitHub officially explained the reason for this change:

Currently, a quick chat question and a multi-hour autonomous coding task cost the user the same amount. GitHub has been subsidizing the heavy users, but this model is no longer sustainable.

Before the rise of AI agents, the costs of chat and completions were similar, and monthly fees could cover them.

After agents rose, a single task could run for hours and modify entire codebases, creating a cost difference of orders of magnitude between heavy and light users. The flat monthly fee model collapses in the face of such disparity.

The news sparked an uproar on Reddit and X.

A developer with the ID JBusu shared a screenshot of their bill, bluntly calling the new pricing "a joke." Their previous monthly cost of $28.12 would become $746.01 under the new system. They've decided to cancel, "At this price, I could rent a cloud server myself and it would be cheaper."

Another user shared an even more extreme screenshot, showing costs soaring from $50 to $3,000. They said they never expected pricing to be this outrageous, "Is anyone still subscribing?"

However, some veteran Copilot users countered: these extreme bills are likely burned by "vibe-coders" who aren't mindful of token usage and may not represent normal use.

One veteran user commented: "I use it all day long and rarely exceed limits by month-end. It's hard to believe this is due to differences in task complexity." Another was more direct: "It's people wanting fully automated YOLO-mode development, letting AI run wild. Culling this waste is actually good for everyone else."

One thing is clear: GitHub hasn't abolished monthly fees; the base subscription price remains unchanged. What has changed is that extra usage, agent tasks, and calls to more expensive models now fall under usage-based billing.

The hardest hit are those heavy agent users who rely on Copilot for long-chain tasks.

The Leaderboard Gamed by Its Own Users

The collapse of flat fees is partly due to platforms changing their billing rules, and partly because AI users themselves are burning through tokens.

In May, Business Insider reported that Amazon took down an internal AI usage leaderboard called KiroRank.

The report cited insiders saying the leaderboard quietly encouraged a strange work style: some employees, to climb the ranks, would burn tokens on tasks that didn't solve actual problems, purely for ranking.

After the story broke, Amazon SVP Dave Treadwell directly addressed all employees: "Don't use AI for the sake of using AI. Use it to solve customer problems, business problems, to innovate."

Though absurd, this is hardly surprising. When "burning tokens" gets you on a leaderboard, employees will naturally burn tokens.

Silicon Valley has coined a term for this phenomenon: Tokenmaxxing—treating consumption volume as productivity.

Axios's report also mentioned CTOs discovering employees using cutting-edge AI models to check the weather or write routine emails—trivial tasks that, when run on the most expensive frontier models, can silently send bills soaring.

KiroRank wasn't part of Amazon's official evaluation system but an informal tool built by employees. Yet it clearly exposes a classic management principle: when KPIs are set wrong, people will use the cleverest ways to game the system.

Equating "how much was used" with "how well it was done"—this is the systemic root of this wave of AI waste.

Those Who Count Tokens Are Already Making Money

On the flip side of token bill anxiety, some are quietly turning it into a business.

First approach: Feed the AI with context.

Glean is actually Arvind Jain's own company. It builds an enterprise AI work assistant: unifying knowledge scattered across a company, giving employees' AI direct context so they don't have to dig around. The AI takes fewer detours, naturally burning fewer tokens.

This mechanism helped Glean's annual revenue triple in 15 months, crossing $300 million, with clients including Databricks, Reddit, and Samsung.

Second approach: Delegate tasks to the right model.

This is what model routing startup Factory AI does: automatically routing each task to the most suitable model, cheap ones for simple tasks, top-tier for complex ones. Arvind also noted: Do routing right, and you can save 10x.

Both paths lead to the same destination: Let AI work, but don't let it burn money indiscriminately.

Academic research is also laying the groundwork for this shift.

https://arxiv.org/pdf/2604.22750

An arXiv paper from April 2026 systematically broke down how agent coding tasks actually burn money for the first time.

Conclusion One: Token consumption for agent tasks can be thousands of times higher than ordinary code reasoning or code chat, with Input Tokens being the main cost driver.

Conclusion Two: Running the same task multiple times can result in a 30x difference in Token consumption.

Conclusion Three: Higher Token consumption does not necessarily lead to higher accuracy. Accuracy often peaks at medium cost—burning more beyond that spends money without yielding better results.

The paper also found that even frontier models can't reliably predict their own token consumption, generally underestimating the real cost.

You think spending more gets more done. In reality, money is spent, the work isn't necessarily better, and the budget is still unpredictable.

When AI Bills Start Rivaling Labor Costs

"This is the first time in my memory that technology costs are starting to be on par with human costs."

On May 29, Glean CEO Arvind Jain said this in an interview with CNBC's Deirdre Bosa.

Observations from Nvidia's Vice President of Applied Deep Learning, Bryan Catanzaro, corroborate this.

He mentioned in an Axios interview that for his team, compute costs far exceed employee salaries.

Similar trends are emerging across multiple companies: from enterprise AI player Glean, to AI compute seller Nvidia, to AI user Uber—all are re-evaluating this equation.

In Arvind's view, historically, technology was just a small slice of overall corporate costs. But now, AI costs are catching up to payrolls. Many companies' annual AI budgets are often burned through in just one or two months.

Over the past year, AI usage rate was a worshipped metric: more usage meant being advanced, burning tokens meant embracing the future. Now, many companies are reflecting on that simple question: What exactly did all those burned tokens buy?

The window of free or flat-rate unlimited usage is precisely closing at this moment.

Going forward, the question facing all developers is this: How to budget meticulously and maximize the value of every single Token.

Undoubtedly, the true winners of the future will be those who learn to count tokens first.

References:

https://x.com/dee_bosa/status/2060791500049613306%20

https://www.cnbc.com/2026/05/29/-tokens-or-humans-the-new-corporate-trade-off.html%20

https://www.axios.com/2026/05/28/ai-spending-roi-enterprise-costs%20

https://www.businessinsider.com/amazon-ai-leaderboard-tokenmaxxing-2026-5

This article is from the WeChat public account "AI Era Insights", author: ASI启示录

Criptos en tendencia

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

Preguntas relacionadas

QWhat is the main reason behind the dramatic increase in AI usage costs as discussed in the article?

AThe primary reason is the shift from flat-rate monthly subscription models to consumption-based pricing (charging per Token used). This change, implemented by companies like OpenAI and GitHub, means that intensive AI agent tasks, which can consume orders of magnitude more tokens than simple chats or completions, now incur significantly higher costs.

QWhat incident involving a leaked API key led to a massive unexpected bill, and how much was it?

AAn Australian AI consultant named Jesse Davies had a Google Cloud API key exposed from a public service. Attackers used it to make over 60,000 requests, resulting in a bill of $18,000, despite him having set a budget alert and a hard spending limit.

QWhat does the term 'Token maxxing' refer to in the context of corporate AI use?

A'Token maxxing' refers to the practice of employees excessively consuming AI tokens, not to solve real problems, but to climb internal usage leaderboards (like Amazon's KiroRank) or meet misguided productivity KPIs that equate high token usage with good performance.

QWhat was the key finding of the April 2026 arXiv paper regarding AI agent coding tasks and cost?

AThe key finding was that AI agent tasks can consume up to a thousand times more tokens than standard code reasoning/dialogue, primarily due to input tokens. Crucially, higher token consumption does not necessarily lead to higher accuracy, with performance often plateauing at a medium cost level.

QAccording to the article, what are the two main business approaches emerging to help manage and reduce AI token costs?

A1. Providing context to AI: Companies like Glean build systems that give AI assistants direct access to relevant company knowledge, reducing the need for lengthy searches and context-building, thus saving tokens. 2. Model routing: Startups like Factory AI automatically route tasks to the most cost-appropriate AI model (e.g., simple tasks to cheaper models, complex ones to top-tier models), potentially saving up to 10x in costs.

Lecturas Relacionadas

The SpaceX Trade, Unlocked: SPCXON Goes Live on WEEX

WEEX has launched SPCXON/USDT, a tokenized spot instrument that provides exposure to SpaceX stock (SPCXON) for traders using USDT, bypassing traditional brokerage barriers. This product, built on Ondo's framework, mirrors SpaceX's economics for eligible non-US traders, with dividends reinvested. SpaceX's high valuation post-IPO is driven by Starlink and Starship, but skeptics note its premium price and upcoming insider unlock. SPCXON offers exposure, not direct ownership or voting rights, and may trade at a premium/discount. WEEX provides a unified platform for such tokenized equities alongside crypto. The exchange, with over 6.2 million users, emphasizes security and innovative tools.

TheNewsCryptoHace 5 min(s)

The SpaceX Trade, Unlocked: SPCXON Goes Live on WEEX

TheNewsCryptoHace 5 min(s)

BIT Trading Moment: BTC Still Suppressed by Weekly 200 EMA, Rejection May Restart Decline; Storage and Semiconductors that Surged Last Night Begin Falling in Evening Trading

**Crypto & Stock Market Wrap: Bitcoin Tests Resistance, Stocks Retreat After AI Surge** Bitcoin consolidates around $66,000, facing key resistance near $68,000—an area seen as a major psychological and technical hurdle where previous rallies have failed. Analysts note the cryptocurrency is caught between its 200-week moving average (~$63,333) and 200-week EMA (~$68,328). A clear break above $68k is needed to signal a stronger bullish trend, while a rejection could lead to a retest of $63k support. Market sentiment remains cautious, with low futures open interest pointing to a low-liquidity rebound rather than a full bull market. Bitcoin spot ETFs saw another $203 million inflow. US stock futures pointed lower after a strong Tuesday session led by a massive rebound in semiconductors and memory stocks. The rally was fueled by renewed optimism about AI-driven hardware demand, with Micron, SanDisk, and SK Hynix surging. However, those gains reversed in pre-market trading. Super Micro Computer (SMCI) soared over 20% after hours on strong guidance and a record backlog. Other standouts included Rocket Lab and nuclear energy plays Oklo and X-Energy. Rising oil prices (Brent above $91) and climbing Treasury yields (10-year near 4.64%), however, are reigniting inflation concerns and acting as a headwind for equities. In Asia, markets were mixed. South Korea's KOSPI pared early gains to close slightly higher as semiconductor stocks like SK Hynix gave back initial surges. Japan's Nikkei edged lower as the yen hit a fresh 38-year low against the dollar, raising fears of potential market intervention. Key events to watch include the Samsung Galaxy launch, AMD's AI event, and a slew of major tech earnings from Alphabet, Tesla, and IBM after the close on Wednesday, followed by the ECB meeting and Intel's earnings on Thursday.

marsbitHace 8 min(s)

BIT Trading Moment: BTC Still Suppressed by Weekly 200 EMA, Rejection May Restart Decline; Storage and Semiconductors that Surged Last Night Begin Falling in Evening Trading

marsbitHace 8 min(s)

Former CFTC Chairman, Circle President Tarbert: Preaching Long-Termism While Cashing Out $30 Million Himself

Former CFTC Chairman and Circle President Heath Tarbert has consistently advocated for a long-term vision in public, urging patience from investors as Circle’s stock price has fallen significantly from its peak. However, it has been revealed that since Circle’s IPO, Tarbert has continuously sold his CRCL shares through pre-arranged trading plans, cashing out approximately $30 million, without making any public market purchases. This contrast between his public messaging and personal actions has drawn criticism. Tarbert joined Circle in July 2023 as Chief Legal Officer, leveraging his regulatory experience to help guide the company through its IPO and expansion. Despite promoting stablecoins as long-term infrastructure, he established a 10b5-1 trading plan just before Circle went public, leading to substantial stock sales over the following year. In March 2026, he initiated another plan to sell more shares. His career trajectory highlights a pattern of moving between high-level regulatory roles and influential positions in the financial sector. After resigning as CFTC Chairman in early 2021, he joined Citadel Securities as Chief Legal Officer just 27 days later, during a period of intense regulatory scrutiny for the firm. He later joined Circle, aiding its efforts to navigate regulatory challenges for its public listing. While Tarbert's expertise in policy and compliance is valuable to companies like Circle, his actions—advocating long-term confidence while personally divesting—raise questions about the alignment between his public statements and his private financial decisions, leaving investors who followed his advice to bear the market risks.

marsbitHace 31 min(s)

Former CFTC Chairman, Circle President Tarbert: Preaching Long-Termism While Cashing Out $30 Million Himself

marsbitHace 31 min(s)

Gate Research Institute: The 'Wall Street-ization' Wave of Crypto Financial Products – Competition or Integration?

The article titled "Gate Research Institute: Are Crypto Financial Products Sparking a 'Wall Street' Wave—Competition or Convergence?" explores the evolving relationship between the crypto ecosystem and traditional finance (TradFi). The piece begins by reflecting on Bitcoin's original 2009 vision of decentralization, disintermediation, and moving away from banks. It then contrasts this with the 2024 landscape, where key crypto assets like Bitcoin are increasingly held through Wall Street products like ETFs issued by giants like BlackRock. The article questions whether this signifies that TradFi is systematically taking over the rights to issue, price, custody, and distribute crypto financial assets. The core argument is that this is not a zero-sum takeover but rather a bidirectional convergence where each side addresses the other's weaknesses. Crypto offers 24/7 global markets, programmable settlement, and open access but lacks compliant channels, institutional-grade custody, deep fiat liquidity, and mainstream distribution. TradFi possesses these but is constrained by legacy systems, limited operating hours, and slow settlement. Two primary convergence paths are highlighted: * **Path A (CEX to TradFi):** Exemplified by Gate, which has progressed from offering tokenized stocks and CFDs to providing direct, real stock trading (US, Hong Kong, South Korea) within its platform, using USDT. * **Path B (TradFi to Crypto):** Exemplified by Robinhood, which has integrated crypto trading, acquired exchanges like Bitstamp, and is moving traditional assets like stocks onto the blockchain via tokenization and its own Layer 2. Both paths are ultimately competing to become the next-generation, unified financial account—a "super account" where users can seamlessly trade cryptocurrencies, stocks, ETFs, RWA (Real World Assets), and tokenized treasury products in one interface. The growth of RWA and tokenized treasuries (e.g., BlackRock's BUIDL) is presented as the asset-layer fusion, providing stable, yield-bearing assets on-chain and acting as a bridge between the two worlds. In conclusion, the "Wall Street-ization" of crypto is framed as a mutual transformation. Decentralized ideals persist in the protocol layer, while at the application layer, a more efficient, global, and accessible unified capital market is emerging from this convergence. The future competition lies not between crypto exchanges and stockbrokers, but between platforms vying to offer the most comprehensive asset coverage, liquidity, and user experience within a single account.

marsbitHace 35 min(s)

Gate Research Institute: The 'Wall Street-ization' Wave of Crypto Financial Products – Competition or Integration?

marsbitHace 35 min(s)

S&P Dow Jones and Pantera Launch Crypto Index, Excluding Bitcoin for "Not Making Money"

S&P Dow Jones Indices and Pantera Capital have launched the S&P Pantera Digital Asset Index, a benchmark that excludes Bitcoin and meme coins like XRP. The index adopts a fundamental screening framework inspired by the S&P 500, requiring constituent tokens to demonstrate positive protocol revenue over multiple quarters, verified by on-chain data, and a clear mechanism for distributing value to token holders. This "financial viability" test is cited as the reason for Bitcoin's exclusion, as it does not generate protocol revenue. The index currently includes 18 assets, with Ethereum (ETH), BNB, Solana (SOL), Tron (TRX), and Hyperliquid (HYPE) as its top five holdings. While initially serving as a benchmark, discussions are underway with asset managers to develop associated ETFs and investment products. Pantera argues the index addresses an institutional product gap, offering a way to evaluate crypto assets based on utility and earnings rather than just market capitalization.

marsbitHace 36 min(s)

S&P Dow Jones and Pantera Launch Crypto Index, Excluding Bitcoin for "Not Making Money"

marsbitHace 36 min(s)

Trading

Spot

Artículos destacados

Cómo comprar BILL

¡Bienvenido a HTX.com! Hemos hecho que comprar Billions Network (BILL) sea simple y conveniente. Sigue nuestra guía paso a paso para iniciar tu viaje de criptos.Paso 1: crea tu cuenta HTXUtiliza tu correo electrónico o número de teléfono para registrarte y obtener una cuenta gratuita en HTX. Experimenta un proceso de registro sin complicaciones y desbloquea todas las funciones.Obtener mi cuentaPaso 2: ve a Comprar cripto y elige tu método de pagoTarjeta de crédito/débito: usa tu Visa o Mastercard para comprar Billions Network (BILL) al instante.Saldo: utiliza fondos del saldo de tu cuenta HTX para tradear sin problemas.Terceros: hemos agregado métodos de pago populares como Google Pay y Apple Pay para mejorar la comodidad.P2P: tradear directamente con otros usuarios en HTX.Over-the-Counter (OTC): ofrecemos servicios personalizados y tipos de cambio competitivos para los traders.Paso 3: guarda tu Billions Network (BILL)Después de comprar tu Billions Network (BILL), guárdalo en tu cuenta HTX. Alternativamente, puedes enviarlo a otro lugar mediante transferencia blockchain o utilizarlo para tradear otras criptomonedas.Paso 4: tradear Billions Network (BILL)Tradear fácilmente con Billions Network (BILL) en HTX's mercado spot. Simplemente accede a tu cuenta, selecciona tu par de trading, ejecuta tus trades y monitorea en tiempo real. Ofrecemos una experiencia fácil de usar tanto para principiantes como para traders experimentados.

311 Vistas totalesPublicado en 2026.05.07Actualizado en 2026.06.02

Discusiones

Bienvenido a la comunidad de HTX. Aquí puedes mantenerte informado sobre los últimos desarrollos de la plataforma y acceder a análisis profesionales del mercado. A continuación se presentan las opiniones de los usuarios sobre el precio de BILL (BILL).