3 'Hello's and You're Out of Quota: Where Did Your Claude Code Allowance Go? A 28-Day Cache Bug and an Official Response Telling You to 'Use It Sparingly'

marsbitОпубликовано 2026-04-03Обновлено 2026-04-03

Введение

Over the past month, Claude Code experienced a critical caching bug that caused prompt cache read rates to drop to 4–17%, far below the typical 97–99%. This meant users were charged 10–20 times more than normal when resuming conversations, as the system reprocessed entire contexts instead of reusing cached content. The bug persisted across 20 versions from March 4 to April 1. User complaints surged after a promotional period ended, revealing the severity of the issue. Anthropic responded by tightening usage limits and offering user advice—such as downgrading models or reducing context windows—but did not issue refunds or reset quotas. Despite confirming and fixing a caching regression bug, the company maintained that no overcharging occurred. The response contrasts with OpenAI’s approach of compensating users during similar incidents. Subscribers reported extreme consumption rates, with some exhausting their monthly quotas in minutes.

4-17%. This was the prompt cache read rate for Claude Code over the past month. The normal level is 97-99%.

This means that when you resumed a previous session, Claude Code did not reuse the context that had already been processed, but instead processed the entire content from scratch each time, consuming 10 to 20 times the normal amount of allowance. You thought you were continuing a conversation, but in reality, you were starting a brand new, full-price conversation every time.

This number comes from independent developer ArkNill's proxy monitoring tests. By setting up a transparent proxy, he recorded every request between Claude Code and the Anthropic API, discovering at least two client-side cache bugs that prevented the API server from matching cached conversation prefixes, forcing a full token rebuild every round.

The chart above shows a comparison of cache read rates across three phases. During the period from v2.1.69 to v2.1.89 (i.e., the bug period), the cache read rate for the standalone version was only 4-17%. After v2.1.90 fixed one of the critical bugs, the cold-start cache read rate returned to 47-99.7%. By v2.1.91, the stable running cache read rate recovered to 97-99%.

Notably, a detail in the chart: the range for v2.1.90 is very wide (47% to 99.7%). This is because when a session is first resumed, it still needs to "warm up" the cache; the hit rate for the first few rounds is relatively low but quickly returns to normal levels. In the buggy version, this warm-up never happened—the cache read permanently stalled at the 14,500 tokens of the system prompt, and the entire conversation history was billed at full price every time.

28 Days, 20 Versions

This bug wasn't the kind introduced in one update and fixed in the next. According to npm registry release records, the bug-introducing v2.1.69 was released on March 4th, and the bug-fixing v2.1.90 was released on April 1st. This spans 28 days and 20 versions.

The timeline reveals an intriguing detail. After the bug was introduced on March 4th, users did not immediately complain on a large scale. Complaints only exploded around March 23rd, nearly three weeks later. The reason, as梳理ed in GitHub issue #41930, is that Anthropic ran a 2x allowance promotion (doubling during off-peak hours) from March 13th to 28th, which objectively masked the bug's impact. After the promotion ended, the consumption from the cache bug returned to the normal billing baseline, and users' allowances instantly "evaporated".

Anthropic's response was not swift. On March 26th, three days after user complaints exploded, engineer Thariq Shihipar announced on his personal X account that peak hour (weekdays 5am-11am PT) limits had been tightened. On March 30th, Anthropic acknowledged on Reddit that "users are hitting their limits much faster than expected," calling it the team's highest priority. It wasn't until April 1st that team member Lydia Hallie published the formal investigation conclusion.

Throughout this process, Anthropic did not publish any blog posts, send email notifications, or update their status page. All official communication was done solely through engineers' personal social media posts and a few Reddit comments.

How Much Did You Pay, How Long Could You Use It?

GitHub issue #41930 gathered hundreds of user reports. The most extreme case was a Max 20x subscriber ($200/month) whose 5-hour rolling window was completely exhausted in 19 minutes. Max 5x users ($100/month) reported their 5-hour window being used up within 90 minutes. According to The Letter Two, some users claimed a simple "hello" consumed 13% of their session quota. A Pro user ($20/month) said on Discord their allowance was "used up by Monday, reset on Saturday," meaning they could only use it normally for 12 out of 30 days.

Based on ArkNill's benchmark tests, on the buggy version v2.1.89, the 100% quota of the Max 20x plan would be exhausted in about 70 minutes. He also calculated the allowance cost of a single --resume operation on a 500K token context session to be approximately $0.15, because the system would fully replay the entire context.

"You're Holding It Wrong"

Lydia Hallie's investigation conclusion confirmed two things: first, peak hour limits had indeed been tightened, and second, sessions with 1 million token contexts consumed more. She stated the team had fixed some bugs but emphasized that "none of these bugs resulted in overcharging."

She then offered four suggestions for saving usage:
1. Use Sonnet 4.6 instead of Opus (Opus consumes about twice as much);

2. Reduce reasoning strength or turn off extended thinking when deep reasoning isn't needed;

3. Don't resume long sessions idle for over an hour, start a new one instead;

4. Set the environment variable CLAUDE_CODE_AUTO_COMPACT_WINDOW=200000 to limit the context window size.

There was no mention of any form of quota reset or compensation.

AI podcast host Alex Volkov summarized this response as "You're holding it wrong," pointing out that Anthropic itself set the 1 million token context as the default, promoted Opus as the flagship model, and marketed extended thinking as a selling point, but is now advising paying users not to use these features.

The claim of "no overcharging" also creates tension with Claude Code's own update records. Just the day before Lydia's response, v2.1.90 fixed a cache regression bug that had existed since v2.1.69: when using --resume to restore a session, requests that should have hit the cache triggered a full prompt cache miss, billed at full price. Lydia's response did not mention this confirmed billing anomaly.

As a comparison, OpenAI's Codex previously had a similar issue of abnormal quota consumption. OpenAI's approach was to reset user quotas, issue credit补偿, and in March announce the removal of Codex usage caps. Anthropic's approach was to advise users to downgrade models, turn off features, limit context, and attribute responsibility to user usage patterns.

Anthropic sells subscriptions for the "strongest model + largest context + highest reasoning ability" and charges $20 to $200 per month. A 28-day cache bug caused paying users' allowances to evaporate at 10-20 times the normal rate, and the official response is to tell you to use it sparingly.

Связанные с этим вопросы

QWhat was the prompt cache read rate for Claude Code during the bug period, and what is the normal range?

AThe prompt cache read rate was 4-17% during the bug period, while the normal range is 97-99%.

QHow long did the caching bug persist across versions, and which versions were affected?

AThe bug persisted for 28 days, from version v2.1.69 (released March 4) to v2.1.90 (released April 1), spanning 20 versions.

QWhat was the response from Anthropic regarding the bug and its impact on user quotas?

AAnthropic acknowledged the issue on Reddit, stated it was their highest priority, and later provided user recommendations to conserve usage. They claimed 'no overcharging occurred' and did not offer quota resets or compensation.

QWhat were some of the extreme user reports regarding quota consumption during the bug?

AExtreme reports included a Max 20x user exhausting their 5-hour rolling window in 19 minutes, a Pro user depleting their weekly quota by Monday with a reset on Saturday, and a user claiming a simple 'hello' message consumed 13% of their session quota.

QWhat recommendations did Anthropic engineer Lydia Hallie give to users to reduce their token consumption?

AThe recommendations were: 1) Use Sonnet 4.6 instead of Opus, 2) Reduce reasoning strength or turn off extended thinking, 3) Start a new session instead of resuming one idle for over an hour, and 4) Set an environment variable to limit context window size.

Похожее

Crypto Extortion Hits Strait Of Hormuz As Scammers Exploit Shipping Crisis

An emerging crypto extortion scam is targeting shipping companies with vessels stranded near the Strait of Hormuz, according to maritime risk firm Marisks. Criminals posing as Iranian security officials are sending fraudulent messages offering safe passage in exchange for transit fees paid in Bitcoin or Tether. The scam exploits desperation amid the ongoing regional conflict, which has severely restricted movement through the critical waterway. Victims are instructed to submit documents and pay a cryptocurrency fee to secure transit. However, the messages are not from Iranian authorities. Paying could not only result in financial loss but also potentially violate international sanctions, as any crypto transfer linked to Iranian entities may be considered material support. Legal risks remain even if companies are defrauded. The scheme appears to capitalize on earlier reports that Iran was considering a real crypto-based toll system.

bitcoinist2 ч. назад

Crypto Extortion Hits Strait Of Hormuz As Scammers Exploit Shipping Crisis

bitcoinist2 ч. назад

MIT Researcher Proposes New Path To Make Bitcoin Quantum-Safe

MIT Digital Currency Initiative director Neha Narula proposes a practical, staged approach to make Bitcoin quantum-safe. She argues that Bitcoin should immediately implement low-risk defenses through a soft fork, rather than waiting for full consensus on more complex issues. Her recommended solution involves deploying a post-quantum output type like P2MR (BIP 360) along with a new quantum-resistant signature opcode. This would allow users to securely migrate their funds to quantum-resistant addresses, provided they avoid address exposure. While this doesn’t resolve all challenges—such as how to handle inactive or lost coins—Narula emphasizes that immediate action reduces risk and provides time to gather data before a cryptographically relevant quantum computer emerges. She dismisses alternative proposals as impractical for broad use and acknowledges tradeoffs, such as reduced privacy efficiency, but insists progress shouldn’t be delayed by unresolved governance debates.

bitcoinist2 ч. назад

MIT Researcher Proposes New Path To Make Bitcoin Quantum-Safe

bitcoinist2 ч. назад

What To Know About This Week’s CLARITY Act Push—And Why Mid-May Is Now Key

After months of delay, the Senate is approaching a decisive moment for the CLARITY Act, with mid-May emerging as a critical timeframe. Pressure from traditional banking groups, particularly concerning stablecoin yield restrictions, is causing potential delays. Banking associations are urging members to voice concerns to key negotiators like Senator Thom Tillis. Although a recent compromise largely satisfied the crypto industry, unresolved issues remain, including ethics and DeFi provisions. The markup could be postponed until after the Senate's recess, pending further negotiations.

bitcoinist3 ч. назад

What To Know About This Week’s CLARITY Act Push—And Why Mid-May Is Now Key

bitcoinist3 ч. назад

Coinbase Launches Crypto-Backed USDC Loans For UK Users In Latest Expansion

Coinbase has expanded its crypto-backed USDC lending service to UK residents, allowing them to use Bitcoin (BTC), Ethereum (ETH), and Coinbase Wrapped Staked Ether (cbETH) as collateral. Powered by the on-chain protocol Morpho on the Base network, the service enables users to borrow up to $5 million in USDC without selling their crypto holdings. Collateral is locked in a smart contract until the loan is repaid, with liquidation triggered if the loan-to-value ratio exceeds a threshold. This follows the service’s successful US launch, where originations surpassed $2.17 billion. The move is part of Coinbase’s broader expansion in the UK, including savings accounts and DEX trading, and aligns with its efforts to integrate crypto into traditional finance, such as recently offering crypto-backed mortgages.

bitcoinist5 ч. назад

Coinbase Launches Crypto-Backed USDC Loans For UK Users In Latest Expansion

bitcoinist5 ч. назад

Capital Flow Analysis Shows Ozak AI Absorbing Liquidity From BTC, ETH, and SOL During Market Pullbacks

Capital flow analysis indicates that during recent market pullbacks, liquidity is rotating from major cryptocurrencies like BTC, ETH, and SOL into Ozak AI, an early-stage AI project. Priced at $0.014, Ozak AI has raised over $6.8 million and sold more than 1.17 billion tokens in its presale, showing steady and sustained capital absorption rather than hype-driven spikes. Analysts describe this shift as strategic redeployment, not panic selling, driven by Ozak AI’s relative valuation efficiency, AI-native utility (including Prediction Agents and EigenLayer integration), and favorable risk-reward timing. The trend reflects a structural rotation into high-growth AI infrastructure projects during large-cap consolidation, with Ozak AI emerging as a preferred alternative for asymmetric returns.

TheNewsCrypto6 ч. назад

Capital Flow Analysis Shows Ozak AI Absorbing Liquidity From BTC, ETH, and SOL During Market Pullbacks

TheNewsCrypto6 ч. назад

Торговля

Спот

Фьючерсы

Популярные статьи

Как купить S

Добро пожаловать на HTX.com! Мы сделали приобретение Sonic (S) простым и удобным. Следуйте нашему пошаговому руководству и отправляйтесь в свое крипто-путешествие.Шаг 1: Создайте аккаунт на HTXИспользуйте свой адрес электронной почты или номер телефона, чтобы зарегистрироваться и бесплатно создать аккаунт на HTX. Пройдите удобную регистрацию и откройте для себя весь функционал.Создать аккаунтШаг 2: Перейдите в Купить криптовалюту и выберите свой способ оплатыКредитная/Дебетовая Карта: Используйте свою карту Visa или Mastercard для мгновенной покупки Sonic (S).Баланс: Используйте средства с баланса вашего аккаунта HTX для простой торговли.Третьи Лица: Мы добавили популярные способы оплаты, такие как Google Pay и Apple Pay, для повышения удобства.P2P: Торгуйте напрямую с другими пользователями на HTX.Внебиржевая Торговля (OTC): Мы предлагаем индивидуальные услуги и конкурентоспособные обменные курсы для трейдеров.Шаг 3: Хранение Sonic (S)После приобретения вами Sonic (S) храните их в своем аккаунте на HTX. В качестве альтернативы вы можете отправить их куда-либо с помощью перевода в блокчейне или использовать для торговли с другими криптовалютами.Шаг 4: Торговля Sonic (S)С легкостью торгуйте Sonic (S) на спотовом рынке HTX. Просто зайдите в свой аккаунт, выберите торговую пару, совершайте сделки и следите за ними в режиме реального времени. Мы предлагаем удобный интерфейс как для начинающих, так и для опытных трейдеров.

1.1k просмотров всегоОпубликовано 2025.01.15Обновлено 2025.03.21

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

Он решает проблемы масштабируемости, совместимости между блокчейнами и стимулов для разработчиков с помощью технологических инноваций.

2.2k просмотров всегоОпубликовано 2025.04.09Обновлено 2025.04.09

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

HTX Learn — ваш проводник в мир перспективных проектов, и мы запускаем специальное мероприятие "Учитесь и Зарабатывайте", посвящённое этим проектам. Наше новое направление .

1.8k просмотров всегоОпубликовано 2025.04.10Обновлено 2025.04.10

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на S (S) представлены ниже.

3 'Hello's and You're Out of Quota: Where Did Your Claude Code Allowance Go? A 28-Day Cache Bug and an Official Response Telling You to 'Use It Sparingly'

Введение

28 Days, 20 Versions

How Much Did You Pay, How Long Could You Use It?

"You're Holding It Wrong"

Связанные с этим вопросы

Похожее

Crypto Extortion Hits Strait Of Hormuz As Scammers Exploit Shipping Crisis

MIT Researcher Proposes New Path To Make Bitcoin Quantum-Safe

What To Know About This Week’s CLARITY Act Push—And Why Mid-May Is Now Key

Coinbase Launches Crypto-Backed USDC Loans For UK Users In Latest Expansion

Capital Flow Analysis Shows Ozak AI Absorbing Liquidity From BTC, ETH, and SOL During Market Pullbacks

Торговля

Популярные статьи

Как купить S

Sonic: Обновления под руководством Андре Кронье – новая звезда Layer-1 на фоне спада рынка

HTX Learn: Пройдите обучение по "Sonic" и разделите 1000 USDT

Обсуждения

Топ вопросы

Популярные категории

Популярные теги