GitHub, Transfixed by AI

marsbitОпубликовано 2026-06-04Обновлено 2026-06-04

Введение

On the night of February 9th, GitHub suffered a major outage caused by a simple configuration change—reducing a cache refresh interval from 12 to 2 hours—that triggered a cascade of failures. This was not an isolated event, but part of a broader pattern. In early 2026, GitHub experienced at least 8 major incidents, failing to meet its promised 99.9% availability. These outages stemmed from structural issues: explosive growth in load, tight service coupling, and insufficient protection against abnormal traffic. This unprecedented load is driven by AI Agents. In 2025, GitHub handled ~1 billion commits. By 2026, weekly commits reached 275 million, projecting to ~14 billion for the year—a 14x increase. AI tools like Claude Code now contribute 4.5% of all public repository commits, with weekly submissions surging 25x in just three months. AI-generated pull requests jumped from 4 million to 17 million per month in half a year. Unlike human developers, AI Agents work continuously, generating commits at a scale that overwhelms infrastructure designed for human rhythms. The surge also shattered GitHub's business model. Copilot's flat-rate pricing, based on assisting human developers, became unsustainable as Agentic AI sessions consumed resources worth hundreds of dollars for a few dollars in fees. In response, GitHub imposed usage limits and, by June 1st, shifted to a pay-per-use "AI Credits" system. Facing this new reality, GitHub realized a 10x scaling plan was insufficient. It a...

On February 9th, Beijing time late at night, tens of millions of developers worldwide opened GitHub and saw the same page.

It wasn't a 404, but something more anxiety-inducing than a 404—that chilling yellow warning bar that sends shivers down every engineer's spine, alongside a status page full of indicators turning from green to red.

github.com was down.

The API was down.

GitHub Actions was down.

Git operations were down—even Copilot wasn't spared.

That night, some people's CI/CD pipelines ground to a halt at the most critical juncture, some saw their automated deployments stuck mid-air, and others waited for a pull request that just wouldn't merge—behind it, a feature waiting to go live, waiting for real users.

GitHub later published an incident report. The root cause, in technical terms, was "an overload of a core database cluster responsible for authentication and user management." But behind those words lay a startling chain of events—

Two days prior, the engineering team, in a hurry to push a new model to users, changed the refresh time of a "user settings cache" from 12 hours to 2 hours. Just one configuration number.

The result: cache rewrites that were supposed to be spread over 12 hours were compressed into 2, creating a dense "cache rewrite storm." Asynchronous task queues were instantly overwhelmed, shared infrastructure components crashed, and the cascading effects spread to services responsible for proxying HTTPS Git operations, eventually exhausting all platform connections.

One number, changed from 12 to 2.

GitHub was brought down by a configuration change it made itself.

But if you only see that one config change, you've probably missed the most important part of this story.

01 Not One Accident, But Ten

The February 9th incident was not an isolated event.

In fact, in the first three months of 2026, GitHub experienced at least 8 major incidents. February alone saw 37 recorded failures, big and small. GitHub's CTO Vlad Fedorov later admitted in a blog post that GitHub had failed to maintain the "three nines"—99.9% availability—it promises its enterprise customers during those two months.

Looking through the failure records of those two months, you'll find a peculiar pattern: each incident appears to have a different cause.

February 2nd: Issues with the Azure compute provider, causing GitHub Actions to be down for nearly 4 hours, affecting Copilot Chat, CodeQL, Dependabot, and more.

February 9th: Cache rewrite storm, authentication database overload.

March 5th: Redis cluster failure, 95% of GitHub Actions workflows unable to start within 5 minutes, average delay of 30 minutes.

March 18th: Webhook latency spiked to 32 times the normal level.

Each one looked like an "accident," each with a different immediate cause. But Fedorov's explanation strings them together into the same story. He said these incidents share three common structural causes: "rapid load growth, tight coupling between services leading to localized failure propagation, and systems lacking protection capabilities against abnormal client traffic."

In engineer speak, GitHub's foundation is starting to crack under the pressure of new loads.

And this "new load" has a specific name.

02 275 Million Commits Per Week

Key Data

Total commits for all of 2025: Approximately 1 billion

Weekly commit volume in 2026: 275 million

Projected annual total for 2026 at this rate: 14 billion (a 14-fold year-over-year increase)

GitHub Actions compute minutes: 5 billion minutes per week in 2023 → 10 billion in 2025 → 21 billion minutes in a week in early 2026

If you're a GitHub infrastructure engineer, the comparison between your monitoring dashboard in 2025 and 2026 would probably leave you speechless.

Throughout all of 2025, GitHub processed around 1 billion code commits. That number itself is massive, the result of years of platform growth. But by 2026, the *weekly* commit volume reached 275 million. Doing the math—if this pace continues for the whole year, the total commits for 2026 would be close to 14 billion, a full 14 times the total for all of 2025.

This isn't a smooth growth curve; it's a cliff. The change in GitHub Actions compute minutes is even more telling: 5 billion minutes per week in 2023, doubling to 10 billion in 2025, and then in one week in early 2026, it skyrocketed to 21 billion minutes.

What's submitting code so frantically?

Not human developers.

GitHub's data shows that AI Agents are becoming the most active 'users' on the platform. Claude Code alone now accounts for 4.5% of all commits to public repositories on GitHub. 2.6 million commits per week—a number that was only 100,000 in late September 2025, a 25-fold increase in three months.

The number of PRs opened by AI Agents is also exploding. In September 2025, AI-generated PRs numbered about 4 million per month. By March 2026, that number jumped to 17 million—more than four times higher in half a year.

A picture might help you understand what this means.

Before, GitHub's "users" were mainly human programmers. They work during the day, sleep at night, rest on weekends. Each commit involves thought, hesitation; their typing speed has limits. System load followed human schedules, with peaks and troughs that could be predicted.

Now, more and more "users" are AI Agents. They don't sleep, don't rest, don't hesitate. One task can spawn multiple parallel Agents. A single Agent can easily commit more code per hour than a real engineer does in a week. More importantly, they're not just committing code; they're constantly creating new repositories—treating repositories as "output artifacts" of a workflow, not a human's "workspace."

GitHub's infrastructure engineers are no longer facing a larger version of the same problem, but a fundamentally different kind of problem.

03 Copilot's Money Isn't Enough to Burn Anymore

Frequent failures are just one side of the problem. GitHub has another, even more troublesome headache—when doing the math, they found they were losing money.

Copilot's original pricing logic was based on a reasonable assumption: users primarily engaged in "assistive completion," each interaction brief, with predictable compute demands. The personal plan at $10/month and the business plan at $19/month, charged per seat, worked well for several years.

Then, Agentic AI arrived.

Agentic workflows and traditional completion are different species. Standard code completion involves linear, predictable requests with short compute cycles. An Agentic coding session might run for hours, spawning multiple parallel threads, performing multi-step reasoning, self-correction, cross-repository refactoring—the token consumption of one session can easily exceed the entire monthly subscription fee of an average user.

GitHub faces a situation where a minority of heavy Agentic users are consuming compute resources worth hundreds of dollars for a monthly fee of a few dollars.

Faced with this, GitHub's reaction was direct—control the flow first, then change the price.

Starting early this year, GitHub implemented two parallel rate-limiting mechanisms for Copilot: session duration caps and weekly usage caps, both calculated based on token consumption multiplied by model compute weight. At the same time, new user registration for some individual Copilot plans was paused.

On June 1st, GitHub completed a more fundamental pricing overhaul: Copilot fully switched to usage-based billing, replacing old plan fees with "AI Credits." 1 AI Credit equals 1 US cent, with usage calculated in real-time based on token consumption.

The era of per-seat pricing has reached its end in the face of Agentic AI.

This shift isn't just GitHub's headache. It's a collective pricing crisis the entire AI tool industry is experiencing in 2026—when AI starts replacing humans in executing entire workflows, not just "assisting" human work, all subscription logic based on "per user per month" becomes unsustainable.

04 30 Times, Not 10 Times

Back to the infrastructure problem. How does GitHub actually plan to handle this "14-fold growth"?

A detail here illustrates the severity of the situation:

In late December 2025, Agentic workflows suddenly began accelerating. GitHub's engineers realized that 10x wasn't enough. By February 2026, after that major outage, GitHub announced it needed to redesign its architecture for 30 times today's scale.

Not scaling, but redesigning.

The difference between these two words is significant. Scaling is adding more machines, more memory to existing databases—same direction, just bigger. Redesigning means the underlying architectural assumptions will fail systematically at 30x scale, forcing a fundamental rethinking of service decomposition, data flow, and failure isolation from the ground up.

GitHub's disclosed specific directions include decoupling critical services to prevent cascading failures, introducing backpressure and traffic degradation capabilities, deploying independent hosts for hotspot services, eliminating single points of failure, and implementing more robust change management—to avoid operations like "changing cache TTL from 12 hours to 2 hours" going live without sufficient load testing.

It's worth noting GitHub isn't alone.

Stripe has already encountered issues with AI Agents creating accounts in bulk; AWS is building Agent-specific identity systems, logging systems, and production control mechanisms. These moves aren't precautionary; signals have already appeared on their monitoring dashboards that they had to address.

GitHub was just the first to be transfixed—because it's at the very core of the AI toolchain.

05 Code Repositories, Becoming AI's Exhaust Pipe

Stop and think about the nature of this whole thing.

What is GitHub? The most intuitive answer: it's where programmers store code. But on a deeper level, it's the infrastructure for human software collaboration—commits are the tracks of collaboration, PRs are containers for discussion, Issues are records of intent, Actions are pipelines for execution. The entire system was designed for human work rhythms, thought processes, and collaborative patterns.

AI Agents have changed all that.

When an AI Agent can commit code hundreds of times a day, each "commit" lacking human thought and trade-off, just being a step in a task loop—is a code repository still a "container for collaboration"?

When AI tools automatically generate repos, automatically open PRs, automatically run CI, automatically merge—are developers still the primary actors in this process, or have they devolved into "reviewers" or even "bystanders"?

GitHub's CTO described this crisis as "rapid load growth." But this term likely understates the essence—this isn't just quantitative growth; it's a qualitative change in usage. In the old model, GitHub was a "developer's tool." In the new model, GitHub is becoming "AI's exhaust pipe," an output channel for automated workflows.

What this means for GitHub actually has no answer yet. Scaling 30x can solve the traffic problem, but it can't solve the redefinition of the business model, nor can it solve the identity question of "who is my real user."

A rather telling phenomenon recently: After the outages, GitHub published a flurry of engineering blog posts, describing the root causes of each incident in great detail, reaching a level of transparency that is almost surprisingly high. Some see this as GitHub actively building trust; others see it as trading transparency for the patience of the developer community—because the upcoming refactoring period will bring more instability.

A platform, after being transfixed by its own success, needs to tear itself apart and rebuild—and that process itself is a test of whether it can hold on.

On the night of February 9th, that engineer waiting for a PR to merge probably eventually saw the green light. But they might not have realized that the outage that made them wait wasn't just a GitHub accident; it was a signal—a sound announcing the entire software development industry's entry into a new era.

This article is from WeChat Official Account "GeekPark" (ID: geekpark), author: Yu Hang Yuan

Связанные с этим вопросы

QWhat was the immediate cause of GitHub's major outage on February 9th, as detailed in the article?

AThe immediate cause was a configuration change that reduced the refresh time for a user settings cache from 12 hours to 2 hours. This compressed cache rewriting, creating a 'cache rewrite storm' that overloaded asynchronous job queues, crashed shared infrastructure, and ultimately exhausted platform connections.

QAccording to the article, what is the fundamental structural reason behind GitHub's series of outages in early 2026?

AThe fundamental structural reason is that GitHub's infrastructure was not designed for the nature and scale of traffic from AI Agents. This new workload leads to rapid load growth, causes cascading failures due to tight service coupling, and overwhelms systems that lack protection against abnormal client traffic patterns.

QHow did the nature of GitHub's primary 'users' change, contributing to its infrastructure crisis?

AGitHub's most active 'users' are increasingly AI Agents, not human developers. Unlike humans, AI Agents operate continuously, submit code at an extremely high rate (exceeding a human's weekly output per hour), and treat repositories as disposable output channels rather than collaborative workspaces, creating an unpredictable and massive load.

QWhy did GitHub change its Copilot pricing model from a per-seat subscription to a usage-based 'AI Credits' system?

AThe old per-seat pricing became unsustainable with the rise of Agentic AI workflows. These sessions consume vastly more computational resources (tokens) than traditional code completion. A few heavy Agentic users could consume hundreds of dollars worth of resources for a few dollars in subscription fees, forcing GitHub to adopt a pay-per-use model.

QWhat does the article suggest is the deeper, symbolic shift occurring as AI Agents become dominant on platforms like GitHub?

AThe article suggests a profound shift in purpose: GitHub is transitioning from being a 'container for human collaboration' (tracking intent, discussion, and execution) to becoming an 'AI exhaust pipe'—a mere output channel for automated workflows. This challenges its core identity and the design of its systems, which were built for human rhythms and collaboration.

Похожее

Anthropic Cries Wolf: Is the AGI Threat Real, or Just an IPO Story?

Anthropic has published an article titled "When AI builds itself," discussing the emerging concept of "recursive self-improvement," where AI begins to actively participate in designing, training, testing, and optimizing its own subsequent versions. The company presents internal data showing that by May 2026, over 80% of code merged into its codebase was written by Claude, its AI model. Claude's capabilities have expanded to handling complex, open-ended engineering tasks, achieving a 76% success rate in such areas, and even contributing to research processes, such as optimizing code performance and conducting AI safety experiments. Anthropic outlines an evolution from human-driven development to AI-assisted workflows, culminating in the current stage where AI agents can autonomously write, run, and delegate code. The company cautions that the path toward a "closed loop," where AI continuously improves itself, is becoming visible. It calls for coordinated global mechanisms to potentially slow or pause frontier AI development to allow safety research and societal structures to catch up. However, the timing of this warning coincides with Anthropic's preparations for an IPO, framing the narrative not just as a safety concern but also as a demonstration of Claude's advanced capabilities and its integral role in accelerating Anthropic's own R&D—creating a potential "flywheel" effect for competitive advantage. This contrasts with OpenAI's recent, more policy-oriented discussion of the same risks, highlighting the competitive dynamics in the AI industry as companies position themselves in both the technological and regulatory landscape.

marsbit1 мин. назад

Anthropic Cries Wolf: Is the AGI Threat Real, or Just an IPO Story?

marsbit1 мин. назад

BIT Research: ETF Purchases Have Slowed, Strategy (MicroStrategy) Has Slowed, What Else Can Drive Bitcoin's Rise?

Market Refocus on Inflation and Rate Expectations Weighs on Bitcoin Currently, the market is in a phase of macro-repricing dominated by inflation and interest rate expectations. Bitcoin, which previously benefited from easy liquidity and low inflation, is seeing its core bullish drivers weaken. These drivers were market expectations for interest rate cuts and strong inflows from Bitcoin ETFs and institutions like MicroStrategy (referred to as "Strategy" in the text). The logic has shifted. Recent high inflation data (e.g., CPI hitting 3.8% in a May 2026 report) has caused the market to sharply reduce its rate cut expectations for 2025 and even price in potential hikes. This is a key constraint for Bitcoin, as it lacks cash flows and is highly sensitive to rate expectations. Concurrently, institutional capital flows have slowed significantly. Following the hot CPI data, Bitcoin ETFs saw accelerated outflows, with around $4.3 billion leaving over a period. MicroStrategy's ability to keep adding substantial Bitcoin to its balance sheet is also diminishing. Together, ETF and MicroStrategy holdings total roughly $110 billion, but their momentum as growth engines is cooling. In summary, Bitcoin's current pressure stems not from its own fundamentals but from a changing macro environment. As long as inflation stays elevated, Bitcoin is likely to remain in a consolidating phase. However, historically, inflation eventually peaks. Once it recedes and rate cut expectations rebuild, institutional capital could return, potentially fueling a new and more robust recovery phase for Bitcoin.

marsbit8 мин. назад

BIT Research: ETF Purchases Have Slowed, Strategy (MicroStrategy) Has Slowed, What Else Can Drive Bitcoin's Rise?

marsbit8 мин. назад

Earning 1000 Trillion in Half a Year, 'Pocketing' 20 Million per Capita: This Round of Wealth Creation in the Korean Stock Market is Unprecedented in Scale

The South Korean stock market is experiencing an unprecedented wealth surge in 2026, with household equity and fund asset values soaring by over 1,000 trillion KRW (~$730bn) year-to-date. This translates to an average per capita wealth increase of roughly 20 million KRW, fueled by a historic 109% rally in the KOSPI index. The boom is driven by three converging forces: an AI-driven semiconductor supercycle boosting giants like Samsung and SK Hynix; the government's "Value-Up" market reforms addressing long-standing corporate governance issues; and aggressive real estate regulations that have locked capital within financial markets, preventing profits from flowing back into property. This has triggered a wealth effect, boosting high-end consumption significantly. However, the gains are highly concentrated. The two semiconductor behemoths account for over half the index's value, but retail investors own relatively low stakes in them, systematically missing the biggest rallies. Wealth and consumption benefits are skewed towards luxury goods and imported cars, bypassing mainstream retail. Further risks stem from excessive leverage, with high trading volume in leveraged ETFs, and a market sentiment heavily reliant on the AI sector's fortunes and speculative rumors. While this cycle marks a potential shift from real estate to equities as a primary wealth generator for Koreans, its sustainability, amid structural imbalances and leverage, remains a critical test.

marsbit14 мин. назад

Earning 1000 Trillion in Half a Year, 'Pocketing' 20 Million per Capita: This Round of Wealth Creation in the Korean Stock Market is Unprecedented in Scale

marsbit14 мин. назад

Behind ZEC's Over 30% Plunge: An 'Unlimited Minting' Vulnerability with No Way to Prove if It Was Ever Exploited

A critical vulnerability was discovered in Zcash's Orchard privacy pool, allowing for the theoretical creation of undetectable counterfeit ZEC. Researcher Taylor Hornby found the flaw on May 29th, 2024, within the Orchard circuit's cryptographic constraints, which could let an attacker bypass asset conservation rules. Although a rapid emergency fix was deployed within days via a coordinated soft and hard fork, a core uncertainty remains: due to Orchard's privacy features, it is impossible to cryptographically prove whether this "unlimited mint" flaw was exploited in the nearly four years since the pool's 2022 launch. This uncertainty, rather than the patched flaw itself, triggered a market panic, causing ZEC's price to drop over 30%. While the Zcash Foundation stated no evidence of exploitation was found, independent entity Shielded Labs emphasized the impossibility of definitively proving no counterfeit ZEC was ever created. The incident highlights the unique trust challenge in privacy systems. To address this, developers are proposing a new network upgrade with enhanced auditing to allow verifiable proof of supply integrity. Notably, the researcher utilized the newly released AI model Claude Opus 4.8 as a tool during the security review, signaling the growing role of advanced AI in uncovering complex cryptographic vulnerabilities.

marsbit16 мин. назад

Behind ZEC's Over 30% Plunge: An 'Unlimited Minting' Vulnerability with No Way to Prove if It Was Ever Exploited

marsbit16 мин. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片