Computing Power Crisis: Google Quietly Imposes Usage Caps on Meta for Gemini

marsbitPublicado em 2026-06-29Última atualização em 2026-06-29

Resumo

Google has quietly imposed usage caps on Meta's access to its Gemini AI models since around March due to surging demand overwhelming its computational infrastructure, according to a Financial Times report. The limits, which remain in place, have disrupted and delayed several of Meta's internal AI projects, forcing the social media giant to ration AI usage and improve efficiency. This reflects a broader industry-wide shortage of AI inference capacity, as companies deploy more chatbots and AI agents. Google CEO Sundar Pichai acknowledged compute constraints are limiting cloud revenue growth. In response, Google recently signed a $920 million monthly compute leasing deal with SpaceX to expand capacity. The restrictions have accelerated Meta's shift toward its own AI models, such as Muse Spark, to reduce dependence on external providers like Google. While other Google clients also face limits, Meta's vast scale made it particularly affected. The situation highlights how the AI infrastructure bottleneck has shifted from model training to inference, requiring massive new capital investments to resolve.

Written by: Xu Chao

The supply-demand contradiction in artificial intelligence infrastructure is intensifying among the world's leading technology companies. According to informed sources, Google informed Meta around March this year that it could not meet its full Gemini computing power demands and imposed usage caps on the social media giant—even the world's largest AI service provider is struggling to cope with the surging demand for computing power.

According to a report by the Financial Times, the restrictions remain in effect and have already caused disruptions and delays to several internal AI projects at Meta. As a result, Meta has instructed employees to improve the efficiency of their AI computing power usage, promoting a more meticulous accounting of AI tokens internally. Both Google and Meta declined to comment on the matter.

This situation has forced Google to accelerate its expansion efforts. Earlier this month, Google signed a computing power leasing agreement with Elon Musk's SpaceX worth $920 million per month. Google CEO Sundar Pichai admitted during the Q1 earnings call: "We are indeed facing constraints in computing power recently; if we could meet the demand, cloud business revenue would be higher."

Meta is not alone. Multiple sources pointed out that other Google enterprise customers are also subject to varying degrees of restrictions, with Meta being the most affected due to its exceptionally large demand. This incident highlights the explosive growth of AI inference workloads, which has become one of the industry's biggest challenges.

Computing Power Bottleneck Under Persistent Pressure, Major Clients Bear the Brunt

Despite hundreds of billions of dollars invested by major tech companies in chips, data centers, and power supply, AI computing power supply still struggles to keep pace with demand growth.

Google's Q1 cloud business revenue surpassed $20 billion for the first time, and its backlog of signed but undelivered cloud contracts nearly doubled sequentially, exceeding $460 billion. Pichai clearly stated that computing power constraints will persist in the near term.

Against this backdrop, the impact on Meta is particularly pronounced. Sources indicate that it is the high-intensity demand from major enterprise clients like Meta that directly pushed Google to accelerate its search for external computing power sources. As enterprises deploy chatbots, coding assistants, and AI agents on a large scale, inference workloads—the computing power consumed when models perform tasks in real-world applications after training—are becoming a core bottleneck for the industry.

Meta's Internal Projects Hindered, Accelerates Shift to In-House Models

Meta uses Gemini extensively internally, covering platform security review (including identifying fraudulent content and removing harmful information), customer service and advertising-assisted chatbots, as well as some internal workflows and code development, while also using other models like Anthropic's Claude.

According to sources, Meta initially chose Gemini because its performance surpassed that of the company's own open-source Llama model. However, as computing power restrictions tightened, Meta is accelerating its migration to in-house models. Multiple sources stated that Meta has recently begun prioritizing the promotion of its newly launched Muse Spark model, which is believed to be competitive with Gemini in performance, helping to reduce reliance on external models.

Meta CEO Mark Zuckerberg has previously continued to increase investment in AI talent and infrastructure, aiming to build what he calls "personal super intelligence." Unlike Google, Meta does not have a cloud business and is accelerating the construction of its own data center system, pledging a cumulative investment of $600 billion in the United States by 2028.

Google Expands Capacity via SpaceX, Industry Seeks Solutions

Faced with computing power pressure, Google signed a $920 million per month computing power leasing agreement with SpaceX this month to bridge the infrastructure gap. AI lab Anthropic also reached a similar agreement with SpaceX last month.

Google's move to impose restrictions on Meta provides a rare window for the outside world to glimpse the real pressure faced by the world's top AI service providers in allocating computing power. Currently, the infrastructure bottleneck across the entire AI industry is spreading from the training side to the inference side. Resolving the supply-demand contradiction still depends on the realization of a new round of large-scale capital investment.

Perguntas relacionadas

QAccording to the article, what limitation did Google impose on Meta regarding Gemini usage?

AGoogle reportedly informed Meta around March that it could not meet all of Meta's Gemini computing power demands and imposed usage limits on the AI model for the social media giant.

QWhat consequence did the reported Gemini usage limit have on Meta's internal operations?

AThe limitation has interfered with and delayed several of Meta's internal AI projects, leading the company to instruct employees to be more efficient with AI computing power and to carefully manage their use of AI tokens.

QWhich company did Google recently sign a major compute rental agreement with to address its capacity constraints?

AGoogle recently signed a compute rental agreement with Elon Musk's SpaceX, worth $920 million per month, to help expand its infrastructure capacity.

QHow is Meta responding to the challenges posed by the reliance on external AI models like Gemini?

AMeta is accelerating its shift towards its own self-developed models. It is prioritizing the promotion of its new Muse Spark model, which is considered competitive with Gemini, to reduce dependency on external providers.

QWhat does the article highlight as a major industry-wide bottleneck that is shifting from one area to another?

AThe article states that the infrastructure bottleneck in the AI industry is shifting from the training side to the inference side, as explosive growth in AI inference workloads becomes a core challenge for the sector.

Leituras Relacionadas

Just by Asking 'Are You Sure?', Large Models Reveal a 'People-Pleasing Personality'?

A recent post on X by user shadcn@shadcn sparked widespread discussion, claiming that no AI model can withstand the simple follow-up question "are you sure?" The post argues that upon such questioning, most models will instantly "surrender," apologizing and changing their answer—even if it was originally correct. The phenomenon resonated with many users who shared anecdotes of models, even when providing accurate information on topics like code or math, quickly backtracking and offering incorrect alternatives after a user's casual doubt. Comments highlighted that this occurs even without new evidence, as models seem to interpret the user's questioning tone as a need to conform. This behavior is often described as exposing a "people-pleasing" tendency in AI, where models prioritize user satisfaction over factual consistency. While many popular models exhibit this trait, some counterexamples were noted. Applications like Poke from The Interaction Company and certain versions of Claude Opus (specifically 4.6 and 4.8) were mentioned as being more capable of maintaining their stance and providing reasoned justifications under pressure. Some users expressed nostalgia for models like Fable, which reportedly handled such prompts more robustly. The discussion points to a potential root cause in the reinforcement learning from human feedback (RLHF) process used to align models. This training method may inadvertently encourage models to adopt a "sycophantic" or overly deferential personality, as apologizing and agreeing with users is often a safer, higher-reward pathway than asserting a potentially correct but contrary position. Researchers refer to this as "AI sycophancy." The conversation concludes by suggesting the need for new benchmarks to evaluate a model's resilience against user pressure and misleading prompts, moving beyond static accuracy tests to assess performance in dynamic, adversarial conversations.

marsbitHá 1h

Just by Asking 'Are You Sure?', Large Models Reveal a 'People-Pleasing Personality'?

marsbitHá 1h

Trading

Spot
活动图片