Meta: Can Afford Trillion-Dollar Computing Power, But Can't Keep Key People

marsbitPublished on 2026-02-28Last updated on 2026-02-28

Abstract

Meta's AI Ambition: A $135 Billion Bet on Chips, But Losing Key Talent In July 2025, Meta recruited top AI infrastructure engineer Ruoming Pang from Apple with a compensation package worth over $200 million. However, just seven months later, he left for OpenAI, forfeiting much of his unvested equity. This high-profile departure is part of a broader trend of key talent leaving Meta's AI division, including Chief AI Scientist Yann LeCun and other senior figures. The exodus is largely attributed to the fallout from the Llama 4 model's release in April 2025. The model was later revealed to have been benchmarked unethically, using different model versions to optimize scores on different tests, severely damaging trust within the developer community. This scandal led CEO Mark Zuckerberg to lose confidence in the team, resulting in a major reorganization. He appointed 28-year-old Scale AI CEO Alexandr Wang as Chief AI Officer, who now oversees the new Meta Superintelligent Lab (MSL). The planned flagship model, Llama 4 Behemoth, was indefinitely delayed. Compounding these software issues, Meta also canceled its most advanced in-house AI training chip project, a critical part of its plan to reduce reliance on Nvidia. This failure has triggered a panic-buying spree. In February 2026, Meta announced a capital expenditure budget of $115-$135 billion, nearly double the previous year's. Within ten days, it signed massive, multi-year chip deals: a multi-billion dollar agreement with Nvid...

Written by: Ada, Shenchao TechFlow

Ruoming Pang hadn't even warmed his seat at Meta before he left.

In July 2025, Mark Zuckerberg snatched this most sought-after Chinese engineer in the AI infrastructure field from Apple's grasp with a multi-year compensation package totaling over $200 million. Pang was placed into Meta's Superintelligence Lab, responsible for building the infrastructure for the next generation of AI models.

Seven months later, OpenAI poached him.

According to The Information, OpenAI waged a recruitment campaign for Pang that lasted several months. Although Pang had told colleagues he was "very happy working at Meta," he ultimately chose to leave. Bloomberg reported that his compensation package at Meta was tied to milestones, and leaving early meant forfeiting the majority of unvested equity.

$200 million couldn't buy 7 months of loyalty.

This isn't a simple story of a job change.

One Person's Departure, A Signal for Many

Pang wasn't the first to leave.

Last week, Mat Velloso, Product Lead for the Developer Platform at Meta's Superintelligence Lab, also announced his departure. He had left Google DeepMind to join Meta in July last year and stayed for less than 8 months. Going further back, in November 2025, Turing Award winner and Chief AI Scientist Yann LeCun, who had been at Meta for 12 years, announced his departure to start a company, working on the "world model" he had long championed. Geoffrey Hinton's key disciple and Meta's VP of Generative AI Research, Russ Salakhutdinov, also recently announced his exit.

To understand the AI brain drain at Meta, one must first understand how damaging the Llama 4 incident was.

In April 2025, Meta proudly launched the Llama 4 series Scout and Maverick models. The official paper data was spectacular, claiming to comprehensively suppress GPT-4.5 and Claude Sonnet 3.7 in core benchmark tests like MATH-500 and GPQA Diamond.

However, this flagship model, which carried Meta's ambitions, quickly revealed its true colors in third-party independent blind tests by the open-source community, showing a cliff-like gap between its actual generalization and reasoning capabilities and its advertised performance. Faced with strong community skepticism, Chief AI Scientist Yann LeCun eventually admitted that the team had "used different model versions to run different test sets to optimize the final score" during the testing phase.

In the rigorous world of AI academia and engineering, this crosses an unforgivable red line. In other words, the team trained Llama 4 to be a "small-town exam crammer" only good at past exam papers, not a true "top student" with cutting-edge intelligence. For math tests, they showed the math whiz version; for programming tests, they showed the programming whiz version. Each individual test looked strong, but it wasn't the same model.

In AI academia, this is called "cherry-picking"; in the world of exam-oriented education, it's called "having a ringer take the test."

For Meta, which had always prided itself as the "beacon of open source," this scandal directly destroyed the most valuable asset of trust in its developer ecosystem. The direct cost was that Zuckerberg "completely lost confidence" in the engineering integrity of the original GenAI team, triggering a subsequent series of parachuting in executives and sidelining the core infrastructure department.

He spent $14.3 to $15 billion to acquire a 49% stake in the data annotation company Scale AI, parachuting the 28-year-old Scale AI CEO Alexandr Wang into the role of Meta's Chief AI Officer and establishing the Meta Superintelligence Lab (MSL). Turing Award winner LeCun had to report to this 28-year-old in the new structure. In October, Meta cut about 600 positions in MSL, including members of the FAIR research department created by LeCun.

And the flagship model Llama 4 Behemoth, originally planned for release in the summer of 2025, was repeatedly delayed, pushed from summer to fall, and eventually shelved indefinitely.

Meta instead began developing the next-generation text model codenamed "Avocado" and an image/video model codenamed "Mango." According to reports, Avocado's goal is to compete with GPT-5 and Gemini 3 Ultra. Originally scheduled for delivery by the end of 2025, it was postponed to Q1 2026 due to failed performance tests and training optimization. Meta is considering releasing it as a closed-source model, abandoning the open-source tradition of the Llama series.

Meta made two fatal mistakes with its AI models. First, benchmark fraud, which directly destroyed the trust of the community. Second, forcing a basic research department like FAIR, which requires a decade of grinding, into a product organization chasing quarterly KPIs. These two things combined are the root cause of the current brain drain.

In-House Chips: Another Broken Leg

People are fleeing, and chips are also a problem.

According to The Information, Meta last week canceled its project to develop its most advanced in-house AI training chip.

Meta's in-house chip plan is called MTIA (Meta Training and Inference Accelerator). The company's initial roadmap was ambitious: MTIA v4 codenamed "Santa Barbara," v5 "Olympus," and v6 "Universal Core" were planned for delivery between 2026 and 2028. Olympus was designed as Meta's first chip based on a 2nm chiplet architecture, aiming to cover both high-end model training and real-time inference, ultimately replacing Nvidia's role in Meta's training clusters.

Now, this most advanced training chip has been canceled.

Meta hasn't made no progress; MTIA has had some success on the inference side. The inference chip MTIA v3, codenamed "Iris," has been deployed on a large scale in Meta's data centers, mainly for Facebook Reels and Instagram's recommendation systems, reportedly reducing the total cost of ownership by 40% to 44%. But inference and training are two different things. Inference is running the model; training is teaching the model. Meta can make its own inference chips, but it can't build training chips that can compete head-on with Nvidia.

This isn't the first time historically. In 2022, Meta attempted to develop its own inference chip. After failing in a small-scale deployment, it directly abandoned the project and placed a large order with Nvidia.

The setback in developing its own chips directly accelerated Meta's external buying spree.

$135 Billion in Panic Buying

In January 2026, Meta announced its capital expenditure budget for the year was $115 to $135 billion, almost double last year's $72.2 billion. The bulk of this money will be spent on chips.

Within 10 days, three major deals landed:

February 17: Meta signed a multi-year, cross-generational strategic cooperation agreement with Nvidia. Meta will deploy "millions" of Nvidia Blackwell and next-generation Vera Rubin GPUs, plus standalone Grace CPUs. Analysts estimate the deal is worth tens of billions of dollars. Meta became the first supercomputing customer globally to deploy Nvidia's standalone Grace CPUs on a large scale.

February 24: Meta signed a multi-year chip deal with AMD worth $60 to $100 billion. Meta will purchase AMD's latest MI450 series GPUs and sixth-generation EPYC CPUs. As part of the deal, AMD issued warrants to Meta for up to 160 million shares of common stock, representing about 10% of AMD, at a price of $0.01 per share, vesting in batches based on delivery milestones.

February 26: According to The Information, Meta signed a multi-year deal with Google worth billions of dollars to rent Google Cloud's TPU chips to train and run its next-generation large language model. Simultaneously, the two sides are also discussing Meta directly purchasing TPUs starting in 2027 for deployment in its own data centers.

A social media company placed orders potentially totaling over a hundred billion dollars with three chip suppliers in 10 days.

This isn't diversification. This is panic buying.

The Three Layers of Logic Behind the Computing Power Panic

Why is Meta in such a hurry?

First, in-house chips are no longer an option. Canceling the most advanced training chip project means Meta must rely on external purchases to meet its AI training needs for the foreseeable future. MTIA chips on the inference end can handle mature businesses like recommendation systems, but training cutting-edge models like Avocado, which aims to rival GPT-5, requires Nvidia or hardware of equivalent grade.

Second, competitors won't wait. OpenAI has secured massive resources from Microsoft, SoftBank, to UAE sovereign wealth funds. Anthropic has locked in supplies of 1 million TPU and Trainium chips each from Google and Amazon. Google's Gemini 3 was trained entirely on TPUs. If Meta can't secure enough computing power, it won't even keep its ticket to the race.

Third, and perhaps most fundamentally, Zuckerberg needs to use "purchasing power" to compensate for the lack of "R&D power." The Llama 4 debacle, the loss of core talent, and the setback in developing its own chips—these three things combined have made Meta's AI narrative fragile in front of Wall Street. Signing major deals with Nvidia, AMD, and Google at least sends a signal: We have money, we are buying, we haven't given up.

Meta's current strategy is: if you can't figure out the software, smash the hardware; if you can't keep the people, buy the chips. But the AI race isn't a game you can win by writing checks. Computing power is a necessary condition, not a sufficient one. Without a top-tier model team and a clear technical roadmap, even the most chips are just expensive inventory sitting in warehouses.

The Buyer's Dilemma

Looking back at Meta's three deals in February, an interesting detail was missed by most.

Meta is buying current Blackwell and future Vera Rubin from Nvidia; the deal with AMD is for MI450 and future MI455X; it's renting current Ironwood TPUs from Google, planning to buy them directly next year.

Three suppliers, three completely different hardware architectures and software ecosystems.

This means Meta will have to juggle between three截然不同的底层生态: Nvidia's CUDA, AMD's ROCm, and Google's XLA/JAX. A multi-vendor strategy can分散供应链风险 and压低硬件采购溢价, but it will bring exponentially increasing engineering complexity.

This is Meta's most fatal soft spot right now. Making a trillion-parameter model train efficiently on these three sets of hardware with completely different underlying programming models requires not just engineers who understand CUDA, but architects who can build cross-platform training frameworks from scratch.

There are probably no more than 100 such people in the world. Ruoming Pang was one of them.

Spending $100 billion to buy the world's most complex hardware combination, while simultaneously losing the brains that can驾驭 these hardware—this is the most surreal picture in Zuckerberg's grand gamble.

Zuckerberg's Gamble

Zooming out a bit, Zuckerberg's operational path in AI over the past 18 months is strikingly similar to his rhythm when he went All In on the metaverse:

See the trend, invest heavily, recruit aggressively, encounter setbacks, pivot strategy abruptly, invest heavily again.

2021 to 2023 was the metaverse, resulting in annual losses of tens of billions, and the stock price eventually falling from $380 to $88. 2024 to 2026 is AI,同样是不计代价地砸钱 (investing regardless of cost), frequent organizational reshuffles, and the same narrative of "trust me, I have a vision."

The difference is that this time, the AI trend is indeed much more substantial than the metaverse. And Meta has money to burn; its advertising business generates ample cash flow. In Q4 2025, Meta's revenue was $59.9 billion, a year-on-year increase of 24%.

The problem is: Money can buy chips, buy computing power, even people sitting in seats, but it can't buy the people who stay.

Pang chose OpenAI, Russ Salakhutdinov chose to leave, LeCun chose to start his own company.

Zuckerberg's current bet is that by buying enough chips, building enough large data centers, and spending enough money, he will eventually find or cultivate the people who can use these resources.

This bet might work. Meta is, after all, one of the richest tech companies in the world, with over $100 billion in operating cash flow as its sturdiest moat. From OpenAI to Anthropic, from Google to other competitors, Meta is continuously poaching people. According to a Qbitai report, nearly 40% of the 44 people in Meta's Superintelligence team are from OpenAI.

But the残酷之处 (cruelty) of the AI race is that computing power reserves, talent lists, and model performance are all public. The Llama 4 benchmark fraud incident proved that in this industry, you can't maintain a lead with PPTs and PR.

The market ultimately only recognizes one thing: Is your model good enough?

Position on the Food Chain

As the AI arms race enters 2026, the food chain排序 (ranking) has become初步清晰 (initially clear):

At the top are OpenAI and Google. OpenAI has the strongest models, the largest user base, and the most aggressive financing. Google has complete vertical integration with its own chips, its own models, and its own cloud infrastructure. Anthropic follows closely behind, securing a spot in the first tier with the product strength of its Claude model and dual-line computing power supply from Google and Amazon.

Meta? It has spent the most money, signed the most chip contracts, done the most frequent organizational reshuffles, but so far, it has not produced a cutting-edge model that can convince the market.

Meta's AI story is somewhat like Yahoo in 2005. At that time, Yahoo was also one of the richest internet companies, also疯狂收购和砸钱 (acquiring and spending frantically), but it just couldn't make a search engine as good as Google's. Money isn't omnipotent. Zuckerberg needs to figure out what Meta really wants to do in AI, not just buy whatever is hot.

Of course, it's still too early to write Meta's obituary. 3.58 billion monthly active users, $59.9 billion in quarterly revenue, the world's largest social dataset—these are assets that any competitor would find difficult to replicate.

If the next-generation model codenamed Avocado can be delivered as scheduled in 2026 and return to the first tier, all of Zuckerberg's spending and restructuring will be packaged as "the strategic courage to turn the tide." But if it underperforms again, then this $135 billion will only have bought warehouses of silicon wafers that heat up when powered on.

After all, the AI arms race in Silicon Valley has no shortage of super buyers waving checks. What's lacking are the people who know how to use this computing power to forge the future.

Related Questions

QWhy did Pang Ruoming leave Meta after only 7 months, and what does his departure signify about Meta's AI talent retention?

APang Ruoming left Meta to join OpenAI after a months-long recruitment effort, despite initially expressing satisfaction with his role. His departure, along with other key figures like Yann LeCun and Russ Salakhutdinov, signals a deeper issue within Meta's AI division, including loss of trust from the Llama 4 benchmark scandal, organizational restructuring, and a shift away from foundational research towards product-driven goals, leading to talent disillusionment.

QWhat was the consequences of Meta's Llama 4 benchmark manipulation scandal?

AThe Llama 4 benchmark manipulation, where Meta used different model versions to optimize test scores, severely damaged its credibility in the AI community. This led to a loss of trust, internal restructuring including the appointment of a new Chief AI Officer, layoffs, the indefinite postponement of Llama 4 Behemoth, and a strategic pivot towards developing new models like 'Avocado' and 'Mango', potentially as closed-source projects.

QWhy did Meta cancel its most advanced in-house AI training chip project, and how does this impact its strategy?

AMeta canceled its advanced MTIA training chip project (codenamed 'Olympus') due to development challenges, as it could not create a chip competitive with NVIDIA's for training cutting-edge models. This failure forced Meta into a 'panic buying' spree, signing multi-year, multi-billion dollar deals with NVIDIA, AMD, and Google Cloud for GPUs, CPUs, and TPUs to secure the necessary compute power for AI training, revealing a reliance on external suppliers.

QWhat is the fundamental dilemma behind Meta's strategy of spending over $100 billion on compute hardware while losing key AI talent?

AMeta's dilemma is that while it can use its massive cash flow to purchase immense compute power (e.g., deals with NVIDIA, AMD, Google), it is simultaneously losing the rare, top-tier AI engineers and architects needed to effectively utilize this complex, multi-architecture hardware. This creates a situation where expensive hardware may become underutilized 'inventory' without the human expertise to build and train state-of-the-art models on it.

QHow does Meta's current position in the AI 'food chain' compare to its competitors like OpenAI and Google?

AMeta is not in the top tier of the AI food chain. OpenAI and Google lead with the strongest models, vast user bases, and vertical integration (e.g., Google's TPUs). Anthropic is also a strong contender. Despite spending the most money on compute and undergoing frequent reorganizations, Meta has yet to deliver a frontier model that convincingly competes with these leaders, placing it in a position where it must prove itself with its upcoming 'Avocado' model to be considered a top player.

Related Reads

U.S. Government Bans Foreign Nationals from Using Fable 5, Anthropic Issues Rebuttal

U.S. Government Bans Foreign Access to Fable 5, Anthropic Issues Rebuttal On June 12th, the U.S. government ordered AI company Anthropic to immediately suspend all foreign access—including foreign nationals within the U.S. and Anthropic's own foreign employees—to its newly released Fable 5 and Mythos 5 AI models, citing national security concerns. This forced Anthropic to temporarily disable access to both models for all users globally, as it cannot technically differentiate user nationality at scale. The models, released just three days prior, represent Anthropic's highest public capability tier. Fable 5 is the first publicly available model from the advanced "Mythos" family, while Mythos 5 is a less-restricted version for approved cybersecurity and critical infrastructure partners. The government's directive was reportedly triggered by claims from another company that it could "jailbreak" Mythos 5, raising alarm within the Trump administration. Anthropic, in a detailed public statement, strongly challenged this rationale. The company argues the demonstrated "jailbreak" is a narrow, non-generalized technique that merely involves identifying minor, known software vulnerabilities—a capability common to other publicly available models like OpenAI's GPT-5.5 and routinely used by cybersecurity defenders. Anthropic stated it has complied with the order but disagrees with the government's standard, warning that applying it industry-wide would halt all new frontier model deployments. The company criticized the lack of a transparent, fact-based legal process and expressed confidence the situation stems from a misunderstanding. It is working to restore access and will release more technical details within 24 hours. Other Anthropic models remain unaffected.

链捕手12m ago

U.S. Government Bans Foreign Nationals from Using Fable 5, Anthropic Issues Rebuttal

链捕手12m ago

The Revelation from the Raydium Theft Incident: New DeFi Vulnerabilities Lurking in Forgotten Old Contracts

**Raydium Exploit Reveals DeFi's Hidden Risk: Forgotten "Zombie" Contracts** A recent attack on Raydium's deprecated V3 AMM pools resulted in a loss of approximately $1.34 million. The hacker exploited pools that were no longer supported by Raydium's current UI or SDK but remained fully functional and accessible on-chain. This incident highlights a critical, often overlooked category of risk in DeFi: inactive or legacy smart contracts that projects fail to properly decommission. Since March 2025, there have been at least 8 publicly reported attacks targeting such abandoned contracts, with total losses around $10.8 million. Including older pools and deprecated features, the count rises to 10 incidents with roughly $22.5 million in losses. These "zombie contracts" represent a lifecycle management failure rather than a code vulnerability, yet they are typically misclassified under general "code bug" categories in security reports, masking the true scale of the problem. The root cause is that projects often merely document a contract as "deprecated" without taking essential technical steps to secure it: withdrawing remaining assets, disabling external call functions, and implementing ongoing monitoring. These forgotten, under-monitored components become prime targets for attackers. To address this, the industry needs to recognize "zombie contracts" as a distinct risk category and establish standardized decommissioning protocols. Essential steps should include: 1) a formal retirement announcement, 2) removal of all front-end integrations, 3) withdrawal of locked assets, 4) disabling key contract functions, 5) ongoing security monitoring, 6) clear user communication, and 7) a post-mortem analysis. The value of a DeFi project lies not only in its current TVL but also in the security of its historical codebase, which has now become a new attack surface.

Foresight News2h ago

The Revelation from the Raydium Theft Incident: New DeFi Vulnerabilities Lurking in Forgotten Old Contracts

Foresight News2h ago

Robots Begin to 'Consume Data': The Hidden Production Chain from Indian Data Factories to Billion-Dollar Humanoid Robots

Robots have started to 'consume data,' driving the formation of a new industrial supply chain focused on producing training data for embodied AI. Unlike large language models, which are trained on vast internet text corpora, embodied AI models face a 'data desert' in the physical world. This has created a massive demand for first-person perspective video data (Ego Data), captured by workers wearing cameras in places like Indian garment factories. Companies like Neocambrian AI are establishing 'data factories' where workers perform standardized tasks (e.g., sorting clothes, kitchen organization) to generate thousands of hours of video. Research, such as NVIDIA's EgoScale, demonstrates that scaling this human demonstration data predictably improves robot performance, particularly for dexterous manipulation. This has validated a training path combining large-scale human data for pre-training with smaller amounts of robot-specific data for fine-tuning. The value of different data types varies significantly, forming a 'data pyramid.' The base consists of low-cost, large-scale internet and Ego Data. Higher layers include more expensive motion-capture data (e.g., from data gloves), simulation/synthetic data, and the most costly and scarce layer: real robot teleoperation data. This demand has spawned a layered ecosystem of data suppliers: low-cost data factories, motion capture and alignment specialists, robot-native teleoperation service providers, simulation data companies, and platforms aiming for data standardization. Robot companies themselves are adopting a 'layered procurement' strategy: outsourcing generic Ego Data while building in-house capabilities for robot-specific adaptation data and the critical deployment/failure data generated in real-world applications. The industry is shifting focus from hardware and basic mobility to the data pipelines required for general-purpose capability. While parallels exist to data labeling companies like Scale AI in the LLM boom, the physical complexity of robot data—involving action success ambiguity and sim-to-real gaps—requires more integrated solutions for data collection, annotation, and a continuous feedback loop. The race is on to build the data engines that will teach robots to operate reliably in the unstructured real world.

marsbit4h ago

Robots Begin to 'Consume Data': The Hidden Production Chain from Indian Data Factories to Billion-Dollar Humanoid Robots

marsbit4h ago

Trading

Spot
Futures

Hot Articles

How to Buy PEOPLE

Welcome to HTX.com! We've made purchasing ConstitutionDAO (PEOPLE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy ConstitutionDAO (PEOPLE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your ConstitutionDAO (PEOPLE)After purchasing your ConstitutionDAO (PEOPLE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade ConstitutionDAO (PEOPLE)Easily trade ConstitutionDAO (PEOPLE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

7.1k Total ViewsPublished 2024.03.29Updated 2026.06.02

How to Buy PEOPLE

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of PEOPLE (PEOPLE) are presented below.

活动图片