Written by: Ada, Shenchao TechFlow
Ruoming Pang hadn't even warmed his seat at Meta before he left.
In July 2025, Mark Zuckerberg snatched this most sought-after Chinese engineer in the AI infrastructure field from Apple's grasp with a multi-year compensation package totaling over $200 million. Pang was placed into Meta's Superintelligence Lab, responsible for building the infrastructure for the next generation of AI models.
Seven months later, OpenAI poached him.
According to The Information, OpenAI waged a recruitment campaign for Pang that lasted several months. Although Pang had told colleagues he was "very happy working at Meta," he ultimately chose to leave. Bloomberg reported that his compensation package at Meta was tied to milestones, and leaving early meant forfeiting the majority of unvested equity.
$200 million couldn't buy 7 months of loyalty.
This isn't a simple story of a job change.
One Person's Departure, A Signal for Many
Pang wasn't the first to leave.
Last week, Mat Velloso, Product Lead for the Developer Platform at Meta's Superintelligence Lab, also announced his departure. He had left Google DeepMind to join Meta in July last year and stayed for less than 8 months. Going further back, in November 2025, Turing Award winner and Chief AI Scientist Yann LeCun, who had been at Meta for 12 years, announced his departure to start a company, working on the "world model" he had long championed. Geoffrey Hinton's key disciple and Meta's VP of Generative AI Research, Russ Salakhutdinov, also recently announced his exit.
To understand the AI brain drain at Meta, one must first understand how damaging the Llama 4 incident was.
In April 2025, Meta proudly launched the Llama 4 series Scout and Maverick models. The official paper data was spectacular, claiming to comprehensively suppress GPT-4.5 and Claude Sonnet 3.7 in core benchmark tests like MATH-500 and GPQA Diamond.
However, this flagship model, which carried Meta's ambitions, quickly revealed its true colors in third-party independent blind tests by the open-source community, showing a cliff-like gap between its actual generalization and reasoning capabilities and its advertised performance. Faced with strong community skepticism, Chief AI Scientist Yann LeCun eventually admitted that the team had "used different model versions to run different test sets to optimize the final score" during the testing phase.
In the rigorous world of AI academia and engineering, this crosses an unforgivable red line. In other words, the team trained Llama 4 to be a "small-town exam crammer" only good at past exam papers, not a true "top student" with cutting-edge intelligence. For math tests, they showed the math whiz version; for programming tests, they showed the programming whiz version. Each individual test looked strong, but it wasn't the same model.
In AI academia, this is called "cherry-picking"; in the world of exam-oriented education, it's called "having a ringer take the test."
For Meta, which had always prided itself as the "beacon of open source," this scandal directly destroyed the most valuable asset of trust in its developer ecosystem. The direct cost was that Zuckerberg "completely lost confidence" in the engineering integrity of the original GenAI team, triggering a subsequent series of parachuting in executives and sidelining the core infrastructure department.
He spent $14.3 to $15 billion to acquire a 49% stake in the data annotation company Scale AI, parachuting the 28-year-old Scale AI CEO Alexandr Wang into the role of Meta's Chief AI Officer and establishing the Meta Superintelligence Lab (MSL). Turing Award winner LeCun had to report to this 28-year-old in the new structure. In October, Meta cut about 600 positions in MSL, including members of the FAIR research department created by LeCun.
And the flagship model Llama 4 Behemoth, originally planned for release in the summer of 2025, was repeatedly delayed, pushed from summer to fall, and eventually shelved indefinitely.
Meta instead began developing the next-generation text model codenamed "Avocado" and an image/video model codenamed "Mango." According to reports, Avocado's goal is to compete with GPT-5 and Gemini 3 Ultra. Originally scheduled for delivery by the end of 2025, it was postponed to Q1 2026 due to failed performance tests and training optimization. Meta is considering releasing it as a closed-source model, abandoning the open-source tradition of the Llama series.
Meta made two fatal mistakes with its AI models. First, benchmark fraud, which directly destroyed the trust of the community. Second, forcing a basic research department like FAIR, which requires a decade of grinding, into a product organization chasing quarterly KPIs. These two things combined are the root cause of the current brain drain.
In-House Chips: Another Broken Leg
People are fleeing, and chips are also a problem.
According to The Information, Meta last week canceled its project to develop its most advanced in-house AI training chip.
Meta's in-house chip plan is called MTIA (Meta Training and Inference Accelerator). The company's initial roadmap was ambitious: MTIA v4 codenamed "Santa Barbara," v5 "Olympus," and v6 "Universal Core" were planned for delivery between 2026 and 2028. Olympus was designed as Meta's first chip based on a 2nm chiplet architecture, aiming to cover both high-end model training and real-time inference, ultimately replacing Nvidia's role in Meta's training clusters.
Now, this most advanced training chip has been canceled.
Meta hasn't made no progress; MTIA has had some success on the inference side. The inference chip MTIA v3, codenamed "Iris," has been deployed on a large scale in Meta's data centers, mainly for Facebook Reels and Instagram's recommendation systems, reportedly reducing the total cost of ownership by 40% to 44%. But inference and training are two different things. Inference is running the model; training is teaching the model. Meta can make its own inference chips, but it can't build training chips that can compete head-on with Nvidia.
This isn't the first time historically. In 2022, Meta attempted to develop its own inference chip. After failing in a small-scale deployment, it directly abandoned the project and placed a large order with Nvidia.
The setback in developing its own chips directly accelerated Meta's external buying spree.
$135 Billion in Panic Buying
In January 2026, Meta announced its capital expenditure budget for the year was $115 to $135 billion, almost double last year's $72.2 billion. The bulk of this money will be spent on chips.
Within 10 days, three major deals landed:
February 17: Meta signed a multi-year, cross-generational strategic cooperation agreement with Nvidia. Meta will deploy "millions" of Nvidia Blackwell and next-generation Vera Rubin GPUs, plus standalone Grace CPUs. Analysts estimate the deal is worth tens of billions of dollars. Meta became the first supercomputing customer globally to deploy Nvidia's standalone Grace CPUs on a large scale.
February 24: Meta signed a multi-year chip deal with AMD worth $60 to $100 billion. Meta will purchase AMD's latest MI450 series GPUs and sixth-generation EPYC CPUs. As part of the deal, AMD issued warrants to Meta for up to 160 million shares of common stock, representing about 10% of AMD, at a price of $0.01 per share, vesting in batches based on delivery milestones.
February 26: According to The Information, Meta signed a multi-year deal with Google worth billions of dollars to rent Google Cloud's TPU chips to train and run its next-generation large language model. Simultaneously, the two sides are also discussing Meta directly purchasing TPUs starting in 2027 for deployment in its own data centers.
A social media company placed orders potentially totaling over a hundred billion dollars with three chip suppliers in 10 days.
This isn't diversification. This is panic buying.
The Three Layers of Logic Behind the Computing Power Panic
Why is Meta in such a hurry?
First, in-house chips are no longer an option. Canceling the most advanced training chip project means Meta must rely on external purchases to meet its AI training needs for the foreseeable future. MTIA chips on the inference end can handle mature businesses like recommendation systems, but training cutting-edge models like Avocado, which aims to rival GPT-5, requires Nvidia or hardware of equivalent grade.
Second, competitors won't wait. OpenAI has secured massive resources from Microsoft, SoftBank, to UAE sovereign wealth funds. Anthropic has locked in supplies of 1 million TPU and Trainium chips each from Google and Amazon. Google's Gemini 3 was trained entirely on TPUs. If Meta can't secure enough computing power, it won't even keep its ticket to the race.
Third, and perhaps most fundamentally, Zuckerberg needs to use "purchasing power" to compensate for the lack of "R&D power." The Llama 4 debacle, the loss of core talent, and the setback in developing its own chips—these three things combined have made Meta's AI narrative fragile in front of Wall Street. Signing major deals with Nvidia, AMD, and Google at least sends a signal: We have money, we are buying, we haven't given up.
Meta's current strategy is: if you can't figure out the software, smash the hardware; if you can't keep the people, buy the chips. But the AI race isn't a game you can win by writing checks. Computing power is a necessary condition, not a sufficient one. Without a top-tier model team and a clear technical roadmap, even the most chips are just expensive inventory sitting in warehouses.
The Buyer's Dilemma
Looking back at Meta's three deals in February, an interesting detail was missed by most.
Meta is buying current Blackwell and future Vera Rubin from Nvidia; the deal with AMD is for MI450 and future MI455X; it's renting current Ironwood TPUs from Google, planning to buy them directly next year.
Three suppliers, three completely different hardware architectures and software ecosystems.
This means Meta will have to juggle between three截然不同的底层生态: Nvidia's CUDA, AMD's ROCm, and Google's XLA/JAX. A multi-vendor strategy can分散供应链风险 and压低硬件采购溢价, but it will bring exponentially increasing engineering complexity.
This is Meta's most fatal soft spot right now. Making a trillion-parameter model train efficiently on these three sets of hardware with completely different underlying programming models requires not just engineers who understand CUDA, but architects who can build cross-platform training frameworks from scratch.
There are probably no more than 100 such people in the world. Ruoming Pang was one of them.
Spending $100 billion to buy the world's most complex hardware combination, while simultaneously losing the brains that can驾驭 these hardware—this is the most surreal picture in Zuckerberg's grand gamble.
Zuckerberg's Gamble
Zooming out a bit, Zuckerberg's operational path in AI over the past 18 months is strikingly similar to his rhythm when he went All In on the metaverse:
See the trend, invest heavily, recruit aggressively, encounter setbacks, pivot strategy abruptly, invest heavily again.
2021 to 2023 was the metaverse, resulting in annual losses of tens of billions, and the stock price eventually falling from $380 to $88. 2024 to 2026 is AI,同样是不计代价地砸钱 (investing regardless of cost), frequent organizational reshuffles, and the same narrative of "trust me, I have a vision."
The difference is that this time, the AI trend is indeed much more substantial than the metaverse. And Meta has money to burn; its advertising business generates ample cash flow. In Q4 2025, Meta's revenue was $59.9 billion, a year-on-year increase of 24%.
The problem is: Money can buy chips, buy computing power, even people sitting in seats, but it can't buy the people who stay.
Pang chose OpenAI, Russ Salakhutdinov chose to leave, LeCun chose to start his own company.
Zuckerberg's current bet is that by buying enough chips, building enough large data centers, and spending enough money, he will eventually find or cultivate the people who can use these resources.
This bet might work. Meta is, after all, one of the richest tech companies in the world, with over $100 billion in operating cash flow as its sturdiest moat. From OpenAI to Anthropic, from Google to other competitors, Meta is continuously poaching people. According to a Qbitai report, nearly 40% of the 44 people in Meta's Superintelligence team are from OpenAI.
But the残酷之处 (cruelty) of the AI race is that computing power reserves, talent lists, and model performance are all public. The Llama 4 benchmark fraud incident proved that in this industry, you can't maintain a lead with PPTs and PR.
The market ultimately only recognizes one thing: Is your model good enough?
Position on the Food Chain
As the AI arms race enters 2026, the food chain排序 (ranking) has become初步清晰 (initially clear):
At the top are OpenAI and Google. OpenAI has the strongest models, the largest user base, and the most aggressive financing. Google has complete vertical integration with its own chips, its own models, and its own cloud infrastructure. Anthropic follows closely behind, securing a spot in the first tier with the product strength of its Claude model and dual-line computing power supply from Google and Amazon.
Meta? It has spent the most money, signed the most chip contracts, done the most frequent organizational reshuffles, but so far, it has not produced a cutting-edge model that can convince the market.
Meta's AI story is somewhat like Yahoo in 2005. At that time, Yahoo was also one of the richest internet companies, also疯狂收购和砸钱 (acquiring and spending frantically), but it just couldn't make a search engine as good as Google's. Money isn't omnipotent. Zuckerberg needs to figure out what Meta really wants to do in AI, not just buy whatever is hot.
Of course, it's still too early to write Meta's obituary. 3.58 billion monthly active users, $59.9 billion in quarterly revenue, the world's largest social dataset—these are assets that any competitor would find difficult to replicate.
If the next-generation model codenamed Avocado can be delivered as scheduled in 2026 and return to the first tier, all of Zuckerberg's spending and restructuring will be packaged as "the strategic courage to turn the tide." But if it underperforms again, then this $135 billion will only have bought warehouses of silicon wafers that heat up when powered on.
After all, the AI arms race in Silicon Valley has no shortage of super buyers waving checks. What's lacking are the people who know how to use this computing power to forge the future.






