The whole world covets NVIDIA's business.
According to NVIDIA's Q4 FY2026 (ending January 2026) earnings report, its GAAP gross margin was as high as 75.2%, making it practically a money-printing machine. This immense profitability stems primarily from its dominant position in the AI chip market, which grants it powerful pricing power.
Almost all large language models run on NVIDIA's computing chips, supporting its nearly $5 trillion market capitalization.
But precisely because of this, almost all major AI companies are openly or covertly trying to break free from NVIDIA's cage, unwilling to hand over their fate to it. The recently released DeepSeek V4, based on its technical report, was most likely trained using NVIDIA chips, but it is being adapted for inference on Huawei's Ascend computing chips. Furthermore, it stated that the token cost for the Pro version will be significantly reduced after the launch of Huawei's Ascend 950 in the second half of the year. Additionally, besides Huawei Ascend, domestic chip manufacturers like Tianshu Zhixin and Cambricon have also announced support for the new DeepSeek V4 model.
In NVIDIA's home turf, the US, Google developed its own TPU (Tensor Processing Unit) computing chips. As of April 2026, the TPU has reached its eighth generation, forming a complete product line of training and inference chips. In March, Meta also disclosed its roadmap for self-developed AI chips, planning to deploy four new products in the MTIA series by the end of 2027 to meet the internal AI business computing needs, while maintaining large-scale procurement partnerships with NVIDIA and AMD, building a dual-track computing power system of "self-developed + external procurement".
Yes, for the time being, no AI company can bypass NVIDIA, but Jensen Huang still senses the crisis. In a recent podcast interview, Huang stated that Moore's Law is coming to an end, meaning the era of chip performance doubling every year is over. The performance advantage of today's most advanced chips is not a permanent moat, but a relative advantage with a time window. Once the manufacturing process approaches physical limits, the difficulty for latecomers to catch up will actually decrease.
Huang said that restricting the export of computing chips to China would indeed slow down the development speed of Chinese AI in the short term, but in the long run, it will only force China to form its own ecosystem. What he didn't delve into further is that currently, only Chinese AI companies are committed to open source, and are being adopted by numerous companies and startups. If more and more open-source models run on Chinese-made computing chips, then even if NVIDIA still holds the number one market position, it will no longer be the only one.
In fact, even without the threat of Chinese open-source large models and computing chips, market competition is likely to push the computing chip industry towards a duopoly structure, rather than letting NVIDIA dominate alone.
Interestingly, among them, OpenAI, which is extremely dependent on NVIDIA, is ironically the most active in "backstabbing" it.
01
On April 17 local time, US AI chip manufacturer Cerebras officially submitted an IPO application to the US SEC, aiming to raise $3 billion with a valuation of $35 billion.
After withdrawing its previous IPO application in October 2025, this challenger to NVIDIA, whose core selling point is "wafer-scale chips," launched another IPO冲刺 (sprint) within six months, successfully pushing its company valuation from $8.1 billion to $35 billion.
The core pillar of this valuation surge is a cooperation agreement with OpenAI worth over $20 billion.
According to the agreement, OpenAI commits to using server clusters powered by Cerebras chips over the next three years. Cerebras will deploy 750 megawatts of computing power for the latter, expected to be fully deployed by 2028. Additionally, OpenAI will provide Cerebras with approximately $1 billion in funding to help develop its data centers and obtain about 10% in warrants.
Clearly, OpenAI is no longer just a simple client; it is a creditor and potentially a major future stakeholder. The decision to re-initiate the IPO冲刺 at this time is likely a joint decision by both companies.
On the same day Cerebras submitted its IPO documents, three core OpenAI executives, including Sora lead Bill Peebles, announced their departure. Meanwhile, the $500 billion "Stargate" plan, once seen as a milestone in US AI infrastructure, is also in disarray, with internal coordination and financing issues progressing slowly.
According to media disclosures, OpenAI's revenue in 2025 was $13.1 billion, with losses as high as $8 billion. Losses are预计 to soar to $25 billion this year. Under the pressure of huge losses, OpenAI even had to make painful cuts, shutting down the popular video generation product Sora.
Some analysis suggests that Sora's daily computing power cost was approximately $15 million, with the cost of a 10-second high-precision video around $33. During Sora's operation, total user payment revenue was only $2.1 million.
In such turbulent times, Altman naturally understands that over-reliance on NVIDIA would become OpenAI's biggest weakness.
Previously, OpenAI announced collaborations with Broadcom to develop custom chips and adopted AMD's new MI450 chips, frequently sending clear signals to the outside world—it no longer wants to work for NVIDIA. It is against this backdrop that Cerebras became a key bet in OpenAI's "de-NVIDIAization" strategy.
Although Cerebras is not widely known, it has uniqueness among chip manufacturing companies.
Almost all chip design giants follow the "cut the wafer, make small chips" route. Cerebras, however, focused on the "memory wall" encountered when data is moved between chips, thus adopting a more aggressive single-chip technology路线.
Cerebras's core product is the Wafer-Scale Engine WSE-3, a single chip made from an entire 300mm wafer. Because computation, storage, and interconnection are all within a single chip, data transmission latency is reduced by 90% compared to GPU clusters, making it particularly suitable for low-latency inference of large models.
In inference scenarios, the wafer-scale architecture is expected to reduce the cost per token by 80%.
OpenAI's head of computing infrastructure stated that Cerebras has added a dedicated low-latency inference solution to the platform, which will not only allow users to get faster response times but also lay the foundation for expanding real-time AI technology to a broader user base.
More importantly, Cerebras's non-HBM dependent route might break NVIDIA's near-monopoly in the chip industry, making computing power supply more diverse.
All of these恰好 hit OpenAI's pain points perfectly, making the collaboration between the two a natural fit.
Besides OpenAI, Cerebras also reached a cooperation agreement with AWS in March. The CS-3 will be deployed in Amazon's data centers, entering the infrastructure system of mainstream hyperscale cloud platforms.
02
"The most exciting thing about this rapidly iterating industry is that algorithms will continue to become faster, more accurate, and more efficient—precisely why I am unwilling to投身 those traditional industries that remain unchanged for nine years."
Cerebras's ability to reach its current position is closely tied to its founder, Andrew Feldman.
Unlike typical chip company founders who are engineers, Feldman graduated from Stanford University with bachelor's degrees in Economics and Political Science and an MBA. From the beginning of his career, he consistently accumulated experience in product and marketing fields. This career path gave him a natural instinct for what kind of business model could succeed.
As his experience grew, Feldman gradually transitioned from an employee to a serial entrepreneur.
And all serial entrepreneurs have one极其明显的 characteristic—they want to win, desperately. These people aren't just ordinarily "competitive"; they treat "winning" as indispensable as breathing. They typically choose to bet in the "no man's land" of industry consensus, going all-in on directions most people consider "unnecessary" or "impossible." In other words, they have a relatively large "gambling spirit."
In 2007, Feldman founded the server company SeaMicro.
"Today's large processors are like us driving a space shuttle to the grocery store. Actually, I just need to drive a Prius."
SeaMicro abandoned the traditional server approach of "piling on components." It removed all components except the CPU, memory, and a self-developed ASIC, providing "more cores" for specialized internet companies needing "scale-out" workloads. The company was acquired by AMD for $355 million in 2012.
Although the microserver business gradually faded into obscurity after being integrated into AMD, this experience allowed Feldman to accumulate wealth and further solidify his entrepreneurial methodology: at the node of generational change, use "counter-mainstream" hardware design to切入细分 markets not yet covered by giants.
According to industry conventions, chip yield decreases as area increases. While chip companies were all following NVIDIA's path forward, Feldman decided, in a very "layman" way of thinking, to directly make a single chip the size of a plate.
In 2015, Feldman and his technical partner Gary Lauterbach共同 founded Cerebras and brought in several former colleagues from SeaMicro. Cerebras remained silent for a full four years until it released the first-generation WSE-1 in August 2019.
During this obscure R&D period, Feldman was betting on two things: one was that TSMC's wafer-level packaging technology would gradually mature, and the other was that AI models would become so large that the memory wall of GPUs would become a fatal bottleneck.
Judging from current developments, he bet correctly.
From 2019 to 2024, Cerebras launched a new generation every two years, with the process jumping from 16nm to 7nm to 5nm, and the number of transistors rolling from 1.2 trillion to 4 trillion. Meanwhile, Feldman began actively seeking out major clients. In 2023, he flew to Abu Dhabi and secured G42.
Cerebras and G42 collaborated to train the leading language model in the Arabic language domain and jointly created Condor Galaxy, a network of nine interconnected supercomputers. The close cooperation with this Middle Eastern enterprise also triggered a national security review of Cerebras by the US Committee on Foreign Investment, but Feldman didn't care—the review indicated his own strength.
"If you only work 38 hours a week and还想挑战 an 800-pound gorilla like NVIDIA? No way. You need every waking minute."
Feldman was once asked in an interview about his views on "work-life balance," and he gave a rather radical negative answer. He毫不掩饰 his ambition to challenge NVIDIA.
Referencing NVIDIA's hundred-fold growth over ten years, Feldman holds a optimistic outlook for Cerebras's prospects: to develop treatment plans for millions of patients in the next 3 to 5 years; to provide inference computing power for applications yet to be born; to allow the public to use the company's technology without even noticing it.
03
Cerebras's冲刺 IPO faces constant controversy. Optimists look forward to witnessing the birth of a second NVIDIA, while naysayers question the stability of its performance.
According to officially disclosed financial information, Cerebras's revenue grew from $24.6 million in 2022 to $510 million in 2025, with a four-year compound annual growth rate of 175%. Particularly突出的是, the GAAP net profit in 2025 was $238 million, successfully reversing the颓势 of a net loss of $482 million in 2024.
However, a closer analysis reveals that the GAAP profit benefited from a non-cash book gain of $363 million. This gain was actually an accounting operation resulting from the removal of G42-related liabilities from the balance sheet due to the US security review. Excluding this non-recurring item, the company's non-GAAP net loss was actually $75.7 million.
In other words, Cerebras's "return to profitability" is an accounting game.
In 2023 and 2024, G42 contributed 83% and 87% of Cerebras's total revenue, respectively. With geopolitical conflicts becoming increasingly severe, the risk of relying on a single customer from the Middle East is可想而知. After all, Cerebras's first IPO withdrawal was partly due to national security reviews.
According to the prospectus, the company's remaining performance obligations,高达 $24.6 billion, rely heavily on the $20 billion agreement signed with OpenAI. In other words, Cerebras's expected revenue is almost entirely based on OpenAI's forward commitments, rather than a diversified large-scale customer base.
Whether this "shot in the arm" order can be fulfilled depends on the fate of OpenAI itself. When the stability of the largest customer itself is being反复打量 by the market, how much of this "blank check" can be realized is something恐怕 Feldman himself cannot guarantee.
A comparison with NVIDIA更能看出 Cerebras's disadvantages.
Even before the AI industry's big explosion, NVIDIA had already established a diversified customer base across multiple fields such as gaming, professional visualization, and data centers. No single customer accounted for more than 10% of its revenue. Over more than twenty years of evolution, NVIDIA has deeply bound itself with millions of developers. Every product iteration stems from the needs of internal ecosystem expansion, and its product planning path is very clear. Cerebras's ecosystem is at a very early stage, still achieving only a single-point breakthrough in inference scenarios, and has a long way to go before becoming a true platform company.
Even without the sudden emergence of ChatGPT, NVIDIA was a high-quality company with stable revenue and considerable profits. But if the $20 billion order from OpenAI were to disappear, Cerebras恐怕 wouldn't even have the possibility of冲刺 an IPO.
In December 2025, NVIDIA reached a special cooperation agreement worth approximately $20 billion in cash with Cerebras's competitor Groq. NVIDIA obtained a permanent non-exclusive license for Groq's LPU inference architecture and full-stack chip design technology.
Jensen Huang's entry signifies that the value of Cerebras's low-latency dedicated inference architecture has been recognized by industry giants, but it also急剧 increases the competitive pressure Cerebras faces.
From a practical standpoint, OpenAI引入 Cerebras is not for replacement, but to act as a "catfish" (stimulus), increasing bargaining chips and分散 supply chain risks.
There are reports that NVIDIA's system based on Groq chips will be launched in the second half of 2026. If Altman turns around and reaches an agreement with Huang again, Cerebras could easily become the sacrifice.
In the trillion-dollar AI chip track, diversified competition is undoubtedly good for the long-term development of the industry ecosystem. But the capital market is never short of wealth creation myths and舆论炒作. Whether Cerebras can truly deliver on its technological and commercial value still requires overcoming multiple tests.
The appealing title of "NVIDIA challenger" might also turn out to be a short-lived bubble.
But as the "Jevons Paradox" reveals, technological progress improves resource utilization efficiency and reduces the cost per unit output, but because people can afford to use more and use it more widely, it反而 leads to an increase in the total consumption of resources. As AI渗透 more extensively into all aspects of people's lives, computing power demand will continue to grow rapidly in the foreseeable future.
This super-large market worth hundreds of billions or even thousands of billions of dollars is not only about economics but also involves geopolitical security. No one wants to hand over the keys to their fate to be held by NVIDIA alone.
But显然, even out of自尊, Jensen Huang will not easily hand over the keys.
This article is from WeChat public account "最话FunTalk" (ID: iFuntalker), author: He Yiran, editor: Liu Yuxiang







