Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

marsbit2026-06-24 tarihinde yayınlandı2026-06-24 tarihinde güncellendi

Özet

Google has begun selling its proprietary TPU chips and AI computing hardware directly to third-party data centers and clients, marking a strategic shift. Previously only accessible via cloud rentals, TPUs are specialized processors designed for the matrix and tensor operations central to AI models. By combining thousands into supercomputing clusters managed by CPUs, Google achieves high-efficiency AI processing. This move enables Google’s Gemini AI to offer competitive token pricing, challenging rivals like OpenAI. It also signals a broader industry trend where AI compute is becoming a commoditized resource like electricity. While NVIDIA remains dominant with its CUDA ecosystem and high-performance GPUs, the focus is shifting from raw power to cost efficiency and system integration. Google’s approach mirrors NVIDIA’s by selling an entire ecosystem—hardware, software, and data center expertise—rather than just chips. This threatens NVIDIA’s grip on the mid-range inference market, where lower-cost, efficient solutions are increasingly demanded. Similarly, cloud providers like Huawei Cloud and Alibaba Cloud in China are developing their own AI chip ecosystems (e.g., Ascend, Zhenwu), packaging chips, clusters, and tools into full-stack solutions. They aim to reduce token costs and capture market share through integrated systems. In summary, the AI infrastructure race is evolving from a competition for the strongest chips to a contest for the most efficient and cost-effective ...

Recently, Google announced it has officially begun selling its self-developed TPU chips and supporting AI computing hardware directly to third-party data centers and customers. As Google's 'secret weapon' in the AI field, third parties previously could only rent TPUs through cloud data centers. The industry once thought Google would not sell these chips externally, but surprisingly, this good news came in June this year.

So the question is, what is a 'TPU'? Its full name is 'Tensor Processing Unit.' Unlike CPUs and GPUs, it is a chip specifically designed for 'matrix and tensor mathematical operations' in AI computing, capable of handling related calculations with extremely high efficiency.

Does it sound like an auxiliary chip? However, that's not the case, because current large AI model technology essentially involves complex mathematical operations (primarily matrix multiplication) on massive amounts of data. So Google did one thing: it combined thousands of TPUs into supercomputing clusters, then used CPU hosts for coordination (decomposing tasks, converting data), creating highly efficient AI computing centers.

Image Source: Google

This is also why Gemini can aggressively capture users from OpenAI and other companies with lower subscription fees and higher usage limits. Even looking at token prices alone, Gemini is one of the representatives among overseas AI products with lower pricing for flagship models, and its mainstream model pricing is close to that of domestic model manufacturers like DeepSeek.

Moreover, TPUs are also better at handling the massive daily computing requests from users, making them 'professionally suited' for the future AI ecosystem, which is why the outside world has coveted this chipset for a long time. After announcing the sales plan, Google also revealed a $5 billion agreement to jointly build a large computing center with the famous private equity firm Blackstone, with a tentative capacity of 500 megawatts.

Lei Technology (ID: leitech) speculates that after this news is released, many enterprises, especially those wanting to build their own computing centers, will likely inquire with Google or seek cooperation. At this point, some might think Amazon should be worried, as this is clearly competing for cloud service business. Actually, not quite. The one most likely troubled now is probably Nvidia.

Has Google Dealt Nvidia a 'Fatal Blow'?

First, let me ask a question: Why has Nvidia become one of the most important companies in the AI era? If your answer is just 'GPU computing power is strong,' you're only half right.

Where Nvidia is truly formidable is that it has long ceased to be just a GPU seller. CUDA, NVLink, DGX, InfiniBand networks, AI software libraries, developer ecosystems, server partners, cloud vendor adaptation—all these things together form Nvidia's moat.

So, when you buy a Nvidia computing card and start it up, you're not just buying a card; you're buying a complete, industry-validated AI ecosystem. For most companies, Nvidia's CUDA ecosystem saves them from 'reinventing the wheel,' thereby saving considerable effort and cost.

This is also why many AI companies, despite knowing Nvidia GPUs are expensive, still have to use them. Because during the AI explosion period, 'cost' could be ignored; the only metric was whether they could lead competitors or catch up faster. However, as large AI models enter the popularization phase, what everyone wants is no longer just 'speed.' Faced with a massive user base, efficiency and cost-effectiveness have become the new priorities.

Google has clearly seen this too, so they bet on TPUs and packaged them into a complete solution. The chips in this solution aren't meant to defeat Nvidia in raw computing performance, but rather to package Google's years of experience in chips, data centers, networks, storage, scheduling, and model training into a cloud service capability that enterprises can purchase directly.

This is where Google truly 'learned from Nvidia.' It's not learning to sell chips, but to sell systems and ecosystems, turning a series of hardware into 'productivity' that customers can use. This holds considerable appeal for enterprises wishing to keep their computing centers in their own hands.

So, is Nvidia panicking? Not necessarily, but it's definitely a headache. After all, while flagship computing cards are profitable, companies won't remain in a 'buy as much as available' state forever. They will gradually turn their attention to other more cost-effective chips. At that time, Google's TPU solution will inevitably impact Nvidia's share of this market.

However, for the entire AI industry, Nvidia remains the most recognized universal standard in the AI computing market, and the status of the CUDA ecosystem is not something that can be easily shaken by one or two generations of chips. Especially in large model training, many teams have accumulated extensive experience around Nvidia's system, making a rash platform switch risky.

For example, DeepSeek recently announced that its new model could be trained using Huawei's Ascend chips, and this was only possible after several version iterations following deep collaboration between Huawei and DeepSeek.

Image Source: Ascend

But from Google's perspective, it doesn't need to replace Nvidia in all scenarios. As long as it can capture a portion of enterprise clients and prove its efficiency is higher than other computing ecosystems, it can already carve out a piece of the cake from the AI infrastructure market.

Especially in the inference phase, Google's TPU computing servers clearly have an advantage. Everyone knows that once tokens are actually used, the consumption rate isn't much slower than opening a floodgate. There's the case of Uber burning through a year's budget in four months, a mysterious company spending $500 million on token fees in one month, and even deep-pocketed Microsoft restricting employee permissions, ordering them to use its own computing power.

It can be said that as AI usage frequency increases across various fields, more cases will prove that token cost is the key to future AI competition. Because whoever has lower token costs can popularize AI across more business lines to seize users and markets.

Computing Power Becomes a Basic Resource, Cloud Vendors' Opportunity Arrives

I think a netizen's analogy is very apt: training a model is like buying a car, while inference service is like the gasoline burned daily. Even wealthy households can't burn premium 98-octane gas in all cars every day; the computing power Google provides is like 92-octane. Although the power is slightly insufficient, the car still runs, the work gets done, and it's cheaper.

Recently, I wrote an article mentioning that there's a consensus in the industry: AI computing power is becoming more and more like basic resources such as electricity, water, and broadband.

Moreover, for users, they don't need to know how 'computing power' is produced, but they will care about its price like they care about utility bills. This 'user' can be an individual, an enterprise, a city, or even a country.

Therefore, in the future AI market, Nvidia will still be important because without high-performance chips, nothing can start. But when computing power demand becomes a long-term, stable, and scalable basic resource, the bargaining power will gradually shift to cloud service vendors.

This is also why cloud vendors like Google, Microsoft, Amazon, Alibaba Cloud, and Huawei Cloud are no longer content with being just 'resellers' of Nvidia GPU computing power but are all laying out their own computing ecosystems. This doesn't mean they will stop procuring Nvidia GPUs, because the market needs them, clients need them, and they can also sell for a good price.

Image Source: Lei Technology

But at the same time, their true development focus will inevitably shift to their own ecosystems. This is also what Nvidia needs to be most vigilant about. After all, Nvidia's current market value is largely calculated on the premise of it being the 'AI foundational base.' Once Nvidia loses control over the non-top-tier computing card market, it may gradually return to its position in the gaming graphics card market five years ago: top-tier, but not indispensable.

In fact, if we shift our perspective back to China, similar changes have already appeared. When we discussed domestic AI computing cards before, the focus was often on computing performance comparisons, discussing how far single-card performance is from top-tier computing cards.

This issue is certainly important, but if we only focus on chip performance itself, we overlook another key point: Domestic cloud vendors are also turning chips, clusters, cloud platforms, model services, and industry solutions into a complete AI production system, and this is the core competitiveness of domestic AI.

This isn't just my opinion; it's what core cloud service providers like Huawei Cloud and Alibaba Cloud are doing. For example, Huawei's Ascend Cloud Service, although the Ascend chip repeatedly makes headlines, today's Huawei already provides cloud-based toolchains, super node clusters, model migration, training/inference optimization, and industry landing capabilities centered around Ascend computing power.

Image Source: Weibo

Moreover, Huawei is also promoting this computing ecosystem to more domestic AI companies. Besides DeepSeek mentioned earlier, there are also leading AI firms like Baidu, iFLYTEK, Zhipu AI, and MiniMax. It can be said that Huawei has gradually built its computing ecosystem. The next step is to bring more partners on board and then use lower token prices to capture the market.

Alibaba Cloud is doing the same. They released the Zhenwu M890 training-inference integrated AI chip in May this year, and before that, the Zhenwu 810E had already been deployed on a large scale in Alibaba Cloud's Lingjun Intelligent Computing Platform. At this year's Alibaba Cloud Summit, Alibaba Cloud also directly announced that the cumulative shipment of Pingtouge's Zhenwu series AI chips has reached 560,000 units, with annualized revenue crossing the 10-billion-yuan level.

It can be said that in learning from Nvidia, domestic cloud service vendors are not only moving faster but also started earlier.

The Strongest Computing Power? No, The World Needs 'Optimal Computing Power'

Of course, Nvidia won't suddenly lose its core position in the AI era just because Google started selling TPUs.

At least for a long time, GPUs, CUDA, and the developer ecosystem will still be the standard the entire AI industry cannot bypass. Especially in large model training, high-performance computing, and general AI development scenarios, Nvidia remains the most mature and industry-recognized choice currently.

But the problem is that the AI computing market is entering its next phase.

In the past, everyone competed over 'whose chip is stronger.' Now, what enterprises truly care about is becoming 'who can make computing power cheaper.' At this point, the advantages of cloud service vendors like Google, Huawei Cloud, and Alibaba Cloud begin to show: they possess massive numbers of individual and enterprise clients, data, applications, and scenarios, and are also more adept at packaging various hardware into a productivity system that can be used directly.

In other words, what is truly scarce in the AI era is no longer just the chips themselves, but the systemic ability to turn chips into productivity.

When computing power becomes more and more like basic resources such as water, electricity, and broadband, the ultimate winning company may not necessarily be the one with the strongest single-card performance, but rather the one that can deliver AI computing power to clients continuously at lower cost and higher efficiency.

Therefore, in the view of Lei Technology, Google selling TPUs is actually a signal reminding the entire industry: The competition for AI infrastructure is no longer just a chip war, but a system war.

This article is from the WeChat public account 'Lei Technology,' author: Lei Technology

İlgili Sorular

QWhat is TPU and how does it differ from CPU and GPU?

ATPU stands for Tensor Processing Unit. It is a chip specifically designed for matrix and tensor mathematical operations in AI computing, offering high efficiency in handling such calculations. Unlike CPUs and GPUs, which are more general-purpose processors, TPUs are specialized for the core computations that underpin AI large model technology.

QAccording to the article, why is Nvidia considered a crucial company in the AI era?

ANvidia's crucial role isn't solely due to powerful GPU performance. Its real strength lies in the comprehensive ecosystem it has built, including CUDA, NVLink, DGX systems, InfiniBand networking, AI software libraries, developer tools, and extensive partnerships. This ecosystem provides a proven, turnkey solution, saving companies time and resources, which is why many continue to use Nvidia's hardware despite the cost.

QWhat is the primary goal of Google's strategy in selling its TPU hardware and systems?

AGoogle's primary goal is not just to sell individual TPU chips but to package its expertise in chips, data centers, networking, and model training into a complete, competitive system and ecosystem. This approach aims to offer enterprises an efficient, alternative AI infrastructure solution, particularly for inference workloads, with the potential for lower token costs and higher efficiency, thereby challenging Nvidia's dominance in non-top-tier markets.

QHow does the article describe the evolving focus of the AI compute market?

AThe article describes a shift in focus from 'whose chip is the most powerful' to 'who can make compute power cheaper.' AI compute is increasingly becoming a fundamental resource, similar to electricity or water. The future competition is shifting from a pure chip battle to a systems battle, where the winner will be the company that can deliver AI compute to customers with lower cost and higher efficiency.

QWhat parallel domestic development in China does the article mention in relation to Google's TPU strategy?

AThe article mentions that Chinese cloud service providers like Huawei Cloud and Alibaba Cloud are pursuing a similar strategy. They are not just focusing on chip performance but are building comprehensive AI production systems by integrating their own chips (e.g., Huawei's Ascend, Alibaba's Zhenwu), clusters, cloud platforms, model services, and industry solutions. This focus on creating a complete ecosystem, often offering competitive token pricing, represents the core competitiveness of domestic AI infrastructure.

İlgili Okumalar

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

Jocy, founder of IOSG Ventures, argues that Ethereum does not need renewed technological faith but a "Musk-like compromise." The recent formation of ETHLabs—funded by major ETH holders like BitMine and Lubin—highlights a market-driven move to fill a gap left by the Ethereum Foundation (EF), signaling a loss of confidence in its decentralized, hands-off approach. The core critique contrasts Vitalik Buterin's (V) idealistic, technology-first vision with Elon Musk's pragmatic, business-driven execution. The author asserts Ethereum's current shortage is not another technical roadmap but a clear, real-world application narrative and a leader willing to engage directly with commercial realities—like Musk. Internal issues are emphasized, citing EF's management problems and talent drain. While the new decentralized model with independent nodes like ETHLabs addresses the single foundation's limitations, it risks fragmentation without cohesive direction. True cohesion, the author suggests, must come from a shared, compelling narrative around ETH's value, not just from aligned financial interests. Independence claims for new entities are seen as aspirational, needing years of transparency to build trust. The ultimate threat is not competitors like Solana, but the broader shift of attention and talent toward AI. Ethereum has a limited window—12 to 18 months—to recapture focus by delivering tangible, real-world applications. The conclusion urges V to shift from abstract ideals to grounded, pragmatic leadership. The time for this crucial pivot is running out.

marsbit59 dk önce

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

marsbit59 dk önce

İşlemler

Spot
Futures
活动图片