Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

marsbit2026-06-24 tarihinde yayınlandı2026-06-24 tarihinde güncellendi

Özet

Google has begun selling its proprietary TPU chips and AI computing hardware directly to third-party data centers and clients, marking a strategic shift. Previously only accessible via cloud rentals, TPUs are specialized processors designed for the matrix and tensor operations central to AI models. By combining thousands into supercomputing clusters managed by CPUs, Google achieves high-efficiency AI processing. This move enables Google’s Gemini AI to offer competitive token pricing, challenging rivals like OpenAI. It also signals a broader industry trend where AI compute is becoming a commoditized resource like electricity. While NVIDIA remains dominant with its CUDA ecosystem and high-performance GPUs, the focus is shifting from raw power to cost efficiency and system integration. Google’s approach mirrors NVIDIA’s by selling an entire ecosystem—hardware, software, and data center expertise—rather than just chips. This threatens NVIDIA’s grip on the mid-range inference market, where lower-cost, efficient solutions are increasingly demanded. Similarly, cloud providers like Huawei Cloud and Alibaba Cloud in China are developing their own AI chip ecosystems (e.g., Ascend, Zhenwu), packaging chips, clusters, and tools into full-stack solutions. They aim to reduce token costs and capture market share through integrated systems. In summary, the AI infrastructure race is evolving from a competition for the strongest chips to a contest for the most efficient and cost-effective ...

Recently, Google announced it has officially begun selling its self-developed TPU chips and supporting AI computing hardware directly to third-party data centers and customers. As Google's 'secret weapon' in the AI field, third parties previously could only rent TPUs through cloud data centers. The industry once thought Google would not sell these chips externally, but surprisingly, this good news came in June this year.

So the question is, what is a 'TPU'? Its full name is 'Tensor Processing Unit.' Unlike CPUs and GPUs, it is a chip specifically designed for 'matrix and tensor mathematical operations' in AI computing, capable of handling related calculations with extremely high efficiency.

Does it sound like an auxiliary chip? However, that's not the case, because current large AI model technology essentially involves complex mathematical operations (primarily matrix multiplication) on massive amounts of data. So Google did one thing: it combined thousands of TPUs into supercomputing clusters, then used CPU hosts for coordination (decomposing tasks, converting data), creating highly efficient AI computing centers.

Image Source: Google

This is also why Gemini can aggressively capture users from OpenAI and other companies with lower subscription fees and higher usage limits. Even looking at token prices alone, Gemini is one of the representatives among overseas AI products with lower pricing for flagship models, and its mainstream model pricing is close to that of domestic model manufacturers like DeepSeek.

Moreover, TPUs are also better at handling the massive daily computing requests from users, making them 'professionally suited' for the future AI ecosystem, which is why the outside world has coveted this chipset for a long time. After announcing the sales plan, Google also revealed a $5 billion agreement to jointly build a large computing center with the famous private equity firm Blackstone, with a tentative capacity of 500 megawatts.

Lei Technology (ID: leitech) speculates that after this news is released, many enterprises, especially those wanting to build their own computing centers, will likely inquire with Google or seek cooperation. At this point, some might think Amazon should be worried, as this is clearly competing for cloud service business. Actually, not quite. The one most likely troubled now is probably Nvidia.

Has Google Dealt Nvidia a 'Fatal Blow'?

First, let me ask a question: Why has Nvidia become one of the most important companies in the AI era? If your answer is just 'GPU computing power is strong,' you're only half right.

Where Nvidia is truly formidable is that it has long ceased to be just a GPU seller. CUDA, NVLink, DGX, InfiniBand networks, AI software libraries, developer ecosystems, server partners, cloud vendor adaptation—all these things together form Nvidia's moat.

So, when you buy a Nvidia computing card and start it up, you're not just buying a card; you're buying a complete, industry-validated AI ecosystem. For most companies, Nvidia's CUDA ecosystem saves them from 'reinventing the wheel,' thereby saving considerable effort and cost.

This is also why many AI companies, despite knowing Nvidia GPUs are expensive, still have to use them. Because during the AI explosion period, 'cost' could be ignored; the only metric was whether they could lead competitors or catch up faster. However, as large AI models enter the popularization phase, what everyone wants is no longer just 'speed.' Faced with a massive user base, efficiency and cost-effectiveness have become the new priorities.

Google has clearly seen this too, so they bet on TPUs and packaged them into a complete solution. The chips in this solution aren't meant to defeat Nvidia in raw computing performance, but rather to package Google's years of experience in chips, data centers, networks, storage, scheduling, and model training into a cloud service capability that enterprises can purchase directly.

This is where Google truly 'learned from Nvidia.' It's not learning to sell chips, but to sell systems and ecosystems, turning a series of hardware into 'productivity' that customers can use. This holds considerable appeal for enterprises wishing to keep their computing centers in their own hands.

So, is Nvidia panicking? Not necessarily, but it's definitely a headache. After all, while flagship computing cards are profitable, companies won't remain in a 'buy as much as available' state forever. They will gradually turn their attention to other more cost-effective chips. At that time, Google's TPU solution will inevitably impact Nvidia's share of this market.

However, for the entire AI industry, Nvidia remains the most recognized universal standard in the AI computing market, and the status of the CUDA ecosystem is not something that can be easily shaken by one or two generations of chips. Especially in large model training, many teams have accumulated extensive experience around Nvidia's system, making a rash platform switch risky.

For example, DeepSeek recently announced that its new model could be trained using Huawei's Ascend chips, and this was only possible after several version iterations following deep collaboration between Huawei and DeepSeek.

Image Source: Ascend

But from Google's perspective, it doesn't need to replace Nvidia in all scenarios. As long as it can capture a portion of enterprise clients and prove its efficiency is higher than other computing ecosystems, it can already carve out a piece of the cake from the AI infrastructure market.

Especially in the inference phase, Google's TPU computing servers clearly have an advantage. Everyone knows that once tokens are actually used, the consumption rate isn't much slower than opening a floodgate. There's the case of Uber burning through a year's budget in four months, a mysterious company spending $500 million on token fees in one month, and even deep-pocketed Microsoft restricting employee permissions, ordering them to use its own computing power.

It can be said that as AI usage frequency increases across various fields, more cases will prove that token cost is the key to future AI competition. Because whoever has lower token costs can popularize AI across more business lines to seize users and markets.

Computing Power Becomes a Basic Resource, Cloud Vendors' Opportunity Arrives

I think a netizen's analogy is very apt: training a model is like buying a car, while inference service is like the gasoline burned daily. Even wealthy households can't burn premium 98-octane gas in all cars every day; the computing power Google provides is like 92-octane. Although the power is slightly insufficient, the car still runs, the work gets done, and it's cheaper.

Recently, I wrote an article mentioning that there's a consensus in the industry: AI computing power is becoming more and more like basic resources such as electricity, water, and broadband.

Moreover, for users, they don't need to know how 'computing power' is produced, but they will care about its price like they care about utility bills. This 'user' can be an individual, an enterprise, a city, or even a country.

Therefore, in the future AI market, Nvidia will still be important because without high-performance chips, nothing can start. But when computing power demand becomes a long-term, stable, and scalable basic resource, the bargaining power will gradually shift to cloud service vendors.

This is also why cloud vendors like Google, Microsoft, Amazon, Alibaba Cloud, and Huawei Cloud are no longer content with being just 'resellers' of Nvidia GPU computing power but are all laying out their own computing ecosystems. This doesn't mean they will stop procuring Nvidia GPUs, because the market needs them, clients need them, and they can also sell for a good price.

Image Source: Lei Technology

But at the same time, their true development focus will inevitably shift to their own ecosystems. This is also what Nvidia needs to be most vigilant about. After all, Nvidia's current market value is largely calculated on the premise of it being the 'AI foundational base.' Once Nvidia loses control over the non-top-tier computing card market, it may gradually return to its position in the gaming graphics card market five years ago: top-tier, but not indispensable.

In fact, if we shift our perspective back to China, similar changes have already appeared. When we discussed domestic AI computing cards before, the focus was often on computing performance comparisons, discussing how far single-card performance is from top-tier computing cards.

This issue is certainly important, but if we only focus on chip performance itself, we overlook another key point: Domestic cloud vendors are also turning chips, clusters, cloud platforms, model services, and industry solutions into a complete AI production system, and this is the core competitiveness of domestic AI.

This isn't just my opinion; it's what core cloud service providers like Huawei Cloud and Alibaba Cloud are doing. For example, Huawei's Ascend Cloud Service, although the Ascend chip repeatedly makes headlines, today's Huawei already provides cloud-based toolchains, super node clusters, model migration, training/inference optimization, and industry landing capabilities centered around Ascend computing power.

Image Source: Weibo

Moreover, Huawei is also promoting this computing ecosystem to more domestic AI companies. Besides DeepSeek mentioned earlier, there are also leading AI firms like Baidu, iFLYTEK, Zhipu AI, and MiniMax. It can be said that Huawei has gradually built its computing ecosystem. The next step is to bring more partners on board and then use lower token prices to capture the market.

Alibaba Cloud is doing the same. They released the Zhenwu M890 training-inference integrated AI chip in May this year, and before that, the Zhenwu 810E had already been deployed on a large scale in Alibaba Cloud's Lingjun Intelligent Computing Platform. At this year's Alibaba Cloud Summit, Alibaba Cloud also directly announced that the cumulative shipment of Pingtouge's Zhenwu series AI chips has reached 560,000 units, with annualized revenue crossing the 10-billion-yuan level.

It can be said that in learning from Nvidia, domestic cloud service vendors are not only moving faster but also started earlier.

The Strongest Computing Power? No, The World Needs 'Optimal Computing Power'

Of course, Nvidia won't suddenly lose its core position in the AI era just because Google started selling TPUs.

At least for a long time, GPUs, CUDA, and the developer ecosystem will still be the standard the entire AI industry cannot bypass. Especially in large model training, high-performance computing, and general AI development scenarios, Nvidia remains the most mature and industry-recognized choice currently.

But the problem is that the AI computing market is entering its next phase.

In the past, everyone competed over 'whose chip is stronger.' Now, what enterprises truly care about is becoming 'who can make computing power cheaper.' At this point, the advantages of cloud service vendors like Google, Huawei Cloud, and Alibaba Cloud begin to show: they possess massive numbers of individual and enterprise clients, data, applications, and scenarios, and are also more adept at packaging various hardware into a productivity system that can be used directly.

In other words, what is truly scarce in the AI era is no longer just the chips themselves, but the systemic ability to turn chips into productivity.

When computing power becomes more and more like basic resources such as water, electricity, and broadband, the ultimate winning company may not necessarily be the one with the strongest single-card performance, but rather the one that can deliver AI computing power to clients continuously at lower cost and higher efficiency.

Therefore, in the view of Lei Technology, Google selling TPUs is actually a signal reminding the entire industry: The competition for AI infrastructure is no longer just a chip war, but a system war.

This article is from the WeChat public account 'Lei Technology,' author: Lei Technology

İlgili Sorular

QWhat is TPU and how does it differ from CPU and GPU?

ATPU stands for Tensor Processing Unit. It is a chip specifically designed for matrix and tensor mathematical operations in AI computing, offering high efficiency in handling such calculations. Unlike CPUs and GPUs, which are more general-purpose processors, TPUs are specialized for the core computations that underpin AI large model technology.

QAccording to the article, why is Nvidia considered a crucial company in the AI era?

ANvidia's crucial role isn't solely due to powerful GPU performance. Its real strength lies in the comprehensive ecosystem it has built, including CUDA, NVLink, DGX systems, InfiniBand networking, AI software libraries, developer tools, and extensive partnerships. This ecosystem provides a proven, turnkey solution, saving companies time and resources, which is why many continue to use Nvidia's hardware despite the cost.

QWhat is the primary goal of Google's strategy in selling its TPU hardware and systems?

AGoogle's primary goal is not just to sell individual TPU chips but to package its expertise in chips, data centers, networking, and model training into a complete, competitive system and ecosystem. This approach aims to offer enterprises an efficient, alternative AI infrastructure solution, particularly for inference workloads, with the potential for lower token costs and higher efficiency, thereby challenging Nvidia's dominance in non-top-tier markets.

QHow does the article describe the evolving focus of the AI compute market?

AThe article describes a shift in focus from 'whose chip is the most powerful' to 'who can make compute power cheaper.' AI compute is increasingly becoming a fundamental resource, similar to electricity or water. The future competition is shifting from a pure chip battle to a systems battle, where the winner will be the company that can deliver AI compute to customers with lower cost and higher efficiency.

QWhat parallel domestic development in China does the article mention in relation to Google's TPU strategy?

AThe article mentions that Chinese cloud service providers like Huawei Cloud and Alibaba Cloud are pursuing a similar strategy. They are not just focusing on chip performance but are building comprehensive AI production systems by integrating their own chips (e.g., Huawei's Ascend, Alibaba's Zhenwu), clusters, cloud platforms, model services, and industry solutions. This focus on creating a complete ecosystem, often offering competitive token pricing, represents the core competitiveness of domestic AI infrastructure.

İlgili Okumalar

Ethereum Staking Tax Debate Erupts Over Validator Redirected Revenue Proposal

A new Ethereum Research proposal titled "Validator Redirected Revenue" has sparked debate over staking economics. It outlines a mechanism allowing validators to redirect part of their staking rewards toward funding ecosystem public goods, such as research and infrastructure, aiming to create a more sustainable funding model beyond donations and grants. Critics have labeled the concept a "staking tax," arguing it could politicize validation and potentially become mandatory, undermining validator incentives and network neutrality. Supporters contend it addresses Ethereum's long-term funding needs and allows validators to coordinate on ecosystem priorities. Importantly, the proposal is in an early research and discussion stage on the Ethereum Research forum. It is not an approved or imminent protocol change. The path to implementation is long and uncertain. However, it signals a significant governance debate that market participants and ETH holders are watching closely due to its potential impact on staking rewards and Ethereum's investment case.

bitcoinist17 dk önce

Ethereum Staking Tax Debate Erupts Over Validator Redirected Revenue Proposal

bitcoinist17 dk önce

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

Jocy, founder of IOSG Ventures, argues that Ethereum does not need renewed technological faith but a "Musk-like compromise." The recent formation of ETHLabs—funded by major ETH holders like BitMine and Lubin—highlights a market-driven move to fill a gap left by the Ethereum Foundation (EF), signaling a loss of confidence in its decentralized, hands-off approach. The core critique contrasts Vitalik Buterin's (V) idealistic, technology-first vision with Elon Musk's pragmatic, business-driven execution. The author asserts Ethereum's current shortage is not another technical roadmap but a clear, real-world application narrative and a leader willing to engage directly with commercial realities—like Musk. Internal issues are emphasized, citing EF's management problems and talent drain. While the new decentralized model with independent nodes like ETHLabs addresses the single foundation's limitations, it risks fragmentation without cohesive direction. True cohesion, the author suggests, must come from a shared, compelling narrative around ETH's value, not just from aligned financial interests. Independence claims for new entities are seen as aspirational, needing years of transparency to build trust. The ultimate threat is not competitors like Solana, but the broader shift of attention and talent toward AI. Ethereum has a limited window—12 to 18 months—to recapture focus by delivering tangible, real-world applications. The conclusion urges V to shift from abstract ideals to grounded, pragmatic leadership. The time for this crucial pivot is running out.

marsbit59 dk önce

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

marsbit59 dk önce

JD.com and Former OpenAI CTO Mira Murati Have Bet on the Same AI Track

JD.com and Mira Murati's Thinking Machines Lab are converging on the same AI frontier: proactive visual-language interaction models. JD just open-sourced JoyAI-VL-Interaction, the world's first full-stack open-source model of its kind. Unlike traditional "turn-based" AI that waits for user prompts, this model actively analyzes continuous video streams, autonomously deciding when to respond, stay silent, or delegate complex tasks. It prioritizes vision as the primary driver for decision-making in physical-world scenarios like elderly fall detection, live sports commentary, or warehouse monitoring. The 8-billion-parameter model is designed for practical deployment, running on a single RTX 3090 GPU with sub-second latency. Its "full-stack" open-source release includes the model, inference system, and dataset, aiming to catalyze a developer ecosystem. JD's strategy is underpinned by its vast operational footprint in retail, logistics, and healthcare, which provides crucial real-world data for training. The move signals a broader shift in AI competition from screen-based Q&A to active participation in the physical world.

marsbit1 saat önce

JD.com and Former OpenAI CTO Mira Murati Have Bet on the Same AI Track

marsbit1 saat önce

Do Not Apply Blindly: A Comprehensive Evaluation of the Eight Main Pathways to Hong Kong Residency by 2026

Hong Kong has recently updated its talent attraction policies, offering eight mainstream pathways to residency. These include programs like the Top Talent Pass Scheme (TTPS), Quality Migrant Admission Scheme (QMAS), the newly introduced Technology Professionals Admission Scheme (TP Stream), and Vocational Professionals Admission Scheme (VPAS). Navigating these options involves understanding key details such as core eligibility criteria, employer sponsorship requirements, and the respective advantages and drawbacks of each scheme. A comprehensive comparison chart is provided to help applicants evaluate their choices and potentially save on consultancy fees. Applicants are reminded to always verify information with the official announcements from the Hong Kong Immigration Department.

marsbit1 saat önce

Do Not Apply Blindly: A Comprehensive Evaluation of the Eight Main Pathways to Hong Kong Residency by 2026

marsbit1 saat önce

Report Analysis: Semiconductor Sector Surges 155%, Bernstein Says NVDA and AVGO Are 'Absurdly Cheap'

Title: Bernstein's Semiconductor Quarterly Review: AI is the "Only Game in Town," Highlights "Absurdly Cheap" NVDA and AVGO Bernstein's June 23 semiconductor industry review asserts AI is the sector's dominant driver, fueling record gains. The SOX index rose 155.6% over the past year, primarily driven by a 75% increase in forward EPS, not just valuation expansion. While sector-wide valuations are high, Bernstein identifies a significant valuation gap. Despite leading the AI chip market, Nvidia (NVDA) and Broadcom (AVGO) have lagged in performance this year. Based on their 2027 EPS projections and critical roles in the AI supply chain, Bernstein rates both as "Outperform," calling their current valuations "absurdly cheap." The report notes extreme divergence within the sector, with memory chips up 500% YTD, while GPUs/ASICs gained 115%. Bernstein upgraded AMD to "Outperform" due to its dual opportunity in both AI/GPU and CPU markets. However, it remains cautious on Qualcomm (QCOM), citing smartphone market pressures. Key risks highlighted include historically high sector crowding and elevated inventory levels, which could pressure the supply chain if demand softens. The conclusion stresses selective stock-picking over broad directional bets, given current high valuations.

marsbit2 saat önce

Report Analysis: Semiconductor Sector Surges 155%, Bernstein Says NVDA and AVGO Are 'Absurdly Cheap'

marsbit2 saat önce

İşlemler

Spot

Futures