Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

marsbitPublished on 2026-06-24Last updated on 2026-06-24

Abstract

Google has begun selling its proprietary TPU chips and AI computing hardware directly to third-party data centers and clients, marking a strategic shift. Previously only accessible via cloud rentals, TPUs are specialized processors designed for the matrix and tensor operations central to AI models. By combining thousands into supercomputing clusters managed by CPUs, Google achieves high-efficiency AI processing. This move enables Google’s Gemini AI to offer competitive token pricing, challenging rivals like OpenAI. It also signals a broader industry trend where AI compute is becoming a commoditized resource like electricity. While NVIDIA remains dominant with its CUDA ecosystem and high-performance GPUs, the focus is shifting from raw power to cost efficiency and system integration. Google’s approach mirrors NVIDIA’s by selling an entire ecosystem—hardware, software, and data center expertise—rather than just chips. This threatens NVIDIA’s grip on the mid-range inference market, where lower-cost, efficient solutions are increasingly demanded. Similarly, cloud providers like Huawei Cloud and Alibaba Cloud in China are developing their own AI chip ecosystems (e.g., Ascend, Zhenwu), packaging chips, clusters, and tools into full-stack solutions. They aim to reduce token costs and capture market share through integrated systems. In summary, the AI infrastructure race is evolving from a competition for the strongest chips to a contest for the most efficient and cost-effective ...

Recently, Google announced it has officially begun selling its self-developed TPU chips and supporting AI computing hardware directly to third-party data centers and customers. As Google's 'secret weapon' in the AI field, third parties previously could only rent TPUs through cloud data centers. The industry once thought Google would not sell these chips externally, but surprisingly, this good news came in June this year.

So the question is, what is a 'TPU'? Its full name is 'Tensor Processing Unit.' Unlike CPUs and GPUs, it is a chip specifically designed for 'matrix and tensor mathematical operations' in AI computing, capable of handling related calculations with extremely high efficiency.

Does it sound like an auxiliary chip? However, that's not the case, because current large AI model technology essentially involves complex mathematical operations (primarily matrix multiplication) on massive amounts of data. So Google did one thing: it combined thousands of TPUs into supercomputing clusters, then used CPU hosts for coordination (decomposing tasks, converting data), creating highly efficient AI computing centers.

Image Source: Google

This is also why Gemini can aggressively capture users from OpenAI and other companies with lower subscription fees and higher usage limits. Even looking at token prices alone, Gemini is one of the representatives among overseas AI products with lower pricing for flagship models, and its mainstream model pricing is close to that of domestic model manufacturers like DeepSeek.

Moreover, TPUs are also better at handling the massive daily computing requests from users, making them 'professionally suited' for the future AI ecosystem, which is why the outside world has coveted this chipset for a long time. After announcing the sales plan, Google also revealed a $5 billion agreement to jointly build a large computing center with the famous private equity firm Blackstone, with a tentative capacity of 500 megawatts.

Lei Technology (ID: leitech) speculates that after this news is released, many enterprises, especially those wanting to build their own computing centers, will likely inquire with Google or seek cooperation. At this point, some might think Amazon should be worried, as this is clearly competing for cloud service business. Actually, not quite. The one most likely troubled now is probably Nvidia.

Has Google Dealt Nvidia a 'Fatal Blow'?

First, let me ask a question: Why has Nvidia become one of the most important companies in the AI era? If your answer is just 'GPU computing power is strong,' you're only half right.

Where Nvidia is truly formidable is that it has long ceased to be just a GPU seller. CUDA, NVLink, DGX, InfiniBand networks, AI software libraries, developer ecosystems, server partners, cloud vendor adaptation—all these things together form Nvidia's moat.

So, when you buy a Nvidia computing card and start it up, you're not just buying a card; you're buying a complete, industry-validated AI ecosystem. For most companies, Nvidia's CUDA ecosystem saves them from 'reinventing the wheel,' thereby saving considerable effort and cost.

This is also why many AI companies, despite knowing Nvidia GPUs are expensive, still have to use them. Because during the AI explosion period, 'cost' could be ignored; the only metric was whether they could lead competitors or catch up faster. However, as large AI models enter the popularization phase, what everyone wants is no longer just 'speed.' Faced with a massive user base, efficiency and cost-effectiveness have become the new priorities.

Google has clearly seen this too, so they bet on TPUs and packaged them into a complete solution. The chips in this solution aren't meant to defeat Nvidia in raw computing performance, but rather to package Google's years of experience in chips, data centers, networks, storage, scheduling, and model training into a cloud service capability that enterprises can purchase directly.

This is where Google truly 'learned from Nvidia.' It's not learning to sell chips, but to sell systems and ecosystems, turning a series of hardware into 'productivity' that customers can use. This holds considerable appeal for enterprises wishing to keep their computing centers in their own hands.

So, is Nvidia panicking? Not necessarily, but it's definitely a headache. After all, while flagship computing cards are profitable, companies won't remain in a 'buy as much as available' state forever. They will gradually turn their attention to other more cost-effective chips. At that time, Google's TPU solution will inevitably impact Nvidia's share of this market.

However, for the entire AI industry, Nvidia remains the most recognized universal standard in the AI computing market, and the status of the CUDA ecosystem is not something that can be easily shaken by one or two generations of chips. Especially in large model training, many teams have accumulated extensive experience around Nvidia's system, making a rash platform switch risky.

For example, DeepSeek recently announced that its new model could be trained using Huawei's Ascend chips, and this was only possible after several version iterations following deep collaboration between Huawei and DeepSeek.

Image Source: Ascend

But from Google's perspective, it doesn't need to replace Nvidia in all scenarios. As long as it can capture a portion of enterprise clients and prove its efficiency is higher than other computing ecosystems, it can already carve out a piece of the cake from the AI infrastructure market.

Especially in the inference phase, Google's TPU computing servers clearly have an advantage. Everyone knows that once tokens are actually used, the consumption rate isn't much slower than opening a floodgate. There's the case of Uber burning through a year's budget in four months, a mysterious company spending $500 million on token fees in one month, and even deep-pocketed Microsoft restricting employee permissions, ordering them to use its own computing power.

It can be said that as AI usage frequency increases across various fields, more cases will prove that token cost is the key to future AI competition. Because whoever has lower token costs can popularize AI across more business lines to seize users and markets.

Computing Power Becomes a Basic Resource, Cloud Vendors' Opportunity Arrives

I think a netizen's analogy is very apt: training a model is like buying a car, while inference service is like the gasoline burned daily. Even wealthy households can't burn premium 98-octane gas in all cars every day; the computing power Google provides is like 92-octane. Although the power is slightly insufficient, the car still runs, the work gets done, and it's cheaper.

Recently, I wrote an article mentioning that there's a consensus in the industry: AI computing power is becoming more and more like basic resources such as electricity, water, and broadband.

Moreover, for users, they don't need to know how 'computing power' is produced, but they will care about its price like they care about utility bills. This 'user' can be an individual, an enterprise, a city, or even a country.

Therefore, in the future AI market, Nvidia will still be important because without high-performance chips, nothing can start. But when computing power demand becomes a long-term, stable, and scalable basic resource, the bargaining power will gradually shift to cloud service vendors.

This is also why cloud vendors like Google, Microsoft, Amazon, Alibaba Cloud, and Huawei Cloud are no longer content with being just 'resellers' of Nvidia GPU computing power but are all laying out their own computing ecosystems. This doesn't mean they will stop procuring Nvidia GPUs, because the market needs them, clients need them, and they can also sell for a good price.

Image Source: Lei Technology

But at the same time, their true development focus will inevitably shift to their own ecosystems. This is also what Nvidia needs to be most vigilant about. After all, Nvidia's current market value is largely calculated on the premise of it being the 'AI foundational base.' Once Nvidia loses control over the non-top-tier computing card market, it may gradually return to its position in the gaming graphics card market five years ago: top-tier, but not indispensable.

In fact, if we shift our perspective back to China, similar changes have already appeared. When we discussed domestic AI computing cards before, the focus was often on computing performance comparisons, discussing how far single-card performance is from top-tier computing cards.

This issue is certainly important, but if we only focus on chip performance itself, we overlook another key point: Domestic cloud vendors are also turning chips, clusters, cloud platforms, model services, and industry solutions into a complete AI production system, and this is the core competitiveness of domestic AI.

This isn't just my opinion; it's what core cloud service providers like Huawei Cloud and Alibaba Cloud are doing. For example, Huawei's Ascend Cloud Service, although the Ascend chip repeatedly makes headlines, today's Huawei already provides cloud-based toolchains, super node clusters, model migration, training/inference optimization, and industry landing capabilities centered around Ascend computing power.

Image Source: Weibo

Moreover, Huawei is also promoting this computing ecosystem to more domestic AI companies. Besides DeepSeek mentioned earlier, there are also leading AI firms like Baidu, iFLYTEK, Zhipu AI, and MiniMax. It can be said that Huawei has gradually built its computing ecosystem. The next step is to bring more partners on board and then use lower token prices to capture the market.

Alibaba Cloud is doing the same. They released the Zhenwu M890 training-inference integrated AI chip in May this year, and before that, the Zhenwu 810E had already been deployed on a large scale in Alibaba Cloud's Lingjun Intelligent Computing Platform. At this year's Alibaba Cloud Summit, Alibaba Cloud also directly announced that the cumulative shipment of Pingtouge's Zhenwu series AI chips has reached 560,000 units, with annualized revenue crossing the 10-billion-yuan level.

It can be said that in learning from Nvidia, domestic cloud service vendors are not only moving faster but also started earlier.

The Strongest Computing Power? No, The World Needs 'Optimal Computing Power'

Of course, Nvidia won't suddenly lose its core position in the AI era just because Google started selling TPUs.

At least for a long time, GPUs, CUDA, and the developer ecosystem will still be the standard the entire AI industry cannot bypass. Especially in large model training, high-performance computing, and general AI development scenarios, Nvidia remains the most mature and industry-recognized choice currently.

But the problem is that the AI computing market is entering its next phase.

In the past, everyone competed over 'whose chip is stronger.' Now, what enterprises truly care about is becoming 'who can make computing power cheaper.' At this point, the advantages of cloud service vendors like Google, Huawei Cloud, and Alibaba Cloud begin to show: they possess massive numbers of individual and enterprise clients, data, applications, and scenarios, and are also more adept at packaging various hardware into a productivity system that can be used directly.

In other words, what is truly scarce in the AI era is no longer just the chips themselves, but the systemic ability to turn chips into productivity.

When computing power becomes more and more like basic resources such as water, electricity, and broadband, the ultimate winning company may not necessarily be the one with the strongest single-card performance, but rather the one that can deliver AI computing power to clients continuously at lower cost and higher efficiency.

Therefore, in the view of Lei Technology, Google selling TPUs is actually a signal reminding the entire industry: The competition for AI infrastructure is no longer just a chip war, but a system war.

This article is from the WeChat public account 'Lei Technology,' author: Lei Technology

a16z: In the AI Era, Company Competition for Talent Starts with Job Title Naming

The article discusses how companies in the AI era are competing for talent through strategic "title arbitrage," or the renaming of key roles to reflect and attract new, high-value capabilities. It uses Palantir's creation of the "Forward-Deployed Engineer" (FDE) as a prime example. This title reframed client-facing technical work from a peripheral "implementation" role into a core, high-status engineering function. The move was strategic, allowing Palantir to attract talent that blended technical skill with business acumen and to dominate the market's perception of this capability. The piece argues that job titles are an organizational language that signals the value and authority of certain work. Effective new titles, like "Data Scientist" or "Site Reliability Engineer," emerge when a role's strategic importance genuinely outgrows its old name. Conversely, mere title inflation without substantive change is ineffective. For AI companies, particularly in B2B, this is a crucial strategy. AI transformation creates new high-leverage roles (e.g., "Legal Engineer," "GTM Engineer") that combine domain expertise with technical automation. By naming these roles, a company can help clients internally legitimize these change-makers. This, in turn, builds market mindshare, associating the company with the new capability. In conclusion, as AI blurs the lines between product and service, the ability to accurately name and organize the critical, client-adjacent work that defines product learning will be a key competitive advantage. The first to define this new organizational language plants a flag in the market's mind.

marsbit32m ago

a16z: In the AI Era, Company Competition for Talent Starts with Job Title Naming

marsbit32m ago

CBRS First Post-IPO Earnings Report: Revenue Doubles but Gross Margin Guidance Plummets, OpenAI's Massive Order Faces a Long Road to Realization

Cerebras Systems (CBRS) released its first quarterly report since going public. Q1 core revenue of $191.3M beat expectations, surging 92% YoY, with full-year revenue guidance raised to $855-$865M. However, the company's Q2 gross margin guidance of 36%-38% represents a sharp drop from Q1's 47%, sending shares down over 10% after-hours. Management attributed the margin pressure to temporary costs from leasing back hardware to deploy capacity for OpenAI, highlighting a business model shift from selling chips to selling compute power. While growth is strong, key investor debates center on valuation, customer concentration, and long-term contracts. Revenue remains heavily concentrated (86% from two UAE-linked entities in FY25), though major deals with OpenAI (a $20B+ contract) and AWS provide a long-term growth narrative. Analysts are broadly bullish, citing Cerebras's unique wafer-scale chip architecture as a potential advantage in the AI inference market. However, skeptics point to the narrow moat of its speed advantage, uncertainties around margin recovery amid its business transition, and a high valuation (~50x forward sales) that prices in the flawless execution of its large, long-dated contracts.

marsbit32m ago

CBRS First Post-IPO Earnings Report: Revenue Doubles but Gross Margin Guidance Plummets, OpenAI's Massive Order Faces a Long Road to Realization

marsbit32m ago

Interview with Strategy CEO: Can STRC Recover After Selling Bitcoin?

Interview with Strategy CEO Phong Le on the recent sale of 32 Bitcoin and its impact. He clarifies the move was a small, strategic action to demonstrate liquidity to debt holders, test internal processes, and prove operational discipline—not a response to fears of a "death spiral" from DeFi protocols leveraging STRC (Strategy's preferred stock product), which he notes holds less than 10% of STRC. Le emphasizes Strategy’s long-term focus as the largest corporate Bitcoin holder, using the adage that markets are a "voting machine" short-term but a "weighing machine" long-term. Decision-making is data-driven, involving the board, complex modeling, and multiple stakeholder considerations, moving beyond a founder-centric model. He outlines various capital options but stresses the strategic importance of "doing nothing" as a valid choice, citing resilience built during the 2022 bear market. Le expresses unwavering belief in Bitcoin's foundational value for global sovereignty and its future role in an AI-driven economy with trillions of autonomous agents. Addressing STRC's current price below its $100 face value, Le explains recent pressure was due to using dollar reserves for bond buybacks. He expects STRC to return to par as reserves are replenished and its semi-monthly dividend payments begin, noting the product is heavily over-collateralized. Finally, Le confirms the company sold Bitcoin the week prior to May 31st, as disclosed in an 8-K filing, leaving prediction market interpretations to others. The overarching philosophy remains "Spread Bitcoin with love," embracing all methods of gaining Bitcoin exposure.

marsbit1h ago

Interview with Strategy CEO: Can STRC Recover After Selling Bitcoin?

marsbit1h ago

Ethereum Staking Tax Debate Erupts Over Validator Redirected Revenue Proposal

A new Ethereum Research proposal titled "Validator Redirected Revenue" has sparked debate over staking economics. It outlines a mechanism allowing validators to redirect part of their staking rewards toward funding ecosystem public goods, such as research and infrastructure, aiming to create a more sustainable funding model beyond donations and grants. Critics have labeled the concept a "staking tax," arguing it could politicize validation and potentially become mandatory, undermining validator incentives and network neutrality. Supporters contend it addresses Ethereum's long-term funding needs and allows validators to coordinate on ecosystem priorities. Importantly, the proposal is in an early research and discussion stage on the Ethereum Research forum. It is not an approved or imminent protocol change. The path to implementation is long and uncertain. However, it signals a significant governance debate that market participants and ETH holders are watching closely due to its potential impact on staking rewards and Ethereum's investment case.

bitcoinist1h ago

Ethereum Staking Tax Debate Erupts Over Validator Redirected Revenue Proposal

bitcoinist1h ago

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

Jocy, founder of IOSG Ventures, argues that Ethereum does not need renewed technological faith but a "Musk-like compromise." The recent formation of ETHLabs—funded by major ETH holders like BitMine and Lubin—highlights a market-driven move to fill a gap left by the Ethereum Foundation (EF), signaling a loss of confidence in its decentralized, hands-off approach. The core critique contrasts Vitalik Buterin's (V) idealistic, technology-first vision with Elon Musk's pragmatic, business-driven execution. The author asserts Ethereum's current shortage is not another technical roadmap but a clear, real-world application narrative and a leader willing to engage directly with commercial realities—like Musk. Internal issues are emphasized, citing EF's management problems and talent drain. While the new decentralized model with independent nodes like ETHLabs addresses the single foundation's limitations, it risks fragmentation without cohesive direction. True cohesion, the author suggests, must come from a shared, compelling narrative around ETH's value, not just from aligned financial interests. Independence claims for new entities are seen as aspirational, needing years of transparency to build trust. The ultimate threat is not competitors like Solana, but the broader shift of attention and talent toward AI. Ethereum has a limited window—12 to 18 months—to recapture focus by delivering tangible, real-world applications. The conclusion urges V to shift from abstract ideals to grounded, pragmatic leadership. The time for this crucial pivot is running out.

marsbit2h ago

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

marsbit2h ago

Trading

Spot

Futures

Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

Abstract

Has Google Dealt Nvidia a 'Fatal Blow'?

Computing Power Becomes a Basic Resource, Cloud Vendors' Opportunity Arrives

The Strongest Computing Power? No, The World Needs 'Optimal Computing Power'

Related Questions

Related Reads

a16z: In the AI Era, Company Competition for Talent Starts with Job Title Naming

CBRS First Post-IPO Earnings Report: Revenue Doubles but Gross Margin Guidance Plummets, OpenAI's Massive Order Faces a Long Road to Realization

Interview with Strategy CEO: Can STRC Recover After Selling Bitcoin?

Ethereum Staking Tax Debate Erupts Over Validator Redirected Revenue Proposal

IOSG Founder: Ethereum Doesn't Need Another Leap of Technical Faith, It Needs a Musk-style Compromise

Trading