When US Giants Collectively "Defect" to Chinese AI Models

marsbitPublished on 2026-07-03Last updated on 2026-07-03

Abstract

When Silicon Valley Giants Turn to Chinese AI Models to Cut Costs A surprising trend is emerging: major U.S. tech companies are significantly reducing AI costs by switching to Chinese models. Coinbase, the largest U.S. cryptocurrency exchange, reportedly halved its AI spending after migrating to China's GLM-5.2 and Kimi 2.7 models, despite increasing usage. They achieved this through a sophisticated three-part strategy: implementing an automatic routing system to select the most cost-effective model per task, boosting cache hit rates from 5% to 60% to reuse computations, and employing "context engineering" to provide AI with more precise, less cluttered information. They are not alone. AI startup Lindy switched from Claude to DeepSeek, saving millions, while Snowflake's tests found GLM-5.2 solved 66% of coding tasks compared to Claude Opus's 67%—but at a fraction of the cost (output pricing is 5-7 times lower). While the top Western models may offer slightly better stability, the massive price differential is leading many businesses to reconsider their value proposition. This shift signals a deeper change in the AI industry, moving beyond pure performance benchmarks to a fierce cost competition. As pressure mounts, even OpenAI and Anthropic have begun slashing prices. For users, this means more choices, lower costs, and a crucial lesson: using multiple models based on task complexity, optimizing with caching, and keeping contexts lean are now key to leveraging AI efficient...

Original Title: US Largest Crypto Exchange Quietly Switches to Chinese AI Model, Saves Half the Cost

Original Author: AI Hands-on Notes

A Data Point That Makes Silicon Valley Uneasy

Recently, a statement made by Brian Armstrong, the CEO of the largest US cryptocurrency exchange, Coinbase, caused a stir in the tech circle:

"We switched our AI models to China's GLM 5.2 and Kimi 2.7, cutting AI expenses in half."

Cut in half? Did usage also drop?

On the contrary. Coinbase's token usage has been consistently increasing.

Using more while spending less is what truly makes OpenAI and Anthropic uneasy.

How Did They Do It? Three Cost-Saving Strategies

Coinbase didn't just swap to a cheaper model. They built a complete "cost-saving system":

First Move: Don't Lock into One Model, Let the System Choose

Coinbase built an automated routing system. For each incoming request, the system automatically selects the most suitable model based on task type, price, and cache status.

Not every task requires the most expensive model. Simple translations use cheaper ones; complex reasoning uses better ones—just like you wouldn't drive a sports car to buy groceries downstairs.

Second Move: Boost Cache Hit Rate from 5% to 60%

This is the most impactful move. By optimizing caching strategies, Coinbase increased the cache hit rate from 5% to 60%.

Simply put, 60% of requests can reuse previous calculation results, significantly reducing the actual cost per call. This single optimization saved a substantial amount of money.

Third Move: Context Engineering

Coinbase requires developers to streamline context, start new sessions for new tasks, and avoid cramming too much into a single conversation.

This isn't laziness; it's a new field of study—known in the industry as Context Engineering. In a technical blog, Anthropic explicitly stated: when managing AI agents, context engineering is more effective than prompt engineering.

Simply put: it's not about making the AI smarter, but giving it more precise information.

▲ More and more enterprises are starting to be meticulous about AI model costs

Not Just Coinbase, This is a Trend

Coinbase isn't the first to try this.

Lindy, an AI startup with only 25 people, had its CEO Flo Crivello completely replace Claude with Deepseek. He told CNBC: "AI costs have already surpassed human costs; this is unsustainable." After switching models, costs "plummeted," saving millions of dollars.

Snowflake's CEO Sridhar Ramaswamy conducted a hands-on comparison: on 103 coding tasks, GLM-5.2 solved 66%, while Claude Opus 4.7 solved 67%. The gap? Almost none.

But the price gap is real:

Price Comparison (Per Million Tokens)

GLM-5.2: Input $1.40 / Output $4.40
Claude Opus 4.7: Input $5 / Output $25
GPT-5.5: Input $5 / Output $30

Output prices differ by 5-7 times.

Cheap Means No Good? Don't Jump to Conclusions

Reading this, you might ask: It's so much cheaper, is the quality the same?

Honestly, not exactly the same, but the gap is smaller than you think.

Snowflake's tests showed that GLM-5.2 is indeed less stable on certain tasks—first-attempt success rate was 47.6%, lower than Opus's 53.7%. Also, GLM sometimes "perseverates" on the wrong approach: on one task, it spent 24 minutes making 411 tool calls and still failed. Opus solved it in 9 minutes with 49 calls.

But on most tasks, the final success rates of the two were almost equal. The key question is: Are you willing to pay 5 times more for a few percentage points of stability?

For many companies, the answer is increasingly clear: No.

▲ The price gap between Chinese and Western AI models is reshaping the industry landscape

What Does This Mean for Us Ordinary People?

You might say: I'm not Coinbase, what does this have to do with me?

Actually, this trend offers three direct insights into how you use AI:

1. Don't Stick to Just One Model

Many people use AI and swear by just one—either ChatGPT or Claude. But professional players don't do that anymore. Using different models for different tasks is the most cost-effective approach.

Use cheaper ones for daily Q&A; use better ones for coding and analysis. It's like eating; you don't go to a Michelin-starred restaurant for every meal.

2. Caching and Reuse are Key to Saving Money

If you often use AI for similar tasks (like writing weekly reports or organizing notes daily), learning to leverage caching and templates can significantly reduce consumption.

3. Streamline Context = Better Results

Many people feed AI with every bit of background information. But facts show that giving AI less but more precise information leads to better results. New task? Start a new conversation. Don't make the AI search through a pile of history for answers.

Deeper Change: AI Pricing Models are Being Reshaped

Behind this wave of "model migration" is a shake-up of the entire AI industry's pricing logic.

The high valuations of OpenAI and Anthropic are built on the assumption of "continued high-speed revenue growth." But if more and more companies, like Coinbase and Lindy, switch to cheaper alternatives, this assumption crumbles.

Reportedly, OpenAI and Anthropic have already begun a price war. In OpenAI's newly released GPT-5.6 series, the Terra model is half the price of GPT-5.5, and the Luna model focuses on being the lowest-cost option.

For users, this is good news. The fiercer the competition, the lower the prices, and the more choices available.

When US giants start using Chinese models to save money, it shows that AI competition is no longer just a benchmark race in the lab, but a real cost competition measured in hard cash. The real skill is achieving the same results while spending less.

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

TRON Nile Testnet Deploys Quantum-Resistant Signature Cryptography

TRON's Nile testnet has deployed an upgrade implementing quantum-resistant signature cryptography, a proactive step to secure its ledger against potential future decryption threats from quantum computing. This development, confirmed via official sources like nileex.io and github.com, represents a notable Layer 1 security enhancement. The report emphasizes its relevance within the current market focus on stablecoins. Crucially, the upgrade is currently active only on the testnet, not the TRON mainnet. While providing a verified data point for the market to weigh, the article cautions that this single development should be considered alongside broader market factors and does not guarantee any specific price action. It stands as a snapshot of ongoing protocol development.

bitcoinist21m ago

TRON Nile Testnet Deploys Quantum-Resistant Signature Cryptography

bitcoinist21m ago

BIS Report Compliance Observations: The True Risks of Stablecoins Go Beyond 'De-pegging'

The BIS report, "Anchoring trust in money: innovation beyond stablecoins," highlights that the primary risks of stablecoins extend beyond potential de-pegging. It argues that the core challenge is whether stablecoins can be integrated into a financial system that is identifiable, monitorable, accountable, and regulatable. While acknowledging efficiency gains like faster payments and programmability, BIS emphasizes that money requires an institutional framework—including legal certainty, liquidity support, and financial integrity controls—which many stablecoins currently lack. The report details compliance risks, noting that while blockchain transactions are transparent, address visibility does not equate to identity or purpose clarity. This creates a systemic risk as pseudonymity, non-custodial wallets, and cross-chain bridges can undermine AML/CFT controls. Furthermore, these risks can spill over into the traditional financial system through on- and off-ramps. The future direction, per BIS, is not to prohibit innovation but to embed regulatory rules—such as identity verification and transaction screening—directly into the technological infrastructure of tokenized finance. The key takeaway for compliance is that any new financial instrument must clearly address questions of customer identification, transaction monitoring, accountability, and cross-border rule consistency to be viable as a mainstream payment tool.

marsbit50m ago

BIS Report Compliance Observations: The True Risks of Stablecoins Go Beyond 'De-pegging'

marsbit50m ago

BIS Report Compliance Watch: The Real Risks of Stablecoins Are Not Just 'De-pegging'

BIS Report Compliance Observations: The real risks of stablecoins go beyond "depegging" The BIS report "Anchoring trust in money: innovation beyond stablecoins" argues that while stablecoins and tokenization offer efficiency gains, their primary risk lies in fitting into an identifiable, monitorable, accountable, and regulatable financial system. Money's trust stems not just from technology but from institutional arrangements: a common unit of account, guaranteed redemption at par, liquidity support, regulatory frameworks, and financial integrity requirements. Stablecoins, operating on permissionless blockchains with pseudo-anonymity and non-custodial wallets, create systemic compliance gaps: unclear customer identity, incomplete fund origins, unexplained transaction purposes, fragmented cross-chain paths, and ambiguous liability. On-chain transparency does not equal compliance transparency. Public addresses don't reveal identity or intent. While blockchain analytics aid law enforcement, they cannot replace routine, large-scale AML/CFT controls. Effective compliance requires a closed-loop process encompassing customer onboarding, transaction monitoring, investigation, reporting, and audit. Stablecoin risks are not confined to the blockchain; they re-enter the traditional financial system via on/off-ramps, exchanges, and payment institutions. This forces banks to monitor client accounts for activity linked to virtual assets. The future direction is not to prohibit innovation but to embed rules into the technology. Tokenized finance should integrate with the existing two-tier monetary system, embedding compliance—like customer identification, pre-transaction screening, and auditable data trails—directly into the transaction flow. For compliance professionals, the key takeaway is that any new financial instrument must answer core questions: Who identifies the customer? Who monitors transactions? Who handles exceptions? Who is liable? Compliance is not the antithesis of innovation but the essential infrastructure for its sustainable growth.

链捕手58m ago

BIS Report Compliance Watch: The Real Risks of Stablecoins Are Not Just 'De-pegging'

链捕手58m ago

When American Giants 'Defect' to Chinese AI Models

Summary: The trend of major U.S. technology firms adopting more cost-effective Chinese AI models is gaining momentum. A prime example is Coinbase, the largest U.S. cryptocurrency exchange, which reportedly halved its AI expenditure by switching to Chinese models GLM-5.2 and Kimi 2.7, while its usage volume increased. This was achieved through a sophisticated cost-saving system featuring intelligent model routing (selecting the most suitable model per task), dramatically improving cache hit rates from 5% to 60%, and implementing "Context Engineering" to streamline prompts. This shift is not isolated. Other companies like the AI startup Lindy and data cloud firm Snowflake are making similar moves, drawn by the significant price disparity. For instance, GLM-5.2 costs $1.40/$4.40 per million tokens (input/output), compared to $5/$25 for Claude Opus 4.7. While top Western models may offer slightly higher stability or speed in complex tasks, the performance gap is narrowing, making the price difference harder to justify for many enterprise use cases. The implications are significant for both businesses and individual users. It highlights the importance of a multi-model strategy based on task requirements, the value of caching and reusing outputs, and the effectiveness of providing concise context. Ultimately, this migration signals a potential reshaping of the AI industry's pricing model, moving competition from pure performance benchmarks to practical cost-effectiveness, with increased choice and downward price pressure benefiting end-users.

链捕手1h ago

When American Giants 'Defect' to Chinese AI Models

链捕手1h ago

Sui Testnet Update v1.74.1 Slashes Transaction Gas Costs Via Protocol Version 128

Sui blockchain developer Mysten Labs has deployed testnet update v1.74.1, introducing protocol version 128. The primary outcome of this upgrade is a significant reduction in transaction gas costs for users and developers operating on the testnet. These optimizations are designed to enhance network performance and scalability in preparation for a future mainnet deployment. It is crucial to note that these changes are currently confined to the testnet environment. The development provides a concrete, source-verified data point regarding ongoing protocol improvements. For market participants, this represents a confirmed technical advancement to consider, though it should be weighed alongside broader market factors and does not in itself guarantee specific price movements. The story highlights a focus on foundational development amid typical market volatility.

bitcoinist1h ago

Sui Testnet Update v1.74.1 Slashes Transaction Gas Costs Via Protocol Version 128

bitcoinist1h ago

Trading

Spot

Hot Articles

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

Audiera is a dual-platform Web4 entertainment ecosystem combining a mobile rhythm experience and a lightweight Telegram mini-game, powered by AI interaction and an on-chain creator economy.

40.4k Total ViewsPublished 2026.03.11Updated 2026.03.11

Audiera: The AI Agent Network Powering the Web4 Entertainment Economy

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

43.1k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

2.4k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.