When US Giants Collectively "Defect" to Chinese AI Models

marsbit发布于2026-07-03更新于2026-07-03

文章摘要

When Silicon Valley Giants Turn to Chinese AI Models to Cut Costs A surprising trend is emerging: major U.S. tech companies are significantly reducing AI costs by switching to Chinese models. Coinbase, the largest U.S. cryptocurrency exchange, reportedly halved its AI spending after migrating to China's GLM-5.2 and Kimi 2.7 models, despite increasing usage. They achieved this through a sophisticated three-part strategy: implementing an automatic routing system to select the most cost-effective model per task, boosting cache hit rates from 5% to 60% to reuse computations, and employing "context engineering" to provide AI with more precise, less cluttered information. They are not alone. AI startup Lindy switched from Claude to DeepSeek, saving millions, while Snowflake's tests found GLM-5.2 solved 66% of coding tasks compared to Claude Opus's 67%—but at a fraction of the cost (output pricing is 5-7 times lower). While the top Western models may offer slightly better stability, the massive price differential is leading many businesses to reconsider their value proposition. This shift signals a deeper change in the AI industry, moving beyond pure performance benchmarks to a fierce cost competition. As pressure mounts, even OpenAI and Anthropic have begun slashing prices. For users, this means more choices, lower costs, and a crucial lesson: using multiple models based on task complexity, optimizing with caching, and keeping contexts lean are now key to leveraging AI efficient...

Original Title: US Largest Crypto Exchange Quietly Switches to Chinese AI Model, Saves Half the Cost

Original Author: AI Hands-on Notes

A Data Point That Makes Silicon Valley Uneasy

Recently, a statement made by Brian Armstrong, the CEO of the largest US cryptocurrency exchange, Coinbase, caused a stir in the tech circle:

"We switched our AI models to China's GLM 5.2 and Kimi 2.7, cutting AI expenses in half."

Cut in half? Did usage also drop?

On the contrary. Coinbase's token usage has been consistently increasing.

Using more while spending less is what truly makes OpenAI and Anthropic uneasy.

How Did They Do It? Three Cost-Saving Strategies

Coinbase didn't just swap to a cheaper model. They built a complete "cost-saving system":

First Move: Don't Lock into One Model, Let the System Choose

Coinbase built an automated routing system. For each incoming request, the system automatically selects the most suitable model based on task type, price, and cache status.

Not every task requires the most expensive model. Simple translations use cheaper ones; complex reasoning uses better ones—just like you wouldn't drive a sports car to buy groceries downstairs.

Second Move: Boost Cache Hit Rate from 5% to 60%

This is the most impactful move. By optimizing caching strategies, Coinbase increased the cache hit rate from 5% to 60%.

Simply put, 60% of requests can reuse previous calculation results, significantly reducing the actual cost per call. This single optimization saved a substantial amount of money.

Third Move: Context Engineering

Coinbase requires developers to streamline context, start new sessions for new tasks, and avoid cramming too much into a single conversation.

This isn't laziness; it's a new field of study—known in the industry as Context Engineering. In a technical blog, Anthropic explicitly stated: when managing AI agents, context engineering is more effective than prompt engineering.

Simply put: it's not about making the AI smarter, but giving it more precise information.

▲ More and more enterprises are starting to be meticulous about AI model costs

Not Just Coinbase, This is a Trend

Coinbase isn't the first to try this.

Lindy, an AI startup with only 25 people, had its CEO Flo Crivello completely replace Claude with Deepseek. He told CNBC: "AI costs have already surpassed human costs; this is unsustainable." After switching models, costs "plummeted," saving millions of dollars.

Snowflake's CEO Sridhar Ramaswamy conducted a hands-on comparison: on 103 coding tasks, GLM-5.2 solved 66%, while Claude Opus 4.7 solved 67%. The gap? Almost none.

But the price gap is real:

Price Comparison (Per Million Tokens)

GLM-5.2: Input $1.40 / Output $4.40
Claude Opus 4.7: Input $5 / Output $25
GPT-5.5: Input $5 / Output $30

Output prices differ by 5-7 times.

Cheap Means No Good? Don't Jump to Conclusions

Reading this, you might ask: It's so much cheaper, is the quality the same?

Honestly, not exactly the same, but the gap is smaller than you think.

Snowflake's tests showed that GLM-5.2 is indeed less stable on certain tasks—first-attempt success rate was 47.6%, lower than Opus's 53.7%. Also, GLM sometimes "perseverates" on the wrong approach: on one task, it spent 24 minutes making 411 tool calls and still failed. Opus solved it in 9 minutes with 49 calls.

But on most tasks, the final success rates of the two were almost equal. The key question is: Are you willing to pay 5 times more for a few percentage points of stability?

For many companies, the answer is increasingly clear: No.

▲ The price gap between Chinese and Western AI models is reshaping the industry landscape

What Does This Mean for Us Ordinary People?

You might say: I'm not Coinbase, what does this have to do with me?

Actually, this trend offers three direct insights into how you use AI:

1. Don't Stick to Just One Model

Many people use AI and swear by just one—either ChatGPT or Claude. But professional players don't do that anymore. Using different models for different tasks is the most cost-effective approach.

Use cheaper ones for daily Q&A; use better ones for coding and analysis. It's like eating; you don't go to a Michelin-starred restaurant for every meal.

2. Caching and Reuse are Key to Saving Money

If you often use AI for similar tasks (like writing weekly reports or organizing notes daily), learning to leverage caching and templates can significantly reduce consumption.

3. Streamline Context = Better Results

Many people feed AI with every bit of background information. But facts show that giving AI less but more precise information leads to better results. New task? Start a new conversation. Don't make the AI search through a pile of history for answers.

Deeper Change: AI Pricing Models are Being Reshaped

Behind this wave of "model migration" is a shake-up of the entire AI industry's pricing logic.

The high valuations of OpenAI and Anthropic are built on the assumption of "continued high-speed revenue growth." But if more and more companies, like Coinbase and Lindy, switch to cheaper alternatives, this assumption crumbles.

Reportedly, OpenAI and Anthropic have already begun a price war. In OpenAI's newly released GPT-5.6 series, the Terra model is half the price of GPT-5.5, and the Luna model focuses on being the lowest-cost option.

For users, this is good news. The fiercer the competition, the lower the prices, and the more choices available.

When US giants start using Chinese models to save money, it shows that AI competition is no longer just a benchmark race in the lab, but a real cost competition measured in hard cash. The real skill is achieving the same results while spending less.

你可能也喜欢

波场网络受稳定币结算驱动创下交易量与活跃地址记录

波场网络在六月创下历史新高，总交易量超过3.85亿笔，活跃钱包地址达到2690万个。这一增长主要由稳定币结算量（特别是OUSD/USDT）推动，表明网络使用活跃度显著提升，而非仅仅是市场价格波动。数据来源明确指向Tronscan.org的官方网络交易和活跃地址记录。文章强调，这一记录反映了稳定币活动在波场网络上的实际使用和结算规模，是具体的网络数据里程碑，不应直接等同于TRX代币价格将上涨的信号。分析师认为，这为市场提供了一个可验证的数据点，有助于判断资本流动和基础设施使用的结构性趋势，但投资者仍需关注后续的监管动态、市场流动性等风险因素。

bitcoinist24分钟前

bitcoinist24分钟前

TRON Nile测试网部署抗量子签名密码学

TRON Nile测试网已部署抗量子签名密码学。此次升级旨在保护账本免受未来量子计算可能带来的解密风险，是Layer 1安全方面的前瞻性举措。关键信息源自主网nileex.io及github.com。报道指出，此更新符合当前市场对稳定币等主题的关注，为投资者提供了一个需考量的具体进展。但需注意，该部署目前仅在测试网进行，而非TRON主网。这意味着应将其视为一个范围明确的确切进展，而非必然引发价格变动的市场广泛转向。在加密市场快速变化的环境中，单个数据点需结合流动性、市场结构和后续官方确认来综合评估。

bitcoinist1小时前

bitcoinist1小时前

BIS 报告合规观察：稳定币真正的风险，不只是“脱锚”

国际清算银行（BIS）最新报告指出，稳定币的真正风险不仅在于价格“脱锚”，更在于其能否融入可识别、可监测、可追责、可监管的金融体系。报告承认稳定币和代币化技术在支付效率、可编程性等方面的优势，但强调货币的核心在于背后的制度安排，包括兑付确定性、流动性支持、法律与监管框架以及金融完整性要求。从合规视角看，稳定币的风险是一组系统性问题：客户身份难以清晰识别、资金来源与目的不明、跨链交易路径碎片化、责任主体模糊。链上数据的公开性并不等于合规透明，“地址可见”不等于“身份可见”。稳定币的规模已不容忽视，其风险会通过出入金、支付机构等渠道传导至传统金融体系，要求银行等机构加强客户尽调与交易监测。 BIS报告建议，未来应将代币化技术嵌入以央行货币和受监管机构为基础的两层体系，实现“规则前置”——在交易流程中直接嵌入客户识别、风险预筛查、可审计数据轨迹等合规要求。这提示合规部门，金融创新的长远发展必须解决“谁识别、谁监测、谁负责”的根本问题。合规不应是创新的障碍，而应成为其可持续发展的基础设施。

链捕手1小时前

链捕手1小时前

当美国巨头集体“叛逃”中国 AI 模型

美国最大加密货币交易所Coinbase将AI模型切换至中国的GLM-5.2和Kimi-2.7，使AI支出减少一半，但使用量仍在增长。这一变化通过三招策略实现：建立自动路由系统根据任务选择最合适模型；将缓存命中率从5%大幅提升至60%；推行“上下文工程”，要求开发者精简对话内容以提高效率。这股趋势并非个例。AI创业公司Lindy用Deepseek替换Claude后成本“断崖式下降”；Snowflake的测试显示，GLM-5.2在多项任务上的表现与价格贵5-7倍的Claude Opus接近。虽然中国模型在稳定性上可能略逊一筹，但巨大的价格优势促使越来越多的企业重新考量性价比。这一现象正在重塑AI行业格局。用户可从中获得启示：不必只认一个模型，应根据任务灵活选用；善用缓存和模板能显著降低成本；与AI对话时提供更精准而非更冗长的上下文，效果更好。更深层次看，美国巨头的选择动摇了OpenAI等公司依赖高增长收入的定价逻辑，可能引发行业价格战，最终为用户带来更多选择和更低成本。AI竞争已从技术跑分进入真金白银的成本较量阶段。

链捕手1小时前

链捕手1小时前

Sui测试网更新v1.74.1通过协议版本128大幅降低交易Gas成本

Sui区块链的开发公司Mysten Labs在测试网部署了v1.74.1版本更新，并启用了协议版本128。此次升级的核心在于显著降低了用户在测试网进行交易时的Gas费用。此举旨在优化网络性能，为将来在主网上的规模化部署做好准备。文章强调，这一信息源自官方渠道（如GitHub上的发布记录），提供了基于事实的清晰进展，而非模糊的市场情绪。报道明确指出，当前更新仅适用于测试网环境，尚未部署至主网。因此，这应被视为一个具有明确范围的已确认技术进展，有助于评估Sui协议的发展动向，但并不能直接等同于未来的价格变动或广泛的市场趋势转变。对于市场参与者而言，这是一个需要结合流动性、市场结构以及其他官方后续信息来综合权衡的数据点。

bitcoinist1小时前

交易

现货

When US Giants Collectively "Defect" to Chinese AI Models

文章摘要

A Data Point That Makes Silicon Valley Uneasy

How Did They Do It? Three Cost-Saving Strategies

Not Just Coinbase, This is a Trend

Cheap Means No Good? Don't Jump to Conclusions

What Does This Mean for Us Ordinary People?

Deeper Change: AI Pricing Models are Being Reshaped

热门币种推荐

相关问答

你可能也喜欢

波场网络受稳定币结算驱动创下交易量与活跃地址记录

TRON Nile测试网部署抗量子签名密码学

BIS 报告合规观察：稳定币真正的风险，不只是“脱锚”

当美国巨头集体“叛逃”中国 AI 模型

Sui测试网更新v1.74.1通过协议版本128大幅降低交易Gas成本

交易

热门文章

自主AI经济的基石：Talus如何重塑链上智能代理

火币成长学院：AI与Crypto深度研报：算法与账本的共生时代

从H2A到A2A：AI Agent经济体与Crypto新机遇

相关讨论

热门问答

热门分类

热门标签