DeepSeek用V4重画了坐标系

marsbitPubblicato 2026-05-01Pubblicato ultima volta 2026-05-01

文 | 云涌AI ,作者 | 黄云皓

2026年4月23日,OpenAI在API价格页上线GPT-5.5:输入价(input)$5.00,输出价(output)$30.00,缓存命中价(cached input)$0.50。比上一代GPT-5.4的$2.50/$15/$0.25整整翻一倍,三档同时抬高。再往前推八个月,2025年8月,GPT-5的输入价还是$1.25;到这一天涨到$5.00,已经是当时的4倍。

4月26日,DeepSeek在V4系列价格页底部加了一行脚注:所有模型的缓存命中价,永久降至原价的1/10。V4-Pro这一档,缓存命中价从$0.145掉到$0.0145。

把两份价格表放到一起,缓存命中这一项:GPT-5.5是$0.50,V4-Pro是$0.0145,差34.5倍;如果再算上V4-Pro的“75% off”临时促销,两者相差138倍。

同一周里、两份价格表朝相反方向各走出两个数量级,一句“价格战”已经很难描述这种差距了。

这一周,模型厂商已经不在同一坐标系

01. 价格调整:脚注里写“永久”

DeepSeek这次调价同时有两个动作。

第一个是临时促销:V4-Pro挂着“75% off”,输入$1.74、输出$3.48、缓存命中$0.0145三档同步打折,折后落到$0.435、$0.87、$0.003625,5月31日后将恢复原价。

第二个写在脚注里:所有模型的缓存命中价永久降至原价的1/10。

真实的生产场景里,输入的提示词(prompt)往往包括每次重复的系统指令、角色设定、文档、工具定义,也包括这次新来的用户问题。在长期任务或重复工作中,前者通常占八九成,服务端只算一次、下次直接复用。这就是“缓存命中”,按低一档的“缓存命中价”计费。

DeepSeek把这一档永久砍到原价的1/10——账单里最大的一块,从此变成零头。七五折5月31日就到期,而缓存命中这一刀,不撤销。

DeepSeek敢这么砍,是因为V4在架构上把单token成本进一步压下来了。1M长上下文同口径下,V4-Pro处理同样任务消耗的算力(FLOPs)只有V3.2的27%,KV Cache(推理时保存上下文的显存)占用只有10%;V4-Flash再低一档,算力10%、KV Cache 7%。

所以$0.0145不是促销价,是架构压出来的。

DeepSeek最后给出的价格是:

  • V4-Flash:$0.14/$0.28/$0.0028(输入/输出/缓存命中)。同档OpenAI GPT-5.4 mini是$0.75/$4.50/$0.075,Anthropic Haiku 4.5是$1/$5/$0.10。
  • V4-Pro:$1.74/$3.48/$0.0145。同档OpenAI GPT-5.5是$5/$30/$0.50,Anthropic Opus 4.7是$5/$25/$0.50。

DeepSeek V4系列价格和脚注,来源:DeepSeek官方文档

要解释的不再是DeepSeek。这一周之后,其他模型厂要么跟着把小数点向左挪,要么留在原位,解释这30倍差价从哪里来。

02. 迁移成本:改两个字符串

价格表已经把差距摆出来了。下一步的问题不是“便不便宜”,而是“能不能换过去”。如果接入方式不兼容,开发者要改客户端、重写工具调用、重跑一批老任务,再低的单价也会先卡在工程成本里。

DeepSeek这次把这道门压低了。它同时挂出两个API入口地址(base URL):https://api.deepseek.com 兼容OpenAI Chat Completions,https://api.deepseek.com/anthropic 兼容Anthropic Messages。V4-Pro和V4-Flash两个模型,在两个入口下都能跑。

对原本接OpenAI Chat Completions或Anthropic Messages的人来说,迁移到DeepSeek现在变成了三步:改base_url,换API key,把模型名替成deepseek-v4-flash或deepseek-v4-pro。这还不能直接替换生产,但应用的API调用已经可以指向DeepSeek:先小范围放量,再对同一批任务比较回答质量和成本。

调通API,只是第一步。工具调用(tool calling)的参数、返回格式和失败路径要重测,长上下文里会不会漏信息、答偏、变慢,也要重新测试;企业采购还要过合规、内部SLA、私有部署和安全评估。最先能动起来的,还是那些把模型封装在API后面、随时可以切供应商的开发者和初创团队。

03. 市场反馈:4个月对7年

V4上线当天,2026年4月24日凌晨,AI编码助手Cline的创始人Saoud Rizwan在X上发了一条:

deepseek v4 is now the cheapest sota model available at 1/20th the cost of opus 4.7. for perspective, if uber used deepseek instead of claude their 2026 ai budget would have lasted 7 years instead of only 4 months.(DeepSeek v4现已成为市场上价格最低的SOTA模型,其成本仅为Opus 4.7的二十分之一。从另一个角度来看,如果Uber使用DeepSeek而非Claude,那么他们2026年的AI预算本可以维持7年,而非仅仅4个月。)

Saoud Rizwan(Cline创始人)2026年4月24日凌晨在X上的原帖,来源:x.comsdrzn

“4个月”这个数不是修辞。Uber CTO Praveen Neppalli Naga在2026年4月接受The Information采访时确认:Uber 2026年整年的AI预算,4月就已经烧完,主要烧在Claude Code在内部工程团队铺开上。

这条推文的杀伤力不在“7年”是否精确。真正重要的是,它把“企业AI预算被模型调用迅速吃完”这件事,和V4的公开定价摆在同一张账单上;在开发者社区里,这种对照就是迁移决策的导火索。

同一天,独立评测者Simon Willison发了V4上手测评,把V4-Flash、V4-Pro与GPT-5.5、Opus 4.7、Gemini 3.1 Pro等十多款前沿模型的定价摆进同一张对照表,结论是:V4-Flash是市面上最便宜的小模型,V4-Pro是最便宜的前沿大模型。

一周之内,第三方模型路由平台OpenRouter的V4-Pro模型页画出了一条上线即起飞的曲线:4月24日上线当天约5B prompt tokens,到4月29日已涨至46.1B prompt、705M reasoning、449M completion(分别对应用户输入提示词、模型推理过程、最终输出三类token),一周不到翻了近10倍——开发者侧的真实路由流量。

OpenRouter的V4-Pro模型流量数据,来源:OpenRouter

四件事在同一周里凑齐:实名站台(Saoud Rizwan、Simon Willison)、具体的成本对比(4个月对7年)、公开评测、第三方路由流量。

这不是“会迁移”的远期推论,是迁移开始的早期势头。

04. 反平台

价格表只能说明这一刀砍得多狠,不能说明DeepSeek站在哪儿。要看清V4的位置,得把三件事拆开看:架构成本、商业模式、战略意图。

架构成本:压低单token的物理上限

DeepSeek这一刀能下到$0.0145,并不源自定价部门的勇气,而是基于V4的架构换代。模型每读一段长文,都要把读过的内容暂存在显存里——这块“草稿纸”叫KV Cache,上下文越长、草稿纸越大、推理越烧钱。V4在注意力层用了一组新的混合压缩法:CSA(Compressed Sparse Attention)把KV物理压到1/4,再叠加“只看重点”的逻辑稀疏;HCA(Heavily Compressed Attention)压得更狠,物理压到1/128,再用全局注意力补漏。两类压缩法在网络中交替工作。

从V2的MLA、V3.2的DSA一路下来,DeepSeek每一代都在压同一件事——长上下文里的KV Cache和算力消耗。

到V4这一代,1M长上下文同口径下,V4-Pro比V3.2少消耗73%的算力(FLOPs只剩27%),KV Cache只占10%;V4-Flash再低一档,FLOPs 10%、KV Cache 7%。HuggingFace在V4解读里给了一个直观比对:V4的KV Cache只有同等条件下“业界标准省内存写法”(8-head GQA + BF16 KV)的2%——同样一段长对话,别家要占的显存,V4只用1/50。

V4 vs V3.2 架构同口径对照(1M 上下文),来源:DeepSeek V4 技术报告

物理空间往下压的同时,国产算力的适配在并行推进。V4在华为昇腾950上已跑通实测,智源FlagOS也把V4-Flash适配到了华为昇腾、海光、沐曦、昆仑芯等多款国产芯片。

在V4-Pro官方API页面以小字备注:V4-Pro受限于高端算力,预计下半年昇腾950超节点(把数十张芯片用高速互联拼成一台大机器、专门跑大模型推理)批量上市后,Pro的价格还将大幅下调。这一句把下半年的降价空间,直接挂在国产超节点的产能上。

商业模式:不靠API的毛利养现金流

主流玩家最近的动作是涨价。OpenAI在4月23日把GPT-5.5的价格在GPT-5.4基础上翻倍,同时在GPT-5.5之上新增一档GPT-5.5 Pro,定价$30/$180——一边是同档涨价,一边是把价格梯子的顶端再往上抬,只对愿为额外能力付高价的企业客户开放。Anthropic走的是同一条路径:换装的新tokenizer让同输入最多多产生35%的token,实际账单提高;同时在Opus 4.6上新开Fast mode顶端档$30/$150(6倍于标准价)。

中国头部厂商沿着同一方向走:阿里旗舰Qwen3.6-Max-Preview于4月20日首次以闭源形式发布;阿里云、百度云3月18日同日上调AI算力价5–34%、存储涨30%,阿里云4月15日又上调百炼平台部分MU模型单元服务价;智谱年内三次调价;月之暗面4月20日发布Kimi K2.6,API输入价从每百万token 4元提至6.5元,涨价58%。

一连串动作方向一致:单价上涨、通过细分市场把能力卖更高价、重心转向高毛利企业客户,提高API毛利撑住利润。

DeepSeek走的是反方向。母公司幻方2025年收益率56.55%,主营现金流不依靠卖API赚钱。融资这一头同样不缺:4月17日路透社首次报道DeepSeek新一轮估值至少100亿美元,4月22日彭博、The Information报道腾讯、阿里加入谈判,把估值推至200亿美元以上——6天里估值翻倍;彭博同时透露,腾讯在谈判桌上提出收购DeepSeek 20%股权,被DeepSeek回绝。云大厂主动加码抢入,DeepSeek却在挑钱的“形状”。

现金流不靠API、估值不靠API、控制权也不轻易让出,永久把缓存命中价格砍到1/10并不是打价格战,是“不用拼这场仗”

战略意图:技术生态拓展

梁文锋在2024年7月接受专访时讲过几句话:

我们不会闭源。我们认为先有一个强大的技术生态更重要。
开源更像一个文化行为,而非商业行为。
我们经常说中国AI和美国有一两年差距,但真实的gap是原创和模仿之差。
 这一波浪潮里,我们的出发点就不是趁机赚一笔,而是走到技术的前沿去推动整个生态发展。

这不是一时表态。DeepSeek创业时的第一篇技术报告标题就是《DeepSeek LLM: Scaling Open-Source Language Models with Longtermism》——长期主义和开源,是写在第一篇文章封面上的。

长期主义和开源写在论文标题里,来源:DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

这个意图落到V4上,是同时在做三件事。

  • 全档MIT开源。 V4-Pro(1.6T总参数/49B激活)、V4-Flash(284B/13B激活)两档全部以MIT放出,不保留。在阿里、字节、百度旗舰相继转闭源的当口,这本身是一种方向选择。
  • 落地形态开放。 DeepSeek自己运营双base URL的API入口,同时把V4送上阿里云百炼、火山方舟、华为云、腾讯云、英伟达云的主流第三方云;以MIT开源支持全档私有化部署(含1.6T旗舰);并作为基座供二次开发。API、第三方云、私有化、二次开发——四种落地入口平级展开,没有哪一种被定位为“主战场”,统统开放。
  • 主动适配多元芯片架构。 V4早期访问阶段先给到华为昇腾、寒武纪;4月24日上线当天,华为云首发适配V4-Flash并同步上线10+昇腾融合算子(针对昇腾芯片定制的核心计算模块),智源FlagOS再把V4-Flash适配到海光、沐曦、摩尔线程、昆仑芯等8+款国产芯片。不偏废NVIDIA——同日NVIDIA官方Developer Blog发文宣布Blackwell上day-0可用。

模型、入口、硬件,全部从DeepSeek手里放出去。MIT开源让模型触手可及;开放的落地形态让DeepSeek无孔不入;多元芯片适配让V4通行无阻。三层叠起来,V4进入一个自己也关不掉的技术生态。“一个强大的技术生态”,在V4上就是这个形状。

OpenAI、Anthropic、阿里、字节、百度等主流玩家方向一致:闭源旗舰、自营API,把客户圈进围栏,让生态围着自家平台转。这是平台路径,用模型当门票、用API当通道、用迁移成本当护城河,把“平台”建起来。

DeepSeek反过来,把这三样全往外放,它不是在搭一个属于自己的平台,而是在拆掉所有让自己变成平台的东西。

这种定位,可称之为——反平台。

反平台是名词,不是动词。从这一周起模型厂商走向分化,开篇那句“不在同一坐标系”,到这里才有了具体所指。“价格”只是表面那一瞥,把两边真正分开的,是平台与反平台的分化。

尾声

这不是一场“价格战”,是一次分化。同一周里,价格、协议、模型厂的位置都朝两边走:一边是平台,一边是反平台。

地图在重画,不只是价格在动。

END
 
作者 | 黄云皓
出品 | 云涌AI
云涌创新 | 在复杂中,看见涌现 

写完了,但涌现还在继续。欢迎补一个你的视角

参考资料:

  1. DeepSeek 官网|DeepSeek
  2. DeepSeek-V4 Technical Report|DeepSeek
  3. DeepSeek-V4: Better, Faster, Cheaper at Long Context|HuggingFace
  4. OpenAI 官网|OpenAI
  5. Anthropic 官网|Anthropic
  6. “deepseek v4 is now the cheapest sota model …”|Saoud Rizwan,X
  7. Uber CTO Shows How Claude Code Can Blow Up AI Budgets|The Information
  8. DeepSeek V4—almost on the frontier, a fraction of the price|Simon Willison
  9. OpenRouter 官网|OpenRouter
  10. 阿里 Qwen 官网|阿里 Qwen
  11. 阿里云官网|阿里云
  12. 百度智能云官网|百度智能云
  13. 月之暗面官网|月之暗面
  14. 智谱 AI 官网|智谱AI
  15. China’s DeepSeek is raising funds at $10 billion valuation, The Information reports|路透社
  16. Tencent, Alibaba in Talks to Join DeepSeek’s First Funding Round|彭博/The Information
  17. 智源 FlagOS 官网|智源研究院
  18. 华为云官网|华为云
  19. Build with DeepSeek V4 Using NVIDIA Blackwell and GPU-Accelerated Endpoints|NVIDIA Technical Blog
  20. DeepSeek LLM: Scaling Open-Source Language Models with Longtermism|arXiv
  21. 揭秘DeepSeek:一个更极致的中国技术理想主义故事|暗涌 Waves

Letture associate

Google TPU Shipments Revised Up by 50%

Recent industry research indicates a significant upward revision in the shipments of Google's TPU (Tensor Processing Unit) chips. Previous expectations for 2027 were set at around 10 million units, but new estimates now point to 15 million units, a 50% increase. This substantial boost directly translates to higher demand across the entire supporting supply chain. Google's TPU clusters utilize a standardized all-optical interconnect architecture. Consequently, key hardware components are deeply integrated and scaled in fixed ratios with the chips. The 15 million TPU target will drive corresponding demand increases for NPO optical engines (roughly a 1:1 match), 1.6T optical modules, OCS optical switches, high-end server power supplies, fiber optics & MPO connectors, and liquid cooling solutions. Among these, liquid cooling is highlighted as the sector experiencing the most significant transformation and offering the most stable potential for excess returns. As next-generation TPU chips reach power levels where traditional air cooling is insufficient, liquid cooling becomes essential. 2026 is forecasted as the first year of substantial adoption for Google's liquid cooling solutions. This shift, coupled with delivery and capacity bottlenecks faced by incumbent overseas manufacturers, is creating a prime window for domestic Chinese suppliers to enter and secure Google's core supply chain. The market size for Google-specific liquid cooling is projected to potentially triple from a baseline of hundreds of billions to around 300 billion units by 2028. The logic for the fiber optic sector is also being rewritten. Once considered a cyclical commodity tied to telecom operator procurement, fiber is now a strategic and scarce resource for AI Data Centers (AIDC). A severe supply-demand imbalance, driven by the long lead time for preform production (18-24 months) and surging demand from cloud giants, is supporting strong performance. Chinese fiber manufacturers are well-positioned to capture a significant share of global AIDC demand, with exports potentially reaching 200-300 million core kilometers in 2026. Overall, the investment focus within the AI computing industry is shifting from pure "chip performance speculation" towards the more certain incremental growth in computing infrastructure and its supporting ecosystem. The upward revision in Google TPU shipments, along with the potential for further doubling by 2028, is seen as solidifying performance visibility for the entire supporting supply chain over the next two years.

marsbit28 min fa

Google TPU Shipments Revised Up by 50%

marsbit28 min fa

What Wall Street Really Wants After the Crypto Story Recedes

The tide of speculative crypto narratives has receded, revealing Wall Street's true objective: building a controlled, yield-generating, and compliant financial pipeline on distributed ledgers. They are migrating core functions onto blockchains, not for decentralization, but for efficiency and new revenue streams. Key developments include BlackRock's BUIDL fund, a tokenized treasury fund acting as a foundational reserve asset, and the rise of Securitize, which is going public and partnering with the NYSE to build a 24/7 digital securities trading and settlement system. This signals a major shift of securities clearing to blockchain technology. To make volatile assets like Bitcoin palatable for institutional investors, firms like BlackRock and Goldman Sachs are creating "covered call" ETFs (e.g., BITA). These products systematically sell options on Bitcoin holdings, transforming price volatility into stable monthly income, effectively repackaging crypto as a yield-bearing asset. Stablecoins are being positioned not as speculative tools but as efficient payment rails. Companies like Stripe and Mastercard are integrating them for instant, low-cost merchant settlements and cross-border card payments, respectively. Critically, new legislation like the GENIUS Act shapes them as non-interest-bearing, heavily regulated extensions of the US dollar system. In summary, Wall Street is quietly constructing a parallel, blockchain-based financial infrastructure featuring tokenized traditional assets, structured crypto yields, and programmable dollar pipelines—all under its control and fully integrated with existing regulatory and credit frameworks.

marsbit45 min fa

What Wall Street Really Wants After the Crypto Story Recedes

marsbit45 min fa

Tying Itself to SpaceX: Cursor's $60 Billion Rise

This article recounts the rapid rise of AI-powered coding startup Cursor and its 25-year-old MIT graduate CEO, Michael Truell. Launched in 2023, Cursor achieved explosive growth, reaching over 10 billion USD in revenue by late 2025. However, its journey highlights a central dilemma for AI application companies: dependence on foundational model providers. Cursor initially relied heavily on Anthropic's models but faced an existential threat when Anthropic launched its own competing coding tool, Claude Code. In response, Cursor declared an internal emergency in early 2026 and accelerated development of its own model, Composer. To secure the immense computing power needed, Truell struck a pivotal deal with Elon Musk's SpaceX in April 2026. The collaboration grants Cursor access to SpaceX's supercomputing resources for Composer, while SpaceX's Grok model benefits from Cursor's programming data. The agreement includes a potential 600 billion USD acquisition of Cursor by SpaceX later in the year, though a substantial termination fee is in place if the deal falls through. The story explores Cursor's intense, sometimes controversial hiring practices involving lengthy unpaid "work trials," its complex partnership-turned-rivalry with Anthropic, and its high-stakes gamble to ensure independence through the SpaceX alliance. The core question remains: will Cursor evolve into a defining, independent "generational" software company, or become a key piece in a tech giant's AI arsenal?

marsbit49 min fa

Tying Itself to SpaceX: Cursor's $60 Billion Rise

marsbit49 min fa

Warsh's Debut: Will the FED Chair Who Knows Crypto Best Bring Surprises or Shocks to the Market?

Kevin Warsh, the new Federal Reserve Chairman, prepares for his inaugural press conference amidst a challenging macroeconomic landscape: resurgent inflation, a bond market sell-off, and political pressure from President Trump for rate cuts. Uniquely, Warsh holds indirect investments in over 20 crypto and Web3 entities (e.g., Solana, dYdX), making him the first Fed Chair with disclosed crypto exposure. His stance may combine a hawkish, inflation-focused monetary policy with a crypto-friendly regulatory philosophy that shifts from Powell’s “same risk, same rule” approach toward a framework acknowledging blockchain’s productivity value. Warsh’s leadership could impact crypto markets across three dimensions: a paradigm shift in regulation (potentially accelerating pro-innovation legislation and stable币 rules), a re-pricing of risk premiums based on clearer communication and his view of AI as a structural disinflationary force, and a long-term reallocation of global institutional capital driven by increased legitimacy. Two potential scenarios for the press conference are outlined. A “positive surprise” would involve a dovish-leaning tone on rates coupled with signals of regulatory openness, potentially boosting crypto asset valuations. Conversely, a “negative shock” would see a more hawkish-than-expected stance on inflation and rates, triggering a broad risk-asset selloff that crypto markets would not escape. While ethics rules required Warsh to divest his crypto holdings upon confirmation, his deep understanding of the technology may fundamentally lower policy uncertainty and build a more receptive long-term foundation for digital assets’ integration into the mainstream financial system.

marsbit11 h fa

Warsh's Debut: Will the FED Chair Who Knows Crypto Best Bring Surprises or Shocks to the Market?

marsbit11 h fa

Trading

Spot
Futures

Articoli Popolari

Come comprare 4

Benvenuto in HTX.com! Abbiamo reso l'acquisto di 4 (4) semplice e conveniente. Segui la nostra guida passo passo per intraprendere il tuo viaggio nel mondo delle criptovalute.Step 1: Crea il tuo Account HTXUsa la tua email o numero di telefono per registrarti il tuo account gratuito su HTX. Vivi un'esperienza facile e sblocca tutte le funzionalità,Crea il mio accountStep 2: Vai in Acquista crypto e seleziona il tuo metodo di pagamentoCarta di credito/debito: utilizza la tua Visa o Mastercard per acquistare immediatamente 44.Bilancio: Usa i fondi dal bilancio del tuo account HTX per fare trading senza problemi.Terze parti: abbiamo aggiunto metodi di pagamento molto utilizzati come Google Pay e Apple Pay per maggiore comodità.P2P: Fai trading direttamente con altri utenti HTX.Over-the-Counter (OTC): Offriamo servizi su misura e tassi di cambio competitivi per i trader.Step 3: Conserva 4 (4)Dopo aver acquistato 4 (4), conserva nel tuo account HTX. In alternativa, puoi inviare tramite trasferimento blockchain o scambiare per altre criptovalute.Step 4: Scambia 4 (4)Scambia facilmente 4 (4) nel mercato spot di HTX. Accedi al tuo account, seleziona la tua coppia di trading, esegui le tue operazioni e monitora in tempo reale. Offriamo un'esperienza user-friendly sia per chi ha appena iniziato che per i trader più esperti.

356 Totale visualizzazioniPubblicato il 2025.10.20Aggiornato il 2026.06.02

Come comprare 4

Discussioni

Benvenuto nella Community HTX. Qui puoi rimanere informato sugli ultimi sviluppi della piattaforma e accedere ad approfondimenti esperti sul mercato. Le opinioni degli utenti sul prezzo di 4 4 sono presentate come di seguito.

活动图片