和美国AI「御三家」聊DeepSeekV4和美团LongCat,有意外收获

marsbitPublished on 2026-04-28Last updated on 2026-04-28

随着DeepSeek-V4的发布,我问了一下闭源大模型的「御三家」如何看待这个竞争对手,它们各自的表态很有意思:

GPT-5.5比较「傲娇」,强调不是一次「终结比赛」的事件,但要「盯紧」这个对手,承认便宜的百万级上下文,会对整个行业都造成压力。

Gemini 3.1则是个「实诚人」,定性DeepSeek-V4属于「掀桌子」级别的危险竞品,当开源模型在自己的舒适区展现出巨大统治力时,「压迫感是极其真实的」。

Claude Opus 4.7显得最「从容」,表态「高兴对于不安」,甚至直言如果DeepSeek-V4在某些任务场景更好用那就切过去,自己「不需要在每件事上都是最优解」。

三个模型之间的细微差别,其实也和各家公司的风格高度相关,OpenAI的狂躁色彩,Google的家大业大,Anthropic的学究气质,完全吻合有没有?

除了DeepSeek V4,同一天,美团也开放测试了新的万亿级参数大模型LongCat-2.0-Preview,在继续向「御三家」追问时,话题开始拐向对于国产算力集群的惊讶,这就更有看点了。

简单来说,从今年开始,中国第一梯队的大模型,已经无碍的放在了数以万计的国产芯片上去做训练,不再是有限尺寸的小打小闹,也没有避重就轻的「只谈推理、不接训练」了。

这在过去被普遍认为是不可能实现的全链路替代,只能说一座冰山的水下体积,要数十倍于能够看到的大小。

行到水穷处,坐看云起时,看岁月之峥嵘啊。

1

前段时间,黄仁勋做客一档播客的对话切片火了。

因为从画面来看,老黄肉眼可见的急了,在关于英伟达迫切想把芯片卖给中国是不是在「资敌」这个话题上,两人竟然聊出了火药味。

黄仁勋连续用了「Childish」(幼稚)这个词来打断主持人的预设立场,并抛出了后来被各家媒体写进标题的那段话:

「如果DeepSeek的新模型首先在华为的芯片上发布,对于美国来说是一个相当糟糕的结果。」

考虑到播客录制的时间,黄仁勋言辞里的「如果」是多余的,因为这在行业里早就是一个公开的秘密了,实际上DeepSeek-V4发布的多次跳票,就是因为要花更多的时间适配国产芯片集群。

所以并没有什么「黄仁勋担心的事情终于成了事实」,他很清楚,自己是在「基于既定事实而去借题发挥」。

所以在上头时,黄仁勋几乎是在以一种斥责的口吻,批评主持人的禁运论及其支持者,认为这是一种失败主义的表现,美国根本不相信自己能够赢得竞争,所以要用拆台的手段,去遏制中国。

有人用视频模型整活,演绎了当时的「真实」气氛,你们可以感受一下:

无论如何,黄仁勋的即兴演出,可能会注定载入AI发展的史册,尤其是对于那些奋力越过山丘的国行模型厂商而言,自己说自己做对了,还不是那么有说服力,来自对手的肯定,才是货真价实的认证。

就,有一种脚下有路、心里有底的滋味。

2

算上DeepSeek-V4和美团LongCat-2.0-Preview,在全球范围内的万亿级参数模型数量上,中国也是第一次超过了美国,站在了「多数派」的一边。

在前些年的最高点,新加坡消费了英伟达近三成的GPU出口,这显然不是新加坡人突然有了拿芯片泡酒喝的兴致,如果不是特别无知的话,就得承认「转口贸易」其实占了很大比重。

训练领先的模型,尤其是大规模参数的模型,离不开英伟达的卡,这在行业里曾是雷打不动的常识。

但在今年这个晚春,那些曾经坚固的铁律,一条接着一条开始松动。

DeepSeek-V4和国产芯片进行了前所未有的深度适配,是大家都已经知道的了,相比它做这件事情的「政治正确」或是「历史使命」,美团的「主动为之」也很值得注意。

美团启动测试的LongCat-2.0-Preview,只看训练规模的话,可能创下了国产算力的新纪录。

据我所知,美团对国产算力的重视,已经持续许久了。通过几年前开始陆续投资半导体领域的那十多家公司,就能看出来——从新材料到圆晶制造,从芯片设计到TPU,不是明星级的头部企业,就是产业内的隐形冠军。

除了为自身的AI战略铺路,也客观上为国产算力贡献了充足的弹药。

现在的情况是,英伟达高端GPU在中国的名义市占率始终是零,却有两个国产的万亿级大模型同一天发布了。

万亿参数,不再是海外大厂的特权了。

3

去年的这个时候,Anthropic一直在抨击大厂囤积算力是「不负责任」的行为,虽然现在已经在被疯狂打脸了——Codex三天两头重置额度,从Claude Code这里骑脸抢人——但在当初的说法里,有一个理由其实是站得住的:

「你把未来一两年的芯片全买光,小公司根本拿不到卡,AI变成少数巨头的游戏,这是反竞争、反创新。」

之所以「天下苦英伟达久矣」,并非大家对这家公司天然抱有憎恶,而是在奇货可居的供需关系里,英伟达把铲子卖出了比金子更高的价格,这一定不合理。

甲骨文的老板埃里森曾说他和马斯克一起去约黄仁勋吃漂亮饭,全程两人几乎就只是在翻来覆去的说一句话:

「求你了,把钱收下。」

事实证明,有竞争是好事,自从Google的TPU被证明可堪大任以来,市场对于英伟达的垄断预期就有了挥之不去的疑虑,加上中国自主芯片集群成功托了新一代大模型,变数就更大了。

要知道,美团的主营业务战火正酣,AI又是个需要持续砸钱的业务,却依然能够掏出LongCat-2.0-Preview这种水平的模型,足以说明国产算力的经济性和第二选择有价值。

在技术社区Linux DO上,已经有开发者对LongCat-2.0-Preview表达了惊喜,「起码美团还在真做事情。」

很多新的地图,都是从一条不起眼的小路画起的。。

4

脱离英伟达苦心经营的生态,不是没有代价的,而DeepSeek-V4和LongCat-2.0-Preview要克服难以想象的困难:

之所以一直强调万亿级参数,是因为参数越大,对显存容量和带宽的要求更高,需要重构整个软硬件的协同工程;

更不用说CUDA的泛用性,模型团队需要针对国产芯片特性重写和优化核心算子,甚至自研全确定性的算子,以确保训练全程的精确可复现;

承认短期差距并不丢人,在万卡集群上长期训练,硬件故障是必然而非意外的,所以需要同步构建完整的容错、检测与恢复体系;

如果效率上不去,成本就是笑谈,要跑通全链路,就必须针对国产环境的特点,对训练框架和模型结构实现亲和设计,确保性能可以满足需求。

⋯⋯路虽远,行则将至。

公允的讲,自从DeepSeek-V3震惊世界以来又过去了一年多的时间,大家希望看到的开源模型赶超闭源模型的画面依然没有出现,Anthropic的年化收入屡创新高也在说明「一分钱一分货」的基本规律。

国产模型的任重道远和弯道超车,恐怕还是要结合中国产业结构的优势进行,我们的无人机、无人车、机器人乃至工业规模,都在真实生产层面拥有全球独一档的稀缺数据。

这些连接万物的场景,可以为芯片厂商提供长期连续、真实负载的场景,来验证芯片的稳定性和可靠性。

上半场拼算力资源,下半场比物理底座,时间还很长,长到足以产生任何结果的可能性。

本文来自微信公众号 “阑夕”(ID:techread),作者:阑夕

Trending Cryptos

Related Reads

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

In mid-June, three seemingly independent industry events—the compliance-driven throttling of Fable 5, the open-sourcing of GLM-5.2, and the leaked release timeline for GPT-5.6—are pushing the global AI industry toward a watershed moment. These shifts signal a fundamental restructuring of the industry's underlying logic. First, **"usability" has substantially overtaken "advanced capabilities"** as the primary weight, pushing the global large language model (LLM) supply chain into a "dual-track" phase of controlled closed-source and local open-source coexistence. Second, **the competitive moats of closed-source giants are shifting**. Their technical focus is moving from "language intelligence" toward "spatial intelligence (world models)"—a domain heavily reliant on computing power. Third, faced with常态化 transnational compliance risks, **a "model-agnostic" decoupled design has become a survival necessity for application-layer developers to maintain business continuity.** The article details how Anthropic's Fable 5, despite its advanced engineering feats, was restricted for non-U.S. citizens within 72 hours of launch, highlighting how geopolitical compliance can instantly limit even the most advanced models. In response, the open-source camp, exemplified by Zhipu AI's MIT-licensed GLM-5.2, is gaining market share by offering stable performance improvements and significant cost advantages (up to 70% savings for enterprises), while achieving full adaptation with domestic semiconductor platforms. Meanwhile, closed-source leaders like OpenAI are pivoting. The anticipated GPT-5.6 reportedly shifts focus from language to spatial intelligence and world models, aiming to rebuild a generational gap in areas like 3D understanding, simulation, and industrial design that demand immense compute. The core conclusion is that the LLM supply chain's logic has changed. Enterprises must now evaluate infrastructure based on a composite of technical performance and policy compliance. For developers, complete reliance on a single closed-source API poses unacceptable risk. Implementing a truly model-agnostic architecture—enabling swift switches to compliant, locally deployable open-source alternatives—is no longer just good practice but a fundamental baseline for business continuity.

marsbit1h ago

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

marsbit1h ago

Is the 'Token Subsidy War' Among AI Giants Almost Over?

The article discusses the ongoing "token subsidy war" among AI giants like OpenAI and Anthropic, questioning whether it's nearing its end. It reveals that current AI subscription prices are heavily subsidized, with some plans offering tokens at up to 70 times the actual cost to attract and retain heavy users, especially developers and enterprises. This strategy mirrors past internet-era subsidy battles, but with a key difference: AI tokens lack "lock-in" effects. Unlike ride-hailing or food delivery apps, users can easily switch between AI providers as APIs become standardized, making it difficult for companies to raise prices post-subsidy. The piece highlights a structural asymmetry in the competition. Giants like Google, with massive advertising revenue, can afford to subsidize tokens indefinitely, akin to using "tokens as a weapon." In contrast, venture-backed companies like OpenAI and Anthropic face pressure to become profitable, especially as they approach IPO. The article cites Google Ventures founder Bill Maris, who suggests Google could slash token prices by 80%, putting immense pressure on competitors. Two potential endgames are presented: the "internet service" model (subsidize, monopolize, then raise prices) and the "utility" model (tokens become a standardized, low-margin commodity like electricity). Given the low switching costs, the latter seems more likely. The competition may not have a single winner but could instead accelerate AI's evolution into a foundational, infrastructure-level technology, akin to a public utility. For now, users continue to benefit from heavily subsidized token costs.

marsbit1h ago

Is the 'Token Subsidy War' Among AI Giants Almost Over?

marsbit1h ago

Beyond the Stadium: The Profitable Games Surrounding the World Cup

"Beyond the Pitch: The Profit Game Around the World Cup" The FIFA World Cup transcends being a sporting spectacle, evolving into a massive global arena for speculation and profit-seeking. The 2026 tournament has amplified this dynamic, creating a multi-layered ecosystem of financial opportunism alongside the football. **Prediction markets** have surged into the mainstream. Platforms like Polymarket and Kalshi saw trading volumes for World Cup contracts soar, attracting new users with their financial trading model and high-profile, chain-based wealth stories that overshadow traditional sports betting in terms of growth and narrative. However, **traditional sportsbooks** remain the dominant force, leveraging established user habits, legal markets, and comprehensive product offerings to handle the vast majority of speculative wagers, with projections suggesting record-breaking betting volumes. Capital markets also react. **"Concept stocks"** in countries like South Korea and Japan experience volatile price swings based on team performance and anticipated fan spending on items like chicken, beer, and viewing parties, effectively becoming a stock market reflecting fan sentiment. The **ticket resale market** has become a sophisticated arena for arbitrage. Prices fluctuate wildly based on team draws and star power, with sellers sometimes listing tickets they don't yet own in a practice akin to short-selling, while FIFA's own "Right to Buy" tokens add another layer of speculative trading. **Collectibles and merchandise** offer another avenue. Panini sticker albums, with their inherent scarcity and nostalgic value, can become high-value collectibles. Limited-edition or locally themed jerseys command significant premiums on secondary markets, and even counterfeit vendors profit from fans' desire for affordable match-day identity. The **cryptocurrency** space has seen a frenzy of speculative, unauthorized World Cup-themed meme coins on chains like Solana. These tokens, often exploiting team names and player imagery, experience extreme pump-and-dump cycles, creating stories of massive gains for a few early entrants and steep losses for many others. Finally, an entire industry thrives on **providing information and tools** to other speculators. Developers create platforms like SeatSidekick to track ticket inventory and prices, while paid Telegram groups and subscriptions sell betting tips and predictions, monetizing the widespread desire for an informational edge. In essence, the World Cup has become a compressed, global laboratory for speculation. While the games determine champions on the field, a parallel, complex network of financial transactions—spanning prediction contracts, bets, stocks, tickets, collectibles, crypto, and information services—settles its own scores in the global market.

marsbit2h ago

Beyond the Stadium: The Profitable Games Surrounding the World Cup

marsbit2h ago

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

This article explains the three primary methods for Codex to interact with a computer, each with distinct use cases, permission boundaries, and trust levels. **1. Computer Use:** This offers the broadest access, allowing Codex to visually control and interact with the graphical user interface of authorized macOS/Windows apps, system settings, and even iOS simulators. It's ideal for tasks lacking APIs or structured tools, such as operating legacy software or multi-app workflows. However, it's the slowest method and has the widest permission scope, requiring careful supervision for sensitive actions. **2. Chrome Extension:** This grants Codex access to the user's logged-in Chrome browser state, including cookies, profiles, and open tabs. It's best for tasks requiring user identity across websites like Gmail, LinkedIn, Salesforce, or internal dashboards. Its key advantage is multi-tab control for complex workflows. While more powerful for browser-based tasks than Computer Use, it carries higher sensitivity as actions are performed under the user's identity. **3. In-App Browser:** This is a browser isolated within the Codex thread, separate from the user's personal browsing data. It excels in web development and debugging scenarios—previewing local servers, testing responsive layouts, or annotating designs directly on the page. Its isolation is a strength for development but a limitation for tasks requiring login sessions. The core principle is to choose the narrowest, safest, and most structured interface for the task. Use plugins or MCPs first, resort to visual control (Computer Use) only for GUI-dependent tasks, employ the Chrome extension for identity-reliant browser work, and prefer the In-App Browser for isolated development. **Appshots** are clarified as a fourth, complementary tool for *inputting* context—capturing a screenshot of a window to point Codex to something—rather than a method for Codex to *act*. Together, this layered approach highlights a key to AI agent productization: not granting unlimited permissions, but constraining them within clear boundaries for specific tasks while preserving user oversight.

marsbit3h ago

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

marsbit3h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片