Sam Altman 和 AWS CEO 罕见同框:聊了智能体、harness、和云的下一仗

marsbitPublicado a 2026-04-30Actualizado a 2026-04-30

Stratechery创始人Ben Thompson同时采访了OpenAI CEO Sam Altman和AWS CEO Matt Garman。当时外界还不知道,仅仅三天后,微软和OpenAI就会宣布修改长达数年的独家协议,Azure不再是OpenAI模型的唯一云服务商。但这场合作的逻辑矛盾已经摆在了台面上——两家公司的掌舵人为什么会坐到一起?

背后的逻辑并不复杂。当初微软用“Azure独占OpenAI模型”锁定了巨大的竞争优势,但也绑住了OpenAI的手脚——大量企业数据已经躺在AWS上,客户不想为换模型而搬家。而Anthropic今年增长迅猛,正是吃到了“客户在哪云就想要哪的模型”的红利。对微软来说,继续卡独家权反而在损害自己对OpenAI这笔最重要的投资。松绑是痛苦的——Azure失去了一个核心差异化武器——但不松绑更亏:如果OpenAI的增长被独家协议限制,微软作为大股东的损失远大于Azure的得利。

于是有了这次联合发布:Bedrock Managed Agents,由OpenAI驱动。可以把它理解为“AWS版的Codex”——一个运行在云端、带有完整身份、权限、日志、治理和部署能力的智能体运行环境。客户数据留在AWS内部,OpenAI不接触原始数据。目标是让那些数据已经在AWS上的企业,不用迁移就能直接用上最前沿的AI能力。

下文本次采访对话的核心内容提炼,原文链接:https://stratechery.com/2026/an-interview-with-openai-ceo-sam-altman-and-aws-ceo-matt-garman-about-bedrock-managed-agents/

1

AI的“AWS时刻”:让智能体从能跑变成能用

Sam Altman:每次我看到用户用我们的模型,我既高兴他们觉得这是魔法,又崩溃于他们经历了多少不必要的折磨。用户需要把东西从一个地方复制粘贴到另一个地方,搞一串复杂提示词,反复试错——这些痛苦我看在眼里。

Matt Garman:在这套联合产品出来之前,客户想用AI智能体,得自己拼凑所有环节——模型调用、身份管理、数据库认证、与内部系统的集成、对自己数据的理解。每一个客户都重新干一遍。所有这些集成工作全都留给了客户自己处理。

Matt Garman:AWS过去20年为全球银行、医疗机构、政府机构建立的安全框架——VPC(虚拟私有云)、角色权限、网关——恰好可以帮上忙。客户最担心的就是“我热爱这项技术,但怎么确保我不会一失误就搞出一个让公司完蛋的事件”。这些问题都是可解的,关键在于给客户一个可控的沙盒环境。

Sam Altman:模型和编排层(harness)正在变得越来越不可分。以前很多需要在系统提示词层面费心调教的事,模型变聪明之后自己就会处理了。比如工具调用——最初我们觉得工具调用不需要融入训练流程,后来发现融合得越深越好用。编排层和模型的边界会持续模糊,甚至预训练和后训练最终也会更紧密地走到一起。但整个行业还处在“家酿计算机俱乐部”的年代——也就是个人计算机刚刚萌芽、没人知道最终形态会是什么的阶段。

2

本地运行 vs 云端运行:两条路最终要汇合

Sam Altman:Codex从云端转向本地,是因为本地环境更简单——你的文件、配置都在那,不需要想数据在哪。但这不是终点。最终的形态是云端智能体——你合上电脑时它在云端继续工作,你有高强度任务时它能在云端并行处理,你可以扩展到一个单台笔记本根本做不到的规模。

Matt Garman:没有任何计算环境曾经真正消灭客户端。iPhone App也有本地组件,本地运行就是有低延迟、简单易用的天然优势。但一旦进入企业场景——两个人间共享、权限边界、安全边界——本地就捉襟见肘了。最终一定是本地和云端两条路结合在一起。

Sam Altman:当智能体以“虚拟同事”的身份进入工作队伍后,我们关于软件和权限的所有心智模型都要被重写。你作为员工应该有一个账户,然后让你的智能体也用这个账户?还是应该给智能体单独一个账户,让服务器能分清楚谁在操作?如果一个人有十个智能体呢?我设想了一种还没被发明出来的“原语”:当Ben的智能体登录时,它用的是Ben的账户,但系统能标注这是智能体而非Ben本人。我们还没搞清楚这些,但智能体加入工作队伍并变得越来越自主,这些问题很快就会被推到台面上。

3

“智能工厂”与定价革命

Sam Altman:我们本质上是一家“token工厂”——不对,应该说是“智能工厂”。客户不关心你用的是什么芯片、模型跑了多少token,只关心一件事:以最低价格获得最好的智能单元,要多少有多少。刚发布的5.5模型,单token价格比前代高很多,但完成同样任务需要的token数量大幅减少,整体算下来更便宜。你不该关心用了多少token,你只该关心花了多少钱、活儿干没干完。按token定价长期来看会过时,最终会演变成按“完成一件工作”来收费。

Sam Altman:水电煤有弹性边界——水便宜了你也不会一天洗两次澡。但智能可能不一样。我没有见过任何其他效用,让我只想说“只要价格足够低,我就无限制地继续用”。目前更多客户是在求我“不管多贵多给我算力”,而不是在砍价。但我有信心持续大幅压低智能的成本。

Matt Garman:这和计算能力的历史轨迹完全一致。今天一个计算周期的成本比30年前便宜了不知多少数量级,但今天卖出的计算量比任何时代都多。AI现在还处于极早期——大家抢前沿模型是因为只有它能完成真正有用的工作。未来一定会有模型结构的混合:小而快的做专项任务,前沿巨型模型去攻克癌症。

4

智能体的终局是什么?

Ben Thompson:企业内部可能需要两层智能体。底层智能体的工作是不断钻入各种数据库、SaaS应用、文件系统去检索、整理和关联信息——这是一层“数据整合智能体”。上层智能体负责与人类交互、呈现结果、执行决策——这是“用户界面智能体”。

Sam Altman:最近跟大企业客户交流时,他们的需求越来越一致:想要一个智能体运行时环境、一个能连接数据并控制token消耗的管理层、以及一个给员工用的工作空间。这套东西大家描述得越来越相似,但产品还没完全做出来。但可能在某个时刻你会发现,这套多层级架构只是我们抱着旧世界不放,模型足够强之后整个东西应该推倒重来。

Matt Garman:我们现在还不完全知道最终形态是什么,这也是做这件事的乐趣——让客户用起来,从他们的实践中学习,再反过来让产品变得更快更好。

Ben Thompson:Google在Next大会上刚讲完从芯片(TPU)到模型(Gemini)到应用的全栈垂直整合,而你和Sam——一个没有前沿模型,一个不是云厂商——却坐在一起宣布合作。AWS到底是因为没有自研前沿模型而落后了,还是有意选择了这条开放路线?

Matt Garman:从AWS第一天起,拥抱合作伙伴就是我们最核心的策略之一。我们衡量成功的标准不是“我是否拥有一切”,而是“合作伙伴是否成功——他们成功了,我们就成功了”。这跟“我必须拥有全栈”的哲学不同,但两种路线各有人信。我们相信客户应该有权选最好的东西。如果最好的东西是我们自己做的,很好。如果最好的东西是合作伙伴的但跑在我们的基础设施上,对我们同样是胜利。没有任何一家公司能拥有所有最好的应用。

Sam Altman:我真心认为开发者现在能构建一类全新的产品。模型能力未来一年会以非常陡峭的曲线进步,我们选在这个时候共同打造平台,时机正好。我希望一年后人们回头看,讨论的重点不是“终于能在AWS上用OpenAI了”,而是“我们当时完全低估了这个新产品的重要性”。

Ben Thompson:上次我们做产品采访是跟微软Kevin Scott聊New Bing,你当时对挑战Google非常自信。现在回头看呢?

Sam Altman:ChatGPT的表现超过了当时的预期——它可能是自Facebook以来第一个真正大规模的新消费产品。API和Codex也做得不错。但Google在很多方面仍然是一家被低估的公司——他们的广度和深度令人敬畏。这次跟AWS的合作,不仅是商业层面的双赢,更是技术层面的一个新起点。当模型能力和编排工具终于走到一个交汇点上,开发者能做的事会完全不同。

本文来自微信公众号“硅星GenAI”

Lecturas Relacionadas

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

In mid-June, three seemingly independent industry events—the compliance-driven throttling of Fable 5, the open-sourcing of GLM-5.2, and the leaked release timeline for GPT-5.6—are pushing the global AI industry toward a watershed moment. These shifts signal a fundamental restructuring of the industry's underlying logic. First, **"usability" has substantially overtaken "advanced capabilities"** as the primary weight, pushing the global large language model (LLM) supply chain into a "dual-track" phase of controlled closed-source and local open-source coexistence. Second, **the competitive moats of closed-source giants are shifting**. Their technical focus is moving from "language intelligence" toward "spatial intelligence (world models)"—a domain heavily reliant on computing power. Third, faced with常态化 transnational compliance risks, **a "model-agnostic" decoupled design has become a survival necessity for application-layer developers to maintain business continuity.** The article details how Anthropic's Fable 5, despite its advanced engineering feats, was restricted for non-U.S. citizens within 72 hours of launch, highlighting how geopolitical compliance can instantly limit even the most advanced models. In response, the open-source camp, exemplified by Zhipu AI's MIT-licensed GLM-5.2, is gaining market share by offering stable performance improvements and significant cost advantages (up to 70% savings for enterprises), while achieving full adaptation with domestic semiconductor platforms. Meanwhile, closed-source leaders like OpenAI are pivoting. The anticipated GPT-5.6 reportedly shifts focus from language to spatial intelligence and world models, aiming to rebuild a generational gap in areas like 3D understanding, simulation, and industrial design that demand immense compute. The core conclusion is that the LLM supply chain's logic has changed. Enterprises must now evaluate infrastructure based on a composite of technical performance and policy compliance. For developers, complete reliance on a single closed-source API poses unacceptable risk. Implementing a truly model-agnostic architecture—enabling swift switches to compliant, locally deployable open-source alternatives—is no longer just good practice but a fundamental baseline for business continuity.

marsbitHace 1 hora(s)

GPT-5.6 Countdown: Abandon the Illusion of a Single API, Computational Iteration Can't Outpace a Single Page of Compliance

marsbitHace 1 hora(s)

Is the 'Token Subsidy War' Among AI Giants Almost Over?

The article discusses the ongoing "token subsidy war" among AI giants like OpenAI and Anthropic, questioning whether it's nearing its end. It reveals that current AI subscription prices are heavily subsidized, with some plans offering tokens at up to 70 times the actual cost to attract and retain heavy users, especially developers and enterprises. This strategy mirrors past internet-era subsidy battles, but with a key difference: AI tokens lack "lock-in" effects. Unlike ride-hailing or food delivery apps, users can easily switch between AI providers as APIs become standardized, making it difficult for companies to raise prices post-subsidy. The piece highlights a structural asymmetry in the competition. Giants like Google, with massive advertising revenue, can afford to subsidize tokens indefinitely, akin to using "tokens as a weapon." In contrast, venture-backed companies like OpenAI and Anthropic face pressure to become profitable, especially as they approach IPO. The article cites Google Ventures founder Bill Maris, who suggests Google could slash token prices by 80%, putting immense pressure on competitors. Two potential endgames are presented: the "internet service" model (subsidize, monopolize, then raise prices) and the "utility" model (tokens become a standardized, low-margin commodity like electricity). Given the low switching costs, the latter seems more likely. The competition may not have a single winner but could instead accelerate AI's evolution into a foundational, infrastructure-level technology, akin to a public utility. For now, users continue to benefit from heavily subsidized token costs.

marsbitHace 1 hora(s)

Is the 'Token Subsidy War' Among AI Giants Almost Over?

marsbitHace 1 hora(s)

Beyond the Stadium: The Profitable Games Surrounding the World Cup

"Beyond the Pitch: The Profit Game Around the World Cup" The FIFA World Cup transcends being a sporting spectacle, evolving into a massive global arena for speculation and profit-seeking. The 2026 tournament has amplified this dynamic, creating a multi-layered ecosystem of financial opportunism alongside the football. **Prediction markets** have surged into the mainstream. Platforms like Polymarket and Kalshi saw trading volumes for World Cup contracts soar, attracting new users with their financial trading model and high-profile, chain-based wealth stories that overshadow traditional sports betting in terms of growth and narrative. However, **traditional sportsbooks** remain the dominant force, leveraging established user habits, legal markets, and comprehensive product offerings to handle the vast majority of speculative wagers, with projections suggesting record-breaking betting volumes. Capital markets also react. **"Concept stocks"** in countries like South Korea and Japan experience volatile price swings based on team performance and anticipated fan spending on items like chicken, beer, and viewing parties, effectively becoming a stock market reflecting fan sentiment. The **ticket resale market** has become a sophisticated arena for arbitrage. Prices fluctuate wildly based on team draws and star power, with sellers sometimes listing tickets they don't yet own in a practice akin to short-selling, while FIFA's own "Right to Buy" tokens add another layer of speculative trading. **Collectibles and merchandise** offer another avenue. Panini sticker albums, with their inherent scarcity and nostalgic value, can become high-value collectibles. Limited-edition or locally themed jerseys command significant premiums on secondary markets, and even counterfeit vendors profit from fans' desire for affordable match-day identity. The **cryptocurrency** space has seen a frenzy of speculative, unauthorized World Cup-themed meme coins on chains like Solana. These tokens, often exploiting team names and player imagery, experience extreme pump-and-dump cycles, creating stories of massive gains for a few early entrants and steep losses for many others. Finally, an entire industry thrives on **providing information and tools** to other speculators. Developers create platforms like SeatSidekick to track ticket inventory and prices, while paid Telegram groups and subscriptions sell betting tips and predictions, monetizing the widespread desire for an informational edge. In essence, the World Cup has become a compressed, global laboratory for speculation. While the games determine champions on the field, a parallel, complex network of financial transactions—spanning prediction contracts, bets, stocks, tickets, collectibles, crypto, and information services—settles its own scores in the global market.

marsbitHace 2 hora(s)

Beyond the Stadium: The Profitable Games Surrounding the World Cup

marsbitHace 2 hora(s)

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

This article explains the three primary methods for Codex to interact with a computer, each with distinct use cases, permission boundaries, and trust levels. **1. Computer Use:** This offers the broadest access, allowing Codex to visually control and interact with the graphical user interface of authorized macOS/Windows apps, system settings, and even iOS simulators. It's ideal for tasks lacking APIs or structured tools, such as operating legacy software or multi-app workflows. However, it's the slowest method and has the widest permission scope, requiring careful supervision for sensitive actions. **2. Chrome Extension:** This grants Codex access to the user's logged-in Chrome browser state, including cookies, profiles, and open tabs. It's best for tasks requiring user identity across websites like Gmail, LinkedIn, Salesforce, or internal dashboards. Its key advantage is multi-tab control for complex workflows. While more powerful for browser-based tasks than Computer Use, it carries higher sensitivity as actions are performed under the user's identity. **3. In-App Browser:** This is a browser isolated within the Codex thread, separate from the user's personal browsing data. It excels in web development and debugging scenarios—previewing local servers, testing responsive layouts, or annotating designs directly on the page. Its isolation is a strength for development but a limitation for tasks requiring login sessions. The core principle is to choose the narrowest, safest, and most structured interface for the task. Use plugins or MCPs first, resort to visual control (Computer Use) only for GUI-dependent tasks, employ the Chrome extension for identity-reliant browser work, and prefer the In-App Browser for isolated development. **Appshots** are clarified as a fourth, complementary tool for *inputting* context—capturing a screenshot of a window to point Codex to something—rather than a method for Codex to *act*. Together, this layered approach highlights a key to AI agent productization: not granting unlimited permissions, but constraining them within clear boundaries for specific tasks while preserving user oversight.

marsbitHace 3 hora(s)

How Does Codex Use a Computer? Three Entry Points and Permission Boundaries

marsbitHace 3 hora(s)

Trading

Spot
Futuros
活动图片