GPT Designs GPT

marsbit发布于2026-06-25更新于2026-06-25

文章摘要

OpenAI has unveiled its first custom AI chip, Jalapeño, a move signaling a strategic shift beyond being a mere model company. While many see it as a challenge to NVIDIA, its core aim is to control the entire intelligent production pipeline—from models and chips to data centers and energy. The key driver is the evolving competitive landscape: model advantages are shrinking, while the computational gap in areas like cost-per-token, system throughput, and energy efficiency is becoming the true long-term barrier. Jalapeño is primarily an inference chip, targeting the massive and growing "inference tax"—the daily operational cost of generating tokens for services like ChatGPT and APIs. By designing its own hardware optimized for its specific workloads and future product roadmaps (even using AI to aid the chip design process), OpenAI aims to drastically reduce token generation costs and improve system efficiency. This creates a potential flywheel: better models help design better chips, which lower costs for running next-generation models, supporting more users and products, which in turn provides more data to refine future chips. The strategy mirrors Apple’s integrated approach, building a closed loop where hardware, software, and applications are co-optimized. In the long term, OpenAI is not trying to become the next NVIDIA (a supplier of "shovels" to all AI companies) but to own and operate the entire "mine"—selling the end product of intelligence itself. This move marks OpenA...

OpenAI is finally making chips.

When many people saw this news, their first reaction was: Nvidia is in trouble.

But what I see is precisely the opposite.

The most significant meaning of the first chip, Jalapeño, is not that it's coming directly for Nvidia.

This is the first time OpenAI has publicly admitted it's not satisfied with being just a model company.

What it wants to control is the entire process of producing intelligence.

From models, to chips. From data centers, to energy. From training, to inference. From producing Tokens, to selling Tokens.

Jalapeño appears to be a chip on the surface, but it's actually more like a roadmap.

OpenAI has finally laid its ambition on the table.

I. The Model Gap is Shrinking, the Compute Gap is Widening

Since the explosion of large models, almost all attention in the AI industry has been on the models.

The industry was shocked by GPT-4, then Claude caught up, Gemini caught up, DeepSeek delivered high cost-performance, Meta pushed open source. Every release, everyone focuses on the same set of things: parameters, leaderboards, coding ability, math ability, long context, multimodal capabilities.

Models are, of course, important. But a change has already occurred: the window of model leadership is getting shorter. Today, a model is just released, and within months, the open-source community, competitors, and cloud providers catch up. Performance gaps still exist, but they are increasingly difficult to constitute a long-term moat on their own.

The things that truly create differentiation are moving to a deeper, more foundational level. Compute supply, inference costs, system throughput, networking capabilities, data center construction, energy acquisition. These aren't as flashy as model launches, nor do they go viral immediately. But they determine whether an AI company can run long-term.

Jensen Huang recently said something: Nvidia systems might not have the lowest purchase price, but they can generate the lowest-cost Tokens, the highest Token throughput, and ultimately bring the highest revenue.

Huang's statement was direct. The industry has long complained that Nvidia is expensive. Huang didn't argue about the purchase price; he reframed the problem in another dimension: don't look at how much the machines cost to buy, look at the production cost per Token.

This is the new ledger for the AI era. Servers and GPUs are not the ultimate unit; the Token is.

OpenAI happens to be at the very center of this problem.

ChatGPT handles a massive volume of requests daily, Codex consumes even more inference steps, and in the future, there are Agents, video generation, robotics, long reasoning chains. The more useful the model, the greater the Token consumption. The more successful the product, the thicker the inference bill.

The brutal part is here: the more users OpenAI has, the more money Nvidia makes. The stronger OpenAI's products, the heavier the underlying compute tax.

If every Token has to pass through an external hardware platform paying a toll, it's hard for OpenAI to have a complete moat. It can have the strongest model, a super entry point, a developer ecosystem. But the core production cost is always in someone else's hands.

This is the essence of Jalapeño. OpenAI has started building its own Token factory.

II. GPT Begins Designing GPT

The most underestimated detail about the Jalapeño chip is the nine-month tape-out.

Traditional high-performance ASIC projects typically have cycles of 18 to 36 months. Advanced processes are even more troublesome—architecture, verification, physical implementation, packaging, software stack, debugging—any hiccup can rapidly escalate costs. OpenAI and Broadcom compressed the cycle to nine months.

This cannot be understood as the chip industry suddenly becoming simple. OpenAI did not spontaneously grow a semiconductor supply chain. Broadcom has deep experience in custom chips and network infrastructure; Celestica handles boards, racks, and systems engineering.

What OpenAI truly contributed is something scarcer: it knows how future models will run.

Many chip companies building AI accelerators face the challenge of guessing the workload. Model architectures change, inference methods change, service patterns change. Once a chip is taped out, it's not as easy to roll back in the physical world as it is in the software world.

OpenAI doesn't have to rely entirely on guesswork. Running ChatGPT, Codex, and APIs daily, it knows which kernels are used most, which memory transfers are most wasteful, which network bottlenecks most affect cluster efficiency, which latencies directly hurt product experience. It also knows how future Agent products will consume inference resources.

This experience was once just backend engineering knowledge; now it's being written into the chip architecture.

A crucial statement in OpenAI's official press release: OpenAI used its own models to accelerate parts of the design and optimization process. It also said that models provided to users are helping improve the infrastructure that will run future models.

GPT has started participating in designing the machines for the next generation of GPT.

For decades, the chip chain was: first design the chip, the chip runs the software, the software runs the AI. Now, the chain is turning back: AI helps humans design chips, which then run the next generation of AI.

Once this loop is established, nine months might just be the beginning. The future could be six months, three months, or even more frequent iterations.

The chip industry had its own rhythm, the model industry had its own rhythm. The former was slow, the latter fast. Jalapeño is pulling these two rhythms together.

If this step succeeds, OpenAI's flywheel will become formidable. Better models help design better chips, better chips lower the running cost of the next model generation, lower costs support more users and products, more users and products generate more real workload data, which in turn defines the next generation of chips.

This is the cycle OpenAI truly wants.

III. Cutting the Inference Tax, Controlling Cash Flow

Jalapeño is not a training chip; it targets large language model inference. This is key.

Training is like building an aircraft carrier. A huge one-time investment, requiring extremely strong general-purpose capability, and constant adaptation to new models, architectures, and experiments. The training market still heavily depends on Nvidia—not just the GPUs, but the entire platform: CUDA, networking, systems, software libraries, developer ecosystem.

Inference is more like a fleet of taxis. Running daily, hourly, by the minute. Every time a user asks a question, an API responds, an Agent takes a step forward, inference happens. It cares more about low latency, low cost, high throughput, high utilization.

Training burns big money in phases; inference burns daily cash flow.

This is also the most painful problem for AI companies as they enter the commercialization stage. GPT training is expensive once, but inference happens every day. The Agent era will further amplify this problem—one task may involve dozens or even hundreds of model calls. Long context, chain-of-thought reasoning, multimodal generation, code execution—all continue to push Token consumption higher.

Jalapeño is precisely targeting this inference tax. It's more like OpenAI's own TPU. Google, Amazon, Meta, Microsoft have all taken similar routes—as long as the workload is sufficiently large, custom ASICs make economic sense for high cost-effectiveness.

OpenAI now meets these conditions. Real requests, a product roadmap, a model team, industry partners like Broadcom, and immense cost pressure.

Jalapeño doesn't need to be sold externally to prove its value. As long as it makes ChatGPT answers cheaper, makes Codex run faster, and makes API margins higher, it's meaningful.

OpenAI also mentioned that Jalapeño will reduce data transfers, balance compute, memory, and network resources, bringing actual utilization closer to theoretical peaks. Compute is expensive often because it's not fully utilized—GPUs waiting for networks, memory transfers slowing down computation, poor scheduling causing idle time—all waste eventually turns into electricity bills and capital expenditure.

The purchase price is only the first layer; system efficiency is the final account.

IV. OpenAI is Looking More and More Like Apple

Many interpret Jalapeño as OpenAI challenging Nvidia, but I think OpenAI doesn't want to become the next Nvidia; it's more like emulating Apple.

Apple's greatest strength has never been any single point. The iPhone is strong, iOS is strong, the A-series and M-series chips are strong, the App Store is strong. But the truly difficult thing to compete against is how all these things are placed within the same closed loop.

Chips are optimized for the system, the system is optimized for applications, and the application experience in turn defines the next generation of chips. This closed loop allows Apple to deliver experiences under the same battery, same size, and same thermal constraints that others find hard to replicate.

OpenAI is building something similar. The model is the intelligence kernel, ChatGPT is the super entry point, Codex is the development tool, API is the ecosystem distribution layer, Jalapeño is the custom chip, and data centers are the AI factories.

Over the past two years, OpenAI CEO Altman has repeatedly discussed chips, energy, nuclear fusion, data centers. Looking back now, he might not have been chasing trends at all; he has stopped planning OpenAI in the way an AI startup would.

If Nvidia sells shovels, then OpenAI wants to own the mine.

Nvidia wants to be the factory equipment supplier for all AI companies, selling GPUs, networking, systems, software ecosystems, AI factory solutions—its ideal customers are every company that needs to produce Tokens.

OpenAI wants to build a factory for itself, selling not the equipment, but the final, generated intelligence.

In the short term, OpenAI still depends on Nvidia. Training and general-purpose computing still require the GPU platform, and Jalapeño likely won't cover all workloads quickly. It will probably first enter OpenAI's most certain, largest-scale, highest-optimization-return inference scenarios.

In the long term, cracks have appeared. When model companies start having their own chip roadmaps, Nvidia's customers are no longer just customers. They also become another type of player in the AI infrastructure landscape.

Words Beyond the Layout

Over the past two decades, the most important asset on the internet was traffic. Whoever controlled the users, controlled the value.

Today, new rules are emerging in the AI era.

Models are becoming more like traffic, while compute is becoming more like land.

Models will iterate, products will change, leaderboards will keep refreshing. But those factories that produce intelligence—chips, networks, data centers, energy—will increasingly concentrate in the hands of a few players.

GPT designing GPT looks like just another tape-out.

But what it truly announces is this:

OpenAI is no longer satisfied with being the smartest company; it wants to be the company that controls the production of intelligence.

This article is from WeChat public account:Layout Beyond, author: Huahua

This article is from WeChat public account:Layout Beyond, author: Huahua, title image from: AI-generated

你可能也喜欢

格兰特·卡多恩将比特币持仓增至2700枚——为何是现在？

Grant Cardone旗下的Cardone Capital在市场低迷期增持比特币，持仓量增至约2700枚BTC，平均购买价格为59,000美元。尽管没有公开文件独立证实这一持仓规模，但按当前价格计算其价值约1.59亿美元。此次买入发生在比特币价格处于近期波动区间低位、年内已下跌约32%的背景下，Cardone视其为积累机会。与此同时，最大企业持仓者MicroStrategy（持有约847,363枚BTC）却改变了策略，首次正式授权可出售最多12.5亿美元的比特币以筹措资金，这打破了其长期“永不卖出”的承诺。六月，美国现货比特币ETF遭遇创纪录的净流出，约40.6亿美元被赎回，加剧了市场抛压。技术分析显示，比特币周线图上的布林带下轨（绿色线）目前提供了支撑，该位置历史上曾多次引发价格反弹，暗示市场可能正在接近阶段性底部。

ambcrypto16分钟前

ambcrypto16分钟前

AI 时代，比特币还剩什么呢？

作者认为，在AI时代，信息生成成本趋近于零，导致真实与虚假内容难以辨别，“可验证性”变得稀缺。与此对比，比特币虽然因能耗高被诟病，但其本质是消耗能源来确保账本历史的不可篡改，从而提供一种不依赖任何中心化信任、仅靠数学与全网节点即可验证的机制。文章将AI比作降低“创造”成本的现代印刷机，而区块链（如比特币）则像降低“验证”成本的复式记账法。两者并非竞争关系：AI负责高效生成内容，区块链负责为数字资产与记录提供去中心化的验证基础。因此，比特币可被视为一台“制造可验证性的机器”。在AI生成内容泛滥的未来，独立验证的事实可能成为新的价值所在。

链捕手30分钟前

链捕手30分钟前

Bitmine以太坊储备增至98亿美元："加密货币最好的年份尚未到来"

比特浸入科技（Bitmine Immersion Technologies）近期再次成为头条，其在一周内增持了27,084枚以太坊（ETH）。这使得其以太坊总持有量达到5,700,040枚，按每枚1,569美元计算，价值约90.1亿美元，占以太坊总供应量的4.7%。此次增持发生在以太坊价格从约1780美元下跌至1578.54美元（撰稿时）的一周内。同时，根据SoSo Value数据，以太坊ETF在整个六月大部分时间出现资金外流，总额达5.0139亿美元。针对疲软的市场状况，比特浸入科技董事长汤姆·李（Tom Lee）表示，近期市场对加密货币投资者颇具挑战，并指出临近季度末的“粉饰橱窗”行为导致投资者减持过去三个月表现不佳的资产是常见现象。此外，迈克尔·赛勒（Michael Saylor）的公司Strategy正面临持续审查，据报道其持有约140亿美元未实现亏损，而其普通股和优先股价格均跌破100美元水平，引发加密社区部分人士建议其停止扩张比特币持仓。由于比特浸入科技常被称为“以太坊的Strategy”，市场担忧其持续的以太坊积累行为可能面临类似困境与批评。目前上市公司共持有价值约749.4亿美元的比特币和114.8亿美元的以太坊，Strategy是最大的比特币持仓上市公司。然而，目前这些担忧仅是推测。比特浸入科技并非单纯积累以太坊，其每年质押收入估计达2.11亿美元，同时持有5.55亿美元现金及等价物以及488万枚质押的ETH。该公司还于6月26日被纳入罗素1000大型股指数。汤姆·李强调，公司计划稳步增长至2026年，并认为市场正开启新一轮牛市周期，代币化和人工智能的快速进展将推动区块链和去中心化加密领域的指数级需求增长。最终摘要： * 新增持后，比特浸入科技持有5,700,040枚ETH，价值约90.1亿美元。 * 尽管以太坊价格疲软、ETF资金外流且Strategy面临批评，比特浸入科技仍持续购入以太坊。

ambcrypto2小时前

ambcrypto2小时前

英国FCA公布加密资产监管规则手册：基于风险的方法将于2027年10月启动

英国金融行为监管局公布新的加密货币监管框架，采取风险为本方法而非“一刀切”规则，将于2027年10月生效。新规要求加密公司持有充足资本覆盖潜在损失，具体金额将根据其风险状况浮动，较小或风险较低的公司可减少信息披露负担以节省合规成本。企业需自行评估资产负债表风险并进行年度压力测试，以确定所需资本水平，FCA将审核评估结果但不强加统一规则。此举旨在提升市场信心，吸引额外300-400万英国用户使用加密货币。针对稳定币，FCA保留了基本框架但简化了部分合规要求，例如取消储备构成预测估算，同时强化消费者保护，要求储备资产置于法定信托下并允许最多5%的流通稳定币作为储备。大型系统性发行机构可能面临更严监管。监管机构强调新规为加密行业提供了明确性与稳健基础，但也有市场人士提醒，监管虽可增强保护、减少欺诈，但无法完全消除风险。FCA将于下月开始提供许可申请前支持会议，以协助企业适应新规。

ambcrypto3小时前

ambcrypto3小时前

你天天用的Claude和Codex，Meta内部不让随便用了

今年5月，Meta为其应用AI工程部门的工程师划定了红线：限制内部使用Claude Code和Codex这两款流行的AI编程工具，相关限制至今仍在生效。作为这些工具的主要客户之一，Meta此举并非因其不好用，而是恰恰相反——担心其过于强大和好用。 Meta正在自研名为MetaCode的AI编程助手，旨在替代外部模型以节省成本并掌握核心技术。限制使用外部模型的核心原因，是防止“蒸馏陷阱”：即担忧员工在构建MetaCode的训练数据、编程题库和评测标准时，过度依赖或掺入Claude/Codex的输出。这会导致自研模型在不知不觉中学习对手的“本事”和判断标准，使能力来源模糊，并可能违反与OpenAI、Anthropic等竞争对手的服务条款，引发法律风险。内部指南明确禁止了可能让外部AI模型“定义能力”的三类任务：不能用其输出来生成测试题目、不能用其分析代码或设计测试点、其生成内容不得进入被测模型的访问环境。仅允许AI处理搭建工作流、整理文件等“打下手”的辅助性任务，且所有AI产出必须经过人工审核。这一事件揭示了AI行业的一个普遍困境：在利用强大外部工具加速自身研发的同时，如何清晰界定并守护自身模型能力的原创性，避免陷入知识产权与合同风险。随着AI参与创造AI的循环加深，“本事究竟是谁的”这条界线正变得越来越模糊。

marsbit3小时前

marsbit3小时前

交易

现货

GPT Designs GPT

文章摘要

热门币种推荐

相关问答

你可能也喜欢

格兰特·卡多恩将比特币持仓增至2700枚——为何是现在？

AI 时代，比特币还剩什么呢？

Bitmine以太坊储备增至98亿美元："加密货币最好的年份尚未到来"

英国FCA公布加密资产监管规则手册：基于风险的方法将于2027年10月启动

你天天用的Claude和Codex，Meta内部不让随便用了

交易

热门文章

如何购买PEOPLE

相关讨论

热门问答

热门分类

热门标签