Both OpenAI and Anthropic are 'Developing Their Own Chips' — Beyond Cost, the Control Over Computing Power is Paramount

marsbitОпубліковано о 2026-07-03Востаннє оновлено о 2026-07-03

Анотація

OpenAI and Anthropic are both advancing plans to develop custom AI chips, driven by the need to control computing power and reduce costs. According to reports, Anthropic is in early-stage development of its own chips and in talks with Samsung for manufacturing, while OpenAI is collaborating with Broadcom and TSMC, aiming to deploy its first inference chip by late 2026. The primary motivation extends beyond just lowering expenses. For these large model companies, chips are core production assets. By designing specialized hardware (ASICs) tailored to their specific model architectures—OpenAI's being more sparse and Anthropic's more dense—they aim to achieve deeper software-hardware co-design. This synergy can significantly improve inference speed, energy efficiency, and overall unit economics, offering advantages that off-the-shelf GPUs cannot. This move does not signify an immediate replacement for suppliers like Nvidia. The process from design to deployment takes 18-24 months, and Nvidia's GPU ecosystem remains deeply entrenched. Instead, custom chips provide a strategic alternative and negotiating leverage, allowing companies to use them for specific, high-volume workloads like inference while still relying on external GPUs and TPUs for other tasks. The trend reflects a broader industry shift where AI competition is evolving from pure algorithmic prowess to integrated control over the entire software-hardware stack. Companies like Google, Amazon, Meta, and Microsoft are a...

According to a report by The Information on Thursday, Anthropic is in talks with Samsung about custom AI chips and has initiated early-stage development work on its own AI chips. If these custom server chips eventually enter mass production, it will mark a significant step for the company behind Claude in advancing hardware autonomy.

This move is seen as Anthropic following in OpenAI's footsteps.

OpenAI has been progressing its custom AI chip project earlier, collaborating with chip design and manufacturing partners in an attempt to build a more independent and efficient computing infrastructure for products like ChatGPT. The actions of both companies point to the same trend: large model companies are shifting from pure algorithmic competition to integrated software-hardware competition.

The market impact first falls on three fronts: the bargaining environment for external GPU suppliers like Nvidia, opportunities for foundries like Samsung in AI chip orders, and the future financing and IPO pace of AI startups.

According to Barron's, Deutsche Bank analysts recently suggested that OpenAI and Anthropic should not delay their IPOs for too long, one reason being that developing their own chips and computing infrastructure requires massive long-term capital.

Developing Own Chips is Primarily a Matter of Computing Power Control

Currently, training and running large models require vast amounts of high-performance computing resources. The AI computing market heavily relies on Nvidia's GPU architecture, and tight supply-demand keeps model training and inference costs high. For model companies like OpenAI and Anthropic, chips are no longer just procurement items but core means of production.

The demand for Anthropic's Claude model has grown significantly in 2026. TradingKey reported that Anthropic executives previously disclosed the company's annualized revenue has exceeded $300 billion, compared to about $90 billion at the end of 2025. Business expansion drives rapid increases in computing power demand and also amplifies the impact of external chip supply uncertainties on company operations.

Anthropic still relies on various third-party chip solutions, including TPUs designed by Alphabet's Google and Amazon's self-developed chips. Reports indicate Anthropic has also entered long-term TPU supply agreements with Google and Broadcom, related to its previously announced $50 billion U.S. computing infrastructure investment plan.

This means that developing one's own chips does not equate to completely breaking away from external suppliers. A more realistic goal is to master core design capabilities, create technical alternatives, and enhance leverage in future business negotiations.

Cost is Just the Entry Point; Hardware-Software Co-design is Key

The most direct reason for developing one's own chips is to reduce costs. Through custom ASICs, AI companies can optimize computing processes around their own model architectures, reducing unnecessary modules in general-purpose chips, thereby improving energy efficiency. If Anthropic's chips are successfully taped out and deployed, reports suggest they could significantly lower API call costs and influence pricing structures in the enterprise AI application market.

But cost is not the only variable. Dylan Patel, founder of SemiAnalysis, emphasized in an interview that the greatest room for AI efficiency improvement doesn't come solely from faster chips, but from co-design across models, kernels, and silicon. He believes single-layer optimization might yield a 2x improvement, but cross-layer co-design could bring effects far greater than a simple multiplication.

This explains why OpenAI and Anthropic are moving towards deeper hardware involvement. Model architectures are not naturally suited to all chips. Dylan Patel stated that OpenAI models are more sparse-oriented, while Anthropic models are relatively more dense. They have significant differences in areas like matrix multiplication unit size, attention mechanism structure, and expert layer shapes, which naturally inclines the two companies towards different hardware directions. "In fact, given the direction OpenAI models are heading, using TPUs could be a bad decision for them; similarly, given the direction of Anthropic and Google models, using GPUs for training could be a bad decision for them," he said.

In other words, developing one's own chips isn't just about replacing Nvidia GPUs with proprietary ones. The real goal is to allow models, from their initial design, to fit the underlying hardware, thereby improving inference speed, energy consumption, throughput, and unit economics.

Not an Immediate Replacement for Nvidia, but Long-term Balancing

The process from R&D, tape-out, verification, to final mass production and deployment of self-developed AI chips typically takes 18 to 24 months. Even if Anthropic successfully reaches an agreement with Samsung, its self-developed chips are unlikely to substantially replace existing computing power supplies in the short term.

OpenAI is progressing earlier. TradingKey reported that OpenAI chose to collaborate with Broadcom and TSMC, planning to deploy its first inference chip in the second half of 2026. Compared to Anthropic, OpenAI is more proactive and closer to deployment on the custom chip path.

The direction of large model companies developing their own chips does point towards reducing dependence on suppliers like Nvidia. But this doesn't mean Nvidia's position will be rapidly weakened. Dylan Patel noted in the interview that Nvidia GPUs still hold advantages in generality, as many models and the open-source ecosystem are already optimized for GPUs. He also mentioned that the so-called CUDA moat isn't just CUDA itself, but the fact that a vast downstream ecosystem of models and software has been adapted for Nvidia's hardware form. If a model's expert structure, hidden dimensions, and communication patterns are inherently more suited for GPUs, migrating to other chips, even if advantageous, might not be straightforward.

Therefore, developing proprietary chips is more like establishing a second route. OpenAI and Anthropic will likely continue using GPU, TPU, Trainium, and other computing resources, while deploying self-developed ASICs for more specific, stable, and high-frequency workloads, especially inference scenarios.

Industry-wide "Computing Power Autonomy" Race Fully Underway

The shared logic behind OpenAI's and Anthropic's self-developed chips can be summarized in three points: reducing long-term computing costs, decreasing reliance on external supply, and improving model efficiency through hardware-software co-design.

Among these, the third point might be the most critical. As model companies scale, general-purpose computing power cannot fully meet the needs of differentiated architectures. Self-developed chips allow companies to place model design, system software, and underlying silicon within the same optimization framework.

But the direction is clear: competition among large models is extending from "whose model is stronger" to "who can better control computing power, capital, and the hardware stack." This is the real reason both OpenAI and Anthropic are moving towards developing their own chips.

Anthropic's exploration is not an isolated case. From Google's decade-long TPU series, to Amazon's Trainium series focused on training, to Meta's MTIA series for inference, and Microsoft's ongoing Maia series, leading tech companies have all deeply engaged in the self-developed chip race.

For Samsung, securing Anthropic's chip foundry order would provide a significant boost to its wafer foundry business's influence in the AI field. Samsung is currently fiercely competing with foundries like TSMC for advanced-node customers. Bringing in high-growth-potential AI clients like Anthropic would help expand its footprint in the AI semiconductor landscape.

This article is from WeChat public account: Wall Street News , author: Zhao Ying

Пов'язані питання

QWhat are the main reasons OpenAI and Anthropic are developing their own AI chips?

AThe primary reasons are: 1) To gain control over core computing resources, reducing dependence on external suppliers like Nvidia. 2) To lower long-term computing costs. 3) To achieve deeper hardware-software co-design, optimizing chip architecture for their specific model architectures to improve efficiency, inference speed, and unit economics.

QHow does developing custom AI chips help with hardware-software synergy?

ACustom chip development allows companies to co-design models, system software, and underlying silicon within a single optimization framework. This enables chips to be tailored specifically to the characteristics of their AI models (e.g., OpenAI's models being more sparse vs. Anthropic's being denser). This cross-layer co-design can lead to significantly greater performance gains than optimizing just one layer, improving inference speed, energy consumption, throughput, and cost-effectiveness.

QWill self-developed chips from OpenAI and Anthropic immediately replace Nvidia GPUs?

ANo, they will not immediately replace Nvidia GPUs. The process from R&D to mass deployment typically takes 18-24 months. Even if successful, these custom ASICs are more likely to be used for specific, stable, and high-frequency workloads (especially inference), while companies will continue to use a mix of GPUs, TPUs, and other external chips. Nvidia's ecosystem and the adaptability of many models to its hardware remain significant advantages, making self-developed chips a strategic alternative for long-term balancing rather than a direct, immediate replacement.

QAccording to the article, what is a key market impact of AI companies developing their own chips?

AA key market impact is on the bargaining environment for external GPU suppliers like Nvidia. It also affects opportunities for foundries (like Samsung) to secure AI chip orders and influences the future financing and IPO timelines for AI startups, as building self-developed chips and compute infrastructure requires massive long-term capital investment.

QWhich other major tech companies are mentioned as already having in-house AI chip projects?

AThe article mentions several other tech giants with in-house AI chip projects: Google with its long-standing TPU series, Amazon with its Trainium chips for training, Meta with its MTIA series for inference, and Microsoft with its ongoing Maia chip series.

Пов'язані матеріали

Whale bets $70M on Bitcoin, Solana recovery – Will Fed’s hike fears ruin it?

A whale increased its long positions in Bitcoin and Solana with a $70M bet, and opened a $78M total position including a short on Hyperliquid (HYPE). The move followed a weak U.S. jobs report, which fueled a relief rally in crypto markets by easing fears of Federal Reserve rate hikes. The whale initially saw an unrealized profit of $9.2M. However, Fed rate expectations did not shift significantly toward cuts; markets still price an 83% chance rates remain unchanged. Upcoming FOMC meeting minutes could trigger volatility. The whale's position is already under pressure, with the HYPE short down 70%. Meanwhile, smart money increased bidding on Solana by 129%, but Bitcoin faces heavy short positions near $62K, setting up potential for a short squeeze. For a sustained recovery, Bitcoin must clear resistance at $62.3K and $65K. Overall, the market's direction hinges on the Fed's signals and whether hawkish hints could renew selling pressure.

ambcrypto36 хв тому

Whale bets $70M on Bitcoin, Solana recovery – Will Fed’s hike fears ruin it?

ambcrypto36 хв тому

77 Bloody Codes: When the '$5 Wrench' Shatters the Privacy Myth of France's Crypto Elite

**Summary** In the first half of 2026, France recorded 77 violent crimes—including kidnappings, illegal confinement, and extortion—targeting cryptocurrency holders, a 71% increase over the 45 cases in all of 2025. This equates to an attack every 2.3 days, making France a global hotspot for so-called "wrench attacks," where criminals use physical violence to coerce victims into surrendering crypto assets. High-profile cases include the 2025 kidnapping of Ledger co-founder David Balland, who was shown with a severed finger in a ransom video, and an attempted kidnapping of the family of Paymium's CEO in Paris. Prosecutors note these are not isolated incidents but part of structured criminal networks, sometimes involving minors. Several factors contribute to France's vulnerability: a large holder base (approx. 7.3 million people), the presence of major crypto firms and executives, a culture of public disclosure among enthusiasts, and potential data leaks. The trend is spreading across Europe, which now accounts for over 40% of such global attacks. Research indicates a correlation between Bitcoin price surges and increased wrench attacks. In response, French Interior Minister Laurent Nuñez announced a three-pillar action plan focusing on enhanced intelligence sharing, deeper cooperation with the industry association Adan, and improved operational and cross-border coordination. Authorities have made over 200 arrests in the past year. Security experts warn that digital asset safety now extends to the physical world. They advise holders to use hardware wallets, avoid disclosing holdings on social media, enable multi-factor authentication, and report suspicious activity. The situation underscores the urgent need for the crypto community to shift from a "show-off" culture to one of discretion as physical security becomes paramount.

marsbit54 хв тому

77 Bloody Codes: When the '$5 Wrench' Shatters the Privacy Myth of France's Crypto Elite

marsbit54 хв тому

Crypto rules are ‘not a favor,’ says SEC, but CLARITY Act still waits

SEC Chairman Paul Atkins defended the agency's push for clear crypto market regulations, stating that providing regulatory clarity is "not a favor" but a necessary requirement for markets to function. He emphasized the SEC's "historic steps" to modernize rules in response to calls to make the U.S. a crypto capital. Despite issuing staff guidance on topics like asset classification and ETFs, the SEC acknowledges past missteps that broke trust and aims to rebuild it through an orderly process for handling numerous filings. However, such guidance, lacking a foundation in codified law, remains vulnerable to legal challenges. Lasting clarity for the industry is seen as dependent on the passage of the CLARITY Act, a comprehensive crypto market structure bill. Although the bill has cleared a key committee, it still awaits a Senate floor vote. With the EU's MiCA framework now active, industry groups are urging the U.S. to pass the CLARITY Act to prevent innovation from moving overseas.

ambcrypto1 год тому

Crypto rules are ‘not a favor,’ says SEC, but CLARITY Act still waits

ambcrypto1 год тому

Bearish Clouds Gather as $2.13B in Bitcoin and Ethereum Options Expire

The cryptocurrency market faced a key event on July 3 as $2.13 billion in Bitcoin and Ethereum options expired. Data revealed a defensive sentiment, especially for Ethereum, where a high put-call ratio of 1.29 indicated traders were hedging against further price drops. Bitcoin's put-call ratio was 0.70. Market positioning was concentrated near key levels of $60,000 for Bitcoin and $1,700 for Ethereum. While Bitcoin briefly reclaimed the $60,000 level, analysts remain uncertain if this signals a sustained recovery. Broader market trends, including traditional finance and tokenized stocks, are also influencing sentiment. The options data suggests traders are cautious and preparing for continued volatility rather than a major bullish move. At the time of reporting, Bitcoin traded near $61,932 and Ethereum around $1,738, both with significant liquidations over the preceding 24 hours.

TheNewsCrypto2 год тому

Bearish Clouds Gather as $2.13B in Bitcoin and Ethereum Options Expire

TheNewsCrypto2 год тому

DeepSeek's New Technology Ported to Apple Silicon, Mac Local LLM Accelerated by 60%

DeepSeek's newly open-sourced DSpark inference acceleration technology has been ported to Apple Silicon, yielding significant speedups for running large language models locally on Macs. The port, called mlx-dspark, was developed by engineer Abdur Rahim and supports models like Gemma-4 12B and Qwen3-4B. DSpark uses speculative decoding, where a smaller "draft" model proposes candidate tokens which are then verified in a batch by the target model. Rahim adapted this approach for Apple's MLX framework, implementing 4-bit quantization for the draft model. On an M4 Pro Mac, this resulted in generation speeds increasing by approximately 1.6x for Gemma-4 12B (to ~30 tok/s) and 1.4x for Qwen3-4B (to ~73 tok/s). Crucially, the port maintains bitwise identical output to the original models, including support for temperature sampling, not just greedy decoding. The project also integrated DFlash, an alternative block-based speculative decoding method from z-lab. Benchmarks show DFlash excels in predictable contexts like code/math tasks (achieving ~2.1x speedup), while DSpark's Markov head provides better performance for open-ended chat. The latest mlx-dspark version allows users to switch between these methods. The work demonstrates efficient, high-fidelity local LLM inference on consumer Apple hardware.

marsbit2 год тому

DeepSeek's New Technology Ported to Apple Silicon, Mac Local LLM Accelerated by 60%

marsbit2 год тому

Торгівля

Спот