The Underlying Logic of Bottleneck Propagation in the AI Computing Power Industry Chain

Q: What are the four sequential bottleneck stages in the AI computing power supply chain as described in the article, and which one is identified as the 'ultimate bottleneck'?

The four sequential bottleneck stages are: 1) GPU/Computing, 2) Memory (HBM), 3) Optical Interconnect, and 4) Power + Liquid Cooling. The article identifies the fourth stage, Power and Liquid Cooling, as the 'ultimate bottleneck' or final physical constraint, as even if all other components are ready, a lack of power and cooling prevents the AI clusters from running.

Q: Why did High Bandwidth Memory (HBM) become a critical bottleneck after the initial GPU shortage was alleviated?

HBM became the critical bottleneck because as GPU computing power increased to handle massive AI models with trillions of parameters, the need for faster data transfer (memory bandwidth) created a 'memory wall.' HBM, which is much faster than traditional DDR memory, is essential for feeding data to these powerful GPUs. Its complex manufacturing process (involving TSVs and stacking) and limited suppliers (SK Hynix, Samsung, Micron) made its supply unable to keep up with explosive demand, delaying entire AI cluster deployments even when GPU chips were available.

Q: According to the article, what is the fundamental reason the industry is transitioning from copper cables to optical interconnects for scaling AI clusters?

The fundamental reason is the physical limitations of copper cables. While usable within a single server rack, copper cables face severe signal attenuation, excessive weight (e.g., over 1.36 tons for an NVL72 rack), high power consumption for signal integrity, and distance constraints when scaling to multi-rack clusters with thousands of GPUs. Optical interconnects (like CPO and silicon photonics) offer higher bandwidth density, lower power per bit, and longer transmission distances, making them a necessity for breaking the performance ceiling of large-scale AI data centers.

Q: How does the article characterize the nature of bottlenecks in the AI computing power supply chain, and what investment shift does this logic explain?

The article characterizes the bottlenecks as forming a system-level 'Leontief production function,' where components like GPU, HBM, interconnect, power, and cooling are complementary constraints—the system's capacity is determined by the lowest-performing (most bottlenecked) component. This logic explains the shift in investment focus from earlier leaders like NVIDIA and TSMC to companies in subsequent bottleneck areas: HBM suppliers (SK Hynix, etc.), optical component makers (Lumentum, Coherent), and power/cooling infrastructure providers (Vertiv, power companies), as each bottleneck转移 reshapes value distribution in the产业链.

Q: What specific data points from major investment banks does the article cite to illustrate the scale and unpredictability of current AI infrastructure demand?

The article cites several independent data points: Morgan Stanley noted a 2.5x increase in global weekly LLM token consumption in 3 months. J.P. Morgan identified a 122 GW financing gap for data center projects over 5 years and that 44% of new U.S. power projects face over 4-year grid connection waits. Bank of America significantly raised Alphabet's 2026 CAPEX forecast to $181.5 billion (a doubling year-over-year), expecting a 62% drop in free cash flow. These figures from different research paths collectively show AI demand has exceeded all traditional planning models for power, semiconductor equipment, and memory pricing.

marsbitОпубликовано 2026-05-22Обновлено 2026-05-22

Введение

The article analyzes the evolving bottleneck progression within the AI compute supply chain. Initially constrained by GPU chip and advanced packaging capacity (2022-2024), the primary bottleneck shifted to HBM memory (2024-2025) due to massive model parameter growth. As cluster scale expands, physical limits of copper interconnects are making optical interconnect technologies the next critical phase (2025-2026). The ultimate, emerging constraint is power delivery and advanced liquid cooling (from 2026 onward), driven by skyrocketing rack power densities exceeding traditional infrastructure limits. The core thesis is that AI compute demand follows a "Leontief" production function where solving one bottleneck immediately exposes the next in the sequence: Compute (GPU) → Memory (HBM) → Interconnect (Optics) → Power & Cooling. Each shift reallocates value and investment across the semiconductor and infrastructure landscape.

Author: qinbafrank

In February, in the article "What Does This War of Capital Expenditure Mean?", it was discussed that key segments in the computing power industry chain can still capture the greatest value: chips, packaging & testing, memory, optical modules, etc. Those with capacity that is difficult to expand rapidly or those with extremely high moats will enjoy the红利 of massive capital expenditures.

There is still significant room for efficiency optimization: Distillation, quantization, MoE, dedicated chips, liquid cooling, nuclear fusion (long-term) on the inference side may reduce the energy consumption and cost per unit of computing power by another 10–100 times. Opportunities should be sought in these segments.

Recently, multiple investment banks including Morgan Stanley, J.P. Morgan, Bank of America, Goldman Sachs, UBS, Citi, Bernstein, and HSBC have published update reports on AI/semiconductors/power/memory. The bottlenecks for AI hardware have expanded from the single dimension of "GPU supply" to collective tension across five dimensions: power, chips, memory, equipment, and materials.

The scale of AI demand has broken through the forecast intervals of all traditional power planning, semiconductor equipment capacity, memory price models, and robot installation assumptions.

Morgan Stanley's global thematic research review points out that the global weekly large language model token consumption soared from 6.4 trillion to 22.7 trillion within 3 months, an increase of 2.5 times. The U.S. data center power gap for 2025-28 is 55 GW; J.P. Morgan's inaugural coverage of data center high-performance computing project debt directly gives a "122 GW financing gap in the next 5 years" figure. U.S. 5-year power planning has surged from 101 GW to 230 GW, with 44% of new projects experiencing grid connection wait times exceeding 4 years; Bank of America's latest target price report for Alphabet directly revises its 2026 capital expenditure upward to $181.5 billion, doubling year-on-year, with free cash flow declining 62%. These three sets of data are not outputs from the same framework, but independent portraits from three separate institutions on different research paths.

The evolution of bottlenecks in the semiconductor industry chain (especially in the AI computing power field) precisely progresses in this clear sequential order: "Computing (GPU) → Memory (HBM, etc.) → Optical Interconnect → Power/Liquid Cooling". This is the industry consensus for 2025-2026. As AI training/inference clusters scale from single cabinets (dozens of GPUs) to super-large scale (thousands to hundreds of thousands of GPUs), each time a bottleneck in one segment is resolved, the next physical/supply chain constraint is immediately exposed, forming "Leontief-style" complementary constraints (if one is missing, nothing can be shipped).

It is necessary to understand why this evolution occurs, the current status, and the underlying physical/engineering reasons:

1. First Phase Bottleneck: GPU Computing (Dominant from 2022-2024) Core Constraint:

High-end GPU (e.g., NVIDIA Hopper H100 → Blackwell B200 → Rubin) wafer capacity itself + advanced packaging.

Why it was the bottleneck: AI large models require massive parallel computing. TSMC's 4nm/3nm/2nm logic processes + CoWoS (2.5D/3D packaging) capacity once became the biggest choke point. Even if front-end wafers were sufficient, the back-end capability to package logic chips + HBM stacks couldn't keep up, preventing the entire GPU from being produced.

Easing situation: TSMC aggressively expanded CoWoS (capacity doubling 2024-2025), NVIDIA Blackwell is shipping in large volumes. But this only unlocked the "computing" segment, immediately exposing new problems.

2. Second Phase Bottleneck: Memory (HBM High Bandwidth Memory, becoming the tightest from 2024-2025)

Core Constraint: HBM3/HBM3e/HBM4 capacity.

Why it became the next bottleneck: GPU computing power increased, but model parameters exploded (trillions to tens of trillions of parameters), making data movement (memory bandwidth) the "memory wall." HBM can transmit several TB of data per second, over 20 times faster than conventional DDR memory. Because HBM is adjacent to the logic chip, data doesn't need to travel far, thus saving energy.

A single B200 GPU requires 192GB+ of HBM3e. A single cabinet (NVL72) HBM total capacity has reached 30-40TB, and bandwidth demands far exceed traditional DRAM.

Supply chain status: Only SK Hynix, Samsung, and Micron can mass-produce HBM, with complex processes (TSV + stacking). 2025 supply is already sold out, 2026 remains in short supply, with prices soaring 246% year-on-year. Even if GPU chips are ready, without HBM, assembly and delivery are impossible, causing delays in entire AI cluster deployments.

Result: Memory transformed from a "commodity" into a strategic choke point, potentially accounting for 30% of capital expenditures.

3. Third Phase Bottleneck: Optical Interconnect (Transition underway in 2025-2026)

Core Constraint: Physical limits of copper cables (NVLink/NVSwitch) in bandwidth, distance, power consumption, and weight.

Why a shift to optics is inevitable: Copper can still work within a single cabinet (72 GPUs), but when scaling to multi-cabinet or even thousands of GPU interconnects, copper cable attenuation is severe (effective distance <1 meter at 1.8TB/s bandwidth), weight explodes (NVL72 cabinet copper cables exceed 5,000, total weight 1.36 tons), and power consumption is high (replaceable optical modules replacing copper add an extra 20,000W). Signal integrity, latency, and cooling cannot support larger clusters.

Solution: Shift to optical interconnect (CPO Co-Packaged Optics + Silicon Photonics). Embedding optical engines directly next to the GPU/ASIC, using fiber optics for scale-out, achieving higher bandwidth density, lower per-bit power consumption, and longer distances.

NVIDIA heavily bet on this at GTC 2026, having invested in optical companies. Demand for 800G/1.6T optical modules is exploding. Companies like Lumentum, Broadcom, Coherent, Ayar Labs become new winners.

Current progress: Copper has reached its limit. Optics are shifting from "optional" to "mandatory," breaking through AI data center performance ceilings.

4. Fourth Phase Bottleneck (The Current Frontier): Power + Liquid Cooling (Becoming the ultimate physical constraint from 2026 onwards) Core Constraint: Power Wall + Cooling Wall + Grid Access.

Why it's the ultimate bottleneck: Each GPU's power consumption rose from 300W→700-1200W. Single cabinet power surged from 10-20kW (CPU era) to 120-200kW+ or even higher. Traditional air cooling has a physical limit of only 20-50kW, with unacceptable noise, airflow, and energy consumption.

Power side: Data centers require GW-level power supply, with grid connection queues potentially lasting years. Delivery cycles for transformers, solid-state transformers, and other equipment are extending to 100 weeks. Microsoft's CEO once bluntly stated, "We have GPUs but no electricity to plug them into."

Liquid cooling side: Must switch to Direct-to-Chip liquid cooling or immersion cooling, combined with microfluidics, cold plates, and other technologies. TSMC has demonstrated silicon-based liquid cooling on the CoWoS platform, supporting >2.6kW TDP. Liquid cooling/thermal management companies like Vertiv (VRT) are becoming new infrastructure core players.

Chain reaction: PUE (Power Usage Effectiveness) requirements are <1.2. Waste heat recovery, nuclear/new energy grid integration have become new topics. Even if all previous segments are solved, without power and cooling, cabinets cannot be racked and operated.

The Essential Logic of AI Computing Power Industry Chain Bottleneck Shifts AI computing power is not a "single-point" issue, but a systemic Leontief production function — GPU, HBM, interconnect, power, cooling must match based on the lowest-capacity component. Hyperscalers (Google, Microsoft, Meta, etc.) each time they solve one, immediately push capital and innovation to the next segment.

Currently (2026), we are in the transition period of "accelerated optical interconnect deployment + large-scale commercialization of power/liquid cooling." New bottlenecks may yet emerge (e.g., lasers, fiber materials, or grid transformers), but this chain of "computing → memory → optics → power/cooling" has become the recognized industry path.

This also explains why the investment logic is shifting from NVIDIA/TSMC to the HBM trio (SK Hynix, etc.), optical manufacturers (Lumentum, Coherent), and liquid cooling/power infrastructure companies (Vertiv, related power supply companies).

Every bottleneck shift is reshaping the value distribution across the entire semiconductor + data center industry chain.

Связанные с этим вопросы

QWhat are the four sequential bottleneck stages in the AI computing power supply chain as described in the article, and which one is identified as the 'ultimate bottleneck'?

AThe four sequential bottleneck stages are: 1) GPU/Computing, 2) Memory (HBM), 3) Optical Interconnect, and 4) Power + Liquid Cooling. The article identifies the fourth stage, Power and Liquid Cooling, as the 'ultimate bottleneck' or final physical constraint, as even if all other components are ready, a lack of power and cooling prevents the AI clusters from running.

QWhy did High Bandwidth Memory (HBM) become a critical bottleneck after the initial GPU shortage was alleviated?

AHBM became the critical bottleneck because as GPU computing power increased to handle massive AI models with trillions of parameters, the need for faster data transfer (memory bandwidth) created a 'memory wall.' HBM, which is much faster than traditional DDR memory, is essential for feeding data to these powerful GPUs. Its complex manufacturing process (involving TSVs and stacking) and limited suppliers (SK Hynix, Samsung, Micron) made its supply unable to keep up with explosive demand, delaying entire AI cluster deployments even when GPU chips were available.

QAccording to the article, what is the fundamental reason the industry is transitioning from copper cables to optical interconnects for scaling AI clusters?

AThe fundamental reason is the physical limitations of copper cables. While usable within a single server rack, copper cables face severe signal attenuation, excessive weight (e.g., over 1.36 tons for an NVL72 rack), high power consumption for signal integrity, and distance constraints when scaling to multi-rack clusters with thousands of GPUs. Optical interconnects (like CPO and silicon photonics) offer higher bandwidth density, lower power per bit, and longer transmission distances, making them a necessity for breaking the performance ceiling of large-scale AI data centers.

QHow does the article characterize the nature of bottlenecks in the AI computing power supply chain, and what investment shift does this logic explain?

AThe article characterizes the bottlenecks as forming a system-level 'Leontief production function,' where components like GPU, HBM, interconnect, power, and cooling are complementary constraints—the system's capacity is determined by the lowest-performing (most bottlenecked) component. This logic explains the shift in investment focus from earlier leaders like NVIDIA and TSMC to companies in subsequent bottleneck areas: HBM suppliers (SK Hynix, etc.), optical component makers (Lumentum, Coherent), and power/cooling infrastructure providers (Vertiv, power companies), as each bottleneck转移 reshapes value distribution in the产业链.

QWhat specific data points from major investment banks does the article cite to illustrate the scale and unpredictability of current AI infrastructure demand?

AThe article cites several independent data points: Morgan Stanley noted a 2.5x increase in global weekly LLM token consumption in 3 months. J.P. Morgan identified a 122 GW financing gap for data center projects over 5 years and that 44% of new U.S. power projects face over 4-year grid connection waits. Bank of America significantly raised Alphabet's 2026 CAPEX forecast to $181.5 billion (a doubling year-over-year), expecting a 62% drop in free cash flow. These figures from different research paths collectively show AI demand has exceeded all traditional planning models for power, semiconductor equipment, and memory pricing.

Похожее

Market Analyst Accuses XRP Of Being The Biggest Crypto Scam, What’s Going On?

A market analyst named Ryker has labeled XRP as the biggest scam in crypto, criticizing its high valuation despite what he claims is a lack of utility and ongoing inflation from Ripple's escrow releases. He alleges the team previously inflated the price and used celebrity promotions, particularly in South Korea, leading to significant losses for investors. In contrast, on-chain data shows substantial growth for the XRP Ledger, with a surge in new wallets and leading net flows in the real-world asset (RWA) sector, hinting at a potential price rebound. Currently, XRP is trading around $1.37.

bitcoinist37 мин. назад

Market Analyst Accuses XRP Of Being The Biggest Crypto Scam, What’s Going On?

bitcoinist37 мин. назад

Polymarket And Kalshi Are Now Under Congressional Investigation — The Evidence That Triggered It Is Hard To Dismiss

Congress has launched a formal investigation into prediction market platforms Polymarket and Kalshi, led by House Oversight Committee Chairman James Comer. The probe demands explanations on how the platforms prevent insider trading, sparked by suspicious bets tied to classified U.S. military actions. Evidence includes a U.S. soldier's trades before the Venezuela incursion and accounts netting millions from well-timed wagers on U.S.-Iran strikes and a ceasefire announcement. Both companies, which recently updated their rules, face scrutiny over their rapid growth and Washington lobbying. This investigation poses a significant threat that could reshape the prediction market industry.

bitcoinist3 ч. назад

Polymarket And Kalshi Are Now Under Congressional Investigation — The Evidence That Triggered It Is Hard To Dismiss

bitcoinist3 ч. назад

With Presale Funding Now Above $7 Million, Ozak AI Enters a New Growth Phase Marked by Steady Capital Inflows

Ozak AI ($OZ) has surpassed $7 million in its presale, marking a significant growth phase and shifting investor focus from speculative meme coins to utility-driven AI projects. The project combines AI technology with Decentralized Physical Infrastructure Networks (DePIN) to build infrastructure for smart analytics and automation in decentralized systems. The OZ token, priced at $0.014 in the current presale stage, is designed for staking, governance, and network expansion, aligning user incentives with long-term development. Sustained capital inflows and diminishing token supply indicate strong, conviction-based investor demand. This milestone reflects a broader market trend favoring foundational projects with clear utility over short-term speculation, positioning Ozak AI ahead of its anticipated public exchange listing.

TheNewsCrypto5 ч. назад

With Presale Funding Now Above $7 Million, Ozak AI Enters a New Growth Phase Marked by Steady Capital Inflows

TheNewsCrypto5 ч. назад

Strategy Watch #4

Strategy Watch #4 provides a monthly institutional analysis of digital asset fund performance and allocation trends. The report covers six sections. Key findings for April show mixed capital flows: Bitcoin outflows eased significantly, nearing neutral, while stablecoin inflows surged to multi-month highs, indicating a defensive rotation into dollar instruments. Ethereum remained in persistent net outflow. ETF and DAT flows were positive for Bitcoin, while Ethereum flows staged a notable late-month reversal from deep outflows to meaningful inflows. However, DeFi TVL on Ethereum reversed its March stabilization, with accelerated outflows suggesting sustained allocator caution toward on-chain yield strategies. CME basis yields for both BTC and ETH deteriorated sharply, turning deeply negative as futures moved into backwardation, removing carry opportunities. Across strategies, all sub-strategies posted gains for the month, a rare alignment. Despite a more constructive market backdrop, manager cash levels climbed to multi-year highs, indicating selective and cautious positioning. The report also includes a DeFi/Yield strategy deep dive, on-chain vault performance analysis, and updates on institutional allocations, including rising pension fund activity.

insights.glassnode6 ч. назад

Anthropic Major Release: "The Founder's Playbook" - All 4 Stages of Entrepreneurship, Completely Reimagined with AI

**Anthropic Releases "The Founder's Playbook," Reimagining the Four Stages of Startups with AI** The logic of entrepreneurship is being fundamentally reshaped by AI. Anthropic's new handbook, "The Founder's Playbook: Building an AI-Native Startup," defines the AI-native startup as a new species: not a traditional company with AI tools, but a venture driven by AI from day one. The founder's role is transforming from a hands-on builder to a conductor or architect, orchestrating AI agents for execution while focusing on high-level judgment and strategy. Anthropic outlines a product matrix of Claude tools for different tasks: Claude Chat for interactive research, Claude Code for generating production-ready code, and Claude Cowork for automating knowledge-intensive workflows. The handbook structures the startup lifecycle into four stages, detailing core goals, pitfalls, and AI applications for each: 1. **Idea Stage**: Focuses on validating a real problem. The core challenge is avoiding confirmation bias. AI practices include using Claude as a "structured devil's advocate" to challenge assumptions and for automated market/competitor research. 2. **MVP Stage**: Aims to gather early signals of Product-Market Fit (PMF). Key risks are technical debt and scope creep due to rapid AI-assisted development. Recommended AI uses include maintaining project memory documents (e.g., CLAUDE.md), using Claude Code for structured coding, and automating user feedback analysis. 3. **Launch Stage**: Centers on establishing scalable growth, operations, and compliance. Challenges include accelerating technical debt and founders becoming bottlenecks. AI should be used to build an "operating system" for launch—automating routine tasks (scheduling, reporting, content) and code audits—freeing founders for critical decisions. 4. **Scale Stage**: Focuses on achieving sustainable business operations. The main challenge is delegating operational control. AI should be leveraged for differentiated marketing, operational optimization, and building competitive moats through data network effects. The handbook concludes that in the AI era, "Can we build it?" is no longer the primary constraint. The advantage shifts back to foundational strengths: **insight, judgment, and a deep understanding of a specific problem and audience.**

marsbit7 ч. назад

Anthropic Major Release: "The Founder's Playbook" - All 4 Stages of Entrepreneurship, Completely Reimagined with AI

marsbit7 ч. назад

Торговля

Спот

Фьючерсы

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

The Underlying Logic of Bottleneck Propagation in the AI Computing Power Industry Chain

Введение

Связанные с этим вопросы

Похожее

Market Analyst Accuses XRP Of Being The Biggest Crypto Scam, What’s Going On?

Polymarket And Kalshi Are Now Under Congressional Investigation — The Evidence That Triggered It Is Hard To Dismiss

With Presale Funding Now Above $7 Million, Ozak AI Enters a New Growth Phase Marked by Steady Capital Inflows

Strategy Watch #4

Anthropic Major Release: "The Founder's Playbook" - All 4 Stages of Entrepreneurship, Completely Reimagined with AI

Торговля

Популярные статьи

AI Companions: Новое определение взаимодействия человека с ИИ

HTX Learn: пройдите обучение по "AI Companions" и разделите 10 000 USDT!

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Топ вопросы

Популярные категории

Популярные теги