Author: Claude, Shenchao TechFlow
Shenchao Insight: The rental price for Nvidia B200 chips has fallen from a late-May high of $6.11 per hour to $4.22 per hour, a drop of approximately 30% in three weeks. Meanwhile, a rare divergence has emerged in the semiconductor sector: the SMH Semiconductor ETF gained 15% over the past month, with Micron and SanDisk each soaring nearly 60%, while Nvidia declined by 3% during the same period. For those holding Nvidia stock or considering AI infrastructure investments, a crucial question arises: AI money isn't disappearing; it's just moving somewhere else.
Nvidia is still up about 12% year-to-date, but the market's attention seems to have shifted away from it for now.
Over the past month, the VanEck Semiconductor ETF (SMH) surged 15%, with Micron Technology and SanDisk each skyrocketing nearly 60%. Nvidia not only failed to keep pace but instead fell by about 3%. More tellingly, the B200 chip's cloud rental price, a core metric supporting Nvidia's valuation narrative, is also softening simultaneously.
According to GPU compute pricing platform Ornn, the hourly rental price for B200 chips hit a three-month high of $6.11 on May 30th, then declined continuously, dropping to $4.22 by last weekend, a decline of about 30%. Rich Privorotsky, Head of Goldman Sachs' One-Delta Trading Desk, directly addressed the topic last week: the myth of AI "compute scarcity" might be falling from its pedestal.
B200 Rental Price Drops 30% in Three Weeks, "Compute Scarcity" Narrative Under Pressure
The Nvidia B200 is the core compute chip for current hyperscale data centers, and its rental price is seen as a barometer for AI infrastructure supply and demand. Data from multiple third-party tracking platforms suggests B200 pricing is loosening.
Ornn data shows the hourly rental price for B200 chips fell from a May 30th high of $6.11, ending last weekend at $4.22. The monthly price index compiled by AIMultiple from 63 cloud service providers shows a median B200 price of $6.11/hour, but new cloud (neocloud) vendors are offering floor prices as low as $3.44. Data tracked by GetDeploying from 26 B200 cloud service providers is more extreme: an average price of $4.99/hour, with the lowest quote at only $2.25/hour (for a three-year reserved contract).
Three factors are driving the price decline: improved yield rates for TSMC's 4NP manufacturing process are lowering B200 production costs; HBM3e supply from SK Hynix and Micron has notably eased in Q2 2026; and more new cloud vendors have secured B200 inventory—RunPod, Lambda, Nebius, Spheron, etc., have all listed spot availability, increasing competition and pushing down overall prices.
Pressure will likely intensify in the second half of the year. As Nvidia's next-generation Blackwell Ultra B300 chips begin entering the spot pool, some B200 capacity will shift from on-demand to spot (bidding) mode. Spot prices for B300 have already seen quotes as low as $2.45/hour, cheaper than the lowest listed price for B200. Institutions like Spheron and Thunder Compute predict B200 on-demand prices may stabilize in the $2.50 to $3.00 range by Q4 2026.
For investors holding Nvidia stock, softening rental prices imply margin pressure for Nvidia's downstream customers (cloud providers, new cloud platforms). The purchasing willingness of these customers directly determines Nvidia's order cadence.
Major Divergence in Semiconductor Sector: Memory Soars, Nvidia Lags
The data for this divergence is quite stark.
Nvidia is up about 12% year-to-date in 2026 but down about 3% over the past month. During the same period, the SMH Semiconductor ETF is up 84% year-to-date and gained 15% over the past month. Micron Technology surged nearly 60% over the past month, its stock price hitting a historical high of around $1,089, with a cumulative year-to-date gain exceeding 700% and market cap breaking $1.2 trillion. SanDisk also surged nearly 60% over the past month, with a gain exceeding 4,400% over the past 52 weeks.
The market might not be losing faith in AI; it might simply believe the bottleneck in the AI value chain is shifting.
The previous logic was "GPU scarcity → Nvidia has pricing power → upstream makes the most money." The current logic seems to be: GPU supply is easing, but AI models' demand for high-bandwidth memory (HBM) and storage is exploding, making memory the new bottleneck.
Micron's latest quarterly earnings (Q2 2026) reported revenue of $23.8 billion, nearly tripling year-over-year (compared to $8 billion in the same quarter last year). After spinning off from Western Digital, SanDisk reported Q3 FY2026 revenue of $5.95 billion, a 97% year-over-year increase.
Data released by TrendForce on June 16th shows memory contract prices soared over 100% in the first half of 2026, with structural shortages expected to persist into the second half. Apple CEO Tim Cook acknowledged last week in an interview that Apple can no longer continue absorbing the pressure of rising memory costs. When even a buyer with Apple's bargaining power publicly states it "can't shoulder it anymore," the pricing power of memory manufacturers is evident.
Micron will report its third-quarter earnings after the market close tomorrow (June 24th), with widespread market expectations for another record-breaking report. This earnings report will be a key test for whether the "memory supercycle" can continue.
Goldman Sachs Trading Head: The Core Metric is Rental Price
Goldman Sachs One-Delta Trading Desk Head Rich Privorotsky outlined a clear analytical framework last week:
If compute resources are truly scarce, rental prices should remain firm, justifying continued capital expenditure. If supply increases and rental prices keep declining, the core assumption of "compute scarcity" underpinning the valuation of the entire AI hardware chain will be shaken.
He further noted that this pressure would first manifest at the hardware layer. The true beneficiaries are companies selling complete systems and monetizing through usage, not just those selling "picks and shovels" upstream. The greater risk lies in the upstream hardware and infrastructure stack, where valuations still rely on the premise of "persistent shortage."
The implication is clear: Nvidia's business model is selling chips (picks and shovels), not charging based on usage. If downstream customers' rental prices are falling while Nvidia's chip prices aren't, a margin squeeze emerges in the middle, ultimately translating into slower orders.
A recent "Tokenomics" report from Citadel Securities echoes a similar judgment: the core constraint for AI adoption has shifted from "model capability" to "cost and compute scarcity," with users accelerating migration to cheaper models. The token price index fell for seven consecutive days, marking its longest decline this year.
Santa Clara University Finance Professor Seoyoung Kim put it more bluntly: Most buyers don't know how much compute they'll need next year, suppliers don't know how many GPUs to order, and Nvidia doesn't know how many to produce. All three parties are guessing, and when the collective guess shifts from "won't be enough" to "might be too much," prices come under pressure.
SpaceX-Google's $30 Billion Mega-Contract: Long-Term Market Still Hot
While spot rental prices are falling, the long-term contract market tells a different story.
According to an SEC filing by SpaceX on June 5th, Google has agreed to pay SpaceX $920 million per month from October 2026 to June 2029 to lease approximately 110,000 Nvidia GPUs along with配套 processors, memory, and other components. The total contract value is about $30 billion. Earlier in May, Anthropic signed a similar agreement with SpaceX, paying $1.25 billion per month to lease all available compute power at its Colossus 1 data center in Memphis, with a total value nearing $45 billion.
The background for these contracts is that after SpaceX completed its merger with xAI in February 2026, it converted xAI's previously self-built Colossus supercomputing cluster into a commercial asset for external leasing, locking in massive revenue ahead of its IPO (targeting a $1.75 trillion valuation).
For Nvidia, this is a contradictory signal. On one hand, a long-term contract for 110,000 GPUs proves that major customers are still aggressively locking in compute capacity. RBC Capital Markets stated after the deal's announcement that Nvidia is "in the most favorable position among peers," suggesting these GPU leasing agreements can at least temporarily alleviate market concerns about ASICs eroding Nvidia's market share.
On the other hand, the reason Google needs to lease compute from SpaceX is precisely because its own build-out capacity can't keep up with demand. Google's 2026 capital expenditure is between $180 and $190 billion; SpaceX's monthly $920 million payment is less than 6% of Google's annual budget, essentially serving as "bridge capacity." As these super-customers' own data centers come online in 2027-2028, whether external leasing demand can maintain its current scale remains a question.
The contract also includes a 90-day notice period for early termination. This doesn't look like a clause negotiated under conditions of "extreme compute scarcity"; it more resembles a buyer leaving themselves an exit strategy.
Nvidia's Risk: Not on Demand Side, But on Pricing Power
Stringing these clues together, the issue Nvidia faces is the shifting profit distribution within the AI value chain.
On the GPU supply side, three factors—improved TSMC yields, more vendors securing inventory, and the imminent large-scale arrival of B300—are alleviating the extreme shortages of 2024-2025. On the demand side, super-customers are still purchasing en masse, but the nature of procurement is shifting from "scrambling for supply at any cost" to "price comparison, long-term contracts locking in volume, and retaining exit rights." On the profit side, downstream cloud vendors' rental prices are already falling. If Nvidia's own chip prices cannot decline synchronously, the profit squeeze in the middle will eventually undermine order volume.
The newfound popularity of memory chips is the other side of this value chain migration.
The larger the AI model and the more inference tasks, the more inelastic the demand for high-bandwidth memory becomes. GPUs can improve efficiency through architectural upgrades (e.g., B200's FP4 precision halves bytes per parameter), but memory bandwidth is a physical bottleneck with no shortcuts. Micron's HBM capacity is already sold out for all of 2026—a state of "can't buy even with money" that starkly contrasts with the declining B200 rental prices.
Micron's earnings report tomorrow will provide the next crucial data point. If revenue and guidance exceed expectations again, the narrative of "AI value chain migration from GPU to memory" will be further reinforced. For investors, this isn't about being bearish on AI; it's about needing to reconsider whose pricing power is strengthening and whose is weakening on the AI chain.






