How Much of the Subscription Fee You Pay to Claude Can Optical Module Companies Get?

marsbitОпубликовано 2026-06-17Обновлено 2026-06-17

Введение

How much of your $20 Claude Pro subscription actually goes to AI model companies like Anthropic? A viral breakdown image highlights the fundamental valuation challenge for AI applications versus traditional SaaS. Unlike SaaS with high software margins, AI subscriptions face variable "inference costs": every user query consumes GPU time, power, and cloud resources. This creates a tension between fixed subscription fees and usage-driven expenses. While the specific dollar splits are illustrative, the core question is whether AI revenue can achieve SaaS-like margins as usage scales. Currently, infrastructure providers (cloud platforms, GPU makers like Nvidia, HBM suppliers, power/data centers) capture more certain revenue from growing AI usage. Their financials reflect pricing power and faster earnings validation. The bullish case hinges on efficiency improvements: model optimization, caching, smaller models, and custom chips could lower per-token costs over time. The key debate is whether cost declines can outpace increases in user workload complexity and volume. Ultimately, for AI companies to command high SaaS-like valuations, they must demonstrate not just user growth but also improving gross margins after accounting for inference costs. Investors will scrutinize not just subscriber numbers, but usage patterns, enterprise pricing tiers, and real efficiency gains.

TL;DR

A chart estimating the breakdown of the approximately $20 monthly fee for Claude Pro in the U.S. among model companies, cloud computing power, GPU depreciation, electricity, and the supply chain is prompting investors to re-evaluate how to value AI application revenue.

This chart is not official revenue-sharing data from Anthropic, Amazon Web Services, or NVIDIA, nor should it be treated as the real financials of any specific company. Its value lies in raising a more fundamental question: How much of the subscription fee users pay to AI applications can solidify into software gross profit, as seen with traditional SaaS?

The valuation story for traditional SaaS is clear. Once the software is developed, selling an additional account typically incurs low incremental costs. Mature pure software companies commonly achieve gross margins of 70% or even over 80%. Investors are willing to assign high multiples because profitability often has room to improve further as revenue scales.

The trouble with AI applications is that every user query—be it writing code, analyzing documents, or calling an agent—consumes GPU time, electricity, memory bandwidth, and cloud resources behind the scenes. On the surface, it's a fixed monthly fee, but underneath lies a cost chain that varies with usage volume. Light users might be highly profitable, but for heavy users continuously running tasks within their usage limits or relevant tool packages, costs can escalate rapidly.

Therefore, the $20 breakdown chart aims to challenge not how many dollars a specific company takes, but whether "AI application revenue inherently equals SaaS revenue." For AI companies to justify high valuation multiples, they need to demonstrate not just that users are willing to pay, but also that the weighted gross margin after accounting for usage volume can continuously improve.

Behind the Subscription Fee Lies an Inference Cost Chain

The biggest difference between an AI subscription and a regular software subscription is that the marginal cost of "one use" is no longer close to zero.

In traditional SaaS, when a team adds an account, the service provider also incurs server, customer support, and bandwidth costs, but these costs typically don't increase linearly with every click. The real expenses are upfront R&D, sales, and customer acquisition. After the product scales, a significant portion of new revenue can be retained as profit.

Large language model products are different. A user inputs a question, and the model generates an answer. This process is called inference—the actual computation when the model is invoked by the user. Tokens are the basic unit for measuring the text the model reads and writes. The more users ask, the longer the context, and the more complex the generated content, the more tokens and computing power are consumed.

This creates a tension between fixed subscription fees and variable costs. The monthly fee for Claude Pro in the U.S. is roughly $20 (subject to region, taxes, and adjustments by Anthropic). Users see a fixed price, but the model company faces widely varying usage behaviors. Some users only write emails and look up information, while others process long documents, run coding tasks, or invoke more complex automated workflows.

The circulating breakdown chart attempts to visualize this: out of the $20, a portion remains with the model company, and another portion goes to cloud and computing power providers. Computing power costs include electricity, operations, and GPU depreciation. GPU procurement flows further upstream to NVIDIA, TSMC, HBM (High-Bandwidth Memory) suppliers, optical modules, ODMs, and electricity-related companies.

Here, "GPU depreciation" can be understood as the expensive GPU cost not being accounted for all at once, but gradually amortized into the AI service based on useful life, intensity of use, or accounting rules. The actual allocation is influenced by factors like plan limits, the mix of light and heavy users, internal cloud provider pricing, reserved capacity discounts, GPU utilization, and depreciation periods. Average cost is not the same as marginal cost.

The direction investors truly need to watch is this: AI application companies cannot just disclose revenue growth; they must also answer whether computing power costs are growing in sync with that revenue growth. If usage volume expands faster than model efficiency improves, the higher the subscription revenue, the more pressure there may be on gross margins. Only with sufficient improvement in efficiency can model companies have a chance to approach the profit structure of software companies again.

Infrastructure Gets More Certain Revenue First

At this stage, the growth in AI usage flows more directly to infrastructure rather than all solidifying at the application layer.

Regardless of whether users engage with models through Claude, ChatGPT, Gemini, or internal enterprise agents, inference ultimately lands on computing power, electricity, memory, and networks. The application layer may see product turnover, but the consumption of underlying resources is more rigid. As long as AI usage continues to rise, cloud capital expenditures, GPU procurement, HBM demand, and data center electricity consumption will be driven up.

This is also why infrastructure chain players like NVIDIA, TSMC, and SK Hynix continue to be revalued by the market. NVIDIA's overall gross margin has been at high levels in recent years, with FY2026 GAAP and Non-GAAP gross margins at approximately 71.1% and 71.3%, respectively, with subsequent quarterly guidance remaining high. Note that individual quarters can be affected by specific charges, and public filings don't always allow a clean breakout of the real gross margin structure for AI data centers, but the pricing power of scarce infrastructure is already reflected in the financials.

HBM is a typical link in this chain. It's not ordinary memory but a key component in AI accelerators that supports high-throughput computing. As model scale, context length, and concurrent inference demands increase, AI chips become more reliant on high-bandwidth memory. Supply chain estimates indicate HBM accounts for a growing share of the cost in new-generation AI chips, which is a key reason why SK Hynix, Samsung, and Micron are being revalued in this AI cycle.

Electricity and data centers have also moved from background costs to a key investment theme. The energy consumption of a single ordinary text query might not be staggering, but complex agents, long contexts, code generation, and multi-step tasks amplify the computational load. For cloud providers and data center operators, the key isn't how much electricity one query uses, but what happens when massive inference requests occur continuously: cluster utilization, electricity prices, cooling, facility capacity, and grid access all become cost factors and bottlenecks.

The advantage for the infrastructure end is that performance verification is faster. Cloud providers' AI capital expenditures are already happening; NVIDIA's revenue and gross margins appear in its reports; HBM vendor orders and pricing also enter P&L statements relatively quickly. The model application layer trades more on future expectations: subscription conversion, enterprise penetration, API revenue, and the release of profits after the future cost curve declines.

Efficiency Improvements Remain the Core Argument for Bulls

Software investors and AI bulls do have counterarguments. The core view of the efficiency camp is that today's relatively high inference costs are merely a phenomenon of the early stage. Model optimization, caching, smaller models, custom chips, and higher cluster utilization will continuously drive down unit costs. If costs fall fast enough, AI applications may still return to a high-margin software logic.

This rebuttal has a basis in reality. Some mainstream models have seen significant price drops per token while maintaining or improving capability. OpenAI has disclosed that GPT-4o mini's cost per token has decreased by 99% compared to the earlier text-davinci-003. The pace isn't uniform across all companies—Anthropic has recently focused more on same-price upgrades and model tiering—but the industry direction is still to provide stronger capabilities at lower cost.

Model companies also have multiple ways to improve unit economics. Simple tasks can be routed to smaller models; common requests can be reused via caching; long-context and complex tasks go to stronger models. Cloud providers, meanwhile, aim to lower unit computing costs through custom chips and cluster scheduling. Google has TPUs, Microsoft has introduced Maia for inference, and Amazon is advancing Trainium and Inferentia.

Looking purely at technological progress, there is indeed room for AI application margins to improve. Cheaper inference, better model routing, and stronger compression capabilities can all allow the same $20 subscription to handle more usage. Light users, higher-priced enterprise plans, tiered API pricing, and stricter usage limits can also improve overall unit economics.

The difficulty lies in the fact that cost reduction isn't the only variable. AI applications are evolving from simple chat to heavier workloads. In the past, users might have only asked questions and rewritten text. Now, more demand comes from coding agents, long document processing, video and multimodal generation, and enterprise automation workflows. These scenarios offer higher value but also higher consumption. The more useful the model, the more likely users are to entrust it with more complex, longer-duration tasks.

The disagreement thus becomes more specific: Can the speed of inference cost decline outpace the growth in usage volume and task complexity? If unit costs fall rapidly, but average user consumption grows even faster, the model company's weighted gross margin will still face pressure. Conversely, if model routing, caching, custom chips, and price tiering are effective enough, AI subscriptions could gradually shed today's heavy-cost characteristics.

Subscriber Count Is Not Gross Margin

The $20 breakdown chart shouldn't be interpreted as the final outcome. It's more like a valuation reminder for the current stage: When the market still lacks sufficiently transparent gross margin data from model companies, investors need to discount the assumption that "AI application revenue inherently equals SaaS."

For unlisted model companies like OpenAI and Anthropic, external investors have little access to a complete financial picture. Clues will come from fundraising materials, partner disclosures, cloud cost structures, enterprise plan pricing, the proportion of API revenue, and usage limits. The truly valuable data isn't how many paying users there are, but the proportion of light versus heavy users, whether enterprise clients are willing to pay higher prices for intense usage, whether cloud settlement costs are declining, and whether unit inference cost reductions are flowing into company gross margins.

Validation in the public company chain will appear faster in financial reports. NVIDIA's overall gross margin and data center revenue growth, TSMC's demand for advanced processes and packaging, HBM vendor pricing and profit margins, and the intensity of cloud providers' capital expenditures will all continue to reflect whether AI usage is still being channeled to the infrastructure end. If these indicators remain strong while the model application layer lacks evidence of gross margin improvement, the market will likely continue to assign a more certain valuation premium to infrastructure.

Ultimately, for model companies to reclaim a higher valuation anchor, they need to prove not just that users are willing to pay $20, but that these subscription fees, after accounting for heavy usage, can still leave behind sufficient gross profit. The next round of valuation divergence likely won't be over the headline ARR numbers, but over whether inference costs, plan limits, and enterprise pricing can all work out simultaneously.

Связанные с этим вопросы

QWhat is the key challenge highlighted in the article regarding how AI application subscription revenue should be valued compared to traditional SaaS?

AThe key challenge is that AI application revenue does not naturally equate to SaaS revenue. Unlike traditional SaaS with high gross margins (70-80%) due to low marginal costs per additional user, AI subscriptions incur significant and variable marginal costs (for GPU time, electricity, memory bandwidth, etc.) with each user interaction (inference). Investors need proof that the weighted gross margin can sustainably improve despite rising usage, not just that users are willing to pay.

QAccording to the article, why are infrastructure companies like Nvidia currently seeing more certain and direct financial benefits from the AI boom than AI application companies?

AInfrastructure companies see more certain and direct benefits because regardless of which specific AI application users engage with (e.g., Claude, ChatGPT), the underlying demand for computing resources—GPUs, HBM memory, cloud infrastructure, and power—is rigid and directly tied to usage volume. Their revenue and margins (like Nvidia's ~71% GAAP gross margin) are more immediately verifiable in financial reports, whereas AI app companies' valuations are based more on future expectations of profitability after potential efficiency gains.

QWhat is the core argument of AI 'efficiency optimists' or bulls in response to concerns about high inference costs?

AEfficiency optimists argue that high inference costs are an early-stage phenomenon. They believe that continuous improvements—such as model optimization, caching, smaller specialized models, custom AI chips (like TPU, Maia, Inferentia), and better cluster utilization—will significantly drive down the unit cost of inference. If costs fall fast enough, AI applications could eventually achieve the high-margin economics of traditional software.

QWhat complicating factor could prevent AI application companies from achieving significantly improved gross margins, even as unit inference costs decrease?

AThe increasing complexity and intensity of user workloads could counteract falling unit costs. As AI models become more capable, users delegate more complex, longer, and resource-intensive tasks (e.g., code generation, long document analysis, multi-modal generation, enterprise automation). If the growth in average user consumption and task complexity outpaces the rate of unit cost reduction, the weighted gross margin for AI companies may still face pressure.

QWhat does the article suggest investors should focus on beyond headline subscription user numbers to properly value AI application companies?

AInvestors should focus on metrics related to profitability and cost structure, not just user counts. Key data points include the mix of light vs. heavy users, whether enterprise clients pay premium prices for high-intensity use, the trend in cloud/compute settlement costs, and evidence that unit inference cost reductions are translating into improved company gross margins. Transparency on these factors is needed to justify a high SaaS-like valuation multiple.

Похожее

Conversation with Arthur Hayes: AI Has Drained Market Liquidity, BTC Will Be Below 100k by Year-End

In this June 2026 podcast interview, BitMEX co-founder Arthur Hayes explains his decision to sell his major crypto holdings (HYPE, NEAR, Worldcoin, Zcash). His rationale is based on a macro view linking oil prices, the Iran conflict, US politics, and an impending AI bubble burst. Hayes argues that high oil prices, driven by the ongoing war, will pressure domestic US inflation. To salvage the Republican Party's chances in the midterm elections, he believes Donald Trump may pivot to a populist, anti-AI stance—advocating for taxes and regulation—which would deflate the AI investment narrative. He sees the AI sector, particularly massive capital expenditure on data centers, as having absorbed nearly all excess market liquidity (around $1.5 trillion in debt issuance since 2025), starving other assets like Bitcoin. He highlights the upcoming SpaceX IPO at a ~$1.8 trillion valuation and 100x price-to-sales ratio as a potential tipping point. If these hyped IPOs underperform, it could shatter market confidence in AI. In such a scenario, all risk assets, including crypto, would fall together as correlations converge to 1 during a broad correction. Hayes has moved his portfolio into Treasuries and energy stocks (like ExxonMobil), predicting Bitcoin will be below $100k by year-end. He sees a potential crypto bull market only after the AI frenzy cools, liquidity stops flowing exclusively into AI, and possibly after a significant market downturn prompts new monetary stimulus.

marsbit2 мин. назад

Conversation with Arthur Hayes: AI Has Drained Market Liquidity, BTC Will Be Below 100k by Year-End

marsbit2 мин. назад

Fed's Internal Doves Flock to Hawkish Stance, Warsh's Debut "Between a Rock and a Hard Place"

U.S. Federal Reserve officials who previously advocated for rate cuts, including Governor Christopher Waller, have recently shifted their stance, with many now not ruling out the possibility of future rate hikes. This sets a challenging stage for new Fed Chair Kevin Warsh's first policy meeting. Appointed by President Trump based on his dovish views, Warsh now faces a committee where the debate has pivoted from "when to cut" to "whether to hike," driven by persistent inflation above 3%, a strong labor market, and supply-side pressures from AI infrastructure demands and geopolitical tensions. Key figures illustrate the shift. Governor Waller, once concerned about employment, now says data has pushed him toward considering rate increases. Even moderate voices like Governor Lisa Cook, while expecting inflation to ease, have indicated readiness to hike if it fails to do so. Long-time hawks such as regional Fed presidents Beth Hammack, Lorie Logan, and Neel Kashkari have grown more vocal, arguing that the real policy rate is effectively falling and that action may soon be needed. The upcoming Fed meeting is expected to keep rates steady but will likely remove the "easing bias" from its statement, signaling a neutral stance between cuts and hikes. The quarterly "dot plot" is anticipated to show most officials projecting no cuts this year, with some potentially indicating hikes. Chair Warsh, a critic of the Fed's reliance on forward guidance like the dot plot, must navigate communicating this pivot using tools he has questioned, all while steering policy in a direction counter to the preferences of the president who appointed him. The consensus suggests the Fed's next move could well be a rate increase.

marsbit1 ч. назад

Fed's Internal Doves Flock to Hawkish Stance, Warsh's Debut "Between a Rock and a Hard Place"

marsbit1 ч. назад

The Trillion-Yuan Market Cap 'Yi Zhong Tian': Who is the True Value King?

The article analyzes the three leading Chinese optical module companies, collectively nicknamed "Yi Zhong Tian": Xinyisheng, Zhongji Innolight, and TFC Optical Communication. It evaluates their "cost-performance" not by current stock price, but through three lenses: PEG ratio (growth vs. valuation), earnings quality, and premium/discount for certainty. Xinyisheng shows the most attractive PEG ratio and high profitability, but its valuation reflects discounts for risks like high customer concentration and reliance on overseas markets. Zhongji Innolight, the most expensive, commands a premium for its market leadership, dominant share in key products like 800G/1.6T modules, and higher earnings certainty, though it faces geopolitical risks. TFC Optical, as an upstream component supplier ("water seller"), has the highest gross margin and bets on the long-term CPO/NPO architecture trend, but trades at a high valuation with more stable, less explosive growth. The core argument is that while these companies dominate module assembly, the true profit pool and technological moat lie upstream in laser and switch chips, currently controlled by U.S. firms like Lumentum and Coherent. The long-term "cost-performance" for these Chinese leaders hinges on whether the domestic industry, exemplified by companies like Yuanjie Technology, can successfully move up the value chain into high-power laser chips. Otherwise, their high growth may remain confined to the lower-margin assembly segment.

marsbit1 ч. назад

The Trillion-Yuan Market Cap 'Yi Zhong Tian': Who is the True Value King?

marsbit1 ч. назад

Has the Crypto Market Bottomed? Here's What Institutions Think

The crypto market is in a period of significant debate, with leading institutions offering differing views on whether a bottom has been reached. Three prominent firms have published detailed analyses: * **Galaxy Digital** argues Bitcoin has **not yet bottomed**. Their analysis of 13 historical indicators across six dimensions (valuation, profit-taking, miner pressure, etc.) shows only four are fully met. They project a potential bottom range between $30k and $54k. * **NYDIG** states a bottom is **possible but not likely**. While metrics are close to historic bear market extremes, they note the absence of a classic panic-selling event. They also suggest increased institutional adoption may have structurally altered the market cycle, potentially leading to a shallower downturn. * **Standard Chartered Bank** asserts the **bottom has already occurred** at around $59k. They cite two key factors: potential US-Iran diplomatic progress and the anticipated SpaceX IPO, which they believe absorbed capital and caused ETF selling pressure that is now subsiding. They forecast a year-end price target of $100k. Despite the surface-level disagreement, the reports share critical common ground more valuable for long-term investors: 1. All three believe the market bottom will form **within this year**. 2. All agree the current price is **closer to the bottom than to previous highs**. 3. All maintain a **bullish long-term outlook** for Bitcoin and a new cycle. The core takeaway is that while the exact bottom price ($40k, $50k, or $60k) is debated, the consensus is that a bottom is imminent. For long-term holders, the primary focus should not be pinpointing the absolute low, but on the future potential for prices to reach $100k, $200k, or higher. The fundamental thesis for Bitcoin—sovereign debt accumulation, inflation, declining trust in centralized institutions, global digitization, and improved accessibility—remains intact and is arguably strengthening. The overall landscape is viewed as more favorable than in previous crypto winters.

marsbit1 ч. назад

Has the Crypto Market Bottomed? Here's What Institutions Think

marsbit1 ч. назад

Торговля

Спот
Фьючерсы
活动图片