Author: Haseeb
Compiled by: Jiahuan, ChainCatcher
@SemiAnalysis_ recently discovered an incredible phenomenon in the economics of AI programming subscriptions. If you max out usage, the fee you pay is actually 20 to 70 times cheaper than buying tokens through the API.
Many see this and say: My God, look at how much these LLM companies are subsidizing tokens. The bubble is bound to burst soon.

This reaction is wrong. LLM companies are willing to offer such generous packages precisely because most users rarely hit the ceiling. The product is like a gym membership: generous allowances exist because the vast majority of people hardly use them.
But I've spent a long time pondering this; there is something odd here.
We don't know their actual blended margins on subscriptions, but according to SemiAnalysis, at 20% average utilization, Anthropic's Max 5x plan barely breaks even. A 20% utilization rate might even be optimistic, especially in organizations where everyone (including non-programmers) has a subscription but only uses it occasionally. Most institutions I know, including Dragonfly, are generous with Claude Code subscriptions and encourage non-technical staff to try them.

But what SemiAnalysis didn't delve into is that this is purely a small business phenomenon. Large enterprises cannot use this subscription pricing.
Here's why: When you reach 150 people or more, you are forced out of the "Team" subscription model. You must switch to "Enterprise," priced at a base of $20 per seat plus API fees based on actual token usage. Enterprises pay linearly for token costs, and SemiAnalysis estimates API token gross margins to be around 75%. This is a massive price hike that kicks in suddenly at 150 people.
So, if you are a small business or startup (or an individual), your perception of AI spending is distorted. Your token pricing is actually heavily subsidized; Anthropic likely maintains extremely low or even negative margins on you.
You might wonder why Microsoft and Uber fret so much about token spend and talk about "token-mining." That's the reason. Their structural cost per token is much higher than for startups and individuals.
But Anthropic doesn't care! For a B2B company, extracting maximum value from small companies or individuals isn't very meaningful. Look at companies like Datadog or Cloudflare; 80% to 90% of their revenue comes from large contracts (annual recurring revenue over $100,000). Making zero profit on the long tail is just a customer acquisition cost.
This is classic B2B sales thinking.
But there's another way to view the same situation: through the lens of tax policy.
Because if tokens are replacing labor, then the gross margin OpenAI and Anthropic charge on tokens is essentially a tax on AI labor.
Viewing token pricing this way leads to two major consequences.
Token Pricing as Tax Policy
Assuming the profit margins in the SemiAnalysis article hold: subscriptions break even, large enterprise API gross margin is 75%. The initial reaction is to call this a 75% AI labor tax on large organizations, and a 0% tax on startups.
Standard tax analysis would say this discourages large companies from using AI labor internally, pushing them on the margin to reduce automation and retain more human labor. (Obviously, it also encourages using smaller or open-source models, but the net effect is both are incentivized. Remember, we're talking about the margin here.)
However, what drives behavior more strongly is not the average tax rate. In tax policy, it never is. What we really care about is the marginal tax rate.
For startups on flat-rate subscriptions, the marginal price of the next token is zero until they hit the cap. And a zero marginal price creates the maximum possible distortion a policy can create.
For startups, the subscription model is basically an innovation subsidy. The overwhelming incentive is to figure out how to spend the entire token budget as efficiently as possible. This means running Ralph loops, filling screens with Claude Code sessions, scheduling swarms of agents to work together.

Until the cap is hit, exploration is free. So startups are essentially racing to squeeze the last drop of value from their subscription, out-producing each other. Paradoxically, the more you use, the lower the average token price. Every startup wants to be the one making Anthropic lose the most on subscriptions.
Large enterprises face the opposite incentive. If you exceed 150 seats, every exploratory token is charged at full mark-up (plus a 75% surcharge!). So every step they take exploring the frontier is linearly punished.
Big companies will still automate large, obvious bulk tasks. But marginal, experimental, risky automation will never be discovered because the cost of discovery is too high. This tax structure ultimately encourages them to retain more human labor, preserving their overall organizational structure.
This is the opposite of Japan. Due to a declining population, Japan faces a huge labor shortage. Historically, this meant Japan pursued intense automation, as high human costs incentivized it. That's why Japan has robots in restaurants, factories, hotels, hospitals.
But, strangely, large enterprises find themselves in the opposite dilemma of Japan: if they have to pay an extremely high tax to use AI, it weakens the incentive to automate, strengthening the motivation to retain existing employees (even more so if wages stagnate during this period).
So where does labor substitution flow in this model?
Everyone is staring at big companies, waiting for AI layoffs. But with a 75% tax, aggressively replacing your own employees with AI may simply be uneconomical; token budgets would explode.
But this doesn't mean substitution won't happen; it just manifests differently.
When big firms lose market share to AI-native startups with minimal blended human labor costs, the big firms' revenue and stock price declines trigger layoffs. But those eliminated jobs never reappear at the winning startups. The net reduction effect is the same; this employment gap is just transferred to a lower-taxed part of the economy.
This is also why "AI-washing" (portraying ordinary layoffs as newfound AI efficiency) may not be a flash in the pan. AI-washing is when a company attributes layoffs to AI efficiency, but is actually just masking ordinary business weakness.
Many think this is just a blip in the current AI hype cycle. But even though everyone is primed to witness big companies doing real AI layoffs, "replacing jobs" with AI, that may never happen at scale.
Labor substitution might unfold a different way: startups beat incumbents, incumbents disguise their decline with AI-washing all the way to death, and the startups never rebuild those old jobs. Job substitution still happens, just not where everyone is looking.
That's the first consequence of this model. But there's a second, even weirder consequence.
The 150-Person Cliff
A regulatory notch is a regulatory boundary that induces huge behavioral jumps. Example: the 30-hour-a-week full-time employment threshold, which created tons of jobs that exactly clocked 29 hours a week.
France famously has extremely rigid labor laws that kick in once a company hits 50 employees (works councils, mandatory profit-sharing, firing protections). Smaller companies are exempt. This gives employers a huge incentive to desperately stay below 50 people.

Source: Garicano, Luis, Claire Lelarge, and John Van Reenen, 2016, Firm Size Distortions and the Productivity Distribution: Evidence from France.
Extend this analogy to AI. LLM companies have established a tax threshold that punishes companies exceeding 150 seats. This means you must stay small to keep that wonderful subsidized subscription price, taxing tokens at ~0% (or even negative) rather than 75%.
This could spawn an entirely new management philosophy for companies. Startups will become increasingly obsessed with solving everything with agents, smaller teams, more frequent layoffs, more outsourcing, doing everything possible to keep human-touch points to an absolute minimum.
Not because it's the "optimal" level of automation, but because the incentives force them there. If the magic number is 149, then every seat is precious; you can't afford to waste a single person outside the core joints of the company.
This cliff might be touted by Harvard Business School types as "the new wave of AI-first management." But properly understood, it's just a rational response to enterprise pricing schemes.
This might sound exaggerated. But you can already see the behavioral divergence across organizations. Talk to developers at big companies; they are carefully counting tokens, increasingly anxious about leadership cutting token budgets. Meanwhile, developers at startups are furiously maxing out usage (tokenmaxxing), launching swarms of agents overnight, and checking logs in the morning. I expect this trend to accelerate.
No one designed this intentionally. No committee decided to subsidize innovation for startups and tax incumbents. It all follows directly from tried-and-true traditional enterprise pricing strategies.
But this is how tax laws have always been: a bunch of ancillary rules that ultimately determine which companies can be built, and how those companies distort themselves to minimize their tax burden.
You might argue this is temporary; LLM companies will eventually meter everyone. GitHub Copilot already made this shift. Maybe, maybe not. But before pricing normalizes, 149-person companies and this new wave of AI-first management may have already exploded, gobbled up market share, and written the playbook for the next generation of startups.
Tax policy is crucial. The entire "gig economy" concept only exists because of the legal line between W-2 and 1099. As more and more labor is eaten by AI, token pricing may be the most impactful tax policy of the next decade. Yet no one will ever vote on it.
(Don't be surprised if the fastest-growing companies of the next cycle are all conspicuously stuck at 149 seats.)





