Dragonfly Partner Haseeb: Why the Fastest-Growing Companies of the Future May All Stagnate at 149 Employees

链捕手Published on 2026-06-24Last updated on 2026-06-24

Abstract

Dragonfly partner Haseeb analyzes AI service pricing, particularly Anthropic's model, as a form of "tax policy" on AI labor. He observes a stark contrast: small companies (under ~150 users) benefit from low or zero marginal cost via flat-rate subscriptions, incentivizing them to maximize AI usage ("tokenmaxxing"). In contrast, large enterprises pay per-token API costs with high (e.g., 75%) markups, creating a disincentive for experimental or marginal automation. This pricing cliff at 150 users acts like a regulatory notch, potentially causing the fastest-growing AI-native companies to artificially cap their headcount to retain subsidized rates. Consequently, large-scale direct AI-for-human substitution within big corporations may not happen as expected; instead, job displacement may occur indirectly as lean, AI-heavy startups outcompete and erode their market share. The article concludes that token pricing, though not designed as such, functions as a powerful de facto tax policy shaping company structure and automation incentives.

Author: Haseeb

Compiler: Jiahuan, ChainCatcher

@SemiAnalysis_ recently discovered an incredible phenomenon in the economics of AI programming subscriptions. If you max out your usage, you actually pay 20 to 70 times less per token than buying tokens via the API.

Many people see this and say: My God, look at how much these LLM companies are subsidizing on tokens, the bubble must burst soon.

This reaction is wrong. The reason LLM companies are willing to offer such generous plans is naturally because most users rarely hit the cap. This product is like a gym membership: the allowance is generous because the vast majority of people hardly use it.

But I spent a long time pondering this; there's something genuinely odd here.

We don't know their actual comprehensive profit margin on subscriptions, but according to SemiAnalysis's estimates, at an average utilization rate of 20%, Anthropic's Max 5x plan barely breaks even. A 20% utilization rate might even be on the high side, especially in organizations where everyone (including non-programmers) has a subscription but only uses it occasionally. Most institutions I know, including Dragonfly, generously distribute Claude Code subscriptions and encourage non-programming staff to try them.

But what SemiAnalysis didn't delve into is that this is entirely a phenomenon for small businesses. Large enterprises cannot access this subscription pricing.

Here's why: When you exceed 150 users, you are forced out of the subscription model called "Team." You must switch to "Enterprise," priced at a base of $20/seat plus API fees calculated based on actual token usage. Enterprises can only pay linearly by token cost, and SemiAnalysis estimates the gross margin on API tokens to be around 75%. This is a massive price hike that suddenly kicks in at 150 users.

So, if you're a small business or startup (or an individual user), your perception of AI spending is distorted. Your token pricing is actually heavily subsidized; Anthropic likely operates on very low or even negative margins on you.

You might wonder why Microsoft and Uber are making a fuss about token spending and talking about "token-mining." This is the reason. They pay a structurally higher cost per token than startups and individuals.

But Anthropic doesn't care! For a B2B company, extracting maximum value from small companies or individuals isn't very meaningful. Look at companies like Datadog or Cloudflare: 80% to 90% of their revenue comes from large contracts (Annual Recurring Revenue > $100k). Making zero profit on the long tail is just a customer acquisition cost.

This is classic B2B sales thinking.

But there's another way to view the same situation: through the lens of tax policy.

Because if tokens are replacing labor, then the gross margin that OpenAI and Anthropic charge on tokens is effectively a tax on AI labor.

Viewing token pricing this way leads to two main consequences.

Token Pricing as Tax Policy

Assuming the profit margins from SemiAnalysis's article hold: subscriptions break even, large enterprise API gross margin is 75%. The immediate reaction is to call this a 75% tax on AI labor for large organizations and a 0% tax for startups.

Standard tax analysis would say this discourages large companies from using AI labor internally, incentivizing them, at the margin, to automate less and retain more human labor. (Obviously, it also encourages using smaller or open-source models, but the net effect incentivizes both. Remember, we're talking about the margin here.)

However, what drives behavior more strongly is not the average tax rate. In tax policy, it never is. What we really care about is the marginal tax rate.

For startups on flat-rate subscriptions, the marginal price of the next token is zero until the cap is hit. And a zero marginal price is the maximum distortion a policy can create.

For startups, the subscription model is essentially an innovation subsidy. The most overwhelming incentive is to figure out how to spend the entire token budget as efficiently as possible. This means running Ralph loops, keeping screens full of Claude Code sessions, and scheduling swarms of agents to work together.

Until the cap is reached, exploration is free. So startups are essentially racing to squeeze every last drop of value from the subscription, overwhelming competitors with output. Paradoxically, the more you use, the lower the average token price gets. Every startup wants to be the one that costs Anthropic the most on the subscription.

Large enterprises face the opposite incentive. If you exceed 150 seats, every token in exploration is charged at full markup (with a 75% surcharge!). So they face a linearly increasing penalty for every step of exploratory frontier they probe.

Large firms will still automate the obvious high-volume tasks, but the marginal, experimental, risky automation is never discovered because the cost of discovery is too high. This tax structure ultimately encourages them to retain more human labor and maintain their original organizational structure.

This is the exact opposite of Japan. Due to a declining population, Japan faces a huge labor shortage. Historically, this meant Japan pursued high automation because expensive human labor incentivized it. That's why Japan has robots in restaurants, factories, hotels, and hospitals.

But curiously, large enterprises find themselves in the opposite dilemma: if they have to pay a high tax on using AI, it weakens the incentive to automate and strengthens the motive to retain existing employees (which is even more pronounced if wages stagnate during this period).

So where does the labor substitution flow in this model?

Everyone is watching large companies, waiting for the wave of AI layoffs. But with a 75% tax, aggressively replacing your own employees with AI might simply not be cost-effective; the token budget would explode.

But that doesn't mean substitution won't happen; it just manifests differently.

When large enterprises lose market share to AI-native startups with extremely low composite human costs, the large firms' declining revenue and stock prices trigger layoffs. But those eliminated jobs never reappear in the winning startups. The net reduction effect is the same; this unemployment gap is simply transferred to a lower-taxed part of the economy.

This is also why "AI-washing" (framing ordinary layoffs as newfound AI efficiency) might not be a temporary phenomenon. AI-washing refers to a company attributing layoffs to AI efficiency when it's really just masking ordinary business weakness.

Many think this is just a fad in the current AI hype cycle. But even though everyone is ready to witness large enterprises conducting real AI layoffs, replacing positions with AI, this might never happen on a large scale.

Labor substitution may unfold differently: startups defeat large companies, large companies disguise decline under the banner of AI until they fail, and startups never rebuild those old jobs. Job substitution still occurs, just not where everyone is looking.

This is the first consequence of this model. But there's a second, even more bizarre consequence.

The 150-Person Cliff

A regulatory notch is a regulatory boundary that incentivizes a huge behavioral jump. For example: the 30-hour weekly threshold for full-time employment has spawned a lot of jobs that are exactly 29 hours per week.

It's well known that France has extremely strict labor regulations that kick in once a company reaches 50 employees (work councils, mandatory profit-sharing, firing protections), while small companies are exempt. This gives employers a huge incentive to desperately keep their headcount below 50.

From: Garicano, Luis, Claire Lelarge, and John Van Reenen, 2016, "Firm Size Distortions and the Productivity Distribution: Evidence from France."

Extend this analogy to AI. LLM companies have created a tax threshold that punishes companies exceeding 150 seats. This means you must stay small to keep that wonderful subsidized subscription price, where tokens are taxed at roughly 0% (or even negative), not 75%.

This could give rise to an entirely new philosophy of corporate management. Startups will become obsessed with using agents for everything, having smaller teams, frequent layoffs, more outsourcing, exhausting all means to minimize the parts that require humans.

Not because it's the "optimal" level of automation, but because the incentives push them there. If the magic number is 149, every seat is crucial; you can't waste a single person outside the company's core functions.

This discontinuity might be hailed by Harvard Business School types as "the new generation of AI-first management." But understood correctly, it's simply a rational response to enterprise pricing plans.

This might sound exaggerated. But you can already see the behavioral divergence between organizations. Talk to developers at large enterprises; they are meticulously counting tokens, growing anxious as leadership cuts token budgets. Meanwhile, developers at startups are aggressively tokenmaxxing, launching swarms of agents overnight to check the logs in the morning. I expect this trend to accelerate.

No one designed this intentionally. No committee decided to subsidize innovation for startups and tax incumbents. It flows directly from those tried-and-true enterprise pricing strategies.

But tax law has always been this way: a pile of incidental rules that ultimately determine which companies get built and how those companies contort themselves to minimize their tax burden.

You might argue this is temporary, that LLM companies will eventually meter everyone. Github Copilot already made this transition. Maybe, maybe not. But before pricing normalizes, the 149-person company and this new AI-first management style might have already exploded, swallowed large market shares, and written the playbook for the next generation of startups.

Tax policy matters immensely. The entire "gig economy" concept exists precisely because of the legal distinction between W-2 (employee) and 1099 (independent contractor). As more and more labor is cannibalized by AI, token pricing could become the most impactful tax policy of the next decade. Yet, no one will ever vote on it.

(Don't be surprised if the fastest-growing companies in the next cycle are all conspicuously stuck at 149 seats.)

Related Questions

QAccording to the article, what is the primary reason why AI model companies offer generous subscription plans to small businesses and individuals?

AThe primary reason is that most users rarely hit their usage limits. These subscription plans function like gym memberships, where the generous allowances are possible because the vast majority of users do not utilize them heavily. This allows companies to attract customers and cover costs based on low average utilization, making it a customer acquisition strategy rather than a direct profit center for these segments.

QWhat is the key difference in pricing and cost structure for companies with over 150 users compared to smaller ones, as described in the text?

AFor companies with over 150 users, they are forced to switch from the 'Team' subscription model to the 'Enterprise' model. The Enterprise pricing charges a base fee per seat plus API fees based on actual token usage at a much higher marginal cost. The article suggests the gross margin on API tokens is around 75%, representing a significant price hike that suddenly takes effect at the 150-user threshold.

QHow does the article frame the AI token pricing for large enterprises in terms of economic policy?

AThe article frames it as a form of taxation on AI labor. Since tokens are replacing human labor, the high gross margin (around 75%) charged by companies like OpenAI and Anthropic on API tokens for large enterprises acts as a 'tax' on using AI workforce. This high marginal cost discourages large companies from aggressively automating marginal or experimental tasks, incentivizing them to retain more human labor instead.

QWhat behavioral incentive does the 'zero marginal price' of tokens create for startups under the subscription model?

AThe zero marginal price for tokens until the subscription cap is reached creates a powerful incentive for startups to maximize token usage. It acts as an innovation subsidy, driving startups to figure out how to spend their entire token budget as efficiently as possible. This leads to behaviors like running automated loops, maximizing concurrent AI sessions, and scheduling swarms of AI agents to work, all to extract the maximum value from their fixed-cost subscription.

QWhat is the '150-person cliff' analogy, and what potential long-term consequence does the article suggest it might lead to?

AThe '150-person cliff' is compared to a regulatory notch, like France's labor laws that trigger at 50 employees. It's the pricing threshold where companies lose subsidized subscription rates and face much higher 'taxed' API costs. The article suggests this could lead to a new management philosophy where the fastest-growing companies consciously stay under 149 employees. They would become obsessed with using AI agents for everything, keeping teams extremely lean, outsourcing, and avoiding human roles outside core functions—not because it's optimally efficient, but to avoid crossing the costly pricing threshold.

Related Reads

Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

Google has begun selling its proprietary TPU chips and AI computing hardware directly to third-party data centers and clients, marking a strategic shift. Previously only accessible via cloud rentals, TPUs are specialized processors designed for the matrix and tensor operations central to AI models. By combining thousands into supercomputing clusters managed by CPUs, Google achieves high-efficiency AI processing. This move enables Google’s Gemini AI to offer competitive token pricing, challenging rivals like OpenAI. It also signals a broader industry trend where AI compute is becoming a commoditized resource like electricity. While NVIDIA remains dominant with its CUDA ecosystem and high-performance GPUs, the focus is shifting from raw power to cost efficiency and system integration. Google’s approach mirrors NVIDIA’s by selling an entire ecosystem—hardware, software, and data center expertise—rather than just chips. This threatens NVIDIA’s grip on the mid-range inference market, where lower-cost, efficient solutions are increasingly demanded. Similarly, cloud providers like Huawei Cloud and Alibaba Cloud in China are developing their own AI chip ecosystems (e.g., Ascend, Zhenwu), packaging chips, clusters, and tools into full-stack solutions. They aim to reduce token costs and capture market share through integrated systems. In summary, the AI infrastructure race is evolving from a competition for the strongest chips to a contest for the most efficient and cost-effective systems. Google’s TPU sales highlight this transition, emphasizing that future success lies in delivering affordable, scalable AI compute as a foundational service.

marsbit20m ago

Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

marsbit20m ago

Trading

Spot
Futures
活动图片