Dragonfly Partner Haseeb: The Fastest-Growing Companies of the Future May All Get Stuck at 149 Employees

marsbitPublished on 2026-06-24Last updated on 2026-06-24

Abstract

Dragonfly partner Haseeb explores the distorted economics of AI model pricing, drawing parallels to tax policy. He notes that startups and small teams (under 150 users) enjoy heavily subsidized, fixed-price AI subscriptions (like Claude Code), where the marginal cost of an additional token is effectively zero. This creates a powerful incentive for them to maximize token usage ("token-maxxing") and innovate aggressively with AI automation. In contrast, large enterprises (over 150 users) are forced onto "Enterprise" plans, paying per-token API fees with high (~75%) markups. This acts like a steep "tax" on AI-powered labor, disincentivizing marginal automation and experimental use, and encouraging them to retain more human workers. Haseeb argues this pricing creates a "150-person cliff," a regulatory notch similar to labor laws in France that discourage firms from growing past 50 employees. He predicts the fastest-growing future companies may deliberately cap their headcount at 149 to avoid the punitive enterprise pricing. This would foster an "AI-first" management philosophy obsessed with automation and outsourcing to stay lean. While not intentionally designed, this bifurcated pricing could become one of the most influential de facto tax policies, shaping how AI replaces labor—not through mass layoffs at big firms, but through agile, AI-native startups outcompeting them.

Author: Haseeb

Compiled by: Jiahuan, ChainCatcher

@SemiAnalysis_ recently discovered an incredible phenomenon in the economics of AI programming subscriptions. If you max out usage, the fee you pay is actually 20 to 70 times cheaper than buying tokens through the API.

Many see this and say: My God, look at how much these LLM companies are subsidizing tokens. The bubble is bound to burst soon.

This reaction is wrong. LLM companies are willing to offer such generous packages precisely because most users rarely hit the ceiling. The product is like a gym membership: generous allowances exist because the vast majority of people hardly use them.

But I've spent a long time pondering this; there is something odd here.

We don't know their actual blended margins on subscriptions, but according to SemiAnalysis, at 20% average utilization, Anthropic's Max 5x plan barely breaks even. A 20% utilization rate might even be optimistic, especially in organizations where everyone (including non-programmers) has a subscription but only uses it occasionally. Most institutions I know, including Dragonfly, are generous with Claude Code subscriptions and encourage non-technical staff to try them.

But what SemiAnalysis didn't delve into is that this is purely a small business phenomenon. Large enterprises cannot use this subscription pricing.

Here's why: When you reach 150 people or more, you are forced out of the "Team" subscription model. You must switch to "Enterprise," priced at a base of $20 per seat plus API fees based on actual token usage. Enterprises pay linearly for token costs, and SemiAnalysis estimates API token gross margins to be around 75%. This is a massive price hike that kicks in suddenly at 150 people.

So, if you are a small business or startup (or an individual), your perception of AI spending is distorted. Your token pricing is actually heavily subsidized; Anthropic likely maintains extremely low or even negative margins on you.

You might wonder why Microsoft and Uber fret so much about token spend and talk about "token-mining." That's the reason. Their structural cost per token is much higher than for startups and individuals.

But Anthropic doesn't care! For a B2B company, extracting maximum value from small companies or individuals isn't very meaningful. Look at companies like Datadog or Cloudflare; 80% to 90% of their revenue comes from large contracts (annual recurring revenue over $100,000). Making zero profit on the long tail is just a customer acquisition cost.

This is classic B2B sales thinking.

But there's another way to view the same situation: through the lens of tax policy.

Because if tokens are replacing labor, then the gross margin OpenAI and Anthropic charge on tokens is essentially a tax on AI labor.

Viewing token pricing this way leads to two major consequences.

Token Pricing as Tax Policy

Assuming the profit margins in the SemiAnalysis article hold: subscriptions break even, large enterprise API gross margin is 75%. The initial reaction is to call this a 75% AI labor tax on large organizations, and a 0% tax on startups.

Standard tax analysis would say this discourages large companies from using AI labor internally, pushing them on the margin to reduce automation and retain more human labor. (Obviously, it also encourages using smaller or open-source models, but the net effect is both are incentivized. Remember, we're talking about the margin here.)

However, what drives behavior more strongly is not the average tax rate. In tax policy, it never is. What we really care about is the marginal tax rate.

For startups on flat-rate subscriptions, the marginal price of the next token is zero until they hit the cap. And a zero marginal price creates the maximum possible distortion a policy can create.

For startups, the subscription model is basically an innovation subsidy. The overwhelming incentive is to figure out how to spend the entire token budget as efficiently as possible. This means running Ralph loops, filling screens with Claude Code sessions, scheduling swarms of agents to work together.

Until the cap is hit, exploration is free. So startups are essentially racing to squeeze the last drop of value from their subscription, out-producing each other. Paradoxically, the more you use, the lower the average token price. Every startup wants to be the one making Anthropic lose the most on subscriptions.

Large enterprises face the opposite incentive. If you exceed 150 seats, every exploratory token is charged at full mark-up (plus a 75% surcharge!). So every step they take exploring the frontier is linearly punished.

Big companies will still automate large, obvious bulk tasks. But marginal, experimental, risky automation will never be discovered because the cost of discovery is too high. This tax structure ultimately encourages them to retain more human labor, preserving their overall organizational structure.

This is the opposite of Japan. Due to a declining population, Japan faces a huge labor shortage. Historically, this meant Japan pursued intense automation, as high human costs incentivized it. That's why Japan has robots in restaurants, factories, hotels, hospitals.

But, strangely, large enterprises find themselves in the opposite dilemma of Japan: if they have to pay an extremely high tax to use AI, it weakens the incentive to automate, strengthening the motivation to retain existing employees (even more so if wages stagnate during this period).

So where does labor substitution flow in this model?

Everyone is staring at big companies, waiting for AI layoffs. But with a 75% tax, aggressively replacing your own employees with AI may simply be uneconomical; token budgets would explode.

But this doesn't mean substitution won't happen; it just manifests differently.

When big firms lose market share to AI-native startups with minimal blended human labor costs, the big firms' revenue and stock price declines trigger layoffs. But those eliminated jobs never reappear at the winning startups. The net reduction effect is the same; this employment gap is just transferred to a lower-taxed part of the economy.

This is also why "AI-washing" (portraying ordinary layoffs as newfound AI efficiency) may not be a flash in the pan. AI-washing is when a company attributes layoffs to AI efficiency, but is actually just masking ordinary business weakness.

Many think this is just a blip in the current AI hype cycle. But even though everyone is primed to witness big companies doing real AI layoffs, "replacing jobs" with AI, that may never happen at scale.

Labor substitution might unfold a different way: startups beat incumbents, incumbents disguise their decline with AI-washing all the way to death, and the startups never rebuild those old jobs. Job substitution still happens, just not where everyone is looking.

That's the first consequence of this model. But there's a second, even weirder consequence.

The 150-Person Cliff

A regulatory notch is a regulatory boundary that induces huge behavioral jumps. Example: the 30-hour-a-week full-time employment threshold, which created tons of jobs that exactly clocked 29 hours a week.

France famously has extremely rigid labor laws that kick in once a company hits 50 employees (works councils, mandatory profit-sharing, firing protections). Smaller companies are exempt. This gives employers a huge incentive to desperately stay below 50 people.

Source: Garicano, Luis, Claire Lelarge, and John Van Reenen, 2016, Firm Size Distortions and the Productivity Distribution: Evidence from France.

Extend this analogy to AI. LLM companies have established a tax threshold that punishes companies exceeding 150 seats. This means you must stay small to keep that wonderful subsidized subscription price, taxing tokens at ~0% (or even negative) rather than 75%.

This could spawn an entirely new management philosophy for companies. Startups will become increasingly obsessed with solving everything with agents, smaller teams, more frequent layoffs, more outsourcing, doing everything possible to keep human-touch points to an absolute minimum.

Not because it's the "optimal" level of automation, but because the incentives force them there. If the magic number is 149, then every seat is precious; you can't afford to waste a single person outside the core joints of the company.

This cliff might be touted by Harvard Business School types as "the new wave of AI-first management." But properly understood, it's just a rational response to enterprise pricing schemes.

This might sound exaggerated. But you can already see the behavioral divergence across organizations. Talk to developers at big companies; they are carefully counting tokens, increasingly anxious about leadership cutting token budgets. Meanwhile, developers at startups are furiously maxing out usage (tokenmaxxing), launching swarms of agents overnight, and checking logs in the morning. I expect this trend to accelerate.

No one designed this intentionally. No committee decided to subsidize innovation for startups and tax incumbents. It all follows directly from tried-and-true traditional enterprise pricing strategies.

But this is how tax laws have always been: a bunch of ancillary rules that ultimately determine which companies can be built, and how those companies distort themselves to minimize their tax burden.

You might argue this is temporary; LLM companies will eventually meter everyone. GitHub Copilot already made this shift. Maybe, maybe not. But before pricing normalizes, 149-person companies and this new wave of AI-first management may have already exploded, gobbled up market share, and written the playbook for the next generation of startups.

Tax policy is crucial. The entire "gig economy" concept only exists because of the legal line between W-2 and 1099. As more and more labor is eaten by AI, token pricing may be the most impactful tax policy of the next decade. Yet no one will ever vote on it.

(Don't be surprised if the fastest-growing companies of the next cycle are all conspicuously stuck at 149 seats.)

JD.com and Former OpenAI CTO Mira Murati Have Bet on the Same AI Track

JD.com and Mira Murati's Thinking Machines Lab are converging on the same AI frontier: proactive visual-language interaction models. JD just open-sourced JoyAI-VL-Interaction, the world's first full-stack open-source model of its kind. Unlike traditional "turn-based" AI that waits for user prompts, this model actively analyzes continuous video streams, autonomously deciding when to respond, stay silent, or delegate complex tasks. It prioritizes vision as the primary driver for decision-making in physical-world scenarios like elderly fall detection, live sports commentary, or warehouse monitoring. The 8-billion-parameter model is designed for practical deployment, running on a single RTX 3090 GPU with sub-second latency. Its "full-stack" open-source release includes the model, inference system, and dataset, aiming to catalyze a developer ecosystem. JD's strategy is underpinned by its vast operational footprint in retail, logistics, and healthcare, which provides crucial real-world data for training. The move signals a broader shift in AI competition from screen-based Q&A to active participation in the physical world.

marsbit19m ago

JD.com and Former OpenAI CTO Mira Murati Have Bet on the Same AI Track

marsbit19m ago

Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

Google has begun selling its proprietary TPU chips and AI computing hardware directly to third-party data centers and clients, marking a strategic shift. Previously only accessible via cloud rentals, TPUs are specialized processors designed for the matrix and tensor operations central to AI models. By combining thousands into supercomputing clusters managed by CPUs, Google achieves high-efficiency AI processing. This move enables Google’s Gemini AI to offer competitive token pricing, challenging rivals like OpenAI. It also signals a broader industry trend where AI compute is becoming a commoditized resource like electricity. While NVIDIA remains dominant with its CUDA ecosystem and high-performance GPUs, the focus is shifting from raw power to cost efficiency and system integration. Google’s approach mirrors NVIDIA’s by selling an entire ecosystem—hardware, software, and data center expertise—rather than just chips. This threatens NVIDIA’s grip on the mid-range inference market, where lower-cost, efficient solutions are increasingly demanded. Similarly, cloud providers like Huawei Cloud and Alibaba Cloud in China are developing their own AI chip ecosystems (e.g., Ascend, Zhenwu), packaging chips, clusters, and tools into full-stack solutions. They aim to reduce token costs and capture market share through integrated systems. In summary, the AI infrastructure race is evolving from a competition for the strongest chips to a contest for the most efficient and cost-effective systems. Google’s TPU sales highlight this transition, emphasizing that future success lies in delivering affordable, scalable AI compute as a foundational service.

marsbit21m ago

Google Starts Selling TPUs, Big Tech Aims to Produce "Low-Cost Tokens" with AI Chips

marsbit21m ago

Do Not Apply Blindly: A Comprehensive Evaluation of the Eight Main Pathways to Hong Kong Residency by 2026

Hong Kong has recently updated its talent attraction policies, offering eight mainstream pathways to residency. These include programs like the Top Talent Pass Scheme (TTPS), Quality Migrant Admission Scheme (QMAS), the newly introduced Technology Professionals Admission Scheme (TP Stream), and Vocational Professionals Admission Scheme (VPAS). Navigating these options involves understanding key details such as core eligibility criteria, employer sponsorship requirements, and the respective advantages and drawbacks of each scheme. A comprehensive comparison chart is provided to help applicants evaluate their choices and potentially save on consultancy fees. Applicants are reminded to always verify information with the official announcements from the Hong Kong Immigration Department.

marsbit43m ago

Do Not Apply Blindly: A Comprehensive Evaluation of the Eight Main Pathways to Hong Kong Residency by 2026

marsbit43m ago

Report Analysis: Semiconductor Sector Surges 155%, Bernstein Says NVDA and AVGO Are 'Absurdly Cheap'

Title: Bernstein's Semiconductor Quarterly Review: AI is the "Only Game in Town," Highlights "Absurdly Cheap" NVDA and AVGO Bernstein's June 23 semiconductor industry review asserts AI is the sector's dominant driver, fueling record gains. The SOX index rose 155.6% over the past year, primarily driven by a 75% increase in forward EPS, not just valuation expansion. While sector-wide valuations are high, Bernstein identifies a significant valuation gap. Despite leading the AI chip market, Nvidia (NVDA) and Broadcom (AVGO) have lagged in performance this year. Based on their 2027 EPS projections and critical roles in the AI supply chain, Bernstein rates both as "Outperform," calling their current valuations "absurdly cheap." The report notes extreme divergence within the sector, with memory chips up 500% YTD, while GPUs/ASICs gained 115%. Bernstein upgraded AMD to "Outperform" due to its dual opportunity in both AI/GPU and CPU markets. However, it remains cautious on Qualcomm (QCOM), citing smartphone market pressures. Key risks highlighted include historically high sector crowding and elevated inventory levels, which could pressure the supply chain if demand softens. The conclusion stresses selective stock-picking over broad directional bets, given current high valuations.

marsbit1h ago

Report Analysis: Semiconductor Sector Surges 155%, Bernstein Says NVDA and AVGO Are 'Absurdly Cheap'

marsbit1h ago

Bitcoin Loses $63,500 Support As Heatmaps Show Liquidity Building Above Price

Bitcoin has broken below the $63,500 support level, a key area that had previously held during dips. The failure of buyers to defend this level signals a shift in market sentiment, as broken support can turn into resistance. However, analysis of liquidation heatmaps shows significant liquidity building above the current price, between $65,500 and $66,500. This overhead liquidity could act as a magnet for a short squeeze or relief rally, creating tension with the bearish signal from the support break. The near-term direction hinges on whether Bitcoin can reclaim $63,500 or gets rejected, with a move through the overhead liquidity zone needed to improve the bullish setup. Traders are advised to watch for confirmation from price action, volume, and follow-through rather than treating this as a definitive directional call.

bitcoinist1h ago

Bitcoin Loses $63,500 Support As Heatmaps Show Liquidity Building Above Price

bitcoinist1h ago

Trading

Spot

Futures