From Tokens to Machine Labor: AI is Shifting from Tool to "Worker"

marsbitPublished on 2026-05-31Last updated on 2026-05-31

Abstract

The article "From Token to Machine Labor: AI is Evolving from Tool to 'Worker'" argues that the business model for AI is shifting beyond simply selling computational resources (tokens, GPU hours) or model access. Instead, a new "machine labor market" is emerging, where the core economic transaction is the purchase of economically useful work directly performed by software. The central thesis is that AI pricing will evolve through four stages: 1) raw tokens, 2) standardized LLM capabilities (e.g., text generation), 3) industry-specific labor markets (e.g., legal review, radiology), and finally 4) a programmable results market where tasks like resolving a support ticket are bid on and priced based on outcome. In this future, buyers will care less about *which* model or GPU completes a task and more about whether the work meets specified standards for accuracy, latency, and cost. This transition reframes the impact of AI on human labor. Rather than simple replacement, it suggests a re-coordination where machines handle standardized, verifiable work, freeing humans for roles involving oversight, context management, responsibility, and final judgment. In some cases, this "last 1%" of human input becomes more valuable as it enables the other 99% to be automated. Furthermore, as AI reduces the cost of work, demand may expand, creating larger markets (e.g., 24/7 customer service) rather than just cheaper versions of existing ones. The article concludes that while infrastructure (G...

Editor's Note: As AI begins to write code, handle customer service tickets, and review legal documents, a more fundamental question is emerging: what are businesses truly purchasing—tokens, GPU hours, or completed work?

This article proposes a noteworthy framework: the commercialization of AI should not merely be understood as a "computing power market" or a "model invocation market," but is moving toward a new "machine labor market." In this market, tokens are merely a unit of measurement, GPUs are input goods, models are production tools, and the real objects being priced and traded are economic tasks directly completed by software.

The article's core thesis is that AI pricing mechanisms will evolve from raw tokens and standardized model capabilities to industry-specific labor, and finally to a programmable outcomes market. This means that in the future, enterprises might no longer care which specific model or GPU type completes a task, but rather whether it delivers a result meeting defined standards within specified latency, accuracy, reliability, and cost constraints.

This also implies that AI's impact on the human labor market may not be simple replacement. As machines take on more standardized, verifiable work, the human role may shift towards review, accountability, context management, and final judgment. In some scenarios, the final 1% of human judgment could become even more valuable, as it unlocks large-scale automation of the other 99%.

From this perspective, the next stage of competition in the AI market may no longer be just about model capability, nor a simple price war over computing power, but about who can first standardize, verify, and price "work," and ultimately turn machine labor into a new type of factor of production that can be procured, settled, and traded.

Original article follows:

Waves of productivity have historically come from providing humans with tools and software to optimize how work gets done. Spreadsheets aided accountants and analysts, conveyor belts increased throughput, hammers amplified human leverage. But the actual labor always came from humans.

Now, AI is producing work outcomes end-to-end, directly executing the labor itself. It can write code, handle customer service tickets, review legal documents. A compression is happening at the far end of the tech stack: the old stack supported labor, the new stack is starting to produce it.

If you've listened to discussions on AI financialization recently, you've likely heard Jensen and others say that LLM tokens and/or GPU hours are becoming the new commodities. This intuition is understandable because tokens are measurable, billable, and easy to chart; billions of dollars are also flowing into GPU hours. But tokens remain merely a meter, GPU hours are just an input—no one buys them for their own sake. What people actually want is to get work done. AI is turning the tech stack itself into a source of labor.

Machine Labor: Work performed by software, for an economic purpose, and sold into the production process.

The market is already moving in this direction. Benchmark's Sarah Tavel prefers to understand this opportunity through the lens of outsourcing labor markets, not software categories. If a repeatable task is already performed by a dedicated offshore team or professional services firm, it's often also a good candidate for AI delivery. a16z's Alex Rampell calls this "software eating labor": software's next act is to do the work itself. Sequoia's Julien Bek describes the same shift from another angle: services are turning into software, copilots sell tools, while autopilots sell the work.

The Missing Market Behind Outcomes-Based Pricing

Seat pricing charges for access, token pricing charges for usage. Outcomes-based pricing charges when work is completed. Outcomes pricing moves us a step forward, but it still doesn't answer one question: who decides the price?

If machine labor can be bought directly, price should come from competition among suppliers. These suppliers must be able to meet the same class of tasks or work completion standards, which requires standardization within different industries and tasks.

The current approach uses LLM tokens, but raw tokens are just the lowest layer. A barrel of oil is just a unit of measurement; what's actually traded is a barrel of a specific grade of oil, with defined quality, delivery terms, and market price. A barrel of Brent crude and a barrel of high-sulfur heavy crude are not the same commodity. It's the same with LLM tokens. Tokens are just the unit; what matters is the intelligence behind them: model quality, benchmark floors, latency, context window, reliability, and delivery guarantees. One million tokens from a frontier coding model are not the same commodity as one million tokens from a cheap general-purpose model. The market needs standardized inference grades, just as the energy market needs standardized oil grades.

Anjali Shriva points this out directly: a token is not a fixed cost unit. Its economics vary with context length, task structure, input/output ratio, retry counts, tool calls, and agent workflows. A token in a short prompt and a token buried in a long agent loop are not the same economic object.

We already do this in human labor markets. No one hires a radiologist as a generic "human hour." They look at training background, certifications, specialty, years of experience, availability, reputation, liability, etc. Different human contract specs correspond to different minimum standards and grade expectations.

Human labor markets already run on these specs; it's just that these specs are often muddled, qualitative, and full of proxy signals. Machine labor will make these specs more explicit and quantifiable.

For an LLM or an agent, metrics like skill, experience, speed, and reliability can be written directly into a contract: benchmark scores, latency, throughput, context window, max output length, tool-use accuracy, uptime, error rates. We can procure labor based on quantifiable expectations and outcomes.

TheGrid.ai's contract spec is essentially a qualification filter, plus price competition for LLM outputs. Any supplier meeting the spec can enter the competition:

Intelligence Benchmark ≥ Floor

Latency ≤ Ceiling

Throughput ≥ Floor

Uptime ≥ Floor

Error Rate ≤ Ceiling

Once suppliers all meet the same minimum bar, they compete on price. The buyer asks: which supplier can deliver the required labor at the best price?

Hiring a radiologist, in the LLM context, becomes a measurable question: which LLMs can read X-rays with high proficiency, meeting defined latency, context window, and other outcome-based contract specs.

Outcome is how the buyer measures success; Labor is the economic activity being supplied; Token is the fuel the machine consumes while doing the work.

The Grid is the machine labor market.

From Tokens to Machine Labor Market

The market can price inputs of the tech stack, but to price outputs requires a machine labor market. Buyers don't care about GPU hours. Model endpoints themselves are unstable: they get renamed, deprecated, wrapped, or simply retired.

Users and liquidity hate frequent changes. GPUs and models will keep evolving, but the stable unit is the work itself.

I believe the market will evolve along the following path. Each step up the ladder, what is being purchased becomes more abstract, more valuable, but also harder to verify. The Grid should climb this ladder progressively:

Raw Tokens → Commoditized LLM Capability Market → Commoditized Labor Market → Programmable Outcomes Market

Stage 1: Raw Tokens

Claude 4.7, GPT 5.5, Kimi 2.6, DeepSeek V4, GLM 5, etc.

Today, buyers purchase raw model output from inference providers. They send their prompts, receive inference results, and pay per use. This is easy to verify, but it's still just raw material. Buyers don't actually want tokens; they want useful intelligence at the best price.

Stage 2: Commoditized LLM Capability Market

E.g., text/usd, code/usd, agent/usd, etc.

The buyer no longer chooses a specific model, but the category of intelligence they need. The buyer still owns the workflow, prompts, data, and application logic. The Grid just routes each request to the qualified model that meets the contract spec and offers the lowest price.

Note: This is the first real abstraction above raw tokens, and where TheGrid.ai currently sits.

Stage 3: Commoditized Labor Market

E.g., accounting/usd, support_agent/usd, legal/usd, healthcare/usd, radiology/usd, etc.

As models become more specialized, the capability market can evolve further into industry-specific markets. This is analogous to specialization in human labor markets.

At this layer, we're selling inference suited for workflows in specific labor verticals. This category will expand rapidly as niche industry models become common. Examples include Cursor's Composer, Harvey for legal work, and EvidenceOpen for healthcare.

Stage 4: Programmable RFQ and Outcomes Market for Agents

E.g., support_ticket_resolved/usd, pr_merged/usd, claim_processed/usd, etc.

The final layer is where The Grid moves from an inference market to a machine labor market.

This layer requires RFQs (Request for Quote), escrow, delayed settlement, buyer confirmation, supplier reputation, clawback, dispute resolution, etc. It will likely start with RFQs rather than order books. Buyers define the job, constraints, acceptance criteria, and settlement terms; agents bid to complete it. The Grid helps route, price, verify, and settle these jobs.

This is the most valuable layer, but also the hardest to verify, as outcomes can be delayed, subjective, and easily gamed. A customer service ticket might be reopened; a pull request might pass tests but still introduce poor architecture.

Total Price = Cost of doing the work + Cost of bearing the risk

A workflow does not automatically become a market just because intelligence has a market or intelligence gets cheaper. Some work is deeply dependent on private context, like customer history or internal policies. The more context-dependent the work, the less likely it is to be cleanly liquidated in an open market. [@hypersoren https://hypersoren.xyz/posts/cybernetic-arbitrage/]

The market needs to reveal which labor categories will expand and which will contract.

"Machine Labor vs. Human Labor" or "Machine Labor & Human Labor"

Anjali Shriva notes in her mechanism design draft that the AI narrative is too often described as replacement. But in reality, it's more of a coordination problem: how work, attribution, incentives, and value get reorganized when both humans and machines participate in production.

Today, much AI usage inside companies remains stuck because employees use AI privately, workflows stay locked on individuals, and the firm cannot price these productivity gains or scale them.

Most automatable work will likely shift to machines. Some work will turn into human review, accountability, training, and context management. In some cases, the final 1% of human judgment will become more valuable, as it unlocks the other 99% of automation at scale.

Rachel Su Park's "Brave New World of AI Markets" points out that AI's TAM should not be simply modeled as a replacement for existing human labor spend, because it changes both price and quantity. As the cost of work falls, unit price may decline, but quantity consumed may expand because existing work gets consumed more often, and entirely new work that wasn't economical before becomes possible. The article summarizes it as:

P × Q: Market Size = Price per unit of work × Quantity of work consumed

If AI makes a customer service interaction cheaper, companies can afford to offer 24/7 coverage. The market won't just be a cheaper version of the old customer service labor market; it might become a much larger market for customer interactions.

AI is an expansionary market because demand does not stay constant when the cost of work falls.

The Labor Layer

The machine labor market should start with work that can be crisply defined in specs. GPU hours contain too much input information; they only tell you what powered the work. Pricing full outcomes is too complex, too context-dependent. As verification, reputation, and risk/insurance pricing get handled by machines, the market will move further into the pure outcomes layer.

Machine labor can become tradable because buyers will increasingly not care which model or which GPU produced the work, but rather whether the work itself meets the minimum standards and grade from the contract spec, at the right price. Agents will care even less about the underlying source.

Machines can now directly execute work for an economic purpose, and that work can be defined, measured, priced, procured, and eventually traded. Electricity, compute, models, and tokens are still important, of course, but they're all still upstream.

Downstream is where the work actually gets done, and the market is moving toward a simpler object: machine labor.

Related Questions

QAccording to the article, what is the core evolution path of AI pricing mechanism?

AThe article states that the AI pricing mechanism will evolve from raw tokens, to standardized model capabilities, to industry-specific labor, and finally to a programmable results market.

QWhat does the article define as 'machine labor'?

AThe article defines 'machine labor' as work performed by software for economic purposes, which is then sold into the production process.

QWhat are the four evolutionary stages of the AI market, from primitive to advanced, as outlined in the article?

AThe four stages are: 1) Raw Tokens, 2) Commoditized LLM Capability Market, 3) Commoditized Labor Market, and 4) Programmable Results Market for Agents.

QHow does the article suggest AI will change the human job market, rather than simply replacing it?

AThe article suggests that as machines take over standardized work, human roles will shift towards review, accountability, context management, and final judgment. In some cases, the final 1% of human judgment becomes more valuable as it unlocks large-scale automation of the other 99%.

QWhat key factor does the article identify as necessary for a true machine labor market to function?

AThe article identifies the need for standardization and verifiable specifications within different industries and tasks. Suppliers must meet defined standards (like benchmark scores, latency, error rates) for a specific type of work, enabling price competition based on the delivered labor, not the underlying tokens or compute.

Related Reads

After Burning Tens of Billions of Dollars in Tokens, Silicon Valley Giants Start Limiting Employee Token Usage

After burning tens of billions of dollars on AI tokens, major Silicon Valley firms are now restricting employee usage. Companies like Microsoft, Uber, and Salesforce, which heavily promoted AI for "efficiency," are facing a cost crisis. The practice of "tokenmaxxing"—pushing employees to maximize AI tool usage—led to wasteful spending on trivial tasks like checking the weather or writing birthday messages, with studies showing significant hidden costs for bug fixes and code rewrites. The core issue is a misalignment between individual productivity gains and actual business value. While employees use AI to automate tasks they dislike, such as writing reports, this often doesn't translate to increased company revenue or improved core business outcomes. For instance, AI-generated code speeds up development but also sees an 800% increase in "code churn" (code being discarded or rewritten). As a result, only 14% of CFOs report seeing a clear, measurable return on AI investments. Firms are now shifting strategies. Microsoft has revoked most internal licenses for Claude Code, while others are implementing monitoring and cost controls. New tools from companies like Harness and CloudZero aim to track AI spending and tie costs to business results. Some AI vendors, like HubSpot, are moving from token-based pricing to charging based on outcomes, such as "resolved conversations" or "leads generated." This represents a necessary correction in the AI adoption cycle. The challenge now is for companies to move beyond using AI merely to speed up old tasks and instead rethink their workflows and business models fundamentally. The future of enterprise AI depends on proving its value, not just its usage.

marsbit6m ago

After Burning Tens of Billions of Dollars in Tokens, Silicon Valley Giants Start Limiting Employee Token Usage

marsbit6m ago

Cango Releases Q1 Financial Report: Total Revenue of $102 Million, Business Expands into AI Computing Infrastructure

Cango Releases Q1 2026 Financial Results: Total Revenue of $102 Million, Business Expands into AI Compute Infrastructure Bitcoin mining company Cango reported unaudited financial results for Q1 2026. While bitcoin mining remains its core revenue driver, the company is strategically expanding into energy and AI compute infrastructure. **Key Financial & Operational Highlights:** * **Revenue & Performance:** Total revenue for the quarter was $102 million, with $98.4 million coming from bitcoin mining. However, the company reported a net loss of $261.1 million, primarily attributed to non-cash impacts like bitcoin price declines leading to miner impairments and fair value losses on its bitcoin holdings. Notably, long-term debt was significantly reduced to $30.6 million from $557.6 million at the end of 2025. * **Mining Operations:** Cango's total hash rate was 37.01 EH/s. It mined 1,266 bitcoin during the quarter and reduced its average cash cost per bitcoin by 9.0% quarter-over-quarter to $76,928, demonstrating improved operational efficiency. * **AI Business Expansion:** The company introduced EcoHash, a new commercial platform. This initiative leverages Cango's existing expertise in energy management and high-density computing to provide infrastructure for AI workloads, starting with GPU compute leasing. Management emphasized executing a disciplined strategy to strengthen the core mining business while advancing AI infrastructure through EcoHash. They highlighted progress in cost reduction, stable global operations, and a strengthened balance sheet through debt reduction.

marsbit31m ago

Cango Releases Q1 Financial Report: Total Revenue of $102 Million, Business Expands into AI Computing Infrastructure

marsbit31m ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片