Understanding the New Economic Model of Tokenization

marsbitPublished on 2026-05-19Last updated on 2026-05-19

Abstract

Understanding the New Token Economics Model The commercialization of AI applications is evolving from selling software and subscriptions to selling token call capacity. Tokens, the fundamental unit of information processing for large language models (LLMs), have become the basis for API billing and consumption. With call volumes exploding, tokens themselves are now being traded—procured, routed, split, and resold—forming a new intermediary market. This layer connects upstream LLM providers with downstream developers and enterprises, acting as a global wholesale-to-retail liquidity network. The rise of this business is fueled by a massive surge in China's daily token call volume—growing over a thousandfold from 100 billion in early 2024 to over 140 trillion by March 2026—and significant improvements in domestic LLM capabilities, which are now competitive globally. The core value of token distribution platforms extends beyond simple arbitrage. Key functions include aggregating multiple models (like GPT, Claude, and domestic models such as Kimi and DeepSeek) under a unified API, lowering network and payment barriers, and providing enterprise services like model selection, prompt engineering, and system integration. Profit models are diversifying: (1) resale margins; (2) technical premiums from proprietary inference acceleration (e.g., reducing costs to 1/10 of the industry standard); and (3) enterprise value-added services. High-consumption scenarios like marketing, short-f...

Author: Zhao Ying

Source: Wall Street News

The commercialization of AI applications is extending from selling software and memberships to selling token-calling capabilities. Here, Tokens refer to the smallest units of information processed by large models, serving as the basis for API billing, settlement, and consumption. As the volume of calls increases, Tokens themselves are beginning to be procured, routed, split, and resold like a form of "inventory."

Chen Liangdong, an analyst at Huayuan Securities, summarized the core change in a recent media industry report: "Token operations are forming a new intermediary market, which involves exploring token distribution models to connect upstream large model providers with downstream developers, enterprises, and individuals. The essence is the liquidity infrastructure for a global network of token wholesale to retail."

The background for this business is not complex: On one hand, China's daily token call volume has surged rapidly, rising from 100 billion tokens per day at the beginning of 2024 to 100 trillion by the end of 2025, surpassing 140 trillion by March 2026. On the other hand, domestic large models have improved significantly, entering the global top tier in certain rankings and call volumes. With increasing demand and a growing number of models, the real barriers to transactions have become payment, network access, interfaces, compliance, distribution channels, and scenario implementation.

However, token distribution cannot be simply understood as "reselling API quotas." The thinnest layer of profit comes from resale margins, while the thicker portion comes from inference acceleration, unified interfaces, enterprise-level prompt engineering, Agent orchestration, model selection, and integration with business systems. Precisely because the entry barrier is not high, the risks in this market are equally direct: intensified competition, funding requirements for upfront payments, bad debts, and policy changes from upstream model providers can all squeeze the profits of the intermediary layer.

Tokens Now Have "Wholesalers" and "Retailers"

The basic chain of token distribution includes three types of roles.

Upstream are the model providers, including ByteDance's Seedance series, Alibaba's Qwen series, Zhipu's GLM series, Moonshot AI's Kimi series, DeepSeek series, etc. They are the original suppliers of tokens.

In the middle are agency platforms responsible for procuring resources from upstream model providers and distributing them to end-users. Their work is not just about reselling quotas; they also convert the interface protocols of different models into a unified API format, enabling downstream users to access multiple models through a single API Key.

Downstream are the actual consumers of tokens, including individual users, developers, enterprise clients, and possibly lower-tier distributors.

The value of this intermediary layer focuses on several areas: reducing network barriers through domestic direct connections; enabling a single codebase to adapt to multiple models; supporting both personal and corporate payments; potentially obtaining lower costs through bulk procurement; and aggregating models like GPT, Claude, DeepSeek, and Kimi on one platform to reduce the cost of repeated integration for developers.

Thus, token distribution appears to be asset-light, requiring neither the training of large models nor massive server clusters. The core assets become the API routing and scheduling system, upstream model resources, channel clients, and service capabilities.

The Surge in Call Volume is the Most Direct Fuel for This Business

For the token operation model to succeed, there must first be a sufficiently large consumption volume.

China's daily token call volume increased more than a thousandfold in two years, from 100 billion to over 140 trillion tokens. This expansion stems from the deployment of various vertical Agents and the embedding of generative AI into more business processes by enterprises.

IDC data presents an even more aggressive trajectory: the number of active intelligent agents in Chinese enterprises is expected to exceed 350 million by 2031, with a compound annual growth rate (CAGR) exceeding 135%. As the density and complexity of agent tasks increase, the annual growth rate in token consumption by agents is projected to exceed 30-fold.

This change is already visible in execution-oriented agents. The weekly token consumption of OpenClaw on the OpenRouter platform increased from 0.81T between February 2 and March 16, 2026, to 4.97T, with its share rising from 8.31% to 24.36%.

Once tokens become a mass-consumed commodity, their procurement, pricing, routing, and settlement naturally stratify. Model providers may not directly serve every client, and end customers may not be willing to integrate with each model individually, creating space for the intermediary layer.

The Cost-Effectiveness of Domestic Models Opens the Door for Token Export

The improvement in domestic large model capabilities is a key variable enabling token distribution to expand from domestic to cross-border markets.

Data from SuperCLUE shows that domestic models like ByteDance's Doubao and the DeepSeek series have achieved overall scores exceeding 70 points, narrowing the gap with leading overseas models like GPT-5.4 and Gemini. Models like Tongyi Qianwen, Kimi, and Zhipu GLM have also formed a relatively clear tiered structure.

According to OpenRouter data, for the week ending May 10, 2026, Tencent's Hy3 preview (free) topped the call volume list. Among the top 5, top 10, and top 20 models, there were 2, 6, and 9 domestic large models, respectively.

A more significant change occurred in Q1 2026. From February 9 to 15, the call volume of Chinese models on OpenRouter reached 4.12 trillion tokens, surpassing the 2.94 trillion tokens of US models for the first time in the same period. From February 16 to 22, the weekly call volume of Chinese models further increased to 5.16 trillion tokens. Among the top five models on the platform by call volume, four were from Chinese providers: MiniMax M2.5, Kimi K2.5, Zhipu GLM-5, and DeepSeek V3.2, collectively accounting for 85.7% of the total call volume of the top five.

The price advantage is also prominent. The input price for both MiniMax M2.5 and GLM-5 is $0.3 per million tokens, compared to $5 for Claude Opus 4.6. For output, MiniMax M2.5 is $1.1, GLM-5 is $2.55, and Claude Opus 4.6 is $25. The cost-effectiveness of domestic models becomes more pronounced in high-token-consumption scenarios like AI Agents and code development.

Global AI Resource Imbalance Makes Routing Platforms the "Transit Hubs"

Token distribution doesn't just solve price issues; it also addresses resource mismatches.

Leading overseas large models face barriers like regional access restrictions, compliance rules, and payment hurdles, preventing them from directly reaching certain user groups, including developers in mainland China. Similarly, high-quality domestic models expanding overseas encounter challenges in localization, channel development, and user acquisition.

This imbalance fuels the demand for cross-border flow, aggregated routing, and layered distribution.

OpenRouter is already a typical example. The volume of tokens processed on its platform increased from 5-7 trillion per week in 2025 to over 20 trillion per week by April 2026. Its annualized revenue in 2026 exceeded $50 million, a roughly fivefold increase from the over $10 million annualized revenue disclosed in October 2025.

Similar platforms exist domestically. Silicon Flow is a one-stop large model cloud service platform based on its self-developed inference engine for efficient inference acceleration, while also providing enterprise-grade large model services. As of December 2025, the platform had over 9 million registered users, more than 10,000 enterprise users, and over 150 models available.

Even politically connected capital in the US has entered this field. On May 5, 2026, WLFI, a cryptocurrency company closely linked to Trump and his family, partnered with WorldClaw to launch WorldRouter, integrating over 300 models including Claude, GPT, and Gemini. Settled in USD, its pricing is approximately 30% lower than official public rates.

Real Profits May Not Lie in "Resale Margins"

There are three ways to profit from token distribution.

The first is resale margins. Platforms purchase API quotas in bulk from upstream model providers and resell them at a markup to downstream clients. OpenRouter, which adds about a 5.5% premium to supplier costs, exemplifies this model.

The second is technological premium. Platforms use self-developed inference acceleration engines to reduce the cost per token. Even when selling at prices close to or lower than official rates, they can generate gross profit through computational efficiency advantages. Silicon Flow's SiliconLLM and OneDiff technologies improve language model inference speeds by 10 times and text-to-image efficiency by 3 times, reducing the cost of large model API calls to as low as 1/10th of the industry average.

The third is enterprise value-added services. The cost of deploying AI for enterprises isn't just in token unit prices; it also includes prompt engineering, multi-model selection, business system integration, workflow orchestration, operational scheduling, and employee AI skills development. As basic token prices decline, these hidden costs become more likely points for monetization.

Silicon Flow's enterprise-level MaaS platform follows this direction: providing enterprise users with capabilities across three layers—model training and fine-tuning, deployment and inference, and application development support—covering data processing, model fine-tuning, prompt engineering, and RAG, ultimately delivered in the form of standardized APIs to industries like energy, finance, and government.

Marketing, Short Videos, Gaming, and E-commerce Are Scenarios That Consume Tokens More Easily

To be profitable, token distribution must ultimately land in real-world scenarios.

Generative AI applications are entering industries like healthcare, transportation, and industrial manufacturing and are starting to participate in core processes like corporate decision support and strategic management. However, many enterprises have weak foundations for digital transformation, insufficient data asset accumulation, and limited computing power investment, making direct AI deployment challenging.

In contrast, marketing and advertising companies already possess clients and scenarios, especially in short videos, webtoons, gaming, and e-commerce. Their token consumption demand is more direct and sustained. For such companies, the opportunity isn't just about reselling model capabilities but embedding tokens into client workflows for content generation, ad placement, asset production, and video creation.

Investment leads also unfold along two main lines:

One category includes companies with strong model capabilities, such as Alibaba, Tencent Holdings, Kuaishou, Kunlun Tech, Zhipu, MiniMax, etc.

The other category includes companies with strong token consumption scenarios and quality client sources, especially those with overseas client resources and marketing scenarios, and a willingness to actively invest in AI marketing and AI videoization. Examples include EasyClick and BlueFocus.

Risks Are Also Concrete: Low Barriers, Upfront Funding Requirements, Upstream Dependence

The token distribution business model is asset-light, but its moat is not inherently deep.

Peer competition is the first risk. The technical barrier for distribution is relatively low. Once leading distributors with capital, client, and channel advantages enter, they can quickly replicate the model, compressing profit margins.

Upfront funding requirements and bad debts are the second risk. Distributors often offer monthly or quarterly settlements to downstream clients but need to fund the upfront purchase of API quotas from upstream providers. The larger the token consumption scale, the greater the funding pressure. If clients delay payment, bad debt risks amplify simultaneously.

Policy changes by upstream model providers are the third risk. Large model providers control API pricing and access rules and may adjust prices or tighten policies for third-party access. For the intermediary layer, this is the most difficult factor to control.

Trending Cryptos

CitreaCTR

wrapped stUSDTWSTUSDT

Velodrome FinanceVELODROME

BrevisBREV

ZRX（0X）ZRX

PancakeSwapCAKE

Wall Street Morning Report: U.S. Stocks Suffer Collective Setback, Apple Hits New High Against the Trend, Tonight's CPI and Waller's Hearing to Determine Interest Rate Path

Wall Street Morning Report: U.S. stocks fell collectively, while Apple hit a new record high. Key events including tonight's CPI data and Fed Chair Walsh's testimony will set the direction for interest rates. Markets experienced a sharp "risk-off" move due to escalating Middle East tensions and unexpected hawkish signals from the Federal Reserve. Major indices declined, with the Nasdaq Composite leading losses, down 1.55%. The VIX fear index surged over 14%. Geopolitical tensions spiked as the U.S. conducted consecutive airstrikes on Iran and announced a maritime blockade of Iranian ports, set to begin on July 15. This triggered a panic-driven rally in oil, with WTI crude soaring over 9% to breach $80 per barrel. Safe-haven flows bolstered the U.S. dollar and Treasury yields, while gold plunged nearly 3%, losing its $4,000 level. Rate markets now price a nearly 50% chance of a Fed rate hike in July, up significantly from prior expectations, following hawkish commentary from Fed Governor Waller. The tech sector, particularly AI-related stocks, faced intense selling pressure. The Philadelphia Semiconductor Index plunged 4%. Notable decliners included SK Hynix (down 9% on its second trading day as an ADR), Nvidia, AMD, and Intel. In contrast, Apple shares rose 0.63% to a record high, viewed as a stable haven away from the costly AI data center arms race. Key events to watch include the U.S. June CPI inflation data and Fed Chair Walsh's Congressional testimony tonight, which will critically influence the Fed's policy path. Major bank earnings also begin today. The formal implementation of the U.S. maritime blockade against Iran on July 15 and upcoming events like TSMC's Q2 earnings and a SpaceX Starship test flight remain in focus.

marsbit44m ago

Wall Street Morning Report: U.S. Stocks Suffer Collective Setback, Apple Hits New High Against the Trend, Tonight's CPI and Waller's Hearing to Determine Interest Rate Path

marsbit44m ago

SBI partners with Solana to expand Japan’s on-chain finance, but can it become Asia’s hub?

SBI Holdings and the Solana Foundation have formed a strategic partnership, creating SBI Solana Global, to develop Japan's on-chain financial markets. This initiative focuses on utilizing Solana's infrastructure for regulated services like JPY stablecoins, tokenized real-world assets, and cross-border payments. A core goal is to modernize Japan's financial sector rather than disrupt it, positioning the country as a potential hub for institutional on-chain finance in Asia. Success hinges on driving sustained institutional adoption, transaction growth, and expanding cross-border payment activity, moving beyond policy to execution.

ambcrypto47m ago

SBI partners with Solana to expand Japan’s on-chain finance, but can it become Asia’s hub?

ambcrypto47m ago

Crude Oil Surges 10% in Intraday Trading! US Military Restores Blockade on Iran, Trump Warns of 'Heavy Blows' Tonight and Tomorrow, Imposes 20% Fee on Strait Shipping

Crude oil prices surged up to 10% intraday as tensions between the US and Iran escalated sharply. President Trump announced the reinstatement of a maritime blockade against Iran, targeting its ships and their customers, while stating the Strait of Hormuz would remain open to all other nations. He further declared that the US, as the "guardian" of the strait, would impose a 20% fee on all cargo transiting the strategic waterway to compensate for security costs. The US Central Command confirmed the naval blockade would commence on July 14 (GMT). Following Trump's statements, reports emerged of explosions near Iran's Larak Island and the port of Bandar Abbas, though Iranian authorities did not confirm their nature. Iran's Foreign Minister criticized the proposed 20% fee as excessive, asserting Iran's role as the guardian of the strait. An Iranian military spokesperson warned of a strong response to any unauthorized US interference in the waterway. The United Nations' International Maritime Organization (IMO) expressed opposition to charging tolls for passage through international straits, stating such measures lack legal foundation. Market reactions were pronounced, with Brent crude briefly surpassing $83 and WTI above $78. The situation intensified as Trump notified Congress of renewed hostilities with Iran, following US airstrikes on Iranian targets.

marsbit47m ago

Crude Oil Surges 10% in Intraday Trading! US Military Restores Blockade on Iran, Trump Warns of 'Heavy Blows' Tonight and Tomorrow, Imposes 20% Fee on Strait Shipping

marsbit47m ago

How to Regulate Single-Stock Leveraged ETFs? On Thursday, the Entire Market Is Watching This Korean Government Meeting

The highest-level economic coordination body in South Korea, the "F4" comprising the Ministry of Economy and Finance, the Financial Services Commission, the Bank of Korea, and the Financial Supervisory Service, will hold an emergency meeting on Thursday to discuss regulatory measures for single-stock leveraged ETFs. These products, launched just six weeks ago, have been widely blamed for exacerbating market volatility. The KOSPI's 8% plunge on Monday triggered the year's seventh trading halt, intensifying scrutiny. Regulators have expressed rare public regret over approving the products. FSS Governor Lee Bok-hyun stated he "regretted not doing everything possible to prevent" their introduction and acknowledged structural problems, citing massive retail investments and legal complications from a rushed rollout. Possible countermeasures under discussion include raising margin requirements, imposing daily price limits, and adjusting leverage caps. However, officials admit these may be temporary fixes. Data confirms the amplified volatility. Since the ETFs' launch, days with KOSPI moves exceeding 3% have nearly doubled. Trading halts have reached record levels, surpassing the 2008 financial crisis peak. The products allow 2x leveraged bets on giants like Samsung Electronics and SK Hynix. Their daily rebalancing to match returns is seen as mechanically fueling market swings. The outcome of Thursday's F4 meeting is highly anticipated, with expectations leaning towards stricter controls on leverage, investor access, or other structural constraints to curb the products' market impact.

marsbit1h ago

How to Regulate Single-Stock Leveraged ETFs? On Thursday, the Entire Market Is Watching This Korean Government Meeting

marsbit1h ago

AI Overhauled Terence Tao's 30-Year-Old Website, Uncovering Two Bugs Hidden for Over Two Decades in the Process

AI Revamps Terence Tao's 30-Year-Old Website, Unearthing Two 20-Year-Old Bugs in His Code Terence Tao, a renowned mathematician, has enlisted an AI agent to overhaul his personal academic website, which was built in 1997 with a static HTML, manually-maintained "Web 1.0" architecture. In just one day, the agent migrated 560 papers and preprints, 374 travel logs, 68 courses, 19 books, and 29 old math applets to a new system on GitHub Pages. The new site is structured around YAML files as the "single source of truth," with static HTML pages automatically generated from this data—a fundamental shift from maintaining individual documents to managing a centralized database. During the migration, the AI uncovered inconsistencies, outdated entries, and broken links that had accumulated over nearly three decades of manual updates. It also successfully ported a set of small educational Java 1.0 applets to JavaScript. Notably, while reviewing this translation, Tao found only one new bug introduced by the AI. Conversely, the AI identified two subtle bugs in his original Java code that he was previously unaware of. Tao emphasizes the project highlights AI's potential for automating tedious "digital housekeeping"—routine tasks like data migration and website maintenance that are costly and error-prone when done manually. He also revived a 27-year-old stalled project: a special relativity visualizer or "Minkowskian Inkscape." With AI assistance, a working alpha version was built in two hours. While AI still requires human oversight for critical work, Tao argues that for such structured, non-core tasks, "AI + human review" can result in lower error rates and drastically lower correction costs compared to purely manual maintenance over decades.

marsbit1h ago

AI Overhauled Terence Tao's 30-Year-Old Website, Uncovering Two Bugs Hidden for Over Two Decades in the Process

marsbit1h ago

Trading

Spot

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

Talus is a decentralized AI Agent framework built on the Sui, designed to solve the structural problems of current AI systems: centralization, opacity, and a lack of native economic identity.

43.3k Total ViewsPublished 2026.03.18Updated 2026.03.18

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

By 2026, the integration of artificial intelligence and cryptocurrency has advanced from proof-of-concept to a new stage of "system-level integration".

2.6k Total ViewsPublished 2026.03.26Updated 2026.03.26

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

In 2026, the U.S. IPO market has regained momentum.

16.7k Total ViewsPublished 2026.07.08Updated 2026.07.08

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

Understanding the New Economic Model of Tokenization

Abstract

Tokens Now Have "Wholesalers" and "Retailers"

The Surge in Call Volume is the Most Direct Fuel for This Business

The Cost-Effectiveness of Domestic Models Opens the Door for Token Export

Global AI Resource Imbalance Makes Routing Platforms the "Transit Hubs"

Real Profits May Not Lie in "Resale Margins"

Marketing, Short Videos, Gaming, and E-commerce Are Scenarios That Consume Tokens More Easily

Risks Are Also Concrete: Low Barriers, Upfront Funding Requirements, Upstream Dependence

Trending Cryptos

Related Questions

Related Reads

Wall Street Morning Report: U.S. Stocks Suffer Collective Setback, Apple Hits New High Against the Trend, Tonight's CPI and Waller's Hearing to Determine Interest Rate Path

SBI partners with Solana to expand Japan’s on-chain finance, but can it become Asia’s hub?

Crude Oil Surges 10% in Intraday Trading! US Military Restores Blockade on Iran, Trump Warns of 'Heavy Blows' Tonight and Tomorrow, Imposes 20% Fee on Strait Shipping

How to Regulate Single-Stock Leveraged ETFs? On Thursday, the Entire Market Is Watching This Korean Government Meeting

AI Overhauled Terence Tao's 30-Year-Old Website, Uncovering Two Bugs Hidden for Over Two Decades in the Process

Trading

Hot Articles

The Cornerstone of the Autonomous AI Economy: How Talus is Reshaping On-Chain Intelligent Agents

In-depth Analysis of AI and Crypto: The Era of Symbiosis between Algorithms and Ledgers

U.S. Equity TradFi Assets: Traditional Finance as a Steady Anchor Amid the AI IPO Boom

Discussions

Top Questions

Hot Categories

Hot Tags