By Silicon Base Quadrant
When users are no longer debating whether to upgrade their monthly data plan, they may soon start debating how many Token services to purchase each month.
Tokens are about to be packaged and sold as standardized services by telecom operators, just like data, broadband, and SMS.
Recently, China's three major telecom operators have successively launched Token plan products: monthly subscription-based Token schemes for individual users, and tiered computing power packages for developers and enterprise customers. They have announced the integration of dozens to hundreds of large models into their platforms, allowing for "monthly purchase, multi-model calls, and payment via phone bill."
China Telecom has launched personal and enterprise Token plans, with a minimum monthly fee of 9.9 yuan for 10 million Tokens; local operators like Shanghai Mobile and Shanghai Telecom have introduced billing models based on quota points or general Tokens, with Shanghai Mobile offering 400,000 Tokens for 1 yuan.
As operators begin selling Token services, the cost for users to switch between large models will significantly decrease. For large model companies, this means "user stickiness" will be weakened, and only by "competing more fiercely" can they retain their market.
In the future, large model vendors like Doubao, Qwen, and DeepSeek will not only compete on "price" and "Token quality per unit of energy consumption," but also on "higher-value AI application solution capabilities."
01 What is Token Service?
To understand Token service, first understand what a Token is.
Computers cannot directly recognize text; they can only recognize 0 and 1 codes. Therefore, every word, character, punctuation, or piece of speech we input is converted into 0 and 1 codes through a specific encoding mechanism.
In the context of large models, they also first recognize numeric codes, and the number of bits in the code converted from each character varies slightly.
A Token is the smallest unit of computation for a large model to process information. User input, context memory, and model output are all calculated in Tokens. More complex model calls, longer contexts, and deeper Agent execution chains consume more Tokens.
Typically: In English, one Token is roughly equivalent to 4 letters; In Chinese, due to the higher information density of Chinese characters, one Chinese character, one punctuation mark, or one phrase often corresponds to 1 to 2 Tokens.
Since large models think and output Token by Token, the industry sells and settles the cost and usage quota of large models in the form of "Per Million Tokens" or "quota points."
Currently, large model companies implement tiered pricing for Tokens. Ordinary users using general modes of models like Doubao or Qwen are free; for enterprise-level heavy usage, one can purchase different tiers of API monthly packages or metered services.
Starting last year, operators opened large model "computing power supermarkets." Model vendors are the "tenant merchants," and operators charge "platform fees + computing power fees + channel fees." What users buy is not the "operator's model," but rather: on the telecom platform, using telecom computing power, to call any large model, billed per Token.
In July 2025, China Mobile launched the model service platform MoMA (Mobile Model Access); in April, China Telecom launched the Xingchen TokenHub operation service platform; in May, "Unicom Xingluo" Token service platform was released. These platforms have integrated mainstream large models from companies like Baidu, Alibaba, ByteDance, and DeepSeek, offering unified API, unified authentication, and unified billing.
Operators' platforms internally adapt to multiple large models; users only need to change the model name (Model ID) to switch smoothly.
02 Why Are Operators Selling Tokens?
The explosion of Token service is not accidental.
First, changes in billing models. In the traditional cloud computing era, users were accustomed to paying for "server rental time" or "fixed bandwidth" (i.e., computing power payment at the IaaS layer), buying bandwidth speed and time. However, with the development of large models, the capabilities provided by different models and the costs consumed by different tasks vary greatly. For example, a stronger model costs more per Token; longer contexts consume more Tokens; higher inference complexity leads to higher actual costs. Billing per Token aligns "the level of intelligence consumed by the user" with "the computing power cost paid by the vendor."
Second, lowering technical barriers and "trial-and-error costs." The R&D and deployment of large models often require investments of tens of millions or even billions of dollars. For the vast majority of SMEs and individual developers, building their own models is not realistic. Token service breaks down "Artificial General Intelligence (AGI)" capabilities into pieces, packaging them so developers don't need to worry about how many tens of thousands of GPUs are burning electricity underneath; they only need to call APIs on demand and pay Token fees.
Finally, urgent demand driven by the explosion at the application layer. Entering 2026, application-layer scenarios such as AI Agents, AI-assisted programming, and multimodal content generation have exploded. These applications, in their daily operation, require frequent "throughput" interactions with underlying large models. An automated AI code-writing tool might consume millions of Tokens overnight. This high-frequency, massive-volume interaction forces the market to provide more standardized, stable, and price-competitive Token plan services.
Over the past two decades, operator business models have undergone three core changes in measurement units.
The first stage was the voice era, where operators sold minutes; the second stage was the mobile internet era, selling data GB; and entering the AI era, operators are beginning to experiment with selling Tokens.
Tokens are undergoing a similar evolution to data. Initially, they were just technical metrics; then they became billing units; finally, they evolved into standardized commodities.
The entry of operators marks that Tokens have begun to move beyond the technical realm and enter the consumption system.
In the coming years, the way users purchase AI capabilities may fundamentally change: individual users purchase "AI monthly packages," enterprises procure "Token resource pools," home broadband comes with AI quotas, and government/enterprise dedicated lines integrate Agent services. Tokens will become a basic resource, like electricity, water, and data.
However, this does not mean operators will replace large model vendors.
03 How to Buy Tokens Appropriately?
Should Token service be purchased directly from native large model vendors or from operator platforms? What are the pros and cons of the two current business models?
The first is the native model vendor model, which bills per million Tokens. Vendors like OpenAI, Anthropic, DeepSeek, Qwen, etc., commonly use this system. Users pay separately for input Tokens and output Tokens. Some, like Qwen, might use a pre-purchase at the beginning of the month, settle at the end of the month format.
The second is the operator's monthly subscription Token quota. For example, Shanghai Telecom offers a minimum of 9.9 yuan for 10 million Tokens, with additional purchases for excess, and plans to integrate Token rights into the family's "Beautiful Home" digital space, supporting one-click payment via phone bill.
This "all-in-one price" or "bill integration" model allows Chinese users to purchase large model computing power like they buy data packages.
While overseas markets are dominated by the API tiered pricing of native large model enterprises, the domestic market is pushing Token service into a "packaged" era similar to mobile phone plans.
Currently, both billing models have their advantages, as the user base for Token plans can be divided into three main types.
The first is independent developers and technology enthusiasts (Geeks). They use the API interfaces provided by various vendors to build their own personalized AI applications, such as productivity tools, automatic translation plugins, personal knowledge bases, etc.
The second category is SMEs, startups, and B2B independent software vendors (ISVs). This is the core customer group for Token service. Whether purchasing Tokens for company employees for programming, developing industry-specific AI Agents, or embedding AI assistance into existing enterprise ERP, CRM systems, SMEs need to subscribe to "team edition Token plans" from cloud vendors or operators.
The third category is "AI-heavy dependent" professionals and ordinary households, who need to frequently use AI for copywriting, code writing in home settings, or require AI to tutor their children with homework.
For SMEs and startups, from a techno-economic perspective, the pure Token billing model of native large models is more scientific.
The operator's package model has two advantages. On one hand, independent developers are not tied to one specific large model; they can independently choose from multiple models through the platform provider. On the other hand, Token service may reach mass consumers faster. Most people know what 100GB of data means, but cannot perceive what 10 million Tokens represent.
Operators adopting monthly subscriptions are essentially lowering the cognitive barrier. Users don't need to understand Tokens; they just need to start with a basic package like 9.9 yuan/10 million Tokens to understand their needs.
As operators start selling Token services, "Doubaos" are about to begin competing fiercely at three levels.
From "competing on parameters" to "competing on energy efficiency ratio": For large model companies, they can no longer blindly pursue large parameters and high energy consumption for large models. Instead, they must focus on capabilities like model distillation, quantization, and inference optimization that can output higher quality Tokens with smaller energy consumption.
Price competition will further intensify. After operators aggregate hundreds of models, user switching costs decrease. If model A raises prices, it can be replaced with model B via the platform. When model capability differences are insufficient, price becomes the core competitive factor.
The profit center for large model enterprises will shift. Simply selling APIs offers limited profits. Future profit focus may shift to Agents, industry applications, and enterprise solutions. The model itself gradually becomes infrastructure, while the application layer becomes the value center.
Perhaps, a "two-sided market" is forming: operators control the entry point, model vendors control the capabilities.






