More and More 'Model Supermarkets' Are Opening: ByteDance, Alibaba, and Tencent Compete to Integrate

marsbitPublished on 2026-04-24Last updated on 2026-04-24

Abstract

Chinese tech giants like ByteDance, Alibaba, and Tencent are accelerating the rollout of integrated AI model subscription services—dubbed “model supermarkets”—to provide developers with bundled access to multiple leading domestic large language models (LLMs). ByteDance’s Volcengine recently upgraded its "Coding Plan" by adding newer models like GLM-5.1, Minimax M2.7, and Kimi k2.6, allowing subscribers to use various top models under a single monthly fee starting at ¥40. However, user feedback reveals significant issues, including rapid consumption of usage limits (e.g., hitting caps within hours), frequent server errors (like HTTP 429), and slow response times during peak hours. Complaints about misleading deduction rates—where calls to advanced models consume more quota—are also common. The trend is industry-wide: Alibaba, Tencent, and Baidu have all launched similar multi-model coding plans. While these platforms reduce trial costs for developers, they also expose challenges in balancing affordability with service quality and computational stability. Amid this shift, independent AI companies like Zhipu, MiniMax, and Moonlight Face (Kimi) are developing strategies to avoid becoming mere “pipes” in this ecosystem—focusing on vertical applications, autonomous agents, and long-context models to retain competitiveness. Analysts suggest that, while platform aggregation may pressure model firms in the short term, specialized and vertical AI capabilities will remain differentia...

ByteDance's Volcano Engine recently officially launched GLM-5.1 in its Coding Plan, with the official statement claiming "aligned with the original full capabilities, no purchase limits." Prior to this, Volcano's Coding Plan had long only offered older models like GLM-4.7. This update not only introduced GLM-5.1 but also integrated multiple latest domestic large models such as Minimax M2.7, Kimi k2.6, and DeepSeek-V3.2.

This means developers can call upon multiple leading models simultaneously with just one subscription fee. Market feedback indicates that this "bundled model" significantly reduces developers' trial-and-error costs. Currently, the Lite plan is priced at 40 yuan per month, and the Pro plan at 200 yuan per month, making many developers willing to "buy a spot first."

Zhipu's GLM-5.1 itself demonstrated impressive engineering capabilities in an update in early April 2026. In two official videos released by Zhipu, "Building a Linux Desktop from Scratch in 8 Hours" and "655 Iterations, Increasing Query Throughput of the Vector Database to 6.9 Times the Initial Official Version," it redefined public imagination regarding large models' "8-hour effective execution."

Journalist's On-the-Ground Visit to Developer Community: Majority of Users Report "Not Durable"

Upon entering a Volcano Coding developer exchange group, the journalist found that alongside posts sharing experience feedback, a large number of users reported a gap between expectations and actual experience. Scrolling through a few pages of the exchange community revealed numerous posts complaining and requesting refunds, with many netizens exclaiming "feel cheated."

The controversies mainly focus on two points:

One is the issue of usage limits being consumed too quickly. A user named "Hakimi" posted saying "a few rounds of dialogue in one task and the 5-hour limit is almost used up." Another netizen shared that the reason their "5-hour limit was triggered" was because the account had a continuous sliding window over 5 hours, with the actual number of requests exceeding 6004, surpassing the system limit.

The second is the decline in experience due to computational resource scheduling pressure. Many users reported encountering 429 errors (too many requests) and "first-character delays of over one minute during peak hours being the norm." One user bluntly stated: "The 5-hour limit triggers too frequently, making it unusable for serious development."

Simultaneously, behind the low price of 40 yuan per month for the Coding Plan, there is also a hidden "undercurrent" regarding different deduction coefficients for "a single call request" within the plan. For example, a user posted an image in the developer exchange group showing the "differences in deduction coefficients for calling different models." For instance, the Doubao series and Qwen series have a deduction coefficient of 1, the DeepSeek series is 2, and the MiniMax-M2.7, Kimi-K2.6, and GLM-5.1 series are 5.

This also reflects that building a "model supermarket" is not as easy as imagined. Developers are attracted by the "cost-effectiveness," but the shortcomings exposed initially in areas like computational resource scheduling have caused many developers to hesitate after trying it out. This also reveals the growing pains of the "bundled model" in its early stages. As users flock in, the carrying capacity of the computing platform faces challenges. Finding a sustainable balance between attracting users with low prices and maintaining service quality will be a long-term challenge for Volcano Engine and its followers.

Cloud Vendors Collectively Shift to "Model Supermarkets": Initial Signs of Stratification and Solidification

This "integrative" update by Volcano Engine's Coding Plan is not an isolated incident.

Since early 2026, mainstream cloud vendors like Alibaba Cloud, Baidu Intelligent Cloud, and Tencent Cloud have all been advancing multi-model integration layouts. For example, Alibaba Cloud, as an industry pioneer, earlier launched the multi-model subscription package "Bailian Coding Plan," currently supporting the Qwen series, kimi-k2.5, glm-5, MiniMax-M2.5, and other models. Currently, the Pro price is 200 yuan per month, and the Lite package stopped new purchases from March 20th and stopped renewals and upgrades from April 13th.

Tencent Cloud's large model Coding Plan subscription service was fully updated in March 2026, supporting multiple latest models including Tencent HY 2.0 Instruct, GLM-5, Kimi-K2.5, and MiniMax-M2.5. Baidu Qianfan officially launched its AI coding subscription service, Coding Plan, in February 2026, also one of the较早 (relatively early) domestic cloud vendors to offer such services.

The "model supermarket" model is not a choice of just one company but is becoming a track where cloud vendors are racing to layout. However, tearing open the aggregation strategy of cloud vendors, whoever can provide more stable services, more transparent quota rules, more flexible disaster recovery mechanisms, and whoever can extend beyond programming to more enterprise-level service capabilities, and whether the renewal rate can keep up, all become new core competitive factors.

Internationally, Amazon Bedrock and Microsoft Azure's model aggregation service platforms, though different in scenarios from the domestic Coding subscription model, belong to the same integration trend.

Overall, industry competition is shifting from "single model capability competition" to "platform integration capability + ecosystem service capability" competition, and industry concentration will rapidly increase.

Wang Kai, Chief Asset Allocation Analyst at Guosen Securities, told reporters that although industry differentiation is accelerating, judging the integration period might be slightly premature. "More accurately, this is the refinement and iteration of industry chain分工 (division of labor). Model vendors focus on algorithms, cloud vendors focus on engineering delivery, each leveraging their main business advantages." He believes that regardless of whether other cloud vendors follow suit, the competitive landscape will evolve from individual efforts to ecological niche differentiation.

Increased Pressure for Large Model Companies to Become "Pipelined"?

So-called "pipelining" does not mean model companies disappear, but rather that they lose product premium, user connection rights, and discourse power, with profits shifting towards the computing platform side, becoming a "dominated" role.

Under the aggregation wave of cloud vendors, "pipelining" is also becoming a Sword of Damocles hanging over the heads of independent large model companies. In this silent game, leading players like Zhipu AI, Moonlight Shadow (Kimi), and MiniMax have not chosen passive compromise but have grown from their genes, offering different breakout paths.

Zhipu AI CEO Zhang Peng, in a public dialogue on April 8th, clearly stated that Zhipu's ultimate goal is never to become a "replaceable calling tool" but to build a fully autonomous agent. This positioning attempts to upgrade Zhipu from a "model supplier" to a "task executor," thereby bypassing the low-price trap of pure API pipelines.

Moonlight Shadow (Kimi) adopts a strategy of "decentralized layout + deep cultivation of long text." It synchronously accesses multiple mainstream cloud platforms like Volcano Engine and Alibaba Cloud, achieving multi-source computational power supply, avoiding being bound to a single channel, and ensuring service stability and cost control. Kimi K2.6, launched in April 2026, adopts a Mixture of Experts (MoE) architecture with a standard context window of 256K tokens.

MiniMax focuses its core investments on vertical fields such as content creation, intelligent customer service, education, enterprise services, and entertainment socializing, with key layouts in scenarios like game AI, digital humans, and multimodal interaction, creating "customized capabilities difficult for cloud platforms to replace."

Will platform integration by major vendors accelerate the "pipelining" of model companies? Wang Kai, Chief Asset Allocation Analyst at Guosen Securities, believes it is necessary to distinguish between short-term and long-term perspectives.

"In the short term, distribution channels being controlled by the platform, partial ceding of pricing power, and profits of model vendors shifting to the entry point side are business norms. But in the long run, general models are prone to homogenization; deep learning models in vertical scenarios like finance, healthcare, and law have professional barriers that cannot be erased simply by centralized aggregation." he said.

In terms of responding to the risk of being platformized, strategies from OpenAI and Anthropic can be referenced. On one hand, strengthen channels that directly face end-users, such as the independent operation of ChatGPT and Claude, which essentially establishes user connections bypassing platforms. On the other hand, the speed of technological iteration and user brand recognition are two effective moats, so model companies need to balance R&D investment with productization layout.

The final outcome of this game of "pipelining vs. platformization" might not be about who eats whom, but a further clarification of division of labor. Cloud vendors act as pipes, model companies focus on technology, and both sides gradually find their respective survival boundaries in the game.

As for who eats whom, at this stage, it is far from the end of the story.

This article is from the WeChat public account "Sci-Tech Innovation Board Daily," author: Wang Nai

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

Bitwise CIO Matt Hougan and Research Lead Ryan Rasmussen express strong bullish sentiment on Bitcoin's long-term prospects, suggesting that its $1 million price target may be too conservative. They argue Bitcoin serves a dual role: as digital gold and a potential global settlement asset, especially amid declining trust in traditional monetary systems. Despite a weak Q1 2026 where nearly all crypto assets and prices saw double-digit declines, the analysts remain optimistic due to strong forward-looking catalysts, including institutional adoption via Bitcoin ETFs from major firms like Morgan Stanley and Goldman Sachs. Geopolitical instability, such as Iran’s mention of using Bitcoin for international payments, increases the value of Bitcoin’s “out-of-the-money call option” as a non-political, global settlement currency. This enhances its appeal beyond a mere store of value. . Additionally, Hougan highlights that a clearer regulatory token framework under current SEC leadership, combined with AI efficiency gains and high-performance blockchains, could fuel a new “altseason” by late 2026. This may lead to a wave of legitimate, value-capturing token projects, unlike the earlier ICO boom. . Bitwise also announced an Avalanche ETF, citing its unique architecture and rapid growth in real-world asset (RWA) tokenization, which has surged 10x to nearly $30 billion in two years. The firm believes Layer 1 blockchains are still early in their growth cycle, with significant potential ahead.

marsbit19m ago

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

marsbit19m ago

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

Bitcoin's recent rally towards $80,000 has driven a significant rebound in crypto market sentiment, with the Fear & Greed Index jumping 14 points to 46—its highest level since January. Analysts note that over 300,000 BTC have shifted from short-term to long-term holders in the past month, signaling stronger investor conviction. However, the rally appears largely driven by perpetual futures speculation rather than spot market demand, raising concerns about sustainability. Retail participation also remains subdued, limiting further sentiment gains. While entities like Strategy (formerly MicroStrategy) continue accumulating BTC, weak spot interest could lead to a price correction if profit-taking occurs.

bitcoinist55m ago

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

bitcoinist55m ago

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

Intel's stock surged 20% after reporting exceptional Q1 2026 results, with revenue of $13.6 billion (up 7% YoY) and non-GAAP EPS of $0.29, beating expectations by 29x. The rebound is driven by the resurgence of CPUs in the AI era, particularly as workloads shift from training to inference and agent-based applications. Intel’s Data Center and AI (DCAI) division hit a record $5.1 billion in revenue, up 22% YoY, marking a U-shaped recovery since mid-2025. The growth is attributed to strong demand for Xeon 6 "Granite Rapids" processors and increased AI infrastructure refresh cycles. While NVIDIA and AMD dominated the AI training phase, Intel is benefiting from the focus on AI agents, where CPU performance becomes critical—accounting for 50-90% of workflow latency in agent orchestration. This shift, coupled with management changes and strategic refocus on CPUs (including canceling the Falcon Shores GPU project), has repositioned Intel. New CEO Lip-Bu Tan emphasized that CPUs are re-establishing themselves as essential infrastructure in the AI era.

marsbit1h ago

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

marsbit1h ago

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

DeepSeek AI has officially released DeepSeek-V4, available in two versions: the high-performance **DeepSeek-V4-Pro** (49B activated parameters, 1.6T total) and the more efficient **DeepSeek-V4-Flash** (13B activated parameters, 284B total). Both support a 1M context length, making long-context capability a baseline feature rather than a premium offering. The Pro version rivals top closed-source models in agent capabilities, world knowledge, and reasoning performance. It outperforms Claude Sonnet 4.5 in agentic coding and approaches Claude Opus 4.6 (non-thinking mode) in quality. The Flash version offers competitive performance at a lower cost, though it lags in highly complex tasks. A key technical innovation is a new attention mechanism that reduces computational and memory demands for long contexts. The models are optimized for agent frameworks like Claude Code and OpenClaw. API services are available with support for both OpenAI and Anthropic-style interfaces. DeepSeek also announced upcoming support for Huawei’s computing hardware in the second half of the year. The models are open-sourced on Hugging Face and ModelScope.

marsbit1h ago

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

marsbit1h ago

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

Tether, the issuer of USDT, has frozen over $344 million worth of USDT across two Tron blockchain addresses in its largest-ever single compliance action. The freeze was carried out on April 23 in coordination with the U.S. Office of Foreign Assets Control (OFAC) and other law enforcement agencies. The funds are suspected to be linked to sanctions evasion, criminal networks, or other illicit activities, though specific details were not disclosed. This action comes amid increased U.S. regulatory scrutiny, including sanctions against entities tied to Iran and Mexican drug cartels. Tether’s CEO, Paolo Ardoino, emphasized the company’s commitment to preventing illegal use of USDT, contrasting it with slower responses from competitors like Circle. To date, Tether has frozen over $4.4 billion in assets and collaborates with more than 340 law enforcement agencies across 65 countries. The move has reignited debate within the crypto community over the centralized "freeze authority" held by stablecoin issuers, challenging the notion that "your stablecoins are your own." Critics point to the built-in blacklist function in USDT’s smart contracts, which allows Tether to immobilize funds in targeted wallets, while proponents argue it enhances regulatory compliance and anti-money laundering efforts.

marsbit1h ago

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

marsbit1h ago

Trading

Spot

Futures

Hot Articles

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

As a major evolution of Ethereum’s account system, AA is designed to address the fundamental security and experience bottlenecks of the “private key equals account” model in the EOA era.

3.6k Total ViewsPublished 2025.12.18Updated 2025.12.18

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

The privacy + payments narrative has been the primary catalyst driving rotation and substantial price gains in privacy coins such as DASH and XMR.

16.3k Total ViewsPublished 2026.01.20Updated 2026.01.20

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

ADA's Ouroboros Leios mainnet is expected to launch in 2026, and the hard fork to Protocol Version 11 is planned for Q1 2026.

40.1k Total ViewsPublished 2026.02.10Updated 2026.02.12

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of S (S) are presented below.

More and More 'Model Supermarkets' Are Opening: ByteDance, Alibaba, and Tencent Compete to Integrate

Abstract

Journalist's On-the-Ground Visit to Developer Community: Majority of Users Report "Not Durable"

Cloud Vendors Collectively Shift to "Model Supermarkets": Initial Signs of Stratification and Solidification

Increased Pressure for Large Model Companies to Become "Pipelined"?

Related Questions

Related Reads

Bitwise: Bullish on Bitcoin's Performance in the Second Half of the Year, AI and Regulation Will Spark a New Altcoin Season

Bitcoin Rally To Near $80K Fuels Sharp Sentiment Rebound Across Crypto Markets

Intel Soars 20%, CPUs Return to Center Stage in the Agent Era

DeepSeek V4 Finally Released, Breaking the Strongest Closed-Source Monopoly, Explicitly Partnering with Huawei Chips

Tether Cooperates with U.S. Sanctions to Freeze $344 Million in USDT, Reigniting Debate Over Stablecoin 'One-Click Freeze Authority'

Trading

Hot Articles

In-Depth Research Report on Account Abstraction (AA): Generational Leap in Ethereum’s Account System & Landscape Reshaping in the Next Five Years

Hot Tokens Learning Week 7: Privacy Coins Rally in Rotation, with RIVER Standing Out as 2026’s Surprise Performer

Hot Tokens Learning Week 8: ADA's Ouroboros Leios Mainnet Expected to Launch in 2026

Discussions

Top Questions

Hot Categories

Hot Tags