Can Alibaba Cloud Rewrite Itself?

marsbitОпубліковано о 2026-05-20Востаннє оновлено о 2026-05-20

Анотація

Over the past five months, Alibaba Cloud's MaaS (Model as a Service) revenue has surged 15x, marking a strategic overhaul where the company is shifting its 17-year-old system designed for "humans using cloud" to a new paradigm centered on "Agents consuming Tokens." At its recent summit, Alibaba Cloud announced a full-stack upgrade encompassing "chip-cloud-model-inference," all optimized for AI Agents. Key launches include the new AI product portal "QianWen Cloud," hyper-node servers powered by the in-house AI chip Zhenwu M890, and the latest flagship model, Qwen3.7-Max. Senior VP Liu Weiguang described this as building "China's largest AI factory," where chips are raw materials, the cloud is the workshop, models are machines, and the inference platform is the assembly line, with Tokens as the final product. The company is now emphasizing its chip strategy, unveiling the Zhenwu M890 and a two-year roadmap for future chips. With over 560,000 chips deployed across 400+ clients, Alibaba Cloud aims to control the marginal cost per Token, mirroring Google's integration of TPU and Gemini for optimal cost-performance. The cloud infrastructure itself is being rewritten. Traditional cloud interfaces are being transformed into standardized, Agent-callable Skills. A new scheduling logic focuses on "task scheduling" over "resource scheduling" to handle the unpredictable, elastic workloads of Agents. Liu noted that AI applications now automatically provision cloud resources, with one cu...

Over the past five months, the MaaS revenue of Alibaba Cloud has grown 15 times, which is just one facet of Alibaba Cloud's self-reconstruction. At the summit, Alibaba Cloud announced the completion of a full-stack Agentization upgrade covering "chip-cloud-model-inference," simultaneously launching the new AI product official website "Qianwen Cloud," the hyper-node server equipped with the self-developed AI chip Zhenwu M890, and the latest flagship model Qwen3.7-Max.

In the words of Alibaba Cloud Senior Vice President Liu Weiguang, "We are building China's largest AI factory." The factory metaphor implies a complete production logic: chips are the raw materials, the cloud is the workshop, models are the machines, the inference platform is the assembly line, and the final commodity produced is the Token.

The essence of this reconstruction is to transform the entire system built over the past 17 years around "humans using the cloud" into a new system centered on "Agents consuming Tokens."

Why Play the Chip Card Now?

Alibaba Cloud rarely emphasized chips in public before. At this summit, it not only released the new-generation training-and-inference integrated AI chip Zhenwu M890 but also unprecedentedly disclosed its chip roadmap for the next two years, with successive generations of products, Zhenwu V900 and Zhenwu J900, progressing year by year.

The Zhenwu M890 is equipped with 144GB of video memory, an inter-chip interconnect bandwidth of 800GB/s, and performance three times that of the previous generation Zhenwu 810E. Paired with the self-developed ICN Switch interconnect chip, 128 AI chips can be combined into a single machine, with P2P latency compressed to within 150 nanoseconds.

But beyond the specifications, the more crucial information is scale. The Zhenwu series has cumulatively shipped 560,000 units and has already entered over 400 clients across more than 20 industries, including telecommunications, FAW, and SPDB.

Liu Weiguang repeatedly used Google as an analogy. The deep integration of Google's TPU and Gemini allowed Google to achieve optimal cost-effectiveness within its own framework. Alibaba Cloud certainly wants to follow the same path. He summarized the competitive logic in one sentence: "If the future competition is about every chip generating more high-quality Tokens than competitors, then we win."

Combined with the Yitian CPU, Panmai Smart NIC, and Zhenyue storage master control chip, T-Head's chip landscape has expanded from a single point to complete coverage of computing power, networking, and storage. When inference demand expands exponentially, only by holding chips in one's own hands can the marginal cost of each Token be controlled.

The reasoning is not complicated. Model companies can compete on parameters, but cloud providers ultimately compete on whose Tokens are cheaper, more stable, and faster. Chips are the starting point of this cost war.

The Cloud Itself Must Be Rewritten

Chips solve the problem of "being able to run," but Agents' demands on the cloud extend far beyond computing power.

The interaction logic of traditional cloud products is designed for humans: opening the console, looking at menus, configuring parameters, clicking buttons. This setup is completely unusable for Agents. Agents don't view web pages or click buttons; they need structured capability descriptions, standardized calling protocols, and predictable feedback.

Alibaba Cloud CTO Li Feifei used a set of comparisons to illustrate the problem: Traditional cloud workloads are steady-state; an ECS instance, once launched, might run for months or even years. But Agent workloads are characterized by "irregular elasticity, short lifecycles, and instant scaling up and down." After an Agent completes a task, its sandbox is destroyed. The next request might come in a few milliseconds, or it might not come for several hours.

To address this, Alibaba Cloud has done three things.

First, it made cloud products Skill-based, MCP-based, and CLI-based. Simply put, each cloud product is packaged into a standardized interface that Agents can directly call, using the cloud like calling functions.

Second, it built a dedicated runtime environment for Agents—lightweight sandboxes, multi-Agent collaboration, cross-task memory, and data flow channels.

Third, it rebuilt the scheduling logic, shifting from "resource scheduling" to "task scheduling," because traditional resource orchestration methods cannot withstand the concurrency of massive numbers of Agents.

Liu Weiguang stated that some AI applications, after going online, automatically provision cloud resources in the background—virtual machines, database instances, sandbox environments—without any human intervention. The volume of resources automatically provisioned for one customer in a single day is equivalent to two weeks of manual operations in the past.

"This essentially means Agents are using the cloud by themselves." Liu Weiguang provided an internal conversion formula: Token consumption can be proportionally converted into GPU usage, and each additional GPU card roughly drives a one-to-one increase in CPU. In other words, the growth in Token revenue is not cannibalizing traditional cloud revenue but is pulling it up, provided the cloud platform can handle the Agent workload.

Therefore, Alibaba Cloud is not merely adding a layer of AI capability to the original system. It is rewriting everything from interaction methods, scheduling logic, and billing models to product forms.

Models Are Not for Chatting

The third layer of the full-stack reconstruction is the model. Qwen3.7-Max ranked first among domestic models in the global Arena blind test overall leaderboard, surpassing Kimi-K2.6, DeepSeek-v4-pro, and GLM-5.1. The focus of this release is Alibaba's redefinition of the direction of model capabilities.

Alibaba Group's Tongyi large model leader Zhou Jingren said, "In the past, we pursued how well the model 'spoke.' Now we demand that the model 'gets things done.'"

Taking Alibaba Cloud's practice in chips as an example, on the previously untrained Zhenwu M890 chip, Qwen3.7-Max, relying solely on a task description, autonomously worked for 35 hours from scratch, independently completing the writing and optimization of a production-grade AI computing kernel. The final performance was 10 times higher than the official version, with no human intervention or intermediate guidance throughout the entire process.

This demonstrates the model's core capability in Agent scenarios: long-range autonomous execution. It takes a task, breaks it down, plans, writes code, debugs—working continuously for 35 hours without stopping.

To support this level of inference demand, the Bailian platform has also undergone corresponding upgrades: pooled scheduling to improve GPU utilization, context caching to eliminate redundant calculations, and elastic throughput scheduling to handle concurrency peaks.

In terms of ecosystem, Bailian remains open for access. Besides the Qianwen model matrix, it has also onboarded third-party models such as Zhipu's GLM-5.1, MiniMax's M2.7, and Moonshot AI's Kimi K2.6.

Liu Weiguang mentioned, "Clients in practice don't use just one model; they use a combination of multiple models. We provide the combinations; clients find the mix that best suits them on the platform." At the summit, executives from six leading domestic model companies collectively took the stage, creating a scene reminiscent of a "domestic AI alliance."

Within the last three months, the Qianwen flagship model has iterated through three versions: 3.5, 3.6, and 3.7. This release rhythm itself sends a signal: the competition in model capabilities is far from over, and Alibaba intends to establish a long-term advantage through the vertical integration of self-developed chips and self-developed models.

The Real Bet of This Reconstruction

Looking back, the underlying logic of Alibaba Cloud's full-stack reconstruction is simple and pure. When AI revenue growth far outpaces traditional cloud business, when Tokens might replace ECS as the largest product line, when Agents start automatically provisioning cloud resources without humans needing to log into the console, the entire technical system designed for humans has reached a point where it must be changed.

But the difficulty of execution is another matter. Liu Weiguang himself admitted that the transformation is "easy to talk about, very hard to do." In the past, the sales team interacted with clients' IT departments. Now, doing MaaS requires dialogue with business units or even the CEO.

"Your conversational ability, your experience, are requirements of a completely different level." Alibaba Cloud has already established dedicated MaaS sales roles for large enterprise clients, operating and being assessed separately from traditional IaaS sales.

Performance metrics are also changing, no longer just looking at call volume, but at "high-quality Tokens"—Tokens that solve real problems, not chit-chat Tokens. Three core metrics: daily growth in paying customers, the number of core business systems integrated with models, and the efficiency of Agents autonomously completing task loops.

These adjustments at the organizational and mechanism levels often indicate a company's true judgment more than technical announcements. Alibaba Cloud wants to rebuild its revenue structure, customer relationships, and sales system. Liu Weiguang stated, "When we were building the cloud before, the client's IT budget was calculable—how many servers offline, roughly how much to move them up—you could see the problem. But with MaaS, the answer to this problem is unknown; once you get in, it might exceed your imagination."

The problem statement can't be seen, and the answer is uncertain, but Alibaba Cloud has still decided to dismantle and rewrite its entire system because the only certainty is that AI is an opportunity ten or even a hundred times larger than any before.

This is probably the most noteworthy information from this summit: not which chip has more computing power, or which model ranks where, but that China's largest cloud provider is betting on a future it believes will come, with an aggressive posture approaching that of a startup. (Author: Zhang Shuai, Editor: Yang Lin)

Пов'язані питання

QWhat is the core reason behind Alibaba Cloud's comprehensive 'chip-cloud-model-inference' stack reconstruction according to the article?

AThe core reason is to transform its entire 17-year-old system built around 'people using cloud' into a new system centered on 'Agents consuming Tokens'. This is driven by the recognition that AI revenue growth vastly outpaces traditional cloud business, and Tokens could replace ECS as the largest product line as Agents begin to automatically provision cloud resources.

QWhy does the article emphasize Alibaba Cloud's focus on developing its own AI chips like Zhenwu M890 now?

AThe article emphasizes it because, in an era where inference demand is expanding exponentially, controlling the cost of each Token is crucial for competition. By developing its own chips (like Zhenwu), Alibaba Cloud aims to control the marginal cost per Token from the source, similar to Google's strategy with TPU and Gemini, to ultimately provide cheaper, more stable, and faster Tokens.

QHow does Agent's demand for cloud differ from traditional human user demand as described in the text?

AAgent demand differs fundamentally: it is 'irregularly elastic, short-lived, and instantly scalable then gone,' requiring structured capability descriptions, standardized calling protocols, and predictable feedback. In contrast, traditional cloud workloads are steady-state (e.g., an ECS instance running for months). Agents do not interact with web consoles; they need cloud services encapsulated as standardized, function-like interfaces.

QWhat key shift in Alibaba Cloud's model capability focus is highlighted with the release of Qwen3.7-Max?

AThe key shift is moving from pursuing models that 'speak well' to models that 'can accomplish tasks.' The focus is now on long-term autonomous execution capabilities. For example, Qwen3.7-Max autonomously wrote and optimized a production-level AI computing kernel for the new Zhenwu M890 chip over 35 hours without human intervention, improving performance tenfold.

QAccording to the article, what organizational and operational changes is Alibaba Cloud making to support its MaaS (Model-as-a-Service) transformation?

AAlibaba Cloud is making several changes: 1) Establishing dedicated MaaS sales roles for large clients, separate from traditional IaaS sales with independent assessments. 2) Shifting key performance indicators (KPIs) from mere call volume to 'high-quality Tokens' that solve real problems. Core metrics now include daily growth of paying customers, number of core business systems integrated with models, and the efficiency of Agents completing task loops autonomously. 3) Changing sales dialogue from IT departments to business units or CEOs, requiring different skills and experience levels.

Пов'язані матеріали

Are Rising U.S. Stocks Getting More Dangerous? Goldman Sachs: Downside Protection Mechanisms Have Almost Failed

The US stock market rally is showing signs of becoming increasingly precarious as key downside protection mechanisms fail, according to Goldman Sachs. Derivatives strategist Brian Garrett notes that the S&P 500 options volatility skew has plunged to an 18-month low, indicating the market now prices an 8% probability for both a 10% drop and a 10% rise—a sign of "skew failure." Concurrently, Goldman's Panic Index hit a two-year low, reflecting minimal demand for tail-risk hedging. This complacency emerges amid a relentless market surge, with the S&P 500 setting new records frequently in 2024. Garrett highlights three major concerns: extreme concentration in the top ten stocks (40% of index weight), heavy reliance on AI-themed performance, and a price pattern eerily similar to the 1998-1999 period. Despite pervasive media pessimism, this fear is absent in options pricing. Downside hedge costs are historically low. Goldman suggests tactical trades: buying RSP outperformance options versus the SPX for a broadening rally, purchasing VIX calls for protection, and going long on Bitcoin ETF volatility. Hedge funds have been net buyers for two weeks, with sector rotation into financials and out of industrials. Notably, the global single-stock leveraged/ inverse ETF AUM has doubled to over $60 billion in two months, underscoring growing speculative activity.

marsbit24 хв тому

Are Rising U.S. Stocks Getting More Dangerous? Goldman Sachs: Downside Protection Mechanisms Have Almost Failed

marsbit24 хв тому

DAT Failure? Listed Companies Betting on HYPE Floating Profit of $12.5 Billion

Several public companies that adopted a "HYPE Treasury" strategy—holding significant reserves of the HYPE token from the Hyperliquid ecosystem—have achieved substantial paper gains, collectively exceeding $1.25 billion. This contrasts with the reported struggles of MicroStrategy's flagship BTC treasury strategy. The article profiles three such HYPE-focused treasury companies: 1. **Hyperliquid Strategies Inc. (PURR):** The largest holder, with approximately 22.3 million HYPE tokens valued at ~$1.636 billion, resulting in an unrealized gain of ~$1.22 billion. It has fully transitioned from a biotech firm to a dedicated crypto treasury, adding staking and validator operations to enhance returns. 2. **Hyperion DeFi (HYPD):** Holds around 2 million HYPE tokens (~$147 million value) with a gain of ~$49.4 million. It is deeply integrated into the Hyperliquid ecosystem, running a major validator node and building DeFi products for additional yield. 3. **Lion Group Holding (LGHL):** A smaller holder with ~194,000 HYPE tokens (~$14.14 million value), maintaining a long-term commitment to the token. The success of these HYPE treasuries is attributed not only to the token's significant price appreciation but also to active on-chain participation through staking, validation, and ecosystem integrations, creating a compounding "flywheel" effect. The article posits that while MicroStrategy's BTC strategy faces challenges, HYPE treasuries may offer a more sustainable model through deeper protocol engagement, with potential for further growth if HYPE's price rises as predicted by some analysts.

marsbit44 хв тому

DAT Failure? Listed Companies Betting on HYPE Floating Profit of $12.5 Billion

marsbit44 хв тому

DAT Failing? Listed Companies Betting on HYPE Have Floating Profits of $12.5 Billion

Facing a potential need to sell Bitcoin to pay dividends amid a $12.5B quarterly net loss, the crypto treasury strategy pioneered by Strategy appears strained. In contrast, public companies that adopted a similar strategy by betting on the HYPE token are seeing massive gains, with collective unrealized profits exceeding $1.25 billion. Three key HYPE treasury companies are highlighted: 1. **Hyperliquid Strategies Inc. (PURR):** The largest holder, with approximately 22.3 million HYPE tokens valued at ~$1.636 billion, resulting in ~$1.22 billion in unrealized gains. It has fully transitioned from a biotech firm to a native crypto treasury, focusing on staking and ecosystem participation via validator operations. 2. **Hyperion DeFi (HYPD):** Holds about 2 million HYPE tokens (~$147M value) with ~$49.4M in gains. It is deeply integrated into the Hyperliquid ecosystem, running a top validator node and building DeFi products to generate additional yield. 3. **Lion Group Holding (LGHL):** A smaller player holding ~193,775 HYPE tokens (~$14.14M value), maintaining a long-term holding strategy alongside other crypto assets. The article argues that HYPE treasuries have an advantage over Bitcoin-based ones like Strategy's. Their success stems not just from price appreciation but from active on-chain participation—staking, earning validator rewards, and engaging with ecosystem protocols—creating a compounding "flywheel" effect. With Hyperliquid dominating the on-chain perpetuals market and HYPE's tokenomics encouraging buys and burns, these treasuries are positioned to benefit further if HYPE's price rises as some predict. While the original Bitcoin treasury strategy isn't declared a failure, the current narrative highlights the outsized success of early movers into the HYPE ecosystem.

Odaily星球日报49 хв тому

DAT Failing? Listed Companies Betting on HYPE Have Floating Profits of $12.5 Billion

Odaily星球日报49 хв тому

Comics Illustration: Helping You Understand China's New Regulations on Outbound Investment

Summary: Understanding China's New Regulations on Overseas Investment The State Council has announced new regulations on overseas investment, effective July 1, 2026. The core message is not a prohibition on international investment, but a call for both companies and individuals to operate with strong regulatory awareness. Here are the key points: 1. **Scope is Broad:** The rules apply not only to companies but also to other organizations and individual residents. 2. **Definition of Investment is Wide:** It encompasses not just capital transfers but also asset contributions, obtaining equity or rights, financing, providing guarantees, and direct or indirect acquisition of rights related to overseas entities or assets. 3. **Companies Must Plan Comprehensively:** Beyond simple ownership charts, firms need clear plans covering the investing entity, required approvals or filings, fund transfer paths, and compliance with technology, data, and security reviews. 4. **Individuals Should Prioritize Compliance:** Before focusing on returns, individuals must first assess their eligibility, understand legal channels for capital outflow, know what they are acquiring, and identify responsible parties in case of issues. 5. **Penalties are Significant:** Violations can result in fines and potentially restrictions on future overseas investment activities. In essence, overseas investment remains possible, but it must be approached with regulatory compliance as a fundamental priority, not solely based on commercial opportunity. *Note: This is a general informational summary and does not constitute legal advice or investment recommendations.*

marsbit1 год тому

Comics Illustration: Helping You Understand China's New Regulations on Outbound Investment

marsbit1 год тому

Торгівля

Спот
Ф'ючерси
活动图片