Xiaomi and MiniMax Unleash Major Upgrades Simultaneously, Officially Kicking Off the Agent Pricing War

marsbitPublished on 2026-03-20Last updated on 2026-03-20

Abstract

Chinese AI companies MiniMax and Xiaomi's MiMo have both launched major Agent-focused models, M2.7 and V2-Pro, respectively, within two days in March. Both models rank in the top tier globally on Agent benchmarks but are priced significantly lower than leading Western models—MiniMax at $1.2 per million tokens (1/21 of Claude Opus) and MiMo at $3 (1/8 of Claude Opus). The two represent divergent technical strategies. MiMo-V2-Pro adopts a scale-driven approach with over 1 trillion parameters and a hybrid attention mechanism optimized for long-context and multi-tool agent tasks. In contrast, MiniMax’s M2.7 uses a self-iterative optimization method, autonomously refining its architecture over 100+ cycles to improve performance without disclosing parameter count. Their release rhythms also differ: MiniMax iterates rapidly with four versions in five months, while Xiaomi releases fewer but more substantial upgrades. Notably, Xiaomi debuted V2-Pro anonymously on OpenRouter as "Hunter Alpha," topping the platform’s usage chart before revealing its identity—a first for a Chinese AI model gaining global developer traction through pure performance.

On March 18 and 19, two Chinese companies successively released their major models in the Agent direction. Domestic AI startup MiniMax launched M2.7, while Xiaomi's large model team MiMo introduced V2-Pro. Both models have entered the global top tier on the Agent benchmark, but their API output pricing is 1/21 and 1/8 of Claude Opus 4.6, respectively.

They played their cards in the same week, but with completely different hands. They represent two截然不同的 technical routes, betting on two different futures for the Agent era.

The Same Exam, 1/17 the Tuition

First, let's look at the most直观 comparison.

According to data from OpenRouter and the official pricing pages of various companies, based on API output price (per million tokens), MiniMax M2.7 is $1.2, and MiMo-V2-Pro is $3. As a reference, Claude Opus 4.6's output price is $25, GPT-5.2 is $14, and Claude Sonnet 4.6 is $15.

The price gap is by an order of magnitude, but the capability gap is not. On SWE-bench Verified (the current mainstream benchmark for measuring code engineering capabilities), MiMo-V2-Pro scored 78%, while Sonnet 4.6 scored 79.6%, a difference of less than two percentage points. M2.7's SWE-Pro score was 56.22%, on par with GPT-5.3-Codex. On VIBE-Pro (end-to-end project delivery capability), M2.7 scored 55.6%,接近 the level of Opus 4.6.

The key point of this chart is not who is higher or lower—the benchmark systems of various companies are not fully aligned, so direct comparisons should be made cautiously. The key point is that "price-performance剪刀差": domestic Agent models have already挤进 the same capability band but stand in completely different price ranges.

Trillion Parameters vs. Self-Evolution

Price is only the表象. The two companies have revealed two completely different底牌.

MiMo-V2-Pro follows the "more is better" route. According to Xiaomi's official announcement, V2-Pro has over 1 trillion total parameters, 42B activated parameters, and supports an ultra-long context of 1 million tokens. Its core innovation is the Hybrid Attention mechanism, adjusting the ratio of Sliding Window Attention (SWA) to Global Attention (GA) to 7:1—the previous generation V2-Flash was 5:1. This architecture makes the model more stable when handling long documents and multi-tool parallel calling Agent scenarios. On PinchBench (Agent tool calling capability evaluation), MiMo-V2-Pro scored 84%.

M2.7 takes a completely different path. According to the official technical blog released by MiniMax on March 18, M2.7's parameter count is not公开, but it demonstrates a "self-iterative evolution" mechanism: the model autonomously runs over 100 rounds of optimization cycles, including analyzing failure trajectories, planning modifications, modifying its own code architecture, running evaluations, and cycling again, ultimately achieving a 30% performance improvement on the internal evaluation set. On the MLE Bench Lite (machine learning competition difficulty evaluation) with 22 high-difficulty problems, M2.7 won 9 gold, 5 silver, and 1 bronze, with an average medal rate of 66.6%.

Looking from five dimensions, the锋芒 of the two routes朝向 completely different directions: MiMo-V2-Pro has obvious advantages in context length and code engineering dimensions, while M2.7 pulls ahead in office automation and self-iterative capabilities. According to the same MiniMax technical blog, M2.7 scored ELO 1495 on GDPval-AA (office document processing evaluation), ranking first among open-source models, and maintained a 97% skill adherence rate in the MM-Claw test covering over 40 complex skills.

Four Versions in Five Months

The two companies not only have different technical routes but also completely different iteration rhythms.

According to public release records, MiniMax iterated four major versions from the release of M2 in October 2025 to the release of M2.7 in March 2026—a new version every 49 days on average. The interval between M2.5 and M2.7 was only about 30 days.

Xiaomi MiMo's rhythm is different: MiMo-7B (a 7B parameter open-source inference model) was released in April 2025, V2-Flash (309B total parameters) in December 2025, and V2-Pro (1T total parameters) in March 2026. The parameter scale leap between each generation is larger, but the version intervals are also longer.

MiniMax chose small steps and quick runs, with small iteration amplitudes but extremely high frequency; M2.7's self-iterative mechanism is itself designed for "continuous evolution." Xiaomi chose蓄力一击, with each version representing a major leap in parameter scale and architecture.

Anonymous for 8 Days, Topping OpenRouter

Beyond the technical route, Xiaomi's release strategy also broke industry conventions.

According to a Reuters report, on March 11, an anonymous model named Hunter Alpha appeared on OpenRouter, the world's largest API aggregation platform. No brand endorsement, no launch event, no technical blog. Its API pricing was extremely low, yet its performance was surprisingly strong.

The community began speculating about its origin. According to Republic World and multiple tech media reports, the most mainstream guess was DeepSeek V4, as MiMo team leader Luo Fuli had previously conducted research at DeepSeek. Call volume surged rapidly, exceeding 1 trillion tokens during the anonymous period, topping the OpenRouter weekly chart.

In the early hours of March 19, Xiaomi revealed the answer: Hunter Alpha was MiMo-V2-Pro. According to the same Reuters report, Xiaomi's Hong Kong stock saw a gain of up to 5.8% after the reveal.

This was the first time a domestic large model proved itself on a global platform through pure blind testing. Relying not on brand or宣传, but letting developers vote with their feet over 8 days.

Related Questions

QWhat are the two Chinese companies that recently released their Agent-oriented large models, and what are the model names?

AMiniMax released the M2.7 model, and Xiaomi's MiMo team released the V2-Pro model.

QHow does the API output pricing of MiniMax M2.7 and MiMo-V2-Pro compare to Claude Opus 4.6?

AThe API output price for MiniMax M2.7 is $1.2 per million tokens, which is 1/21 of Claude Opus 4.6's $25. MiMo-V2-Pro is $3 per million tokens, which is 1/8 of Claude Opus 4.6's price.

QWhat are the core technical approaches of MiMo-V2-Pro and MiniMax M2.7?

AMiMo-V2-Pro follows a 'scale-up' approach with over 1 trillion total parameters and a Hybrid Attention mechanism. MiniMax M2.7 uses a 'self-iterative evolution' mechanism where the model autonomously runs optimization cycles to improve its own performance.

QWhat was unique about Xiaomi's release strategy for the MiMo-V2-Pro model?

AXiaomi first released the model anonymously on OpenRouter under the name 'Hunter Alpha' for 8 days. It gained significant developer traction and topped the OpenRouter weekly chart before Xiaomi revealed it was their model.

QHow did the iteration rhythms of MiniMax and Xiaomi's MiMo team differ?

AMiniMax iterated rapidly, releasing four versions in five months (approx. every 49 days). Xiaomi's MiMo team had longer release intervals with larger parameter scale jumps between versions, such as from 7B parameters to 309B, and then to 1T.

Related Reads

Near Returns to the AI Stage: Transformation into a Public Chain Due to 'Payroll Difficulties,' Agent and Privacy Emerge as New Growth Narratives

NEAR Returns to AI Origins: From Payroll Struggles to Blockchain, Now Focusing on AI Agents and Privacy NEAR Protocol's journey began not with grand blockchain ambitions, but from a practical hurdle: its AI startup founders, including Transformer paper co-author Illia Polosukhin, couldn't efficiently pay international developers in 2017. This led them to pivot and build a high-performance, scalable blockchain. After years navigating various crypto narratives like sharding and cross-chain interoperability, NEAR is now leveraging its AI roots to re-enter the AI arena. A key driver is its "NEAR Intents" layer, which abstracts complex cross-chain transactions. Users simply state their goal (e.g., swap BTC for ETH), and a solver network finds the optimal route. This system has processed over $20B in cross-chain volume, generating significant fee revenue. A major growth area is private transactions via "Confidential Intents/Swaps," which hide trade details until settlement to protect against MEV and front-running. Remarkably, private swaps recently accounted for over 40% of NEAR's transaction volume, highlighting strong demand but also potential regulatory scrutiny. With its AI-founder pedigree, NEAR is positioning itself at the intersection of blockchain, AI agents, and privacy, aiming to become infrastructure for the emerging agent economy while navigating the challenges of its rapid adoption.

marsbit1h ago

Near Returns to the AI Stage: Transformation into a Public Chain Due to 'Payroll Difficulties,' Agent and Privacy Emerge as New Growth Narratives

marsbit1h ago

From Ethereum to AI's 'CROPS': What Exactly is This Set of 'Slow Variables' That Vitalik Repeatedly Emphasizes?

In recent discussions, Vitalik Buterin has frequently emphasized the concept of "CROPS," a framework defining core values for Ethereum's development. CROPS stands for Censorship Resistance, Capture Resistance, Open Source, Privacy, and Security. Initially outlined in the Ethereum Foundation's "EF Mandate," it represents a commitment to user sovereignty, ensuring that the network resists external control, remains open, protects privacy, and prioritizes security. The relevance of CROPS extends beyond Ethereum's foundational principles, becoming crucial in the context of AI integration. As AI agents begin handling wallet operations and automated transactions, the risk increases that users may cede control over their digital assets, privacy, and intentions to centralized AI service providers. A "CROPS AI" would therefore emphasize local execution where possible, privacy-preserving remote model calls (e.g., using zero-knowledge proofs), and transparent, verifiable processes to maintain user agency. Vitalik highlights a significant convergence between "CROPS Ethereum access layer" and "CROPS AI." Both address the same fundamental challenge: how users can access powerful services—be it blockchain data via RPCs or AI models—without exposing sensitive information or relinquishing ultimate control. This intersection points toward a future digital entry point that is more private, secure, and user-controlled. Ultimately, CROPS is not merely an abstract ideal but a practical guidepost. It steers development—from protocol resilience and wallet design to AI agent safety—towards a future where users retain self-sovereignty even as digital systems grow more complex and powerful. In an era of accelerating AI adoption, these "slow variables" of censorship resistance, openness, privacy, and security may define Ethereum's enduring value.

marsbit1h ago

From Ethereum to AI's 'CROPS': What Exactly is This Set of 'Slow Variables' That Vitalik Repeatedly Emphasizes?

marsbit1h ago

Silicon Valley 'Startup Guru' Steve Hoffman: Web3 + AI Could Be a Trap

Silicon Valley investor and "Godfather of Startups" Steve Hoffman warns that combining Web3 with AI is likely a trap, not a promising venture. In an interview, Hoffman argues that while AI is a foundational technology touching all industries, Web3 adds complexity, friction, and regulatory risk without solving mainstream consumer or business needs. He advises founders to focus on deep, specialized applications where startups can out-iterate giants, rather than on generic features easily replicated by large tech companies. Hoffman observes that Silicon Valley will lead foundational AI research, while China excels at rapid, large-scale application and commercialization, particularly in robotics. He stresses that AI-driven autonomous agents capable of collaborative, multi-step tasks are 2-4 years away, which will cause significant job displacement. The solution is not to slow AI but to redesign business models around human-AI collaboration and reform social systems like education and retraining. For startups, Hoffman recommends focusing on vertical, expertise-heavy domains to build defensibility. He sees major opportunities in AI fraud detection and cybersecurity. Key founder mindsets include systemic thinking over feature-focus, relentless customer centricity, building adaptive teams, and deeply understanding AI's capabilities and limits. Hoffman is also leading a non-profit initiative to establish university centers aimed at training future leaders in responsible, human-value-aligned AI innovation.

marsbit3h ago

Silicon Valley 'Startup Guru' Steve Hoffman: Web3 + AI Could Be a Trap

marsbit3h ago

Token Inefficient, Economy Tokenless

The article "Tokens Aren't Economical, Economics Aren't Tokenized" analyzes a pivotal shift in the AI industry from a technology-driven narrative to one dominated by capital efficiency. It highlights two concurrent trends: a severe capital shortage due to the exorbitant and recurring costs of compute (e.g., OpenAI's high burn rate) and a wave of corporate spin-offs where major tech companies are separating their AI units (like Kuaishou's Kling and Baidu's Kunlunxin). The core argument is that AI's "anti-internet" business model, where user growth increases costs rather than profits, has created a disconnect between high valuations and actual cash flow. Spin-offs address this by allowing AI assets to be valued independently. Within a parent company, they are seen as cost centers, but as standalone entities, they are priced based on their growth potential and scarcity in the primary market, leading to massive valuation premiums (e.g., Kling's estimated value tripling post-spin-off). The industry is at an inflection point, moving from "model worship" to "value realization." The competition is evolving from a pure compute (GPU) race to a broader focus on systemic efficiency and full-stack engineering (involving CPUs and orchestration) to achieve viable commercialization. The year 2026 is framed as a critical moment where the industry must definitively answer how to economically translate AI capability into tangible business value, reshaping the sector's future power structure.

marsbit3h ago

Token Inefficient, Economy Tokenless

marsbit3h ago

Trading

Spot
Futures

Hot Articles

How to Buy WAR

Welcome to HTX.com! We've made purchasing WAR (WAR) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy WAR (WAR) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your WAR (WAR)After purchasing your WAR (WAR), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade WAR (WAR)Easily trade WAR (WAR) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

2.1k Total ViewsPublished 2024.03.29Updated 2026.06.02

How to Buy WAR

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of WAR (WAR) are presented below.

活动图片