K2.6 is Yang Zhilin's First Roadshow

marsbitPublished on 2026-04-22Last updated on 2026-04-22

Abstract

K2.6 is Yang Zhilin's first roadshow for Moonshot AI's upcoming IPO. The release of K2.6 marks a strategic shift from a developer-focused approach to one aimed at investors and enterprise clients. Key changes include a 58% price increase for API inputs, structured to favor locked-in enterprise users over casual customers. The model is positioned against previous-generation international models like GPT-5.4 rather than the latest competitors, framing it as "first-tier." It also introduces large-scale Agent clusters (300+ agents) and open-source offerings, targeting enterprise automation scenarios. These moves are seen as preparations for K3, a anticipated 3-4 trillion parameter model. With Moonshot's valuation rising to $18 billion, K2.6 is a crucial step to demonstrate commercial viability and market positioning ahead of a potential late-2026 IPO.

By Xiang Xianzhi

The night before last, Moonshot AI released Kimi K2.6 and raised the API input price from $0.60 per million tokens to $0.95 per million tokens.

A 58% increase. The first price hike since the K2 series launched.

But it seems no one is paying attention to this.

Four months ago, in an internal letter on the last day of 2025, Yang Zhilin wrote that Moonshot AI was "not in a hurry for an IPO in the short term." At that time, Zhipu and MiniMax had already submitted their prospectuses to the Hong Kong Stock Exchange. This was clearly a deliberate positioning strategy.

He also wrote in that letter that the company's cash reserves exceeded $1.4 billion, and the Series C round of $500 million was oversubscribed—the subtext being that the potential of the primary market had not been fully utilized, and there was no rush for the secondary market.

Three months later, Bloomberg reported that he had begun talks with CICC and Goldman Sachs. Three weeks after that, K2.6 was launched.

A person who dislikes "rushing" did in four months what he previously said he wouldn't do.

K2.6 is certainly not the last product release before Moonshot AI's IPO. But this version release is Yang Zhilin's first roadshow after Moonshot AI planned to go public.

Kimi Has Never Released a Model Version Like This Before

Kimi had a set routine for releasing models in the past.

Publish a technical report, open-source the weights, top the HuggingFace leaderboard, and then await scrutiny from the tech community. K1.5 countered o1 with a reasoning methodology, with technical details outweighing benchmark numbers; K2 Thinking directly dumped the weights on HuggingFace, letting developers run their own tests. These moves were aimed at developers and researchers.

The rhetoric was also from the tech community: what problem did we solve, why is our method better, welcome to reproduce.

K2.6's moves are somewhat different.

First, the price increase. In RMB terms, the input price for K2.6 is 6.5 yuan per million tokens (cache miss), compared to 4 yuan for K2.5. The output price increased from 21 yuan to 27 yuan. The cache hit price is 1.1 yuan.

This is a structured price increase. Superficially, all tiers are increasing, but the cache hit tier has the smallest increase—from 0.7 yuan to 1.1 yuan, which is $0.16 per million tokens in USD.

This $0.16 is the key to understanding this price hike.

For enterprise users who repeatedly call the same system prompt: code assistants, Agent orchestration frameworks, smart customer service—their prefix is highly reusable, and cache hit rates can reach 75% to 83%. Moonshot AI left a nearly flat price for these customers.

For scattered customers who use it occasionally with different prompts each time, this price increase falls squarely on them.

This is a friendly price adjustment for "enterprises already tied to Kimi" and an unfriendly one for "scattered customers still comparing prices". The former are the "enterprise locked-in clients" in the IPO story, the latter are the "long-tail users" that won't appear on the roadshow PPT. Moonshot AI knows very well who its valuation assets are.

The compute structure of the Agent era is different from the chat era. Chat models are dozens of tokens back and forth, Agents are thousands of tool calls and hundreds of thousands of token consumption. Official K2.6 use cases—Mac local deployment Qwen3.5 model calling tools over 4000 times, running for 12 hours; refactoring the open-source matching engine exchange-core, 1000+ tool calls over 13 hours; more extreme, 5 days of autonomous operation monitoring alerts, fault response—the token consumption for these single tasks is hundreds or even thousands of times that of chat scenarios in the K2.5 era.

Of course, these cases are meant to illustrate long-range reasoning capabilities, but coupled with K2.6's 300-agent cluster, the token consumption must be staggering.

At the old price of $0.60, this kind of Agent task might lose money per call. At $0.95, it barely covers the inference cost.

So the price increase isn't confidence, it's necessity. Moonshot AI has raised $2.5 billion cumulatively, with $1.4 billion cash reserve from Series C to C+, but if the next-gen K3 is truly a 3-4 trillion parameter scale, a single pre-training run might eat up half of that.

Without a price increase, the gross margin data for the last few quarters before the IPO would look bad. The prospectus must disclose gross margin.

This could have been explained openly—the Agent era requires a new pricing model. But Moonshot AI didn't. Because C-end users just came from the free era of K2 Thinking, and telling them "I raised prices" now is not a good product narrative.

It's a story for another audience—Kimi already has a group of enterprise clients who can't leave it, and they'll use it even if it's more expensive. (Like myself)

The second thing is benchmark comparisons. K2.6's official chosen references are GPT-5.4, Claude Opus 4.6, Gemini 3.1 Pro. All three are previous-generation flagships.

The same week, Anthropic released Claude Mythos, and Opus 4.7 just launched—both are a generation stronger than Opus 4.6. K2.6 didn't benchmark against them.

This is actually an active choice. Benchmarking against Mythos, K2.6 falls into the "catch-up" position; benchmarking against Opus 4.6, K2.6 falls into the "first tier" position. An $18 billion valuation needs the latter.

Kimi didn't really do this in the past. When K2 Thinking was released, the official ran full benchmarks, good and bad results, all released for developers to judge for themselves. That was the tech community's way—the community understands where you are strong and weak, and is willing to accept a model with obvious shortcomings but a clear roadmap.

Roadshow PPTs are not. Roadshow PPTs need a conclusion a fund manager can understand in 30 seconds: "on par with or superior to top international closed-source models." This sentence is verbatim from the K2.6 official blog.

The third thing is the Agent cluster and open-source dual track. K2.6 upgraded something called Claw Groups—a heterogeneous Agent ecosystem where Agents with different devices, different models, and different toolchains run in a collaborative space, with K2.6 acting as the scheduler. 300 sub-Agents in parallel, 4000 steps of collaboration, 5 days of autonomous operation.

These numbers are written for enterprise clients. Not for developers. For a developer, "300 Agents in parallel" has no practical meaning—they won't run 300 Agents in a local project. This configuration only makes sense for one type of client: large enterprises that need an Agent matrix to automate entire operational processes.

It's targeting the Salesforce story, not the HuggingFace story.

Meanwhile, K2.6 is fully open-sourced. Yang Zhilin said at the Zhongguancun Forum on March 26th that open source will be an absolute victory.

Open source + enterprise Agent clusters—this is a position between DeepSeek and Anthropic, half and half of both models. It sounds like a good story. But occupying both ends means having to prove both.

The capital market doesn't really care if these questions have answers. It only requires you to have a story for each line.

Price increase, benchmarking, Agent cluster—these three things together have an反常的共同点 (abnormal common point). None are for the tech community.

Kimi's underlying logic for releasing models in the past was—if developers like us, enterprise clients will eventually follow, and the capital market will follow even later. This playbook has a name: technical sincerity.

K2.6 isn't waiting. The price increase is a direct declaration of B-end pricing power; benchmarking against GPT-5.4 is preemptively securing a valuation position; Agent clusters and Claw Groups are the showroom for the enterprise service story.

Each thing corresponds to a question on the roadshow PPT: What is your commercialization capability? What is your benchmark position? What is your B-end moat?

Compressing the time from Preview to GA to 8 days is also this logic. Previous versions of the K2 series all went through 2-3 month preview periods, letting the community test enough, provide feedback, and iterate enough. K2.6 didn't give itself this space. It's not that the technology matured faster; the window won't wait.

An IPO in the second half of 2026 requires 4 to 6 months for filing, inquiry, hearing, roadshow, pricing, and cooling-off period according to HKEX procedures. Starting the roadshow in September means the product must be ready by April.

If GA isn't released in April, there's no window later.

K3 is the Real Grand Finale

But K2.6 is also not the strongest card Moonshot AI can play.

There is a very restrained sentence in the official blog—K2.6 is the "runway prepared for K3".

12-hour long-range coding, 300-Agent cluster, context compressor—these are not the final form of the K2 series; they are the execution layer infrastructure that a larger base model can support. Moonshot AI wouldn't spend effort making this work unless it was certain a larger model would consume these capabilities.

Rumors about K3 leaked on Reddit earlier, targeting a parameter scale of 3-4 trillion. Compared to the trillion-scale of the K2 series, this is a base leap.

If K3 can be released during the roadshow window—that is the real answer sheet. The runway paved by K2.6 allows K3 to take off.

The question is whether it can make it. How long does it take to train a 3-4 trillion parameter model? GPT-5 and Claude Opus 4.6 both had roughly 6-9 month pre-training cycles, plus several months for post-training and safety evaluation. Can Moonshot AI's existing compute—judging from the Alibaba Cloud cooperation and current cash reserves—compress this cycle to 5-6 months?

This bet is placed on K2.6.

Eight days from Preview to GA, Agent cluster expanding from 100 to 300 in one go, long-range execution stretching from hundreds of steps to 4000 steps—every move compresses time, making room for the possibility of K3.

If K3 can be released before August or September—that's the grand finale on the roadshow.

If it doesn't make it—K3 becomes a "model that can only be released after the IPO," and K2.6 has to shoulder the entire valuation narrative alone.

Moonshot AI is betting it can be done.

What Does the $18 Billion Valuation Anchor?

Back to valuation.

Three months ago, Moonshot AI was valued at $4.3 billion; two months ago, $5.5 billion; now, $18 billion.

It's not that Moonshot AI became four times stronger in these three months. It's that Zhipu and MiniMax went public and rose 4x, pushing the ceiling of the entire sector up. Zhipu's HK market cap is HK$305 billion, MiniMax's is HK$309.2 billion—both exceeding SenseTime's historical peak.

The valuation logic for these two is not "what the next-gen technology can do," but "how much AI assets can be priced in the Hong Kong market pool."

Moonshot AI's $18 billion valuation anchors the same thing. It is no longer proving it is the strongest Chinese AI company; it is proving it is a priceable Chinese AI company.

All of K2.6's moves—price increase, benchmarking, Agent cluster, open-source dual track—respond to this proposition.

But there is one thing K2.6 has not yet proven. Will Kimi's C-end users be willing to pay for the more expensive K2.6? Will paying subscribers churn to DeepSeek or MiniMax? How many enterprise clients are actually running Claw Groups, and how many just signed a POC?

These are numbers investors will definitely ask during the roadshow. K2.6 can only put the product out now. Whether it turns into numbers depends on the next three months.

When Zhipu went public, it submitted a prospectus where profits weren't yet positive; MiniMax did too. Investors accepted this story because the grand narrative of "Chinese AI assets" had just opened. Moonshot AI is half a year late. For the same question, Zhipu and MiniMax could say "we are validating," Moonshot AI must say "we are monetizing."

This pressure falls entirely on the three months between K2.6 and K3.

So back to the initial question—Is K2.6 Moonshot AI's final roadshow before the IPO?

No.

If K3 catches the roadshow window, K3 is the real grand finale. K2.6 is just the runway paved for it. If K3 misses the roadshow window, K2.6 has to carry the entire IPO narrative. Then it is Yang Zhilin's被迫提前开讲的第一场 (first, forced-to-start-early one).

Neither outcome was what Yang Zhilin wanted four months ago.

But everything that happened in these four months—Zhipu MiniMax IPO, valuation ceiling pushed up, window period compressed—forced a person who dislikes "rushing" to have to rush.

When K3 is released, it will be the second act.

Related Questions

QWhat is the significance of the K2.6 model release for Moonshot AI's IPO plans?

AThe K2.6 model release is Yang Zhilin's first roadshow for Moonshot AI's planned IPO. It is a strategic move to demonstrate the company metrics crucial for valuation, such as enterprise pricing power, competitive positioning, and B2B capabilities, ahead of a potential listing in the second half of 2026.

QHow did Moonshot AI adjust its API pricing with the K2.6 release, and what was the strategic reason?

AMoonshot AI raised its API input price from $0.60 to $0.95 per million tokens, a 58% increase. This was a structured price hike designed to be friendly to enterprise clients with high cache hit rates (who saw a smaller increase) while passing the full increase to more casual, price-comparing users. The move was necessary to improve gross margin figures for the IPO prospectus and to align pricing with the high token consumption of the new Agent era.

QWhy did the K2.6 benchmark choose to compare itself to older models like GPT-5.4 instead of the latest ones?

AK2.6 was benchmarked against previous-generation flagship models like GPT-5.4 and Claude Opus 4.6 rather than the newer, stronger contemporaries (e.g., Claude Mythos) to position itself in the 'first tier' of models for its roadshow narrative. This creates a more favorable comparison for fund managers, supporting a valuation story of being 'on par with or superior to top international closed-source models'.

QWhat is the 'Claw Groups' feature in K2.6, and which audience is it targeting?

AClaw Groups is a feature for heterogeneous Agent ecosystems, allowing up to 300 different Agents across various devices, models, and toolchains to operate collaboratively with K2.6 as the scheduler. This targets enterprise clients, not developers, as it demonstrates a solution for large corporations seeking to automate full-process operations with an Agent matrix, akin to a Salesforce enterprise story.

QWhat is the relationship between the K2.6 release and the anticipated K3 model?

AK2.6 is described as the 'runway' for the much larger K3 model (rumored to be 3-4 trillion parameters). Features like long-context execution and the Agent cluster infrastructure are built to be consumed by a more powerful base model. The rushed 8-day preview-to-GA cycle for K2.6 is a bet that K3 can be ready in time to be the centerpiece of the IPO roadshow; if not, K2.6 must carry the entire valuation narrative alone.

Related Reads

North Korean Hackers Loot $500 Million in a Single Month, Becoming the Top Threat to Crypto Security

North Korean hackers, particularly the notorious Lazarus Group and its subgroup TraderTraitor, have stolen over $500 million from cryptocurrency DeFi platforms in less than three weeks, bringing their total theft for the year to over $700 million. Recent major attacks on Drift Protocol and KelpDAO, resulting in losses of approximately $286 million and $290 million respectively, highlight a strategic shift: instead of targeting core smart contracts, attackers are now exploiting vulnerabilities in peripheral infrastructure. For instance, the KelpDAO attack involved compromising downstream RPC infrastructure used by LayerZero's decentralized validation network (DVN), allowing manipulation without breaching core cryptography. This sophisticated approach mirrors advanced corporate cyber-espionage. Additionally, North Korea has systematically infiltrated the global crypto workforce, with an estimated 100 operatives using fake identities to gain employment at blockchain companies, enabling long-term access to sensitive systems and facilitating large-scale thefts. According to Chainalysis, North Korean-linked hackers stole a record $2 billion in 2025, accounting for 60% of all global crypto theft that year. Their total historical crypto theft has reached $6.75 billion. Post-theft, they employ specialized money laundering methods, heavily relying on Chinese OTC brokers and cross-chain mixing services rather than standard decentralized exchanges. Security experts, while acknowledging the increased sophistication, emphasize that many attacks still exploit fundamental weaknesses like poor access controls and centralized operational risks. Strengthening private key management, limiting privileged access, and enhancing coordination among exchanges, analysts, and law enforcement immediately after an attack are critical to improving defense and fund recovery chances. The industry's challenge now extends beyond secure smart contracts to safeguarding operational security at the infrastructure level.

marsbit39m ago

North Korean Hackers Loot $500 Million in a Single Month, Becoming the Top Threat to Crypto Security

marsbit39m ago

Circle CEO's Seoul Visit: No Korean Won Stablecoin Issuance, But Met All Major Korean Banks

Circle CEO Jeremy Allaire's recent activities in Seoul indicate a strategic shift for the company, moving away from issuing a Korean won-backed stablecoin and instead focusing on embedding itself as a key infrastructure provider within Korea’s financial and crypto ecosystem. Despite Korea accounting for nearly 30% of global crypto trading volume—with a market characterized by high retail participation and altcoin dominance—Circle has chosen not to compete for the role of stablecoin issuer. Instead, Allaire met with major Korean banks (including Shinhan, KB, and Woori), financial groups, leading exchanges (Upbit, Bithumb, Coinone), and tech firms like Kakao. This approach reflects a broader industry transition: the core of stablecoin competition is shifting from issuance rights to systemic positioning. With Korean regulators still debating whether banks or tech companies should issue stablecoins, Circle is avoiding regulatory uncertainty by strengthening its role as a service and technology partner. The company is deepening integration with trading platforms, building connections, and promoting stablecoin infrastructure. This positions Circle to benefit regardless of which entity eventually issues a won stablecoin. Allaire also noted the potential for a Chinese yuan stablecoin in the next 3–5 years, underscoring a regional trend of stablecoins becoming more regulated and integrated with traditional finance. Ultimately, Circle’s strategy highlights that future influence in the stablecoin market will belong not necessarily to the issuers, but to the foundational infrastructure layers that enable cross-system transactions.

marsbit1h ago

Circle CEO's Seoul Visit: No Korean Won Stablecoin Issuance, But Met All Major Korean Banks

marsbit1h ago

SpaceX Ties Up with Cursor: A High-Stakes AI Gambit of 'Lock First, Acquire Later'

SpaceX has secured an option to acquire AI programming company Cursor for $60 billion, with an alternative clause requiring a $10 billion collaboration fee if the acquisition does not proceed. This structure is not merely a potential acquisition but a strategic move to control core access points in the AI era. The deal is designed as a flexible, dual-path arrangement, allowing SpaceX to either fully acquire Cursor or maintain a binding partnership through high-cost collaboration. This "option-style" approach minimizes immediate regulatory and integration risks while ensuring long-term alignment between the two companies. At its core, the transaction exchanges critical AI-era resources: SpaceX provides its Colossus supercomputing cluster—one of the world’s most powerful AI training infrastructures—while Cursor contributes its AI-native developer environment and strong product adoption. This synergy connects compute power, models, and application layers, forming a closed-loop AI capability stack. Cursor, founded in 2022, has achieved rapid growth with over $1 billion in annual revenue and widespread enterprise adoption. Its value lies in transforming software development through AI agents capable of coding, debugging, and system design—positioning it as a gateway to future software production. For SpaceX, this move is part of a broader strategy to evolve from a aerospace company into an AI infrastructure empire, integrating xAI, supercomputing, and chip manufacturing. Controlling Cursor fills a gap in its developer tooling layer, strengthening its AI narrative ahead of a potential IPO. The deal reflects a shift in AI competition from model superiority to ecosystem and entry-point control. With programming tools as a key battleground, securing developer loyalty becomes crucial for dominating the software production landscape. Risks include questions around Cursor’s valuation, technical integration challenges, and potential regulatory scrutiny. Nevertheless, the deal underscores a strategic bet: controlling both compute and software development access may redefine power dynamics in the AI-driven future.

marsbit1h ago

SpaceX Ties Up with Cursor: A High-Stakes AI Gambit of 'Lock First, Acquire Later'

marsbit1h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of S (S) are presented below.

活动图片