# Сопутствующие статьи по теме MoE

Новостной центр HTX предлагает последние статьи и углубленный анализ по "MoE", охватывающие рыночные тренды, новости проектов, развитие технологий и политику регулирования в криптоиндустрии.

Chinese Large Models: This Time, the Script Is Different

By early 2026, Chinese large language models (LLMs) have gained significant global traction, representing six of the top ten most-used on the AI model aggregation platform OpenRouter. This shift, led by models like Xiaomi's MiMo-V2-Pro, occurred after Chinese models' weekly token usage surpassed that of U.S. models in February 2026. A key driver is the substantial price gap: Chinese models are often 10–20 times cheaper for input and up to 60 times cheaper for output tokens than leading U.S. models like OpenAI’s GPT-5.4 and Anthropic’s Claude Opus. This cost advantage became critical with the rise of agentic applications like OpenClaw, which automate complex tasks (e.g., programming, testing) and consume tokens at a much higher volume than traditional chat interfaces. While U.S. models still lead in complex reasoning benchmarks, Chinese models have nearly closed the gap in programming tasks—evidenced by near-parity scores on the SWE-Bench coding evaluation. This enabled cost-conscious developers, especially in AI startups using open-source stacks, to adopt a "layered" approach: using Chinese models for routine tasks and reserving premium U.S. models for harder problems. Rising demand led Chinese firms like Zhipu and Tencent to increase API prices in early 2026, yet usage continued growing sharply. Analysts note that China’s cost edge stems from large-scale, efficient compute infrastructure and widespread adoption of MoE (Mixture of Experts) architecture. Unlike the low-margin electronics manufacturing analogy ("AI-era Foxconn"), Chinese LLM firms are demonstrating pricing power and rapid technical advancement, suggesting a different trajectory from traditional assembly-line roles.

marsbit04/07 11:00

Chinese Large Models: This Time, the Script Is Different

marsbit04/07 11:00

Google's Open-Source Large Model Gemma 4 Imminent Announcement: Parameter Count Quadrupled

Google is set to announce Gemma 4, its next-generation open-source large language model, marking a significant upgrade from the previous Gemma 3 released a year ago. The new model is expected to feature a 120B parameter version—four times larger than its predecessor—while utilizing a Mixture-of-Experts (MoE) architecture to keep activated parameters at just 15B, enabling local operation on consumer-grade hardware. Gemma 4 is also anticipated to deliver improved context length, reasoning, and complex task performance. The move is seen as part of Google’s strategy to compete in the open-source arena, which has been increasingly influenced by Chinese tech firms. By releasing Gemma 4 months after its flagship closed-source model Gemini 3.0, Google aims to balance commercial interests with developer engagement. The model emphasizes local and offline usability, positioning it as a direct competitor to domestic open-source alternatives. Industry observers note that Gemma 4 raises the bar for open-source models, combining scale and efficiency. Although Google’s primary focus remains on closed-source systems, its technical strength could make Gemma 4 a strong contender in the global open-source ecosystem.

marsbit04/02 06:45

Google's Open-Source Large Model Gemma 4 Imminent Announcement: Parameter Count Quadrupled

marsbit04/02 06:45

The Next Earthquake in AI: Why the Real Danger Isn't the SaaS Killer, But the Computing Power Revolution?

The next seismic shift in AI isn't about SaaS disruption but a fundamental revolution in computing power. While many focus on AI applications like Claude Cowork replacing traditional software, the real transformation is happening beneath the surface: a dual revolution in algorithms and hardware that threatens NVIDIA’s dominance. First, algorithmic efficiency is advancing through architectures like MoE (Mixture of Experts), which activates only a fraction of a model’s parameters during computation. DeepSeek-V2, for example, uses just 9% of its 236 billion parameters to match GPT-4’s performance, decoupling AI capability from compute consumption and slashing training costs by up to 90%. Second, specialized inference hardware from companies like Cerebras and Groq is replacing GPUs for AI deployment. These chips integrate memory directly onto the processor, eliminating latency and drastically reducing inference costs. OpenAI’s $10 billion deal with Cerebras and NVIDIA’s acquisition of Groq signal this shift. Together, these trends could collapse the total cost of developing and running state-of-the-art AI to 10-15% of current GPU-based approaches. This paradigm shift undermines NVIDIA’s monopoly narrative and its valuation, which relies on the assumption that AI growth depends solely on its hardware. The real black swan event may not be an AI application breakthrough but a quiet technical report confirming the decline of GPU-centric compute.

marsbit02/12 04:38

The Next Earthquake in AI: Why the Real Danger Isn't the SaaS Killer, but the Computing Power Revolution?

The next seismic shift in AI is not the threat of "SaaS killers" but a fundamental revolution in computing power. While many focus on how AI applications like Claude Cowork are disrupting traditional software, the real transformation is happening beneath the surface—in the infrastructure that powers AI. Two converging technological paths are challenging NVIDIA’s GPU dominance: 1. **Algorithmic Efficiency**: DeepSeek’s Mixture-of-Experts (MoE) architecture allows massive models (e.g., DeepSeek-V2 with 236B parameters) to activate only a small fraction of "experts" (9%) during computation, achieving GPT-4-level performance at 10% of the computational cost. This decouples AI capability from sheer compute power. 2. **Specialized Hardware**: Inference-optimized chips from companies like Cerebras and Groq integrate memory directly onto the chip, eliminating data transfer delays. This "zero-latency" design drastically improves speed and efficiency, prompting even OpenAI to sign a $10B deal with Cerebras. Together, these advances could cause a cost collapse: training costs may drop by 90%, and inference costs could fall by an order of magnitude. The total cost of running world-class AI may plummet to 10-15% of current GPU-based solutions. This paradigm shift threatens NVIDIA’s valuation, built on the assumption of perpetual GPU dominance. If the market realizes that GPUs are no longer the only—or best—option, the foundation of NVIDIA’s trillions in market cap could crumble. The real black swan event may not be a new AI application, but a quiet technical breakthrough that reshapes the compute landscape.

marsbit02/11 01:58

MiniMax's Funding Story: 7 Rounds in 4 Years, Who is Driving China's First AI Capital Feast

MiniMax, a Chinese AI startup, has completed an IPO on the Hong Kong Stock Exchange after raising $1.5 billion across 7 funding rounds in just 4 years. Founded in early 2022 by former SenseTime executives Yan Junjie and Yun Yeyi, the company focuses on developing multimodal and large language models with the vision of "Intelligence with everyone." Key investors include Alibaba (largest external shareholder), Hillhouse Capital (first investor and largest financial backer), and Mihoyo, among others. MiniMax’s strategy combines foundational model development with practical applications, generating revenue through three balanced segments: companion AI apps (e.g., Talkie), content generation tools (e.g., Hailuo AI), and API services. The company navigated the competitive "war of a hundred models" in China, emphasizing both technological risk-taking—such as early bets on MoE architecture and linear attention models—and capital efficiency. Its approach reflects lessons from the previous AI wave: avoid over-customized enterprise solutions and prioritize scalable consumer applications. Despite market uncertainties and intense competition from tech giants, MiniMax went public not as a finale but as a means to secure more resources for ongoing R&D and expansion. The stock surged over 78% on its first trading day, reaching an HK$89.8 billion market cap.

marsbit01/09 05:14

MiniMax's Funding Story: 7 Rounds in 4 Years, Who is Driving China's First AI Capital Feast

marsbit01/09 05:14

L1 Native Tokens' Final 'Monetary Premium' Is Collapsing!

The monetary premium on Layer 1 (L1) native tokens is rapidly fading as stablecoins increasingly dominate as the preferred medium of exchange (MoE) in on-chain economies. While L1 valuations historically incorporated a monetary premium derived from their use as a store of value (SoV) or MoE, data shows a clear shift. On Ethereum, ETH is no longer the primary MoE; stablecoins like USDC and USDT now lead in on-chain transaction volume and top Uniswap pools. A similar trend is observed on L2s like Arbitrum. On Solana, SOL remains the primary MoE due to platforms like Pumpfun and Raydium using it as a quote currency, but USDC is gaining significant traction, especially with new launchpads like MetaDAO adopting it as the default pairing asset. On BNB Chain, USDT has overtaken BNB in trading volume, which had previously been the dominant MoE. The author argues that the market has chosen stablecoins as the superior on-chain MoE. Attempts by ecosystems to enforce their native token as a quote currency add friction and costs for users with minimal price impact. Improved on-chain UX and liquidity are reducing the historical necessity of using volatile native tokens for transactions. The next wave of users will likely use stablecoins, not native tokens, for everyday exchange, further eroding this monetary premium.

比推12/08 20:16

L1 Native Tokens' Final 'Monetary Premium' Is Collapsing!