The Next Earthquake in AI: Why the Real Danger Isn't the SaaS Killer, but the Computing Power Revolution?

marsbitОпубликовано 2026-02-11Обновлено 2026-02-11

Введение

The next seismic shift in AI is not the threat of "SaaS killers" but a fundamental revolution in computing power. While many focus on how AI applications like Claude Cowork are disrupting traditional software, the real transformation is happening beneath the surface—in the infrastructure that powers AI. Two converging technological paths are challenging NVIDIA’s GPU dominance: 1. **Algorithmic Efficiency**: DeepSeek’s Mixture-of-Experts (MoE) architecture allows massive models (e.g., DeepSeek-V2 with 236B parameters) to activate only a small fraction of "experts" (9%) during computation, achieving GPT-4-level performance at 10% of the computational cost. This decouples AI capability from sheer compute power. 2. **Specialized Hardware**: Inference-optimized chips from companies like Cerebras and Groq integrate memory directly onto the chip, eliminating data transfer delays. This "zero-latency" design drastically improves speed and efficiency, prompting even OpenAI to sign a $10B deal with Cerebras. Together, these advances could cause a cost collapse: training costs may drop by 90%, and inference costs could fall by an order of magnitude. The total cost of running world-class AI may plummet to 10-15% of current GPU-based solutions. This paradigm shift threatens NVIDIA’s valuation, built on the assumption of perpetual GPU dominance. If the market realizes that GPUs are no longer the only—or best—option, the foundation of NVIDIA’s trillions in market cap could crumble. The ...

Written by: Bruce

Lately, the entire tech and investment communities have been fixated on the same thing: how AI applications are "killing" traditional SaaS. Since @AnthropicAI's Claude Cowork demonstrated how easily it can help you write emails, create PPTs, and analyze Excel spreadsheets, a panic about "software is dead" has begun to spread. This is indeed frightening, but if your gaze stops here, you might be missing the real seismic shift.

It's like we're all looking up at a drone dogfight in the sky, but no one notices that the entire continental plate beneath our feet is quietly shifting. The real storm is hidden beneath the surface, in a corner most people can't see: the computing power foundation that supports the entire AI world is undergoing a "silent revolution."

And this revolution might end the grand party hosted by AI's shovel seller: NVIDIA @nvidia, much sooner than anyone imagined.

Two Revolutionary Paths Converging

This revolution isn't a single event, but rather the convergence of two seemingly independent technological paths. They are like two armies closing in, forming a pincer movement against NVIDIA's GPU hegemony.

The first path is the algorithm slimming revolution.

Have you ever thought about whether a super brain really needs to mobilize all its cells when thinking about a problem? Obviously not. DeepSeek figured this out with their Mixture of Experts (MoE) architecture.

You can think of it like a company with hundreds of experts in different fields. But every time you need to solve a problem, you only call upon the two or three most relevant experts, rather than having everyone brainstorm together. This is the cleverness of MoE: it allows a massive model to activate only a small portion of its "experts" during each computation, drastically saving computing power.

What's the result? The DeepSeek-V2 model nominally has 236 billion "experts" (parameters), but only needs to activate 21 billion of them for each task—less than 9% of the total. Yet its performance is comparable to GPT-4, which requires 100% full operation. What does this mean? AI capability is decoupling from the computing power it consumes!

In the past, we all assumed that the stronger the AI, the more GPUs it would need. Now, DeepSeek shows us that through clever algorithms, the same effect can be achieved at one-tenth the cost. This directly puts a huge question mark on the essential nature of NVIDIA GPUs.

The second path is the hardware "lane change" revolution.

AI work is divided into two phases: training and inference. Training is like going to school, requiring reading countless books (data); here, GPUs with their "brute force" parallel computing are indeed useful. But inference is like our daily use of AI, where response speed is more critical.

GPUs have an inherent weakness in inference: their memory (HBM) is external, and data transfer back and forth causes latency. It's like a chef whose ingredients are in a fridge in the next room; every time they cook, they have to run over to get them—no matter how fast, it can't be instant. Companies like Cerebras and Groq started from scratch, designing dedicated inference chips that solder the memory (SRAM) directly onto the chip, putting the ingredients right at hand, achieving "zero latency" access.

The market has voted with real money. OpenAI, while complaining about NVIDIA's GPU inference performance, turned around and signed a $10 billion deal with Cerebras specifically to rent their inference services. NVIDIA itself panicked, spending $20 billion to acquire Groq, precisely to not fall behind in this new race.

When the Two Paths Converge: A Cost Avalanche

Now, let's put these two things together: run an algorithmically "slimmed-down" DeepSeek model on a hardware platform with "zero latency" like a Cerebras chip.

What happens?

A cost avalanche.

First, the slimmed-down model is small enough to be loaded entirely into the chip's built-in memory at once. Second, without the external memory bottleneck, the AI's response speed becomes astonishingly fast. The final result: training costs drop by 90% due to the MoE architecture, and inference costs drop by another order of magnitude due to specialized hardware and sparse computation. In total, the cost of owning and operating a world-class AI could be just 10%-15% of the traditional GPU solution.

This isn't an improvement; it's a paradigm shift.

The Carpet is Being Pulled from Under NVIDIA's Throne

Now you should understand why this is more fatal than the "Cowork panic."

NVIDIA's multi-trillion dollar valuation today is built on a simple story: AI is the future, and the future of AI depends on my GPUs. But now, the foundation of that story is being shaken.

In the training market, even if NVIDIA maintains its monopoly, if customers can do the work with one-tenth the GPUs, the overall size of this market could shrink significantly.

In the inference market, a cake ten times larger than training, NVIDIA not only lacks an absolute advantage but is also facing a siege from various players like Google and Cerebras. Even its biggest customer, OpenAI, is defecting.

Once Wall Street realizes that NVIDIA's "shovels" are no longer the only, or even the best, option, what will happen to the valuation built on the expectation of "permanent monopoly"? I think we all know.

Therefore, the biggest black swan in the next six months might not be which AI application has killed what, but a seemingly insignificant piece of tech news: for example, a new paper on the efficiency of MoE algorithms, or a report showing a significant increase in market share for dedicated inference chips, quietly announcing a new phase in the computing power war.

When the "shovel seller's" shovels are no longer the only choice, his golden age may well be over.

Связанные с этим вопросы

QWhat is the core argument of the article regarding the next major shift in AI?

AThe article argues that the next major shift in AI is not the threat of AI applications killing traditional SaaS, but rather a 'silent revolution' in the computational power (compute) that underpins the entire AI world. This revolution, driven by algorithmic efficiency and new hardware, threatens to disrupt the dominance of companies like NVIDIA.

QWhat are the two technological paths converging to challenge NVIDIA's GPU dominance?

AThe two converging paths are: 1) The algorithmic 'slimming revolution,' exemplified by architectures like Mixture of Experts (MoE) from DeepSeek, which drastically reduces the computational power needed for a given level of performance. 2) The hardware 'lane-changing revolution,' with companies like Cerebras and Groq designing specialized inference chips that eliminate memory bottlenecks, offering vastly faster and more efficient processing than traditional GPUs.

QHow does the Mixture of Experts (MoE) architecture, as used in DeepSeek-V2, achieve its efficiency?

AThe MoE architecture works like a company of experts. Instead of activating the entire massive model for every task, it only activates the most relevant small subset of 'experts' (a fraction of the total parameters). For example, DeepSeek-V2 has 236 billion parameters but only activates 21 billion (less than 9%) for a given task, achieving performance comparable to models that require 100% activation, thus decoupling AI capability from compute consumption.

QWhat specific market action is cited as evidence of the shift away from NVIDIA's GPUs for AI inference?

AThe article cites OpenAI's actions as key evidence: while complaining about the inefficiency of NVIDIA GPUs for inference, OpenAI signed a $10 billion deal to rent inference services from Cerebras, a company specializing in dedicated inference chips. Additionally, NVIDIA's own response—spending $20 billion to acquire Groq—is presented as a move to avoid falling behind in this new hardware paradigm.

QWhat is the potential financial impact on NVIDIA if the described compute revolution succeeds?

AThe article suggests a potential 'cost avalanche' where the total cost of owning and operating a world-class AI could drop to just 10-15% of the traditional GPU-based solution. This would severely challenge NVIDIA's business model, which is built on the premise that AI's future is dependent on its GPUs. If the market realizes NVIDIA's 'shovels' are no longer the only or best option, the 'permanent monopoly' expectation underpinning its multi-trillion dollar valuation could collapse.

Похожее

10% Position Limit Proposed: UK Retail Authorized Funds to Gain Indirect Exposure to Crypto Assets

The UK Financial Conduct Authority (FCA) is consulting on a proposal (CP26/17) that would allow retail funds, including UCITS and most Non-UCITS Retail Schemes (NURS), to invest up to 10% of their total assets in cryptoasset exchange-traded notes (crypto ETNs). This would enable indirect exposure to cryptoassets for mainstream investors through regulated funds. The rule maintains the existing prohibition on funds holding underlying cryptocurrencies like Bitcoin or Ethereum directly. The proposal introduces a strict 10% cap, positioning crypto ETNs as a potential satellite holding within diversified portfolios. Funds must ensure these investments align with their stated objectives and risk profiles. Notably, the cap does not apply to Qualified Investor Schemes (QIS) for professional clients, while Long-Term Asset Funds (LTAFs) would be prohibited from holding crypto ETNs. This move builds on the FCA's 2025 decision to permit retail trading of crypto ETNs on UK regulated exchanges. However, significant compliance burdens fall on fund managers, who must conduct thorough due diligence, assess liquidity, and provide clear risk disclosures to investors. The FCA emphasizes that even a small allocation can significantly impact a fund's risk profile. The policy's practical impact remains uncertain. Widespread adoption depends on whether asset managers deem the potential benefits worth the operational costs, disclosure requirements, and reputational risks. The consultation is open for feedback until July 13, 2026. Ultimately, the proposal represents a cautious, incremental step toward integrating cryptoassets into the regulated fund landscape, rather than a broad opening.

Foresight News16 мин. назад

10% Position Limit Proposed: UK Retail Authorized Funds to Gain Indirect Exposure to Crypto Assets

Foresight News16 мин. назад

Public Version of Mythos Officially Launched: Analyzing the Advantages and Limitations of AI Smart Contract Auditing

Publicly available Mythos, Anthropic's AI model, has officially launched, demonstrating both significant potential and limitations in smart contract security auditing. The article analyzes its capabilities through real-world cases. AI excels in identifying subtle, low-level vulnerabilities through pattern recognition and large-scale code screening. A key example is detecting a storage slot collision between a custom rewards mapping and a third-party library's ReentrancyGuard, a vulnerability easily missed in manual audits. In the recent Zcash incident, AI also rapidly discovered a critical soundness bug that had remained hidden for years. However, AI currently struggles with complex, interconnected scenarios. When tested on the Curve LlamaLend sDOLA exploit, which involved manipulating prices across multiple protocols (Curve pools, lending markets) to trigger liquidations, Fable 5 failed to identify the core cross-protocol attack vector. These scenarios require a deep understanding of DeFi economic models and multi-contract interactions. In conclusion, while AI tools like Mythos significantly boost efficiency in finding standardized, syntactic vulnerabilities, they cannot yet replace expert analysis for complex, business-logic, and cross-protocol attacks. An effective audit workflow combines AI's speed for initial screening with human expertise for in-depth, holistic analysis.

marsbit20 мин. назад

Public Version of Mythos Officially Launched: Analyzing the Advantages and Limitations of AI Smart Contract Auditing

marsbit20 мин. назад

Trade.xyz's Rebase Refusal Sparks Controversy, On-Chain Pre-IPO Market Faces Major Pricing Test

The debate surrounding Trade.xyz's refusal to adjust its SPCX (SpaceX pre-IPO) perpetual contract pricing amid updated share count revelations highlights a key challenge for on-chain pre-IPO markets. While several centralized exchanges (CEXs) paused and repriced their contracts after SpaceX's filing showed a ~10% increase in total shares, Trade.xyz maintained its market-driven pricing logic, which tracks expected per-share price sentiment rather than fundamental valuation metrics like market cap. This discrepancy triggered cross-platform arbitrage and caused leveraged long positions on Trade.xyz to suffer significant losses, as the platform's HIP-3 architecture lacks a native "Rebase" mechanism to neutrally adjust all user positions following such corporate actions. The incident underscores the difficulty for decentralized perpetual exchanges (Perp DEXs) to implement Rebase—a process CEXs handle by centrally pausing markets and adjusting ledger data. On-chain, this requires complex smart contract modifications, increasing gas costs, complexity, and potential attack surfaces. While some DEXs have managed similar adjustments, Trade.xyz's current design does not natively support it, though the team is reportedly exploring solutions for future events like stock splits. Ultimately, the controversy serves as a critical case study for the nascent on-chain pre-IPO sector, raising questions about price discovery reliability, transparent rule disclosure, and the readiness of DeFi infrastructures to handle traditional corporate actions as real-world assets (RWAs) gain traction.

marsbit28 мин. назад

Trade.xyz's Rebase Refusal Sparks Controversy, On-Chain Pre-IPO Market Faces Major Pricing Test

marsbit28 мин. назад

The 'Middle Eastern Prince' Swindles a Wealthy Woman: Renting Planes and Rolls-Royces, Scamming 120 Million Over Three Years

Two brothers who posed as "Middle Eastern princes" have been sentenced in the United States to 24 and 23 years in prison, respectively, and ordered to pay over $21.2 million in restitution and back taxes. Over three years, they fraudulently obtained approximately $21 million, primarily by promoting fictitious investment projects, including a non-existent cryptocurrency mining operation in a former General Electric industrial park in East Cleveland. The brothers, aged 42 and 33, created elaborate personas: one claimed to be a wealthy royal family heir and the city's "International Economic Advisor," while the other posed as a hedge fund manager with expertise from watching the TV show *Billions*. They bolstered their image by renting luxury cars and private jets and cultivating a relationship with a local mayor's chief of staff, who provided official-looking documents and government event access. A significant portion of the victims' funds, about $18 million, came from a single Chinese investor, a woman from Sichuan with experience in Bitcoin mining. The brothers also defrauded several women, including one former girlfriend. Their scheme unraveled when the primary investor discovered her $6 million worth of mining equipment had been sold off. The case highlights a trend of impostors using fabricated "Middle Eastern royal" identities to target wealthy individuals. Similar incidents include a "Dubai prince" who recently promoted a $500 million family office in Hong Kong and a Colombian man who impersonated a Saudi prince for decades in the US before being caught and sentenced in 2019.

marsbit43 мин. назад

The 'Middle Eastern Prince' Swindles a Wealthy Woman: Renting Planes and Rolls-Royces, Scamming 120 Million Over Three Years

marsbit43 мин. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片