# Сопутствующие статьи по теме Inference

Новостной центр HTX предлагает последние статьи и углубленный анализ по "Inference", охватывающие рыночные тренды, новости проектов, развитие технологий и политику регулирования в криптоиндустрии.

After Institutional Support and Price Surge, Revisiting the True Value of Bittensor's 128 Subnets

After removing institutional support and price increases, this article re-evaluates the real value of Bittensor's 128 subnets. Bittensor operates as a decentralized AI ecosystem where each subnet functions like an independent startup with its own token (Alpha), revenue model, and team. There are two primary ways to earn: TAO emissions (protocol subsidies based on staking inflows) and Alpha token PnL (capital gains from subnet performance). Since the Taoflow update in November 2025, subnets with negative net staking flow receive zero emissions, creating a competitive environment. Approximately 3,600 TAO (around $960k daily) is distributed, with the top 10 subnets controlling 56% of emissions. Key case studies include Chutes (SN64), which demonstrates product-market fit with 400k users and 9.1 trillion tokens processed at 85% lower cost than AWS, and Templar (SN3), which offers asymmetric upside by training frontier LLMs in a fully decentralized manner. The investment framework positions TAO as an index fund for the entire network, while Alpha staking represents concentrated bets on specific subnets. The ecosystem is attracting institutional interest, with significant holdings from DCG and Polychain Capital. The conclusion emphasizes evaluating subnets based on product utility, staking flow, team execution, organic demand, and liquidity conditions.

marsbit03/17 13:32

After Institutional Support and Price Surge, Revisiting the True Value of Bittensor's 128 Subnets

marsbit03/17 13:32

AI Jargon Dictionary (March 2026 Edition), Recommended to Bookmark

AI Jargon Dictionary (March 2026 Edition): A Practical Guide This article provides a clear glossary of essential AI terminology, perfect for anyone looking to quickly understand the field. It breaks down key concepts into two categories: foundational and advanced terms. Foundational terms (12) include: - **LLM (Large Language Model)**: Deep learning models trained on vast data to understand and generate language. - **AI Agent**: Systems that understand goals, use tools, and execute multi-step tasks. - **Multimodal**: Models that process and generate multiple content types (text, image, audio, video). - **Generative AI (AIGC)**: AI focused on creating new content. - **Token**: The basic unit for processing input/output, affecting cost and performance. - **Context Window**: The amount of data a model can consider at once. - **RAG (Retrieval-Augmented Generation)**: Enhances responses using external, up-to-date data sources. Advanced terms (18) cover: - **Transformer**: The architecture behind modern LLMs. - **Hallucination**: When models generate incorrect but confident responses. - **Agentic Workflow**: Systems that autonomously break down and execute tasks. - **Vibe Coding**: Generating code through natural language commands without manual coding. The guide is designed to help readers confidently navigate AI discussions and implementations.

marsbit03/11 11:52

AI Jargon Dictionary (March 2026 Edition), Recommended to Bookmark

marsbit03/11 11:52

China's AI Computing Counterattack

Eight years after the ZTE crisis, China's AI industry is fighting back against U.S. chip restrictions. In 2018, ZTE nearly collapsed under U.S. sanctions but survived with heavy fines and oversight. Today, Chinese AI firms like DeepSeek are pivoting away from NVIDIA by developing domestic alternatives and optimizing algorithms to reduce reliance on foreign technology. DeepSeek’s V4 model will use entirely domestic chips, signaling a strategic shift toward computational independence. The real challenge isn’t just hardware—it’s NVIDIA’s CUDA ecosystem, which dominates global AI development with over 4.5 million developers. U.S. export controls have tightened since 2022, banning high-end chips like the A100, H100, and their downgraded versions. In response, Chinese companies are adopting technical workarounds like Mixture-of-Experts models, which activate only parts of the network during inference, slashing costs. DeepSeek’s API is up to 75x cheaper than competitors, driving rapid global adoption. By early 2026, Chinese models accounted for nearly 60% of API calls on OpenRouter. Domestic chips, such as Huawei’s Ascend series, are now capable of full-scale training, not just inference. Production lines in cities like Xinghua manufacture servers with homegrown processors, supporting major AI training projects. Meanwhile, the U.S. faces an electricity shortage as data centers consume growing power, while China benefits from greater energy capacity and lower costs. Chinese AI is also going global via “Token exports,” with services reaching users in India, Indonesia, and beyond. The situation echoes Japan’s semiconductor decline in the 1980s, but China is building an independent ecosystem rather than relying on global supply chains. Domestic chip firms report surging revenues but ongoing losses—reflecting the high cost of achieving true technological independence. The battle is difficult, but progress is underway.

marsbit03/04 05:09

marsbit03/04 05:09

The Next Earthquake in AI: Why the Real Danger Isn't the SaaS Killer, But the Computing Power Revolution?

The next seismic shift in AI isn't about SaaS disruption but a fundamental revolution in computing power. While many focus on AI applications like Claude Cowork replacing traditional software, the real transformation is happening beneath the surface: a dual revolution in algorithms and hardware that threatens NVIDIA’s dominance. First, algorithmic efficiency is advancing through architectures like MoE (Mixture of Experts), which activates only a fraction of a model’s parameters during computation. DeepSeek-V2, for example, uses just 9% of its 236 billion parameters to match GPT-4’s performance, decoupling AI capability from compute consumption and slashing training costs by up to 90%. Second, specialized inference hardware from companies like Cerebras and Groq is replacing GPUs for AI deployment. These chips integrate memory directly onto the processor, eliminating latency and drastically reducing inference costs. OpenAI’s $10 billion deal with Cerebras and NVIDIA’s acquisition of Groq signal this shift. Together, these trends could collapse the total cost of developing and running state-of-the-art AI to 10-15% of current GPU-based approaches. This paradigm shift undermines NVIDIA’s monopoly narrative and its valuation, which relies on the assumption that AI growth depends solely on its hardware. The real black swan event may not be an AI application breakthrough but a quiet technical report confirming the decline of GPU-centric compute.

marsbit02/12 04:38

The Next Earthquake in AI: Why the Real Danger Isn't the SaaS Killer, but the Computing Power Revolution?

The next seismic shift in AI is not the threat of "SaaS killers" but a fundamental revolution in computing power. While many focus on how AI applications like Claude Cowork are disrupting traditional software, the real transformation is happening beneath the surface—in the infrastructure that powers AI. Two converging technological paths are challenging NVIDIA’s GPU dominance: 1. **Algorithmic Efficiency**: DeepSeek’s Mixture-of-Experts (MoE) architecture allows massive models (e.g., DeepSeek-V2 with 236B parameters) to activate only a small fraction of "experts" (9%) during computation, achieving GPT-4-level performance at 10% of the computational cost. This decouples AI capability from sheer compute power. 2. **Specialized Hardware**: Inference-optimized chips from companies like Cerebras and Groq integrate memory directly onto the chip, eliminating data transfer delays. This "zero-latency" design drastically improves speed and efficiency, prompting even OpenAI to sign a $10B deal with Cerebras. Together, these advances could cause a cost collapse: training costs may drop by 90%, and inference costs could fall by an order of magnitude. The total cost of running world-class AI may plummet to 10-15% of current GPU-based solutions. This paradigm shift threatens NVIDIA’s valuation, built on the assumption of perpetual GPU dominance. If the market realizes that GPUs are no longer the only—or best—option, the foundation of NVIDIA’s trillions in market cap could crumble. The real black swan event may not be a new AI application, but a quiet technical breakthrough that reshapes the compute landscape.

marsbit02/11 01:58

Understanding Jensen Huang's Physical AI: Why Is Crypto's Opportunity Also Hidden in the 'Nooks and Crannies'?

Jensen Huang's recent speech at Davos signals a pivotal shift in AI: the transition from the training-focused "brute force" era of AI 1.0 to the new paradigm of "Physical AI" and inference. This marks the next phase after Generative AI, focusing on real-world application and embodiment. Physical AI aims to solve the "last-mile" problem of AI: moving from digital intelligence to physical action. While LLMs have consumed vast digital data, they lack understanding of the physical world—like how to twist open a bottle cap. Physical AI requires three core capabilities: 1. Spatial Intelligence: AI must perceive and interpret 3D environments in real-time, understanding object properties, depth, and interaction dynamics. 2. Virtual Training Grounds: Systems like NVIDIA’s Omniverse enable simulation-to-real (Sim-to-Real) training, allowing robots to learn through vast virtual iterations without costly physical failures. 3. Electronic Skin and Touch Data: Sensors that capture tactile feedback—temperature, pressure, texture—are critical. This data is a new, untapped asset class. This shift opens significant opportunities for Crypto and Web3 ecosystems. DePIN networks can crowdsource hyperlocal spatial data from "every corner" of the world through token incentives. Distributed computing networks can provide edge-based rendering and inference power for low-latency physical responses. Tokenized data ownership and privacy-preserving sharing mechanisms can enable the scalable, ethical collection of sensitive tactile data. In short, Physical AI isn’t just the next chapter for Web2—it’s a catalyst for Web3 domains like DePIN, DeData, and decentralized AI.

marsbit01/23 00:35

Understanding Jensen Huang's Physical AI: Why Is Crypto's Opportunity Also Hidden in the 'Nooks and Crannies'?

marsbit01/23 00:35

Jensen Huang Announces 8 New Products in 1.5 Hours, NVIDIA Fully Bets on AI Inference and Physical AI

NVIDIA CEO Jensen Huang unveiled eight major announcements during his CES 2026 keynote, focusing on advancing AI inference and physical AI technologies. The centerpiece was the NVIDIA Vera Rubin POD AI supercomputer, which integrates six custom chips—Vera CPU, Rubin GPU, NVLink 6 Switch, ConnectX-9 SuperNIC, BlueField-4 DPU, and Spectrum-X CPO—designed for协同 performance. The Rubin GPU offers 5x higher inference and 3.5x higher training performance than Blackwell, with support for HBM4 memory. The Vera Rubin NVL72 system delivers 3.6 EFLOPS in NVFP4 inference performance in a single rack, with enhanced memory bandwidth. NVIDIA also introduced the Spectrum-X Ethernet CPO for improved power efficiency, a推理上下文内存存储平台 to optimize KV cache storage and reduce recomputation, and the DGX SuperPOD based on Rubin architecture, cutting token costs for large MoE models to 1/10. On the software side, NVIDIA expanded its open-source offerings, including new models and datasets, and emphasized the rise of physical AI. The company open-sourced the Alpha-Mayo model for autonomous driving, enabling reasoning-based decision-making, and announced production-ready NVIDIA DRIVE platforms for Mercedes-Benz. Partnerships with Siemens and robotics firms like Boston Dynamics were highlighted, underscoring NVIDIA’s full-stack approach to AI infrastructure and real-world AI applications.

marsbit01/06 04:36

Jensen Huang Announces 8 New Products in 1.5 Hours, NVIDIA Fully Bets on AI Inference and Physical AI

marsbit01/06 04:36

Jensen Huang's CES 2026 Keynote: Three Key Themes and a 'Chip Beast'

At CES 2026, NVIDIA CEO Jensen Huang unveiled the Rubin AI computing platform, a major leap in AI infrastructure designed to make AI "think longer" and operate more cost-efficiently. The Rubin architecture includes the Rubin GPU, Vera CPU, NVLink 6, and BlueField-4, working together to significantly reduce inference costs—up to 10x lower than Blackwell—and support longer context and multi-step reasoning. A key innovation is the Inference Context Memory Storage Platform, which uses BlueField-4 and Spectrum-X to manage AI context data at scale, improving token throughput and energy efficiency. NVIDIA also introduced the DGX SuperPOD, scaling the Rubin platform to 576 GPUs for large-scale AI clusters. Huang emphasized the shift toward Reasoning/Agentic AI and announced major updates to NVIDIA’s open-source ecosystem, including models like Nemotron and tools for RAG, safety, and speech. In physical AI, NVIDIA launched Cosmos for physics-aware video generation and prediction, and Alpamayo—an open-source vision-language-action model for autonomous driving, now in production with Mercedes-Benz. The event also highlighted growing adoption of NVIDIA’s robotics platform, Isaac GR00T, by companies like Boston Dynamics and LG, signaling broader commercialization of AI in real-world applications.

marsbit01/06 02:08

Jensen Huang's CES 2026 Keynote: Three Key Themes and a 'Chip Beast'