# Memory Related Articles

HTX News Center provides the latest articles and in-depth analysis on "Memory", covering market trends, project updates, tech developments, and regulatory policies in the crypto industry.

The AI Bear Market Lasting Two Days Is Over; Why Did Funds Buy Back Storage Stocks First?

After a severe two-day selloff in early June that erased over $1 trillion from U.S. chip stock market value, capital is flowing back first to the memory sector. The correction was not driven by a collapse in AI demand but rather a market reassessment of high expectations. Stocks like Broadcom faced selling pressure despite strong AI revenue guidance, signaling a shift in focus from who has an "AI story" to who can most rapidly translate AI demand into verifiable profits and earnings per share (EPS). Memory companies, such as Micron and SK Hynix, are leading the recovery because their EPS growth is more immediately verifiable. The AI server boom directly increases demand for high-bandwidth memory (HBM) and high-capacity server DRAM, tightening supply and driving up contract prices for conventional DRAM and NAND Flash. This price increase, coupled with a shift to higher-margin products, flows directly into near-term revenue and profitability, as evidenced in recent earnings reports. In contrast, other AI semiconductor segments like GPUs, ASICs, and optical modules, while central to the long-term AI infrastructure story, face longer and less certain paths to EPS validation. Their growth depends more on future product cycles, customer adoption timelines, and capital expenditure plans. The rebound in memory stocks highlights a market preference for assets with shorter, more transparent EPS conversion cycles following the recent de-risking phase. However, this does not negate the potential of other AI hardware segments should they provide clearer near-term order visibility. The episode has raised the validation bar for all AI-related investments.

marsbit06/09 07:57

The AI Bear Market Lasting Two Days Is Over; Why Did Funds Buy Back Storage Stocks First?

marsbit06/09 07:57

Huang Renxun Dramatically 'Saves' South Korean Stock Market

In early June, South Korea's stock market experienced a sharp decline, with the KOSPI index dropping over 5% and triggering a trading halt. Amid this volatility, NVIDIA CEO Jensen Huang's visit to Seoul provided a dramatic boost to market sentiment. During his trip, Huang held a dinner meeting with SK Group Chairman Chey Tae-won and SK Hynix CEO Kwak Noh-Jung. He announced that NVIDIA's new Vera CPU would utilize SK Hynix DRAM and confirmed a multi-year technical collaboration between the two companies. This partnership aims to co-develop next-generation memory for NVIDIA's AI infrastructure roadmap, covering products from data center supercomputers to personal AI devices. Huang also publicly commented that AI company stocks were attractively priced. A key announcement was that NVIDIA's upcoming Vera Rubin AI supercomputer systems will use HBM4 memory, with supply qualifications granted to all three major suppliers: SK Hynix, Samsung Electronics, and Micron Technology. Despite this multi-sourcing strategy, Huang warned that the industry-wide chip shortage, affecting everything from wafers to packaging, is expected to persist for several years due to relentless demand from global AI factory construction. The collaboration extends beyond memory supply. SK Hynix will employ NVIDIA's AI platforms and Omniverse digital twin technology to enhance its own semiconductor design, simulation, and manufacturing processes, aiming for more autonomous factory operations. This visit builds upon a prior October 2025 agreement for SK Group to build a large-scale AI data center using over 50,000 NVIDIA GPUs. Huang's itinerary also included meetings with other Korean giants like Hyundai, LG, and Samsung, indicating NVIDIA's broader strategy to deepen ties with South Korea's tech industry.

链捕手06/08 15:45

Huang Renxun Dramatically 'Saves' South Korean Stock Market

链捕手06/08 15:45

Crossing the 'Memory Wall': The Wafer-Level Revolution and Computing Power Routes in the AI Inference Era

In 2026, a historic shift occurred in AI as major cloud providers' inference spending surpassed training spending for the first time, signaling a move from "building large models" to "using large models." This shifts the core challenge from computing power to the "memory wall"—the bottleneck of data movement (model weights, activations, KV Cache) between external DRAM and processors, where energy and latency from data transfer far exceed computation itself. Companies like Nvidia face GPU idle time due to bandwidth limits. In contrast, Cerebras Systems adopts a radical "wafer-scale" approach with its Wafer-Scale Engine (WSE). Instead of cutting a silicon wafer into many chips, Cerebras uses almost the entire wafer as one massive chip (WSE-3). This design provides 44GB of on-chip SRAM, delivering memory bandwidth thousands of times higher than traditional HBM (e.g., 21 PB/s vs. Nvidia B200). For LLM inference, weights are streamed layer-by-layer from external MemoryX storage to the chip, avoiding HBM bottlenecks. This results in token generation speeds 1.5–5 times faster than Nvidia's B200 in some models and significant advantages in first-token latency and long-context tasks. Additionally, Cerebras's architecture offers much lower interconnect power consumption (0.15 pJ/bit vs. GPU's ~10 pJ/bit). However, Cerebras faces challenges: SRAM scaling has slowed with advanced nodes, limiting future capacity gains; the chip requires specialized liquid cooling and custom software stacks; and its external I/O bandwidth (150 GB/s) is low compared to NVLink, hindering multi-system scaling for very large models. Competition is intensifying. Major players are pursuing three paths: 1) Developing proprietary inference ASICs (e.g., Google TPU, Microsoft Maia), 2) Leveraging advanced packaging (e.g., TSMC's SoW) to democratize wafer-scale-like integration, potentially eroding Cerebras's process advantage within a few years, and 3) Exploring optical interconnects for ultimate bandwidth. Commercially, Cerebras is transitioning from a hardware vendor to a service provider, facing the immense challenge of building high-power, specialized data centers to meet large contracts (e.g., 250MW/year from 2026–2028). In conclusion, the AI inference era presents a fundamental architectural trade-off. Cerebras opts for extreme physical optimization for low-latency, single-task performance, while Nvidia prioritizes versatility and massive cluster throughput. The path forward remains uncertain, with technology and business models still evolving in the race toward advanced AI.

marsbit06/05 11:07

Crossing the 'Memory Wall': The Wafer-Level Revolution and Computing Power Routes in the AI Inference Era

marsbit06/05 11:07

55TB to 28TB? The Rumor and Panic Behind Rubin's Memory Being Halved

Title: 55TB to 28TB? The Rumor and Panic Behind the Potential Halving of Rubin's Memory. On June 4th, a report from SemiAnalysis suggested NVIDIA's next-gen Vera Rubin NVL72 AI rack may ship with roughly 28TB of SOCAMM DRAM per rack instead of the anticipated 55TB, primarily using 96GB modules. This sparked a market panic, causing Micron's stock to drop over 10% on fears of halved memory demand. However, the article argues this panic is misguided for several key reasons. First, SOCAMM modules are socketed and upgradeable, not soldered. Lower initial configuration doesn't mean permanent demand loss. Second, the primary driver is a severe 2026 LPDDR5X supply shortage, not diminished need. NVIDIA is likely prioritizing rack shipments with available components. Third, with fixed total LPDDR5X supply, using less per rack could allow NVIDIA to ship *more* racks, not necessarily reducing overall memory orders. Micron's sharp drop was also attributed to a broader semiconductor sell-off triggered by Broadcom's earnings, with the SemiAnalysis report providing a convenient narrative for profit-taking after Micron's massive rally. In summary: the report on lower default configurations is likely accurate, but interpreting it as a demand collapse is wrong. The real risk for Micron lies in its reportedly minimal HBM4 share for Rubin, not in potentially flexible SOCAMM demand. The sell-off appears more like a correction amplified by coinciding negative catalysts.

marsbit06/05 01:15

55TB to 28TB? The Rumor and Panic Behind Rubin's Memory Being Halved

marsbit06/05 01:15

Can DeepSeek Save China One Trillion Dollars?

"DeepSeek and the $1 Trillion Infrastructure Question" The article examines whether DeepSeek's AI optimization breakthroughs could potentially save China $1 trillion in future AI infrastructure costs. The analysis begins with Nvidia's upcoming Vera Rubin AI platform, costing ~$7.8 million, where memory (HBM4/LPDDR5X) constitutes $2 million—a 435% cost increase in one year, highlighting how AI hardware spending is shifting toward expensive memory components. DeepSeek's approach works in the opposite direction. Through three key technical innovations showcased in DeepSeek V4, the company dramatically improves hardware efficiency: 1. **Memory Compression (MLA)**: Re-engineers the attention mechanism to compress long-context memory (KV Cache) by over 90%, drastically reducing expensive HBM usage. 2. **Selective Activation (MoE)**: Employs Mixture-of-Experts architecture where only a small fraction of parameters (e.g., 49B out of 1.6T in V4-Pro) are activated per token, allowing most parameters to reside in cheaper memory/SSD. 3. **Computation Caching**: Reuses previously computed results via cache hits, replacing expensive GPU computations with cheap memory reads. Combined, these optimizations allow the same hardware to produce approximately 4x more tokens, effectively reducing required hardware investment by 75%. DeepSeek's pricing reflects this: a 10-billion token workload costs ~$522 monthly versus ~$9,000-$10,000 for competitors. The $1 trillion savings projection stems from McKinsey's estimate that global AI infrastructure will require ~$5.2 trillion investment by 2030. As China's daily token consumption grows toward quadrillions, even marginal efficiency gains scale massively. With a conservative 4x throughput improvement, China could avoid building tens of thousands of AI data centers equivalent to ~7 trillion RMB ($1 trillion) in saved investment. Critically, this strategy shifts dependency from scarce, expensive GPU/HBM—where China lags—toward more accessible storage, caching, and systems engineering where domestic suppliers like CXMT are gaining strength. Rather than "replacing Nvidia," DeepSeek rebalances AI's value chain away from monolithic hardware dependency. Ultimately, DeepSeek's technical breakthroughs could lower the barrier to AI adoption across Chinese industries by making advanced capabilities affordable at scale—transforming who can access next-generation AI.

marsbit06/03 00:47

Can DeepSeek Save China One Trillion Dollars?

marsbit06/03 00:47

AI Competition's New Battlefield: Long-term Memory Becomes the Pain Point, How Users Can Secure Their Own Context Ownership

A new front is emerging in the AI competition: user ownership of long-term memory and context. As AI models like ChatGPT evolve from chat tools into persistent digital assistants that learn user preferences and workflows, a critical question arises: who owns this accumulated "memory"? Currently, this personalized data is siloed within each platform (e.g., OpenAI, Anthropic, Google), creating a fragmented experience when users switch models. The article highlights ZetaChain's strategic pivot from blockchain interoperability to addressing this AI "memory" challenge. Its new focus is on building a "Private Memory Layer" and an "AI Consumer Layer." Through its consumer product Anuma, ZetaChain aims to give users encrypted, portable memory that can be used across different AI models. This system also envisions programmable, auditable permissions for AI agents and a framework where user knowledge can be monetized as shareable assets. Ultimately, ZetaChain's transformation reflects a broader infrastructure shift. The future bottleneck is less about raw model capability and more about continuous context, user-controlled identity, and permission management across multiple collaborating AI agents. The company's ZETA token is being repositioned as an "AI infrastructure token" to facilitate access, payments, and permissions within this proposed ecosystem. The core narrative advocates for returning control of personal context and AI relationships to users, rather than leaving them locked within proprietary platforms.

marsbit06/02 04:30

AI Competition's New Battlefield: Long-term Memory Becomes the Pain Point, How Users Can Secure Their Own Context Ownership

marsbit06/02 04:30

活动图片