AI PCs Are Here, Going Toe-to-Toe with 120B Models Locally! NVIDIA Redefines the "Personal AI Computer" Foundation with RTX Spark

marsbitОпубликовано 2026-06-01Обновлено 2026-06-01

Введение

NVIDIA has redefined the "AI PC" standard with the launch of the RTX Spark super chip at GTC 2026. Boasting 1 petaflop (1000 TOPS) of AI performance, it dwarfs the 45-50 TOPS NPUs in current AI PCs. The SoC features a Blackwell GPU, a 20-core Arm CPU co-designed with MediaTek, and crucially, up to 128GB of unified memory shared between CPU and GPU. This architectural shift enables local execution of 120-billion-parameter large language models with million-token context windows, a massive leap from the 9B-40B models typical on current consumer hardware. Beyond AI, use cases include 12K video editing and high-fps ray-traced gaming. Key to enterprise adoption is a security collaboration with Microsoft. Windows security is upgraded, and NVIDIA's OpenShell sandbox runtime is integrated to safely contain AI agent actions. Major software support comes from Adobe, which announced a deep,底层-level rewrite of Photoshop and Premiere to leverage the unified memory for up to 2x performance gains. Six OEMs, including Dell, HP, Lenovo, and Microsoft Surface, will release RTX Spark-based轻薄本 and compact desktops this fall. However, questions remain about real-world performance,功耗, thermal management in laptops, pricing, and the actual impact of the OpenShell sandbox. The RTX Spark represents a fundamental power shift in the PC industry, moving from an x86 CPU-centric model to a GPU-centric SoC platform, but its ultimate success hinges on the upcoming product rollouts and ecosystem validatio...

For the past two years, PC manufacturers have repeatedly mentioned one parameter when promoting "AI PCs": NPU performance. Whether it's Intel Lunar Lake's 45 TOPS or AMD Strix Point's 50 TOPS, these numbers have consistently remained at a relatively modest level. They can handle background blur, voice noise reduction, and run some small-scale on-device models—but that's about it.

On May 31st, at the GTC 2026 conference, NVIDIA unveiled the RTX Spark superchip, raising this figure to 1 petaflop, or 1000 TOPS. This isn't a 30% or 50% improvement—it's an entire order of magnitude leap.

Announced alongside were several other key developments: Microsoft upgraded Windows' native security mechanisms in coordination with RTX Spark and integrated NVIDIA's open-source sandbox runtime, OpenShell, into the Windows platform; Adobe announced a fundamental redesign of Photoshop and Premiere from the ground up to specifically adapt to RTX Spark's Unified Memory Architecture; Six initial OEMs confirmed they will launch thin-and-light laptops and compact desktops featuring this chip in the fall of this year.

What NVIDIA is doing at this GTC isn't just releasing a new chip. It is attempting to set a new hardware standard for the "Personal AI Computer" category.

When GPU Becomes the Star of the PC

First, let's examine the chip itself. According to data NVIDIA revealed at GTC, RTX Spark integrates a Blackwell architecture GPU with 6144 CUDA cores, paired with a 20-core Arm architecture Grace CPU jointly designed with MediaTek, manufactured using TSMC's 3nm process. The key change lies in the memory architecture: up to 128GB of unified memory, where the CPU and GPU share a single memory pool, eliminating the need to move data back and forth between the two.

This is the opposite of traditional PC architecture logic.

The fundamental structure of a traditional PC is "x86 CPU as the main processor, with a discrete GPU as an optional component." Even with the rise of the AI PC concept in recent years, the approach by Intel and AMD has been to embed an NPU within the CPU as an add-on module for AI acceleration, typically offering performance in the range of 40-50 TOPS. The GPU remains "external."

RTX Spark reassigns dominance. This SoC makes the GPU the protagonist, relegating the CPU to a supporting role. NVIDIA claims AI performance of 1 petaflop at FP4 precision, equivalent to 1000 TOPS—more than 20 times the performance of the built-in NPUs in the previous generation of AI PCs. This isn't just speeding up on the same track; it's starting the race on an entirely different one.

The rapid response from OEMs confirms this assessment. According to NVIDIA's official announcement and subsequent reports from DIGITIMES, Asus, Dell, HP, Lenovo, Microsoft Surface, and MSI will launch thin-and-light laptops and compact desktops powered by RTX Spark this fall, with models from Acer and Gigabyte to follow. Virtually all major Windows PC brands have joined the fray.

RTX Spark isn't a product born from nothing. In early 2025, the same core Blackwell + Grace chip was introduced as Project DIGITS and DGX Spark, but it was positioned then as a Linux desktop supercomputer for developers, roughly the size of a small desktop PC. A year later, this architecture has been squeezed into the thermal envelope of a thin-and-light laptop, the operating system switched from Linux to Windows, and the target audience expanded from AI developers to general consumers and enterprise users. This is the most noteworthy change in the consumer-facing announcements at GTC 2026: NVIDIA isn't releasing a developer toy; it's pushing open the door to the consumer market.

Running a 120B Model Locally—Is It Enough?

The numbers for performance and memory ultimately need to answer one question: What can you do with it?

The answer NVIDIA gave at the launch is that RTX Spark supports running a 120B parameter large language model locally, with a context window potentially reaching up to 1 million tokens. What does 120B mean? For reference, the current mainstream practice for running local models on consumer hardware involves using a quantized and compressed 30B to 40B parameter model on an RTX 4090 with 24GB of VRAM. Smaller models that run quickly on consumer GPUs are in the 9B range. Jumping from 9B to 120B redefines the "sufficient" standard for on-device AI.

The 128GB unified memory is the prerequisite for all this. In traditional PC architectures, the CPU has its own system memory, and the GPU has its own VRAM, with a physical boundary between them. A large model exceeding the VRAM capacity either won't run at all or requires complex model partitioning and memory swapping, causing a drastic slowdown. The unified memory architecture eliminates this bottleneck, allowing model data to reside directly in the shared 128GB pool accessible to both the CPU and GPU. Apple first demonstrated the consumer viability of this technical path with Apple Silicon; now NVIDIA is bringing it to the Windows camp.

Beyond large model inference, NVIDIA listed use cases including 12K video editing, 3D scene rendering exceeding 90GB, and ray-traced gaming at 1440p resolution with over 100 fps. The common characteristic of these scenarios is the extremely large volume of data processed in a single operation, where traditional PCs either require wait times many times longer than the processing time itself or simply cannot handle the task at all.

There remains a gap between "supports running" and "runs fluidly." NVIDIA did not disclose the actual inference speed for a 120B model on RTX Spark, nor did it provide first-token latency data for scenarios involving million-token contexts. A key metric determining long-context inference speed is memory bandwidth. For reference, the DGX Spark, which uses the same GB10 core, achieved a measured memory bandwidth of approximately 301 GB/s. This bandwidth level is adequate for running a 120B model, but when handling context windows in the million-token range, users might need to wait several seconds to see the first output token. The notebook version of RTX Spark might see this bandwidth adjusted due to power limitations.

Adding a Safety Cage for AI Agents

Another core announcement beyond raw performance is the collaboration between NVIDIA and Microsoft at the system level. This part might be the most easily overlooked but potentially most impactful content for the industry from the GTC 2026 consumer launch.

A computer capable of running a 120B model, if placed in the hands of an AI agent that can autonomously operate the desktop, click buttons, and read/write files, elevates the security risk beyond the level of "could data be lost" to "could the agent do something you don't want it to do." Without solving this problem, enterprises cannot deploy such devices to their employees.

The solution from Microsoft and NVIDIA is a two-layer defense. First, Microsoft upgraded Windows' native security mechanisms to provide monitoring and constraints for AI agent behavior at the operating system level. Second, NVIDIA formally introduced the OpenShell runtime to the Windows platform. According to NVIDIA's official documentation, OpenShell is an open-source sandbox runtime offering kernel-level isolation. It creates a controlled operational boundary for an AI agent, within which the agent can autonomously execute tasks, but its permissions are strictly limited, preventing unauthorized access to core system files, network connections, or user-sensitive data.

This combination has clear significance for enterprise procurement. Prior to this, the concept of "local AI agents" remained at the stage of technical demos. The hardware might be capable, but the security framework was non-existent. No enterprise IT department would dare to include devices in that state on their procurement list. By inserting a standardized isolation layer between hardware and application, NVIDIA and Microsoft are transforming "usable" into "manageable."

The performance overhead of OpenShell itself is a variable to be observed. Sandbox isolation typically incurs some degree of performance penalty. How much it affects inference speed or system responsiveness hasn't been publicly quantified by NVIDIA yet. Practical implementation challenges like deployment complexity for enterprise IT management and compatibility with existing security policies will need to be validated once OEM devices hit the market.

Why Adobe Is Willing to "Redesign from the Ground Up"

The level of cooperation from software vendors is often a key indicator of whether a new hardware platform can gain a foothold.

Adobe's announcement during GTC is the most significant signal from the software side of this launch. According to confirmations from NVIDIA's official blog and Adobe executives, Adobe has initiated a ground-up redesign of Photoshop and Premiere to specifically adapt to RTX Spark's Unified Memory Architecture, claiming potential performance improvements of up to 2x for AI and graphics processing.

"Redesign from the ground up" isn't about adding a plugin or an adaptation layer. On traditional PCs, where the CPU and GPU have separate memory spaces, processing a massive PSD file or an 8K video timeline involves repeatedly moving data between the two memory pools—a major source of performance waste. RTX Spark's unified memory allows the CPU and GPU to directly share the same 128GB space. This structural change holds real value for professional creators' workflows. Adobe's willingness to alter its foundational code for this indicates it views this architectural direction as more than a one-off marketing gimmick.

However, NVIDIA and Adobe have not disclosed the baseline for this "2x acceleration" claim. Is it compared to a current-generation x86 processor paired with a discrete GPU, or to the NPU solutions in the previous generation of AI PCs? The implications are vastly different. Until the benchmark testing conditions are made public, the true value of this number remains an open question.

Other announced supporters include Blackmagic Design, ComfyUI, llama.cpp, OTOY, and several game developers. The follow-up from ComfyUI and llama.cpp is noteworthy because they are among the most active open-source tools in current local AI workflows. Early support from the developer community often provides a more genuine reflection of a platform's ecosystem potential than promises from large corporations.

NVIDIA is leveraging the CUDA ecosystem and unified memory architecture to build an experience akin to Apple's tight software-hardware integration within the Windows camp. The difference is that Apple built its own walled garden, while NVIDIA needs to persuade Microsoft and ISVs to build it together. Adobe's willingness to undertake a foundational redesign suggests that at least the first brick of that wall has been laid.

Beyond the Paper Specs

Returning to the most practical question: Can you actually buy these devices, and what will the experience be like in hand?

According to information released by NVIDIA, the first RTX Spark devices are scheduled to launch in the fall of this year, spanning thin-and-light laptops and compact desktops from Asus, Dell, HP, Lenovo, Microsoft Surface, and MSI. Models from Acer and Gigabyte will follow. Specific pricing and exact launch dates for all OEMs have not been announced.

More critical than pricing are several physical unknowns. How will power consumption and thermal management be balanced when squeezing a 1 petaflop chip into a thin-and-light laptop? How does RTX Spark perform in non-AI scenarios like everyday office tasks and battery life? Will the actual memory bandwidth of the 128GB unified memory in a notebook form factor be significantly reduced due to power constraints?

These questions represent the real test of industrial implementation. The peak performance of a chip in an engineering prototype and its actual performance in a consumer's hands over 8 hours a day are often two different things. NVIDIA emphasized RTX Spark's energy efficiency during the launch but did not provide specific TDP values or battery life data.

From the perspective of the PC industry landscape, the emergence of RTX Spark signals the formation of a new division of labor model. Over the past three decades, the authority over core PC chips has resided with x86 processor manufacturers. GPU makers, while increasingly important, have always been "components plugged into the motherboard." What NVIDIA is offering this time is a complete SoC, integrating everything from the CPU and GPU to the memory controller, with the Arm-based CPU portion designed in partnership with MediaTek. The power structure of the PC industry chain is shifting from "x86 CPU plus optional GPU" towards "GPU-centric SoC platforms."

This shift won't happen overnight. The OEMs' pricing strategies, the actual energy efficiency performance of the products, the adaptation progress of ISV software, and the validation cycles for enterprise customer procurement—each link will determine whether RTX Spark becomes a new benchmark for the PC industry or merely another high-profile technical demo that fails to meet expectations. The answer will have to wait at least until this fall.

Связанные с этим вопросы

QWhat is the key hardware specification that sets NVIDIA's RTX Spark apart from previous AI PC chips, and by what magnitude?

AThe key specification is its AI compute performance, which reaches 1 petaflop (or 1000 TOPS) of FP4 precision. This represents a performance increase of over 20 times compared to the previous generation of AI PC chips from Intel and AMD, which offered around 45-50 TOPS.

QWhat is the significance of the unified memory architecture in the RTX Spark SoC, and how much memory is available?

AThe significance is that the CPU and GPU share a single, unified memory pool of up to 128GB. This eliminates the need for data to be copied back and forth between separate system RAM and GPU VRAM, which is a major bottleneck for running large AI models or processing large datasets like high-resolution video.

QWhich major software company announced a significant commitment to the RTX Spark platform, and what did they promise to do?

AAdobe announced it would be undertaking a major, low-level refactoring of its flagship applications Photoshop and Premiere to specifically optimize for the RTX Spark's unified memory architecture, promising AI and graphics processing performance improvements of up to 2x.

QWhat are the two main security components introduced by Microsoft and NVIDIA to make local AI agents safe for enterprise use?

AFirst, Microsoft is upgrading Windows' native security mechanisms to monitor and constrain AI agent behavior at the OS level. Second, NVIDIA is bringing its OpenShell sandbox runtime to Windows, which provides kernel-level isolation to strictly limit what an AI agent can do, preventing unauthorized access to core files or sensitive data.

QWhat major shift in PC industry dynamics does the RTX Spark chip represent according to the article?

AIt represents a shift in the fundamental power structure of the PC industry. For decades, the x86 CPU was the central, controlling processor. The RTX Spark, an Arm-based SoC with the GPU as the primary compute element, marks a move towards a 'GPU-centric SoC platform,' challenging the traditional 'x86 CPU plus optional GPU' model.

Похожее

Deconstructing the U.S. Stock Quantum Computing Sector: IonQ, Rigetti, D-Wave, Which of These Concept Stocks is Worth Betting On?

**Title:** Analyzing the US Quantum Computing Race: IonQ, Rigetti, D-Wave – Which Concept Stock is Worth Betting On? **Summary:** The podcast discusses the resurgence of quantum computing as a national priority for both the US and China, driven by its potential to break current encryption, revolutionize drug discovery, finance, and logistics. The core challenge is commercializing the technology, which is hampered by high error rates in quantum bits (qubits). Quantum error correction, requiring thousands of physical qubits per reliable logical qubit, is key but years away. The analysis compares three main publicly traded US quantum computing firms: * **IonQ (Ion Trap):** Considered the most financially stable with the fastest commercial progress (2025 revenue: $130M, +202%) and high-quality clients. Its valuation is very high, pricing in significant future growth. * **Rigetti (Superconducting):** Seen as the highest-risk, highest-potential-reward bet. It has the smallest revenue but recently launched a 108-qubit system. Its valuation multiples are extreme, making it highly sensitive to news. * **D-Wave (Quantum Annealing):** Has the most unique positioning with real-world enterprise clients today (e.g., Mastercard, Volkswagen) solving optimization problems. Its recent acquisition moves it into general-purpose quantum computing ("dual-platform"), adding execution risk. Major tech giants like Google, IBM, and Microsoft are also heavily invested, pursuing various technical approaches. Nvidia is positioning itself as the essential bridge between classical and quantum computing. The investment phase is likened to AI in 2018-2020: promising underlying technology with accelerating breakthroughs but a commercial inflection point still 3-7 years away, suggesting potential for a market correction ("bubble washout"). For investors, suggested approaches include gaining exposure through tech giants with quantum divisions (e.g., Google, IBM) or using niche ETFs like WQTM for pure-play quantum exposure, rather than direct stock picks in the highly volatile pure-play companies at this early stage.

marsbit18 мин. назад

Deconstructing the U.S. Stock Quantum Computing Sector: IonQ, Rigetti, D-Wave, Which of These Concept Stocks is Worth Betting On?

marsbit18 мин. назад

From Parallel Finance to Mainstream Finance: The On-Chain Securities Era Ushers in a Historic Window

From Parallel Finance to Mainstream: The Dawn of On-Chain Securities For over a decade, the crypto industry has operated as a parallel financial system with its own currencies, markets, and assets—from Bitcoin and ICOs to DeFi, NFTs, and memecoins. Despite building a robust internal ecosystem, a wall has separated it from the traditional financial world. That barrier is now crumbling. The industry's first act was one of internal evolution: ICOs streamlined fundraising, DeFi recreated financial services on-chain, and layer-2 networks competed for scalability—all within the crypto bubble. While innovative, this cycle remained closed, with capital and users circulating internally, leading to volatile boom-bust cycles. Even Bitcoin ETFs, while attracting Wall Street capital, merely provided a channel to buy crypto assets without bridging the systems. The next, larger narrative is Real-World Assets (RWA) moving on-chain. This involves tokenizing stocks, bonds, funds, and future cash flows. Blockchain can compress the complex traditional processes of trading, settlement, clearing, and custody into a seamless, automated network operating in seconds. This shift is creating a new financial gateway: the native crypto securities broker. This entity will combine functions of an exchange, broker, bank, and custodian into a unified global financial operating system. Consequently, the next major battleground won't be the "public chain wars" focused on speed and cost, but the competition to build the financial infrastructure capable of hosting high-quality, liquid real-world assets. Access to global equities, index funds, or stakes in companies like SpaceX could erase the boundary between crypto and traditional finance, unlocking a market orders of magnitude larger than crypto's current valuation. In summary, after years of creating a separate financial world, crypto's next decade will be defined by its integration into the existing global financial system, marking the true beginning of its largest growth story.

marsbit39 мин. назад

From Parallel Finance to Mainstream Finance: The On-Chain Securities Era Ushers in a Historic Window

marsbit39 мин. назад

Wang Chuan: When the Neighbor Old Wang Made 30x on Memory Stocks, How to Avoid Anxiety (Part Six) - The Trap of Commoditized Goods

Wang Chuan: When the Neighbor Lao Wang Made 30x on Storage Stocks, How to Stay Anxiety-Free (Part 6) - The Trap of Commoditized Goods. This essay uses historical and current examples to analyze the cyclical and high-risk nature of the data storage industry. It begins with the 1990s rise and dramatic fall of Iomega, whose stock soared over 160x in 18 months before collapsing 97% from its peak, illustrating the fleeting success of storage "meme stocks." The core problem is that storage products, like DRAM and flash memory, are highly commoditized. This leads to extreme volatility: prices have plummeted over 80% multiple times, and company stocks often crash 95% or go bankrupt. The industry's dynamic is defined by "elastic demand facing heavy-asset, long-cycle, rigid supply." When demand spikes and supply is fixed, prices skyrocket, as seen recently with AI-driven demand for High Bandwidth Memory (HBM). Companies like Sandisk and Micron have reported massive revenue and gross margin jumps (e.g., Sandisk's gross margin rising from 22.5% to 78.3%) despite minimal increases in production volume. However, these high margins are self-defeating. They incentivize massive new capacity investments (hundreds of billions planned from 2026), with supply expected to surge by late 2027. Once new supply meets demand, prices and profits will crash, potentially leading to a scenario where "selling more results in earning less." The article debunks the safety of long-term supply agreements, comparing them to fragile non-aggression pacts easily broken when market conditions shift. It warns that when an industry is highly profitable but trades at low P/E ratios, the risk is greatest, as plummeting prices quickly erase those earnings. Multiple asymmetric risks loom, including economic recession, reduced AI spending, faster-than-expected capacity expansion (especially from Chinese firms), and technological innovations that reduce memory requirements. In conclusion, the storage sector is a cyclical trap where periods of euphoric profits are often precursors to devastating downturns, luring unprepared investors into a "wealth incinerator."

marsbit49 мин. назад

Wang Chuan: When the Neighbor Old Wang Made 30x on Memory Stocks, How to Avoid Anxiety (Part Six) - The Trap of Commoditized Goods

marsbit49 мин. назад

Wang Chuan: When the neighbor Lao Wang earned thirty times from investing in memory storage stocks, how can you still avoid anxiety (6) - The trap of homogeneous products

The article, "Wang Chuan: How to Remain Unanxious After Neighbor Lao Wang's Thirty-Fold Gain on Storage Stocks (Part 6) - The Trap of Commoditized Goods," analyzes the cyclical and perilous nature of the data storage industry through historical and current case studies. It begins with the example of Iomega, whose Zip drives led to a stock surge of over 160x in the mid-1990s before collapsing over 97% from its peak due to competition from cheaper CD-R technology. This pattern is characteristic of storage, where products like DRAM are highly commoditized, leading to extreme price volatility. The sector has seen prices crash over 80% multiple times, with companies often facing bankruptcy. The core dynamic is "elastic demand facing heavy-asset, long-cycle, rigid supply." High prices attract new capacity, but the long lead time means supply eventually overshoots, causing sharp price corrections. The current AI-driven boom, exemplified by surging demand for High-Bandwidth Memory (HBM), has led to skyrocketing prices and profit margins for companies like SanDisk and Micron, despite relatively flat production volumes. However, the author warns this high-margin environment is self-defeating. The high profits are already triggering massive new capacity investments (hundreds of billions starting 2026), with supply expected to ramp up by late 2027. When supply catches up, total revenue and profits may fall even as more units are sold. Long-term supply agreements offer little protection, as buyers can find ways to renegotiate if market prices drop, similar to fragile political treaties. Key risks include economic downturns, cuts in AI spending, faster-than-expected capacity expansion (especially from Chinese firms), and innovations in chip/algorithm design that reduce memory needs. A critical trap is that at the cycle's peak, storage stocks often appear cheap with low P/E ratios, luring value investors just before an impending downturn where profits evaporate. The conclusion cautions that for commoditized goods like storage, high margins inevitably destroy themselves, and the current asymmetry favors downside risk over further upside. The neighbor's dream of easy wealth from storage stocks is portrayed as a precarious illusion.

链捕手1 ч. назад

Wang Chuan: When the neighbor Lao Wang earned thirty times from investing in memory storage stocks, how can you still avoid anxiety (6) - The trap of homogeneous products

链捕手1 ч. назад

Торговля

Спот
Фьючерсы

Популярные статьи

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

2025 год — год институциональных инвесторов, в будущем он будет доминировать в приложениях реального времени.

1.8k просмотров всегоОпубликовано 2025.12.16Обновлено 2025.12.16

Неделя обучения по популярным токенам (2): 2026 может стать годом приложений реального времени, сектор AI продолжает оставаться в тренде

Обсуждения

Добро пожаловать в Сообщество HTX. Здесь вы сможете быть в курсе последних новостей о развитии платформы и получить доступ к профессиональной аналитической информации о рынке. Мнения пользователей о цене на AI (AI) представлены ниже.

活动图片