Original Author: Li Hailun, Su Yang
Original Editor: Xu Qingyang
Original Source: Tencent Technology
June 1, 2026 - NVIDIA Founder and CEO Jensen Huang delivered a keynote speech at the NVIDIA GTC Taipei conference held during COMPUTEX 2026.
It had only been three months since the last GTC.
At that time, NVIDIA announced the "chip family bundle" of Vera Rubin, including: the Vera CPU, Rubin GPU, Groq 3 LPU, ConnectX-9, BlueField-4 DPU, and Spectrum-6 switch. These six chips form a rack-scale AI supercomputer, announcing that the number of GPUs required to train large MoE models was reduced to one-quarter, inference throughput per watt improved 10x, and single token cost dropped to one-tenth.
Different from previous emphasis on system-level solutions like the "chip family bundle" or "computing power family bundle," three months later at COMPUTEX, Jensen Huang turned his focus to the target these infrastructures will serve—Agents.
In his speech, Jensen Huang revealed: Vera Rubin has officially entered mass production, Vera CPUs have begun shipping globally, DGX Station for the first time comes to enterprise desktops in a Windows form factor, Cosmos 3 redefines the perceptual framework for physical AI, and DSX becomes the operating system for AI factories. NVIDIA also partnered with Unitree to launch the H2 Plus—the first humanoid robot reference design based on Isaac GR00T, extending Agent boundaries from the digital world to physical form.
NVIDIA is reorganizing its complete technical system around the Agent ecosystem, from chips and data centers to models, software, and robotics platforms.
Jensen Huang said: "The era of Agent AI and practical artificial intelligence has arrived. Now tokens are the unit of profit, AI is the 'generator' of GDP, and the number of software engineers is increasing. People talk about AI reducing jobs; that's complete nonsense. In fact, more software engineers are being hired."
The Same AI Factory, Runs 10x More Agent Tasks
The Vera Rubin platform is now in full production.
Unlike the past, which mainly focused on large model training and inference, Vera Rubin was designed from the start with Agent as a key workload.
In his speech, Jensen Huang stated that an Agent task is often not just a single model inference, but includes multiple steps such as inference, search, tool calls, code execution, and result validation, potentially involving thousands of steps behind the scenes. In the future, data centers will need to handle not just individual model requests but more of these continuous, collaborative Agent tasks.
The platform is defined as a massive, unified compute-unit-level AI supercomputer built specifically for handling Agent workloads from inference, retrieval to tool usage. In a same-scale hyperscale data center, using the new Vera Rubin platform to run autonomous AI Agent tasks achieves a processing efficiency 10 times that of the previous generation Grace Blackwell platform.
Beyond the compute platform itself, networking has also become a focus of the Vera Rubin upgrade.
In the past, data transmission between GPUs in data centers relied primarily on traditional optical modules and switch architectures. However, as cluster sizes continue to expand, power consumption, cooling, and deployment complexity rapidly increase. To address this, NVIDIA introduced the Spectrum-X Ethernet Photonics networking system into the Vera Rubin platform.
This marks NVIDIA's first large-scale introduction of Co-Packaged Optics (CPO) technology into AI data center networks.
Simply put, traditional solutions require plugging optical modules externally into switches, while CPO directly integrates optical devices into the switch internal, thereby reducing energy consumption and signal loss.
Additionally, security is a core capability heavily emphasized in this Vera Rubin platform.
To this end, NVIDIA extended Confidential Computing capabilities across the entire Vera Rubin platform. Through trusted execution environments, hardware-level verification, and end-to-end encryption mechanisms, enterprises can achieve higher levels of security assurance when handling private data, industry-sensitive information, and critical models.
Jensen Huang revealed that Vera Rubin has entered mass production. As a third-generation MGX rack-level system, it involves over 150 partners, more than 350 factories, and a supply chain covering over 30 countries and regions. According to NVIDIA's announced plan, Vera Rubin will begin official shipments this fall.
The "Born-for-Agent" Processor
NVIDIA launched a new type of processor, Vera, designed specifically for the Agent era and has entered full production.
Jensen Huang pointed out that advancements in memory systems will drive innovation and modernization in storage systems. All CPUs to date have been built for humans, but Vera is a CPU designed for the AI era, built for Agents.
As the successor to Grace, Vera adopts NVIDIA's self-designed "Olympus" CPU core architecture, increasing core count from 72 to 88 cores and significantly improving memory and data processing capabilities. According to NVIDIA, in testing with Agent-related workloads, Vera achieved task execution speeds 1.8 times faster than contemporary x86 server CPUs.
More important than pure performance gains is the change in the relationship between Vera and the Rubin GPU: Vera connects to the Rubin GPU via second-generation NVLink-C2C, with an interconnect bandwidth reaching 1.8TB/s, further reducing the overhead of data transfer between CPU and GPU during Agent operation.
Jensen Huang stated that Vera Rubin uses HBM (High-Bandwidth Memory) from Micron, SK Hynix, and Samsung, and the supply chain scale is "twice" that of the previous generation Blackwell. However, deploying a large Blackwell rack took two hours, while the time for Vera Rubin has been compressed to the 5-minute level.
Moving AI Factories from "Construction" to "Operation"
The DSX launched by NVIDIA this time can be understood as an "AI Factory Construction and Operation Toolkit."
In the past, building an AI data center required clients to separately consider servers, networking, power, cooling, facility design, and operational systems, with many steps relying on coordination among different suppliers. DSX aims to integrate these previously fragmented steps into a single framework, providing clients with a standardized, verifiable reference plan from design, simulation, construction, to operation.
Jensen Huang stated at the launch event: "NVIDIA isn't just selling chips; we're providing infrastructure builders with a complete AI factory blueprint."
The most important new capabilities in DSX this time are two-fold.
The first is DSX MaxLPS. It addresses the most practical problem for AI factories: given a fixed power budget, how to place more GPUs and run more Tokens.
According to NVIDIA, MaxLPS, combined with liquid cooling and intra-rack power optimization, allows operators to run up to 40% more GPUs without significantly impacting performance.
The second is DSX OS. It acts as the operational software for the AI factory, responsible for lifecycle management, intelligent scheduling, health monitoring, failure recovery, multi-tenancy management, and more. Simply put, if an AI factory is a complex facility, DSX OS ensures its continuous, stable operation.
Within the DSX product matrix, Reference Design provides AI factory reference designs, telling clients how to build facilities, racks, networking, power, and cooling systems; DSX Sim handles simulation, allowing clients to verify designs before construction; DSX Flex connects the AI factory to the power grid, enabling data centers to adjust tasks based on electricity prices, loads, and demand response signals; DSX Exchange is responsible for data interfaces between IT systems, operational systems, energy, and cooling systems.
In terms of the ecosystem, cloud partners like CoreWeave, Crusoe, and Lambda are deploying DSX Sim, MaxLPS, and DSX OS to reduce risks and improve GPU utilization. Manufacturers like Dell, HPE, Lenovo, Supermicro, and ASUS, Foxconn, GIGABYTE, Wistron are building DSX-compatible systems.
Teaming Up with Windows and ARM
In his live speech, Jensen Huang officially announced the unveiling of the "DGX Station for Windows" workstation, defined by NVIDIA as a desktop-level AI supercomputer for the Windows ecosystem.
Hardware-wise, it features the GB300 Grace Blackwell Ultra Desktop Superchip, connecting the Blackwell Ultra GPU and 72-core Grace CPU via NVLink-C2C, offering up to 748GB unified memory and 20 PFLOPS FP4 performance, and equipped with up to 800Gb/s networking capability.
The key point of this product lies in the change in Agent deployment methods.
NVIDIA hopes enterprises can run multiple Agents locally, securely, and manageably within a Windows environment and integrate them into workflows like design, engineering, data science, inference, and Physical AI. Simultaneously launched, OpenShell handles Agent runtime security through isolated sandboxes and system-level policy control, limiting Agents from unauthorized operations or leaking credentials and private data.
Besides products for enterprise desktops, Jensen Huang also announced a system-level SoC—the RTX Spark SoC—integrating the N1X CPU and Blackwell GPU into a single chip with unified memory architecture, specifically for thin-and-light laptops and small form-factor desktops.
Among these, N1X is NVIDIA's first PC processor co-developed with Microsoft, based on Arm architecture, custom-designed by MediaTek, and manufactured using TSMC's 3nm process. It will debut this fall on laptops from Microsoft, Dell, HP, ASUS, Lenovo, and MSI, with over 30 models initially, targeting high-end thin-and-light notebooks.
This is NVIDIA's "super chip" prepared for the AI PC era, which Jensen Huang sees as a significant redefinition of the PC form factor.
The Agent's "Two Brains"
At this launch event, NVIDIA announced the latest progress on two core model product lines, corresponding to two scenarios for Agents: one running within enterprise systems and one running in the physical world.
NVIDIA released Nemotron 3 Ultra, a 550-billion-parameter Mixture-of-Experts (MoE) model, providing top-tier intelligence for long-running agents in code development, scientific research, and enterprise business processes. Compared to mainstream open-source frontier models of similar scale, this model offers up to 5x faster inference speed and up to 30% lower usage cost, helping agents complete tasks more efficiently and affordably.
Surrounding the Nemotron open model, NVIDIA released a series of software, open-source models, and partnership progress, aiming to enable enterprises to build "digital colleagues" that assist employees in scenarios like engineering design, healthcare, software development, and business operations.
In this combination, Nemotron provides foundational model capabilities, NemoClaw organizes models into Agents, OpenShell handles runtime security, and Agent Toolkit transforms NVIDIA software libraries like CUDA-X into tools directly callable by Agents. Agents can use tools, call data, execute tasks, and integrate into existing enterprise systems within a controlled environment.
Jensen Huang stated that global software companies are bringing AI Agents into real work systems, enabling them to help employees complete complex tasks faster. NemoClaw provides open components needed to build long-running Agents, including orchestration, context, memory, tool calling, and security control capabilities.
In the past, enterprise discussions on AI focused more on what models could answer; now, NVIDIA aims to solve how Agents can securely integrate tools, data, and business processes and operate continuously in real work.
There's also Cosmos 3, officially launched as the third generation of the Cosmos series, representing an architectural-level redesign.
Cosmos 3 is a world foundation model for physical AI, providing the underlying ability to "understand the physical world, predict what will happen, and decide what to do."
Compared to previous Cosmos versions, earlier editions primarily targeted robotics and autonomous driving developers, focusing on video generation and physical world simulation, essentially being a relatively single-modal generation framework. Cosmos 3 adopts a new architecture—a hybrid Transformer—for the first time unifying three aspects: visual reasoning, world generation, and action prediction into a single system.
It can natively understand and generate text, images, videos, ambient sounds, and actions, achieving leading levels of physical accuracy, and is the world's first fully open, all-capable model. NVIDIA claims it has the potential to compress physical AI training and evaluation cycles from months in the past to days.
Jensen Huang predicted that due to breakthroughs in multimodal reasoning for language, vision, and world models, the big bang of physical AI is imminent.
The Cosmos 3 series of open frontier all-capable models provides developers with generational leap capabilities for building robots, autonomous vehicles, and visual AI that can perceive, reason, plan, and act in the physical world.
Lowering the Barrier to Physical AI
NVIDIA and Unitree jointly launched the H2 Plus—a sample humanoid robot for research and developers.
"Sample" means: Unitree is responsible for the robot body, NVIDIA is responsible for the software and computing platform, with both sides pre-integrating hardware and software. Development teams can start skill development immediately upon receiving it, without spending time solving underlying integration issues. It is also the world's first open humanoid robot built on the NVIDIA Isaac GR00T development platform.
This sample model targets a long-standing pain point in humanoid robot development: hardware integration, data collection, simulation, training, evaluation, deployment—each step operates in silos, making the entire process highly fragmented.
NVIDIA stated that research teams receiving a robot body often spend significant time on underlying integration, delaying actual skill development. What H2 Plus attempts to do is streamline this path, allowing research teams to skip underlying integration and directly enter skill development and real-world scenario validation.
In Jensen Huang's view, humanoid robots will bring physical AI to the world's largest industries, unlocking multi-trillion-dollar economic opportunities, and H2 Plus is the starting point for pushing frontier research into real scenarios like factories, warehouses, and logistics systems.
Additionally, NVIDIA announced the official open-sourcing of a set of Physical AI Skills toolkits, covering core scenarios like robotics, autonomous driving, visual AI, and industrial digital twins.
These "Skills" can be understood as standardized usage methods of NVIDIA's platforms like Cosmos, Omniverse, Isaac, Metropolis, written into operational instructions that Agents can directly read and execute. Open-sourcing these packaged instructions forms the toolkit released this time.
When an Agent receives a task, for example, generating a batch of training data for defect detection, it knows which model to call, what format to output, and how to validate results, automatically running the entire process without human step-by-step operation of each stage.
Upgrading AI Storage: From "Fast Running" to "Managed Control"
At the March GTC in San Jose, NVIDIA launched the Vera BlueField-4 STX. At that time, Jensen Huang focused on "AI-native storage architecture," with the core selling point being high-performance KV Cache storage support for Agents' long-context reasoning.
Now, NVIDIA announced the addition of a new set of security capabilities to STX, shifting the focus from "storage performance" to "storage security."
The core logic here stems from the changing context of enterprise AI usage. Now, many enterprises are actively deploying Agents. When Agents access enterprise systems, continuously reading and writing, sharing information across systems without direct human supervision—questions like who is accessing what data, whether there's unauthorized access, or data leakage become major headaches for enterprises.
NVIDIA's solution is to add a layer of security on top of accelerated storage—through a unified NVIDIA DOCA security software and hardware-enforced policies directly in the BlueField-4 chip, platforms based on STX can inspect and control interactions between agents, data, and contextual memory in real-time, helping enterprises achieve continuous policy enforcement across the AI data path.
Jensen Huang explained: "Agents have turned enterprise data into a real-time, living system, and this system must be protected wherever data moves, wherever context is stored, wherever Agents act. What Vera BlueField-4 STX aims to do is use inherently secure design to enforce trust at chip-level, at AI speeds."
Being "Mutual Suppliers" with TSMC
A particularly interesting point in this conference was the collaboration between NVIDIA and TSMC—Currently, TSMC is utilizing NVIDIA technology to improve cycle time, energy efficiency, yield, and operational productivity in advanced wafer fabs.
For the past thirty years, the relationship between TSMC and NVIDIA had only one form: TSMC manufacturing chips for NVIDIA. But now, roles have subtly changed; NVIDIA has begun helping TSMC "manage factories."
Jensen Huang stated: "NVIDIA and TSMC have collaborated for nearly thirty years, continuously pushing the limits of computing. TSMC is bringing NVIDIA's AI and accelerated computing inside the wafer fab, using simulation, optimization, and AI to tackle the world's most complex design and manufacturing challenges, improving speed, efficiency, and yield for next-generation chips."
Their relationship has evolved from a one-way client-vendor dynamic to one of mutual interdependence.
Conclusion
Looking back at this launch event, NVIDIA is piecing together a new blueprint around "Agents."
Vera CPU schedules tasks for Agents, Vera Rubin provides compute power for Agents, BlueField-4 STX secures data for Agents, Cosmos 3 enables Agents to understand the physical world, Nemotron+NemoClaw+OpenShell enables Agents to be organized, invoked, and constrained, DGX Station for Windows brings Agents to enterprise employee desktops, H2 Plus gives Agents a physical body, and DSX and Skills enable all this to be mass-produced and deployed.
From this perspective, Jensen Huang is attempting to depict a new computing era. This echoes his opening statement: "The era of Agent AI and practical artificial intelligence has arrived."
Ultimately, what Jensen Huang wanted to convey this time is one thing: when Agents become AI infrastructure, every layer can have NVIDIA.












