Over the past two years, AI hardware has essentially had one core focus: the GPU.
From large model training to inference clusters, and from edge to cloud computing power, the entire industry has been discussing who can secure more GPUs and who can pack more compute cards into data centers. It's fair to say the entire AI industry has been revolving around GPUs, which has also driven Nvidia's stock price to record highs.
However, at COMPUTEX 2026, Intel presented a different perspective: AI's next stage should not focus solely on GPUs. The core of this argument is the keyword repeatedly emphasized by CEO Pat Gelsinger in his keynote speech: Agentic AI, which we commonly refer to as agents.
Image Source: Intel
Agents Are Changing the Computing Ecosystem
The difference between agents and traditional AI is actually quite significant. Traditional AI operates like a turn-based Q&A machine, while agents are meant to integrate into real-world workflows, proactively completing cycles of "thinking, planning, acting, and reflecting." In other words, they must learn to read data, call tools, execute tasks, check results, and continuously adjust their next steps based on feedback.
This means AI inference is no longer a "one-off deal" but becomes a continuously running system of self-decision-making and self-reasoning, fundamentally changing how computing power is utilized. Therefore, Intel's core message this time is: Agentic AI will reshape the compute power ratio within data centers.
Currently, in the cutting-edge model training phase, the CPU-to-GPU ratio can approach 1:8, with GPUs bearing the vast majority of the computational load. However, upon entering an agentic inference mode, CPUs need to handle task orchestration, tool invocation, data migration, and system coordination. In this scenario, the CPU-to-GPU ratio will gradually move towards 1:1, or may even require higher CPU density to rapidly decompose tasks.
In fact, when an agent not only generates an answer but also needs to continuously invoke models, tools, and external systems, its operational state is completely different from that of traditional AI. Intel mentioned a statistic in the presentation: compared to single-turn inference, an agent's Token consumption can increase by up to 1000 times.
Image Source: Intel
In other words, agents don't bring simple growth in inference volume, but rather more complex, higher-frequency, and more fragmented system loads. Throwing all these loads at GPUs for resolution would be inefficient and expensive.
The Xeon 6+ processor launched by Intel is built on the Intel 18A process, featuring up to 288 efficiency cores and equipped with up to 576MB of L3 cache. Targeting cloud-native, Agentic AI, and network-intensive workloads, it promises higher energy efficiency and more stable sustained performance.
In Intel's proposed solution, a single liquid-cooled rack occupying 32U of compute space can provide 36,864 cores; the rack power consumption is only about 100kW, sufficient to support high-density agent deployment. While 100kW may sound daunting, compared to previous server racks with equivalent performance, power consumption has already been significantly reduced.
Beyond Xeon 6+, there's something even more noteworthy: Intel's restructuring of the inference architecture.
In the presentation, Intel announced a partnership with SambaNova, Vista Equity Partners, Cambium Capital, and others to officially launch a new, fully disaggregated inference solution. This solution runs on the Vector Core Compute Agent Cloud, where Intel Xeon 6 processors handle orchestration and execution, SambaNova SN40 RDUs are responsible for decoding, and finally, NVIDIA Blackwell GPUs handle pre-filling.
Image Source: Intel
This new architecture is specifically designed for agentic workloads. Unlike many past AI systems that tended to offload most of the work in the inference pipeline to GPUs, in this system, CPUs, RDUs, and GPUs will each have their specific roles—handling system scheduling, decoding, pre-filling, and other different stages respectively—ensuring each inference phase runs on the most suitable hardware to maximize efficiency.
Following the introduction of Xeon 6+, the recently launched 3rd Gen Core Ultra processors also made another appearance. They represent another link in Intel's AI ecosystem—the core of edge-side AI. In the presentation, the hybrid local server demonstrated by Intel and Perplexity was precisely built on the 3rd Gen Core Ultra and Xeon 6+ cloud servers.
Image Source: Intel
It can dynamically allocate workloads between the local device and the cloud based on device capability and functional requirements, further reducing reliance on cloud computing power. This is also the ideal form for future AI PCs: by dynamically distributing performance, it lowers Token costs while ensuring task immediacy and data privacy.
Beyond PCs, Intel is extending the 3rd Gen Core Ultra to gaming handhelds and edge computing. The newly announced Arc G3 series of processors are optimized for handheld gaming devices based on the same architecture and will be available later this month (the integrated graphics card most anticipated by handheld gamers is coming).
From General-Purpose to Custom, Intel Aims to Be 'Everywhere'
Beyond general-purpose processors, Intel also emphasized custom chips this time, a business segment CEO Pat Gelsinger has been championing since taking the helm.
Intel believes the custom chip market will be vast in the future because as AI penetrates various industries, customers will become increasingly dissatisfied with general-purpose compute power. In pursuit of higher efficiency and performance, they will gradually lean towards custom chips to maintain their competitiveness.
In the presentation, Intel mentioned collaborations, such as with Google to launch IPUs—chips crucial for cloud service providers to enhance infrastructure performance. Intel is also partnering with telecom clients like Ericsson to provide advanced wireless infrastructure chips globally.
This actually reveals another theme of Gelsinger's speech: Intel is no longer relying on a single, general-purpose chip to win the market. Instead, it's packaging chips, systems, software, and industry partnerships into a complete set of solutions that can be freely customized according to the needs of different enterprises, thereby maximizing Intel's advantages.
Image Source: Intel
From the perspective of Lei Technology, Intel is essentially redefining its position in the ecosystem: data centers need CPUs for agent orchestration; inference systems require heterogeneous disaggregation to reduce costs; PCs need local AI to handle privacy and compliance; edge and embodied intelligence require high-efficiency chips; and industry clients need customized chips.
By meeting the needs of enterprises across different fields and various points in the value chain, Intel aims to become even more "everywhere" than Nvidia.
Of course, the pressure on Intel remains immense. Nvidia's advantages in AI accelerators and software ecosystems are still evident, and AMD continues its offensive in server CPUs and AI chips. For Intel to successfully navigate this path, it ultimately depends on the mass production speed of the 18A process, whether the Xeon 6+ rack-level solutions can be deployed quickly, and whether customers can truly see significant benefits from this new architecture.
But at least this time, Intel's direction is clearer than before.
It can be said that as AI enters the era of agents, competition is no longer just about comparing the peak performance of a single chip. Instead, it involves the collaborative efficiency optimization of the entire computing system. GPUs remain important, but CPUs, edge devices, local AI, and custom chips are also regaining critical importance.
And what Intel aims to seize is precisely this window of opportunity where AI infrastructure is undergoing a re-division of labor.











