By Sleepy
Beijing time, in the early hours of June 9, 2026, Apple's WWDC 2026 commenced as scheduled.
At the event, it rebranded Siri as Siri AI, announced a deep collaboration with Google, using Gemini's model capabilities to train its new generation of foundation models. It extended Private Cloud Compute for the first time to Google Cloud and Nvidia's GPUs.
It released five Apple Foundation Models, with the smallest on-device model having 3 billion parameters and the largest cloud model optimized specifically for Nvidia GPUs. Almost every core app was rewritten. Siri even got its own standalone app, with the ability to save conversations, sync across devices, and possess memory.
This was Apple's most information-dense keynote in recent years.
Taming a Future
Apple's AI story can be traced back to the fall of 2011, at the iPhone 4S launch event, where Siri first took the stage.
At that time, Steve Jobs was gravely ill, and Apple stood at the crossroads of an era. Siri was like a little creature that had escaped from a sci-fi movie. You could ask it about the weather, restaurants, or set an alarm. It would answer you in a slightly mechanical tone, making you feel for the first time that a phone was more than just a piece of cold glass.
Siri evolved from the CALO project by SRI International, originally a military-grade AI assistant funded by DARPA. In 2010, Apple acquired it. According to TechCrunch, the deal likely exceeded $200 million. A year later, Siri debuted with the iPhone 4S. Apple claimed it could understand natural language and act as a personal assistant to handle tasks.
At that moment, Apple secured the world's best entry point for personal intelligence. Then, it delayed for over a decade.
Looking back today, what Siri initially changed was the posture of how people talked to machines. In 2011, the iPhone was turning mobile phones from communication tools into personal computing devices. The App Store redefined software distribution, and the mobile internet migrated from the PC desktop into the palm of your hand. Siri appeared at the crest of a rising wave. But after entering Apple, it quickly transformed from an ambitious personal assistant into a obedient voice remote control.
At its core, Apple believes in being closed and in control. But a true personal assistant must integrate with more services, understand more context, and tolerate more uncertainty. Uncertainty means errors, privacy risks, and the disorder that Apple is least equipped to handle.
Thus, Siri was only allowed to perform deterministic tasks, like a future that had been tamed. It had a name, a voice, a personality wrapper, but lacked the initiative and memory required for a genuine persona. Users were initially amazed by it, later made jokes about it, and eventually stopped using it much.
Apple was the first to put a "personal assistant" into a phone and also the first to lock it away.
The Agent that the entire industry is now working on, looking back, Siri in 2011 was almost its prototype. One could say Apple was the earliest company to create the prototype of an Agent, yet ended up being one of the last to fully realize it.
The AI That Doesn't Look Like AI
During the years Siri didn't grow up, did Apple's AI stagnate?
Quite the opposite. Apple did a lot of AI; it just didn't look like AI.
If measured by the volume of keynote announcements, Apple seemed to only start talking seriously about AI in 2024. But if you trace the technical path backward, Apple had been in motion for a decade.
In 2015, it acquired two companies in succession—one to bolster natural language dialogue, another to explore running deep learning directly on phones. That same year at WWDC, it discussed the Proactive Assistant, attempting to make the system offer suggestions before the user even asked. This idea was ahead of its time but, under the technological constraints of that period, it was more like a slogan.
The following year, it launched SiriKit, cautiously opening Siri up to developers with limited functionality. It also publicly discussed Differential Privacy, stating its intention to learn from large-scale data while protecting individual privacy. In 2017, the iPhone X brought the Neural Engine. Face ID and the camera began relying on on-device machine learning. Apple simultaneously introduced Core ML, allowing developers to run models on Apple devices, and acquired Workflow, which later became Shortcuts.
This was a very Apple-like set of answers. It wanted AI, but not by betting everything on the cloud and massive personal data like Google. It wanted developers, but didn't want Siri to become a messy stew. So Apple chose the hardest and slowest path: focus on the device, privacy, and system integration.
Around 2020, Apple bought several more companies focused on low-power edge AI and speech understanding. That same year, the M1 chip was released, bringing a 16-core Neural Engine to the Mac, pushing on-device AI compute from phones in pockets all the way to computers. The next year, Live Text and Visual Look Up landed. Text in photos could be copied directly, the camera could identify plants and flowers, and more voice requests could be processed on the device without needing the cloud.
Apple indeed didn't release a standalone AI app in these years, but it did make the phone smarter.
Choosing this path had its reasons. AI on a phone isn't just a Q&A machine. It needs to see photos, hear voice, understand contacts, launch apps, sense battery, location, and time. It's best if it can do some things offline, and preferably not bundle up a user's life and upload it to the cloud with every request. Apple's hardware control gave it the qualification to walk this path.
But between localized intelligence and holistic intelligence lies a deep chasm. Apple excels at breaking technology down into reliable components, but generative AI demands it reassemble those components into a whole.
These components quietly lay buried within the system, waiting for a catalyst.
The catalyst didn't arrive first. ChatGPT did.
When ChatGPT emerged at the end of 2022, Apple wasn't unprepared. Tim Cook repeatedly emphasized in various forums that AI and machine learning had been core technologies in Apple products for many years. Bloomberg also reported in 2023 on Apple's internal Ajax large model framework and internal Chatbot project.
The problem wasn't whether Apple had cards in its hand; the problem was that the rules of the game had changed.
ChatGPT shifted user attention from "functions" to "capabilities." Users began to expect AI on their phones by default, and then compared whose was stronger. When ChatGPT could already organize messy thoughts into a coherent email, Siri was still saying, "I found this on the web."
At WWDC 2024, Apple put Apple Intelligence on the table. Writing tools, notification summaries, photo search, personalized Siri understanding, ChatGPT integration. Apple finally admitted that relying solely on in-house models, at least in 2024, couldn't meet user expectations. But the vision it painted ultimately failed to land on the announced schedule.
Hiring Google as a Tutor
Behind the delay of Apple Intelligence wasn't just technology lagging, but the entire Siri team structure failing to keep pace with this round of AI.
Multiple media outlets confirmed that Apple's former AI head, John Giannandrea, stepped down. Craig Federighi took over AI direction, and Vision Pro head Mike Rockwell was transferred to lead the Siri team. A large number of Siri engineers were sent to learn AI programming tools. This wasn't a graceful rotation. Internally, Apple had realized that with the original people and the original pace, it couldn't catch up.
In January 2026, Apple and Google issued a joint statement. Apple would leverage Gemini technology to tailor Apple Intelligence features for the iPhone and other products. According to reports, Apple plans to pay Google approximately $1 billion annually to use a custom Gemini model at the 1.2 trillion parameter level to support the Siri overhaul. Apple had previously tested models from OpenAI and Anthropic but ultimately chose Google.
This is completely different from the ChatGPT integration in 2024. Back then, ChatGPT was more like a lifeline users could authorize when Siri couldn't answer—its brand was OpenAI's, and the interface was pop-up style. This time, Gemini goes directly into the underlying layers, becoming part of Apple's new generation of foundation models.
The key action is distillation. Google gave Apple full access to Gemini. Apple uses the large model within Google's data centers to generate high-quality answers and reasoning processes, then uses those results to train smaller, cheaper models that can run on the iPhone.
A technical paper published by Apple the day before WWDC framed this partnership as the third-generation Apple Foundation Models. Collaborating with Google, it custom-developed five models. On-device, there's the 3-billion-parameter AFM 3 Core, and a 20-billion-parameter sparse model, AFM 3 Core Advanced, which only activates parts per request. For the cloud, there are AFM 3 Cloud, the image model ADM 3 Cloud, and the most powerful AFM 3 Cloud Pro.
More pragmatic changes lie in compute power. No matter how smart on-device models are, they can't handle all tasks. Apple's Private Cloud Compute infrastructure alone could not fully bear Gemini-level inference. Some requests would run on Nvidia GPUs within Google Cloud. Apple subsequently confirmed PCC's first extension beyond Apple's own data centers, with the tech stack covering Nvidia Confidential Computing, Intel TDX, and Google Titan chips. Apple emphasized it still controls PCC software, with devices only trusting programs encrypted and approved by Apple. Related binary files would also be open for inspection by security researchers.
Apple didn't truly relinquish control, but it gave up the dignity of full self-reliance.
Borrowed Bones
To understand Apple's position in the AI era, one must first see its most core asset.
It's not chips, not models, but devices. Devices hold photos, emails, calendars, maps, payments, carrying the fragments of billions of ordinary lives. Whichever AI can mobilize these fragments is no longer just a chatbot; it can become the true personal intelligence hub.
Apple started paving the way for this hub long ago. The Workflow it acquired in 2017 later became Shortcuts, deeply integrated with Siri and system automation. The App Intents introduced in 2022 let third-party apps expose their capabilities to system entry points. In the Apple Intelligence era, these interfaces become the hands and feet for AI to call real-world actions.
With these interfaces, OpenAI can enter, Gemini can enter, and in the future, local partners can be found for the Chinese market. But their entry isn't directly taking over the iPhone; they are fitted into Apple's permission framework and privacy rules.
What Apple fears most isn't whose model is stronger. It fears users starting to bypass the system and directly hand over their lives to another entry point. If one day users opened not apps but an AI assistant that could orchestrate everything for them, Apple would be relegated to a well-crafted shell.
So from now on, the "Apple" in Apple Intelligence represents product control more than complete technological sovereignty. The skin is its own, the clothes are tailored by itself, but the bones are borrowed. Google provides the skeleton, Nvidia provides the joints, and Apple's job is to dress this body in its own clothes and send it out into the world.
What Google gains from this deal is a massive endorsement—even Apple acknowledges Gemini's underlying capabilities as more reliable. What Nvidia gains is another proof that even with the strongest consumer-grade chips and ambitions for self-developed servers, when it comes to cutting-edge inference and complex agent tasks, GPU clouds are still unavoidable.
But the more bones are borrowed, the less the body is truly one's own. Behind every borrowed bone lie supplier business calculations, regulations, and technology cadences. If one day someone decides to pull those bones back, can Apple stand on its own? It's a question Apple doesn't need to answer for now, but it will have to eventually.
A New Tenant Moving into the System
Ordinary people don't care about model parameters. They care if their phone can bother them less.
On the WWDC26 stage, Apple said: "There are times when you expect more from Siri."
For Apple, this almost counts as an apology.
Then it tried to show you a different morning.
You wake up, and the screen is cluttered with twenty notifications. In the past, you'd have to swipe them away one by one. Now the system has already sorted them by priority for you—your boss's messages are at the top, ads and promotions are condensed into a line of gray text. You open your email. A long work email has been summarized into three sentences. You decide to reply, and Siri drafts a response for you based on your usual tone when speaking with this person. You remember you need to call a merchant to return an item in the afternoon. Before you even dial, the system has already pulled the order number from your email from a couple of days ago and placed it on the call interface.
This is the story Apple wants to tell—a layer of intelligence laid beneath the system, saving you from the daily cognitive chores. Read less nonsense, search for files less, get interrupted by notifications less.
To tell this story well, Apple almost completely redesigned Siri's entry point. On the iPhone, it's placed in the Dynamic Island, accessible with a pull-down. On iPad and Mac, it's merged with Spotlight. It has its own standalone app, capable of saving and continuing past conversations, syncing across devices via iCloud. Apple wants Siri to become an AI assistant living within the system, with memory and context, but tries hard not to make it look like ChatGPT.
Vision is also a crucial direction. The camera adds a Siri mode—point it at food for nutritional info, point it at something unrecognizable for identification and search. System-wide dictation is no longer just speech-to-text; it automatically adds punctuation, adjusts formatting, turning spoken words into text ready to send.
The path is also being paved on the developer side. Apple opened the Core AI framework, allowing third parties to load their own models on devices. Upgraded App Intents make it easier for Siri to understand third-party apps. The Foundation Models Framework no longer just calls Apple's on-device models; it also supports integrating external providers like Claude and Gemini. Apple is paving a path for the entire ecosystem: for Siri to perform tasks across apps in the future, developers must hand over content and actions for the system to understand.
If these plans materialize, Apple AI will no longer be just "Siri that can chat."
But Apple is more cautious this time than in the past. Siri AI will only open to users in beta later this year, starting with English. And the same Apple Intelligence, when it reaches China, will likely be a different product.
For Chinese users, watching Apple AI is mostly just for entertainment. The keynote is lively, the features look good, but "not available in your region" for China.
The Chinese market has a whole set of rules for generative AI: filing, content safety, and data localization. Apple needs to find local model partners and pass regulatory approvals. Apple Intelligence in China isn't just a matter of launching months later; it may fundamentally be a different thing from the ground up.
What U.S. users see is a combination of in-house models and Gemini. What Chinese users may see is a version kneaded together by Apple's system permissions, local cloud services, domestic models, and regulatory requirements. They are both called Apple Intelligence, but their actual capabilities and reachable boundaries could be entirely different.
iCloud services in Mainland China are operated by GCBD. The cloud drive saves files, AI needs to understand files; the cloud drive stores photos, AI needs to understand photos; the cloud drive syncs notes, AI needs to extract your plans, habits, and relationships from notes. This data has new uses in the AI era and naturally faces varying degrees of scrutiny.
A more immediate threat comes from competition. Domestic smartphone manufacturers are moving fast with on-device large models, Chinese-language assistants, and imaging AI. For Chinese users, spending ten to twenty thousand yuan on a new iPhone only to find its core AI features unavailable might just prompt them to switch brands.
The daily scenarios in the Chinese market are particularly tricky for Apple. WeChat, Alipay, Meituan, Douyin, ride-hailing apps, government services, hospital registrations—these are what many people actually use their phones for every day. An AI assistant that cannot access these scenarios, cannot understand group chats, receipts, verification codes, and expressions only locals instantly grasp, can hardly be called "intelligent."
Understanding a Person
Apple Intelligence also has another issue: it doesn't cover all iPhones.
iOS 27 can cover down to the iPhone 11 and the second-gen iPhone SE, but Apple Intelligence requires at least the iPhone 15 Pro and newer models, M-series iPads, and Macs. The strongest on-device models require even more: iPhone 17 Pro, iPhone Air, iPads with at least 12GB of unified memory on M4, or M3 Macs.
In recent years, upgrade cycles have been stretching longer. Screens are good enough, cameras are sufficient, and many no longer change phones yearly. AI might become the reason Apple uses to stimulate upgrades again. On-device AI indeed requires stronger chips and more memory, making hardware thresholds inevitable. A capability packaged as "understanding you better" ultimately becomes a price barrier.
For over a decade, Apple has constantly asked, "What comes after the iPhone?" It tried watches, headphones, TVs, and that rumored car project that lasted ten years before being canceled. In 2024, part of the car team's staff were transferred to the generative AI team.
AI arrived just in time. It gives Apple a next-generation story without needing to create a new hardware category from scratch—just transform the devices already in the hands of over a billion users. What comes after the iPhone might still be the iPhone, but it must become something else.
The future hardware plans overseen by Ternus, Tim Cook's successor, hint at Apple's next steps. He is advancing a set of unreleased AI devices—glasses with cameras and wearables that use computer vision to understand the surrounding environment. If these products come to fruition, Apple Intelligence will extend from phones outward, with phones, headphones, glasses, and home hubs potentially becoming new senses.
But no matter how the senses extend, the core question remains the same.
The relationship between people and their phones isn't mostly sitting down for long conversations, but mutual interruptions in extremely trivial scenarios. You're rushing for the subway, the kid is crying, the boss is pressing, and the screen is piled with 20 notifications. The most concrete meaning of Apple Intelligence for ordinary people isn't an omnipotent assistant, but a phone starting to shoulder part of your cognitive load. Read less nonsense, search for files less, get interrupted by notifications less.
Apple has always positioned itself as being on the user's side. It says privacy is a fundamental human right, that devices belong to users, that technology should serve people. In the AI era, this rhetoric will face its real test. Because once a system starts understanding you, it's not just protecting your data; it's also shaping your actions. It gives you summaries, suggestions, filters information for you, decides for you what's important and what can be ignored.
The difficulty of personal intelligence has never been just intelligence; it's also "personal." A person's life isn't a database. It contains emotions, misunderstandings, awkward moments, and corners one doesn't want any system to see. For AI to enter these places, efficiency alone cannot be the pass.
Kazuo Ishiguro wrote about an AI companion named Klara in "Klara and the Sun." She spent her entire existence trying to understand a girl. She learned to observe changes in light, to read expressions and silence, to know when to be quiet.
But the most moving part of the book is when Klara finally understands there is a part of the girl she can never touch. It's not that she isn't smart enough, but she learns one thing: understanding a person and possessing a person's data are two completely different matters.
It took Apple fifteen years to reach the point of admitting Siri wasn't good enough. On this WWDC night, it borrowed models from Google, compute from Nvidia, and another year of patience from users. It proved it's willing to bow its head, but bowing is just the beginning.
What it needs to learn next is what Klara already knew. It's not about becoming smarter, but knowing where to stop after stepping into someone's life.
-END-















