Google Using AI to 'Kill' Google: This Keynote Left People Breathless

marsbitPublished on 2026-05-19Last updated on 2026-05-19

Abstract

Google's latest I/O event showcased a sweeping AI evolution, positioning Gemini as the core intelligence across its ecosystem. The launch of Gemini Omni pushes video generation towards a "world model," understanding physics and enabling complex edits through conversation. For developers, the ultra-fast Gemini 3.5 Flash model powers Antigravity 2.0, an agent-first platform capable of orchestrating tasks like building a basic OS from scratch. Google Search is being reimagined with AI Overviews, persistent information-gathering Agents, and generative UI components. A key consumer debut is Gemini Spark, a persistent personal AI Agent that runs tasks in the cloud even when a user's device is off, managing multi-step workflows. The Gemini App itself received a visual overhaul with a new Neural Expressive design and a "Daily Brief" morning summary feature. The company also announced a suite of creative tools (Pics, Stitch, Flow), upcoming AI-powered audio glasses, and expanded content verification with SynthID. Underlying these advances is a significant strategic shift: Google is transitioning from a free, ad-supported model to a future built on AI subscriptions and enterprise services, betting users will pay for deeply integrated, agentic assistance that handles complex tasks across work and life.

The Gemini App has over 900 million monthly active users, processes 3200 trillion tokens per month, and Nano Banana has generated over 50 billion images...

These are the numbers Google CEO Sundar Pichai opened with at today's early morning Google I/O keynote.

Over the past year, AI has become the main theme across all industries. Gemini's role at Google has also evolved from a standalone app to becoming the core AI underlying capability across all Google products.

This keynote also began with models, then moved on to coding and Agent products.

Gemini Omni pushes Google's video generation towards a "world model" direction, while Gemini 3.5 Flash, alongside AI programming tools, is positioned as an Agent development platform.

These two capabilities were then integrated into Google's full ecosystem: Search, the Gemini App, Flow, Spark, Chrome, XR glasses, and e-commerce scenarios.

Enter Gemini Omni, The "Nano Banana" Moment for Video is Here

Gemini Omni was the first to be highlighted in detail at the keynote. We created a comparison video with Seedance 2.0 to show the differences.

Google describes Gemini Omni as a new model capable of "creating any content from any input."

It combines Gemini's reasoning capabilities with Google's existing generative media models, aiming to enhance the model's understanding of the world, its multimodal generation capabilities, and editing abilities.

Google emphasized that while models like Veo, Nano Banana, and Genie can already generate videos, images, and interactive simulations, Gemini Omni goes a step further by beginning to handle problems closer to the physical world, like motion and gravity.

The case shown on stage involved an explanatory video on protein folding. Users simply input a prompt like "Generate a claymation explanation about protein folding," and Omni transforms the abstract scientific concept into video content.

It also supports more natural video editing. Users can upload their own videos, then modify the style, add elements, adjust details conversationally—even turning a plain circle into a black hole or transforming a nighttime walk into a more dramatic scene.

Google's claim is that Gemini Omni is starting with video but will gradually progress towards "any input to any output." This is also why Google has consistently designed Gemini as a multimodal model from the beginning.

The first model in the Omni family, Gemini Omni Flash, is already being integrated into Google products; more information about Omni Pro will be released later. Omni features in the Gemini App are also now available to Google AI Plus, Pro, and Ultra subscribers.

This means Gemini Omni isn't just a video generation model. Google wants to frame it within the narrative of a "world model": the model doesn't just generate images; it must understand the physical relationships, motion relationships, and scene logic within them.

As it enters applications like the Gemini App, Google Flow, and YouTube Shorts, Omni will also expand Google's generative creation tools from image editing to video editing.

Gemini 3.5 Flash Launches, AI Coding Enters Hyperdrive

If Gemini Omni corresponds to generation and editing, Gemini 3.5 Flash corresponds to speed, cost, and execution capability.

Google introduced Gemini 3.5 Flash at the keynote, calling it one of the first models in the Gemini 3.5 series, focused on agentic coding, long-horizon tasks, and real-world workflows.

Compared to 3.1 Pro, 3.5 Flash shows significant improvements in almost all benchmarks, especially in coding capabilities and evaluations like GDPVal that are closer to real-world economic tasks.

Beyond good benchmark performance, 3.5 Flash generates output tokens 4 times faster than other leading models. After specific optimization in Antigravity, speeds can reach up to 12 times.

It's worth noting that in March, Google's internal development-related tasks processed about 500 billion tokens daily, doubling every few weeks to now exceed 3 trillion tokens daily. Google calls this a feedback loop, using massive real-world usage to further improve 3.5 Flash.

Launched alongside the model is Antigravity 2.0.

It has evolved from an agent-powered IDE into a standalone desktop application, shifting focus to being "agent-first." Users are no longer just using AI for code assistance within an editor but completing development tasks through Agent conversations, Agent outputs, and multi-Agent collaboration.

Antigravity 2.0 adds a full CLI, Antigravity SDK, native voice support via Gemini's audio model, and integrates services like Android, Firebase, and Google AI Studio. Antigravity 2.0, as a standalone desktop app, is now available globally.

Google used a high-intensity demo on stage to illustrate Antigravity 2.0's direction: having an Agent build a runnable operating system from scratch. This task involved 93 sub-Agents executing in parallel over 12 hours, initiating over 15,000 model calls, processing 2.6 billion tokens, and generating core modules like a scheduler, memory management, and file system from an empty project.

Google claimed this task was impossible with Gemini 3.1 Pro, and using Gemini 3.5 Flash cost less than $1000 in API credits.

The demo also showed this system running an SL train program and Doom. Since the system initially lacked video and keyboard drivers, Antigravity proceeded to generate the relevant code and fix it, enabling Doom to run. Google also stated that similar approaches have been tested on projects like a photo editing suite, real-time messaging app, and multi-user collaboration platforms, compressing engineering work that previously took days into hours or even less.

Gemini 3.5 Flash is now available to all users, covering Google products and API. Gemini 3.5 Pro is still under internal use and improvement, expected to open up next month.

From Search Box to Information Agent: Google Reinvents AI Search

Following models and development tools, Google shifted focus to search. Google Search is now AI Search.

Google stated that AI Mode already has over 1 billion monthly active users, with query volume doubling quarterly since its launch.

Starting today, AI Mode is being upgraded to Gemini 3.5. The new intelligent search box is also rolling out. It supports text, image, file, and video inputs and provides AI suggestions as users type questions.

AI Overviews and AI Mode have also been merged into a more continuous AI search experience. Users can first see AI answers on the main search results page, then enter AI Mode to continue asking follow-ups while retaining context. This new search experience launched globally on desktop and mobile on the same day as the keynote.

A bigger change is the Search Agent. Users will be able to create information Agents within Search this summer, allowing them to persistently track certain types of information.

For example, a user could set one to monitor large-cap biotech stocks with a P/E ratio below 15, positive cash flow, and low debt; or have it track rental listings, sneaker collaborations, or new product drops long-term. When conditions change, the Agent will send users a synthesized update.

Google is also bringing Antigravity's agentic coding capability into search.

Soon, search won't just return webpages, summaries, or cards; it can also generate interactive interfaces for specific queries. For instance, if a user asks "How do black holes affect spacetime?", Search might generate an interactive visual component. Following up with "How do binary black holes produce gravitational waves?" would cause Search to regenerate a dynamic interface with adjustable parameters. Generative UI with Antigravity will launch for free to all users this summer.

More complex custom experiences are on the way.

Google demonstrated a weekend planner on stage, where Search would combine weather, maps, user preferences, Gmail, Calendar, and other information to generate a small tool that can be further modified, shared, and synced to a calendar. Such custom experiences will open first to subscribers in the coming months.

Runs When Powered Off: Gemini Spark Brings Agent Capabilities into Personal Life

The most important new product for consumers is Gemini Spark.

Gemini Spark is a personal AI Agent that runs on dedicated virtual machines in Google Cloud and can perform tasks around the clock. It's powered by Gemini 3.5 and Antigravity harness, supporting long-running background tasks.

Even after a user shuts down their computer, Spark continues working. It first integrates with Google's own tools, with third-party tool integration via MCP coming in the next few weeks.

The keynote demonstrated several typical Spark scenarios.

Users can have it summarize Gemini Live releases and progress from the past week, extract information from Docs, Gmail, and chat history, and generate a team email in their personal writing style.

They can also have it manage a block party, maintain an RSVP spreadsheet in Google Sheets, track who's bringing what, generate draft reminder emails for neighbors who haven't signed up, and automatically create a promotional Google Slides page.

Spark also supports voice input on mobile.

Users can speak multiple tasks at once, like flagging all meetings with Sundar in bright pink, drafting an invitation letter for new neighbors, and creating a to-do document before the school year ends for their child. Spark splits this into multiple independent tasks and executes them in the background, with results syncing between phone and computer.

Gemini Spark is opening to select testers this week, with a beta rolling out next week to US Google AI Ultra subscribers.

Google simultaneously introduced a new $100/month Ultra plan and lowered the price of the highest-tier Ultra plan from $250/month to $200/month.

Later this summer, Spark will come to Chrome, becoming an intelligent browser agent that can perform tasks within webpages.

Gemini App Major Overhaul, Plus Google's Version of "AI Morning Briefing"

The Gemini App itself also received a major, transformative overhaul.

Google introduced a new design language called Neural Expressive, featuring fluid animations, vibrant colors, new fonts, and haptic feedback.

The new Gemini App no longer presents answers as large blocks of text. Instead, it dynamically generates layouts better suited for reading and interaction based on the content, including interactive images, timelines, embedded videos, etc. Neural Expressive is now rolling out globally on Android, iOS, and web.

Gemini Live has also been redesigned, opening directly into a real-time conversation. Regional accent selection will roll out in the coming weeks.

The Gemini App also added Daily Brief. This is a personalized summary Agent designed for morning use, synthesizing information from Gmail, Calendar, Tasks, etc., to organize items users need to focus on for the day, and providing next-step action entry points.

Daily Brief is launching today for US Google AI Plus, Pro, and Ultra subscribers.

Beyond the larger Gemini narrative, Google also updated several everyday products.

Google Maps recently underwent its biggest upgrade in a decade and added Ask Maps. It allows users to ask longer, more complex questions. For example, the keynote presented a scenario: a child falls into a duck pond, a wedding starts in 30 minutes, and the user wants to know where they can walk to buy a new dress.

Docs also gained new voice creation capabilities. Users don't need to input precise prompts; they can directly speak their thoughts, have Gemini pull a resume from Drive and event info from Gmail, then generate a Google Docs draft. This capability will launch for Pro and Ultra subscribers this summer, with similar voice capabilities coming to Gmail.

As generative capabilities advance, identifying content sources becomes increasingly important.

Google stated that since SynthID launched three years ago, it has added invisible watermarks to over 100 billion images and videos, and audio equivalent to 60,000 years of content. Next, SynthID and content credential verification will expand to Search and Chrome.

Users can circle content in search or right-click in Chrome to ask if content was AI-generated. The system will indicate if the content came from AI, a camera, or was edited by a generative AI tool.

Google also announced that OpenAI, Kakao, and ElevenLabs will adopt SynthID 2. NVIDIA has previously joined the SynthID ecosystem. For Google, SynthID isn't just a security feature; it's part of the effort to establish AI content transparency standards.

Google's Creation Suite Begins Assault on Images, Design, and Video

In the creative tools space, Google densely released multiple heavyweight products.

Google Pics is a new image creation and editing product within Google Workspace, targeting scenarios like party posters, infographics, and promotional graphics. Users can start from a base image, delete elements, resize objects, edit text, and translate text. Pics-generated content carries SynthID watermarks. Google Pics will launch this summer.

The design product Stitch also received updates. Users can generate a website or app interface with a single prompt, then continue modifying via text or voice—like enlarging titles, adjusting menus, highlighting more pizza options. Stitch supports exporting designs as code or publishing websites directly. These updates are live now.

Updates to Google Flow were particularly noteworthy. With Gemini Omni integrated into Flow, users can change environments, add visual effects, and insert new characters based on original videos while preserving the original performance as much as possible.

Flow also added a new Agent supporting the execution of multiple actions at once. For instance, generating 16 different camera-angle videos from a single image, or batch-transforming a set of morning scenes into night scenes.

Flow Tools allow users to create their own creative tools within Flow, like video effects, hand-drawn animation, and text layering tools, with support for sharing and remixing.

Google Flow Music can expand a piano riff into a music demo with stylistic direction. These new features for Google Flow and Google Flow Music are now live.

Betting on Smart Glasses, Google Ventures into the Next-Generation Entry Point Again

On the hardware side, Google is also extending the Android XR operating system platform from headsets and XR devices further into the smart glasses form factor.

Android XR is a platform Google developed in collaboration with Samsung and optimized for Qualcomm Snapdragon.

Google stated that AI glasses will be divided into two categories: One type are display glasses with small lenses, the other are audio glasses. Display glasses were shown last year at I/O; this year, first developers have begun creating display experiences, and a trusted tester program will expand later this year.

Audio glasses will come to market sooner.

The first audio glasses will launch this fall, with Samsung involved in hardware and experience building, and Warby Parker and Gentle Monster handling eyewear design. These glasses connect to phones, supporting Android and iOS. Gemini's responses are played privately through earbuds, not displayed on lenses.

During the demo, the presenter used glasses to have Gemini navigate to the place they met a friend last week, adding a coffee shop stop along the way; or have Gemini open DoorDash and automatically order coffee, waiting for user confirmation;

They could also have it summarize muted messages and write a family dinner into the calendar. The glasses can also work with a watch, allowing users to take a photo of the scene, generate a cartoon image with Nano Banana, and preview it on the watch.

Towards the end, Gemini's use cases were extended into cybersecurity scenarios.

Google introduced CodeMender. It's a code security Agent capable of automatically finding and fixing critical software vulnerabilities. Google will invite a group of experts to test the CodeMender API before a broader rollout.

After watching the entire keynote, the sheer volume of information was almost overwhelming. But when these AI features truly open up to tens of millions or even hundreds of millions of users, a very practical question of economics immediately presents itself: How does Google plan to recoup this enormous computational expense?

For over two decades, Google represented a classic free internet model. Users exchange attention and data for services; Google monetizes through advertising and distribution. This model made Google the most powerful infrastructure company of the internet era.

But the cost of large model inference is not on the same order of magnitude as performing a single web search.

Long-context memory, multimodal generation, cross-application Agents, enterprise-level automation—these capabilities are all backed by continuously running computational power consumption. The deeper AI integrates, the harder it becomes for Google to continue absorbing costs through "free feature upgrades."

That's why, throughout the keynote, Google I/O seemingly discussed experience upgrades, but the underlying direction pointed towards subscriptions, enterprise contracts, compute billing, and long-term service fees.

Free entry points certainly won't disappear, as they remain the foundation for Google to acquire users, data, and ecosystem positioning. But on top of these entry points, Google is adding a new layer of intelligent services: more powerful models, longer memory, deeper system permissions, more complex task execution, and more stable enterprise-level services.

In other words, Google is transforming further from a free internet services company into an AI subscription infrastructure company.

However, a question follows: are users willing to pay for search? Typically, no.

But what if it's a "super versatile assistant" that can handle your emails, coordinate tasks, analyze reports, manage your smart home around the clock, and even help you write code and develop apps? Would you be willing to pay tens to hundreds of dollars per month for it?

This is precisely the core business proposition that this year's Google I/O urgently wants to validate. And looking around the current frenzied market, the answer seems almost self-evident.

This article comes from the WeChat public account "APPSO," author: Discovering Tomorrow's Products

Related Questions

QWhat are the key new AI models announced by Google at I/O, and what are their primary functions?

AGoogle announced two key new AI models: Gemini Omni and Gemini 3.5 Flash. Gemini Omni is a multimodal model designed to generate and edit video content, with a future goal of handling 'any input to any output' and moving towards being a 'world model'. Gemini 3.5 Flash is focused on speed, cost-efficiency, and execution, particularly excelling at agentic coding and long-duration tasks.

QHow is Google integrating AI into its core Search product, and what new capabilities are being introduced?

AGoogle is deeply integrating AI into Search. AI Overviews and AI Mode are being merged for a seamless experience powered by Gemini 3.5. New capabilities include the ability to create persistent information Agents for tracking topics (like stocks or products), and 'Generative UI with Antigravity' which will generate interactive tools (like a visualizer for black holes) directly in search results based on user queries.

QWhat is Gemini Spark, and what makes it a significant personal AI product?

AGemini Spark is a personal AI Agent that runs on a dedicated Google Cloud virtual machine. Its key feature is the ability to perform long-running, multi-step tasks even when the user's computer is off. It integrates with Google tools and can handle complex workflows like summarizing information, event planning, and managing communications. It also supports breaking down complex voice commands into multiple tasks.

QAccording to the article, what is the fundamental business challenge Google faces with its advanced AI services, and how is it addressing this?

AThe fundamental challenge is the massive computational cost of running advanced AI models for features like long context, multimodal generation, and persistent Agents, which is far higher than traditional web search. Google is addressing this by building a new layer of paid, subscription-based intelligent services (like Gemini Ultra plans with Spark access) on top of its free foundational services, shifting towards becoming an 'AI subscription infrastructure company'.

QWhat are some of the new creative and design tools Google announced, and what are their main features?

AGoogle announced several new creative tools: Google Pics for image creation and editing within Workspace; updates to Stitch for generating and modifying app/website UIs with prompts; and significant updates to Google Flow for video editing, including using Gemini Omni to modify environments/effects and new Agents for batch processing videos (like changing scenes from day to night).

Related Reads

Warsh's First Day in Office, Markets Deliver a 'Wake-up Call': Rate Hike Expected This Year

On his first day in office, newly inaugurated Federal Reserve Chairman Warsh received a stark market warning, with expectations now fully pricing in a 25-basis-point interest rate hike this year. The shift was triggered by hawkish remarks from Fed Governor Waller, who stated that inflation is now the key policy "driver" and that the odds of a hike or cut are evenly split. This sent short-term Treasury yields higher. Waller signaled a significant pivot in his stance, citing disappointing inflation and labor data. He suggested removing "easing bias" language from Fed statements and did not rule out future rate increases if inflation fails to recede, though he noted immediate action isn't warranted without signs of unanchored inflation expectations. Chairman Warsh faces immediate pressure at his first FOMC meeting in June. With the preferred inflation gauge at a three-year high, analysts warn that failing to hike could be interpreted as an implicit easing of policy. The geopolitical situation in the Middle East is adding to existing price pressures. The market's expectation for a hike contrasts sharply with earlier forecasts for multiple cuts. While long-term Treasury yields have been contained by lower energy prices recently, analysts note they remain under structural upward pressure. Warsh's swearing-in at the White House highlights political scrutiny over Fed independence. However, the market has made it clear that inflation is the most urgent challenge, leaving the new chairman little time to settle in.

marsbit2h ago

Warsh's First Day in Office, Markets Deliver a 'Wake-up Call': Rate Hike Expected This Year

marsbit2h ago

Has Microsoft Lost Its Way in the AI Race, and Can Copilot Bring It Back on Track?

Microsoft, once seen as an early AI frontrunner due to its investment in OpenAI, is navigating a strategic shift amid increased competition. Its initial reliance on OpenAI’s GPT models has been complicated by OpenAI’s growing ambitions as a direct competitor, rapid advancements from rivals like Claude and Gemini, and the disruptive rise of AI agents, which challenge its traditional SaaS business model. These factors contributed to stock declines and slower-than-expected adoption of its flagship Copilot products. In response, CEO Satya Nadella has taken a hands-on role in product development, signaling the urgency of change. Microsoft is pivoting from a model-centric strategy to a "model-agnostic" enterprise platform approach. It aims to become the foundational layer connecting various AI models—from OpenAI, Anthropic, or its own new "Superintelligence" team—with enterprise workflows, data, security, and cloud services. Recent organizational changes merged consumer and enterprise Copilot teams to accelerate innovation, exemplified by new products like Copilot Tasks and Copilot Cowork. However, this transformation comes at a high cost. Microsoft faces massive capital expenditures, potentially reaching ~$190 billion by 2026, to support AI infrastructure. While its platform strategy shows early signs of traction with growing Azure AI revenue, it must balance startup-like agility with the reliability expected by enterprise clients. The core challenge is no longer being the sole AI winner but defending its position as the essential enterprise software entry point amidst rapid technological commoditization and the shift towards always-on AI agents.

marsbit3h ago

Has Microsoft Lost Its Way in the AI Race, and Can Copilot Bring It Back on Track?

marsbit3h ago

Why Haven't Forex Stablecoins Taken Off?

Why FX Stablecoins Never Took Off: A Path Forward via Synthetic FX Despite the explosive growth of stablecoin-powered digital banking, which has seen ~$6B in VC investment and a 24x surge in crypto card spending in under a year, a major limitation persists: these banks are essentially dollar-only accounts. This leaves 95-99% of global accounts, which are denominated in non-USD currencies, underserved. Attempts to create native foreign currency (FX) stablecoins (like EURC) have largely failed, with total FX stablecoin TVL at ~$600M compared to $400B for USD stablecoins—a 700x gap. These FX tokens face critical challenges: fragile pegs due to low liquidity, limited exchange/FinTech acceptance, poor on/off-ramps, complex regional compliance, and a chicken-and-egg adoption problem. The article argues that the solution lies not in competing with entrenched USD stablecoin networks (USDT/USDC), but in adopting a synthetic FX model inspired by traditional finance. Specifically, it advocates for Mark-to-Market Non-Deliverable Forwards (NDFs)—cash-settled FX derivatives that allow users to maintain underlying USD stablecoin holdings while having their account balance and P&L denominated in a foreign currency. This approach offers key advantages: strong oracle-based pegs, retention of deep USD stablecoin liquidity and yield, superior on/off-ramps, scalability to any currency with a reliable feed, and capital efficiency. It mirrors how modern institutional FX markets operate. Primary use cases for on-chain NDFs include: 1. **Digital Banks/Wallets:** Enabling multi-currency accounts for international users without leaving the USD stablecoin ecosystem, boosting deposits and retention. 2. **FX Carry Trade Vaults:** Offering access to sovereign interest rate differentials (e.g., earning yield on BRL) in a more stable and scalable format than crypto-native products like Ethena. 3. **Global Enterprise Payments:** Allowing merchants to receive payments in local currency equivalents while settling in USD stablecoins, similar to services offered by Stripe for fiat. The conclusion is that synthetic FX, not native FX stablecoins, is the viable path to integrating foreign exchange into the growing stablecoin digital banking landscape, potentially unlocking the next phase of institutional DeFi and multi-trillion-dollar global adoption.

链捕手3h ago

Why Haven't Forex Stablecoins Taken Off?

链捕手3h ago

Trading

Spot
Futures

Hot Articles

How to Buy PEOPLE

Welcome to HTX.com! We've made purchasing ConstitutionDAO (PEOPLE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy ConstitutionDAO (PEOPLE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your ConstitutionDAO (PEOPLE)After purchasing your ConstitutionDAO (PEOPLE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade ConstitutionDAO (PEOPLE)Easily trade ConstitutionDAO (PEOPLE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

6.8k Total ViewsPublished 2024.03.29Updated 2025.03.21

How to Buy PEOPLE

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of PEOPLE (PEOPLE) are presented below.

活动图片