Gemini 3.5 is Here! Tonight, Google Overtakes Google

链捕手Pubblicato 2026-05-20Pubblicato ultima volta 2026-05-20

Introduzione

Gemini 3.5 Launches: Google Renders Itself Obsolete at I/O 2026 At Google I/O 2026, the company unveiled a transformative suite of AI advancements headlined by three major releases. First, **Gemini Omni**, a true "omnimodal" model, can generate high-quality, coherent videos from any combination of text, image, audio, or video inputs, maintaining character consistency and physical logic across iterative edits. Second, the new flagship **Gemini 3.5 Flash** was introduced, decisively outperforming the previous Gemini 3.1 Pro on key benchmarks for coding, agent tasks, and multimodal reasoning. It is also significantly faster than competitors. This model powers the upgraded **Antigravity 2.0**, an independent Agent development platform that demonstrated the ability to orchestrate 93 sub-agents to build a functional operating system from scratch in just 12 hours. Third, **Gemini Spark** debuted as a personal, always-on AI agent. Running 24/7 in the cloud and integrated with Google Workspace, it can autonomously execute complex multi-step tasks like drafting emails, managing schedules, and planning events by accessing apps like Gmail, Docs, and Sheets. These releases collectively mark a significant leap, moving AI beyond simple generation towards autonomous understanding, decision-making, and task execution, signaling rapid progress on the path toward more advanced AI systems.

Author: XinZhiYuan

Google I/O 2026 goes all out!

Just now, Pichai and Demis Hassabis took the stage together, unveiling all the major releases they've been accumulating for half a year in one go.

Without any suspense, the biggest star of the night, Gemini Omni, officially debuted!

As a truly "omni" model, Omni can accept any form of input and generate any content. It debuts with video output support, making it the "video version of Nano Banana".

Another highlight of the night belongs to Gemini 3.5 Flash.

In almost all benchmarks, 3.5 Flash has achieved a crushing victory over its predecessor flagship, Gemini 3.1 Pro. Its output speed has also doubled, and it is over 4 times faster than GPT-5.5 and Opus 4.7. The more powerful 3.5 Pro will be released next month.

In addition, a slew of other major new products were unveiled:

Antigravity 2.0: A brand-new standalone desktop application, evolving from an IDE to an Agent development platform.
Gemini Spark: A personal AI agent, running 24/7 in the cloud.
Gemini App Redesign: Code-named "Neural Expressive," switching to compute-based billing.
AI Ultra Subscription Plan: Adds a new $100 tier; highest tier reduced from $250 to $200.
Google Search's Biggest Upgrade in 25 Years: Integrated with 3.5 Flash, adds intelligent search box, automatic mini-app generation, etc.

......

Without exaggeration, the density of substantive announcements at this I/O is the highest in years.

Gemini Omni Debut: The Birth of an 'Omni' AI

As hinted by the teaser video, the highly anticipated Gemini Omni has finally arrived. Hassabis personally took the stage to announce, "We are taking the next important step—Gemini Omni, a new model that can create content from any input."

This prominence says it all. What Google aims to build this time is an "omni" AI creation engine. It integrates Gemini's intelligence with the strongest generative AI, fully maximizing capabilities in world understanding, multimodality, and editing. Put simply, given any combination of images, audio, video, and text, it can generate a high-quality video. Moreover, you can edit videos through conversation.

More crucially, Omni doesn't just "look like it"; it truly understands the physical world. Hassabis stated, "Previous systems often stumbled when simulating concepts like gravity and momentum, but Omni achieves a 'step change.'" It injects Gemini's "world knowledge" and "reasoning ability" into video generation.

Given the prompt "Explain protein folding using clay animation," the generated video accurately depicts amino acid chains folding into α-helices and β-sheets at every step, visually presented as exquisite stop-motion animation.

Another example: assigning corresponding objects to the 26 letters of the English alphabet. C for Capybara, D for Disco Ball, L for Lava Lamp. Omni isn't just pasting assets; it's genuinely connecting language, images, and semantics.

It has to be said, the leap from realism to meaningfulness is enormous.

On stage, Hassabis pulled out a selfie video and began live editing. A circle drawn on a palm turned into a black hole; an evening street stroll transformed into a cyberpunk scene. Rewrite the scene with a sentence, change the world with another. Anything can become a canvas for creating new realities. For instance, conjuring fire in your palm from a selfie, or a circle drawn on paper instantly becoming a black hole—all sorts of imaginative possibilities are now achievable.

Moreover, this isn't a one-time generation. You can continue the conversation. Characters remain consistent in Gemini Omni's video output, physical logic holds, and scene memory is coherent.

Starting from an original performance clip. Round two: "Teleport the violinist into the environment of this picture," attaching a reference image of snowy mountains and meadows. The scene instantly switches, with actions and lighting fully adapting to the new environment.
Round three: "Cut the shot to behind the violinist's shoulder." The perspective rotates, but the performance actions and music remain completely continuous.

No matter how the scene changes, the main subjects in the video do not break.

What's even more thought-provoking is Omni's input flexibility. Images, text, video, audio—any references can be mixed as input to generate a coherent output. You can even create your own avatar, allowing an AI version of you to appear in any scene, speaking with your voice and doing things you haven't done.

Currently, Omni Flash is officially launched, with the API version opening in the coming weeks. The more powerful Omni Pro is also on the way. Leveraging Google's powerful integration capabilities, Omni is integrated at launch with Gemini App, Google Flow, and YouTube Shorts, and even free for YouTube Shorts users.

Flash Overtakes Pro: 3.5 Redefines 'Flagship'

Following Gemini Omni, another major highlight of this I/O is the release of the new flagship, Gemini 3.5 Flash. Google defines it as the strongest coding and agent model to date.

On stage, Pichai personally announced, "3.5 Flash outperforms Gemini 3.1 Pro across virtually all benchmarks!" Remember, 3.1 Pro was the flagship model Google launched just three months ago. Now, a Flash-tier model is crushing it.

Unexpectedly, Google delivered such impressive results in such a short time:

Terminal-Bench 2.1 (Coding): 76.2%
GDPval-AA (Real-world Agent Tasks): 1656 Elo
MCP Atlas (Large-scale Tool Usage): 83.6%
CharXiv Reasoning (Multimodal Understanding): 84.2%

In the four major benchmarks above, compared to Gemini 3.1 Pro, 3.5 Flash represents a massive leap forward. In terms of speed, 3.5 Flash occupies its own quadrant at 289 tokens/second, over 4 times faster than other frontier models. Additionally, 3.5 Flash matches or even surpasses GPT-5.5 and Claude Opus 4.7 in some benchmarks. It must be said, 3.5 Flash is both fast and powerful, with virtually no rivals.

Numbers are abstract; let's look at real demonstrations. In an instant, 3.5 Flash can digest an abstruse academic paper and write a fully interactive, visual website. In agent tasks, via Antigravity, it can complete multi-step workflows, automatically categorizing and naming sprawling assets. Or, using two agents, it reproduced the AlphaZero paper in just six hours and coded a fully playable game.

93 Agents Build an OS in Just 12 Hours

It's evident that all these capabilities of 3.5 Flash are enabled by the new Antigravity 2.0. Today, Google's agent development platform, Antigravity, has been upgraded to version 2.0, evolving from an IDE to a standalone desktop application, fully embracing an Agent-first design.

Varun took the stage and gave a demo that left the audience breathless. He tasked Antigravity powered by 3.5 Flash with building an operating system from scratch. 93 sub-agents worked in parallel, making over 15,000 model calls, processing 2.6 billion tokens. Twelve hours later, a completely blank project transformed into a fully functional OS kernel. Scheduler, memory management, file system—every line of code was written by agents, tested by agents, and audited by agents. The API cost was under $1,000.

Then, he attempted to run DOOM on this AI-written operating system. The first attempt failed, lacking video and keyboard drivers. So he immediately entered a fix command in Antigravity 2.0, and the agents began automatically writing the driver code. After a moment, the DOOM screen appeared, and the venue erupted.

To summarize, Antigravity 2.0's core upgrades include:

Sub-agents can be dynamically generated; the main agent splits tasks into subtasks and assigns them out, running in parallel without interference.
Asynchronous task management prevents long-running operations from blocking the main thread.
Scheduled Tasks allow setting "timed tasks" for agents to execute automatically, like checking PR status once a day or running a health check script every hour.
New slash commands: /goal lets the agent run to completion; /grill-me makes the agent clarify requirements before acting; /browser explicitly controls browser usage.

However, these are capabilities already proven internally. The token processing speed using Antigravity internally at Google was 500 billion per day in March. Now, it's roaring at 3 trillion per day. Moreover, this 12x accelerated Flash is available in Antigravity starting today.

3.5 Flash is now the default model for both the Gemini App and Google Search AI Mode, available to all users worldwide. Developers can access it via Antigravity 2.0, Gemini API, and Google AI Studio. Enterprise users can onboard via Gemini Enterprise Agent Platform. Even more explosive, 3.5 Pro is currently in internal testing and will be released next month.

24/7 Personal Assistant: Google Spark Finally Arrives

The third major announcement tonight is undoubtedly Gemini Spark! Pichai's positioning for it is very clear: your personal AI agent. It doesn't stop even when you close your laptop. It runs on a dedicated virtual machine in the cloud, enabling 24/7 availability.

Gemini Spark is powered by Gemini 3.5 + the Antigravity framework, deeply integrated with Google's "Workspace suite." Product VP Josh Woodward took the stage to demonstrate two scenarios that drove the audience wild.

The first is a work scenario: Input an instruction, "Draft an email for the team summarizing all information from the past week about the Gemini Live launch." Spark automatically pulls information across Gmail, Docs, and chat logs, and also invokes a "ghostwriter" skill Woodward wrote himself, making the email automatically match his personal tone. The entire process is done in the background; a human only needs to review and send. Yes, Spark supports custom skills, allowing it to learn your voice, your preferences, and your work style.

The second is a life scenario: Planning a neighborhood block party. Upon receiving the task, Spark executes step by step. It creates an RSVP tracking sheet in Google Sheets, directly linked to Gmail, updating automatically as people reply. For neighbors who haven't signed up, Spark automatically drafts reminder emails, creating drafts for confirmation before sending. Then, it also generates a promotional deck in Google Slides, even including information about placing an inflatable castle in the neighborhood. The entire process didn't involve opening a single app.

Moreover, Spark possesses powerful voice input capabilities. Live on stage, Woodward pulled out his phone and directly issued three tasks via voice: "Find all meetings with Sundar and mark them bright pink," "Write an invitation for new neighbor John to join the block party list," "Create a doc listing things to do for the kids before the school year ends, sorted by deadline."

The voice directly converted into text instructions, and Spark automatically split the continuous voice input into three independent task threads, executing them in parallel in the background.

Regarding pricing, the $100/month AI Ultra subscription provides access to the Spark Beta. The highest-tier Ultra plan has been reduced from $250 to $200. Spark will be available as a Beta next week, initially for U.S. AI Ultra subscribers.

Tonight, Google Unveils the Gateway to ASI

Looking back at this I/O, what's truly chilling isn't any single product. It's that all these capabilities arrived simultaneously.

Full multimodal understanding, full multimodal generation, and 24/7 online Agents—these three puzzle pieces were all put in place by Google in one night. Omni turns a sentence into a world without humans providing any assets; 93 agents create an operating system from scratch without humans writing a single line of code; Spark works for you 24/7 without humans opening an app.

When AI no longer needs humans to "feed it," but understands, decides, executes, and iterates on its own—the end of this road is called ASI (Artificial Superintelligence).

No one can give a definitive timeline. But tonight's Google I/O made everyone realize one thing: On the path to superintelligence, the obstacle of "technically impossible" no longer exists. What remains is merely the speed of engineering deployment. Half a year ago, we were debating whether AGI was a bubble. Half a year later, Google is already writing operating systems with agents. The acceleration in this industry has already surpassed what human intuition can perceive.

References:

https://youtu.be/wYSncx9zLIU
https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/
https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/
https://antigravity.google/blog/introducing-google-antigravity-2-0
https://antigravity.google/blog/google-io-2026-feature-deep-dive

Edited by: Peach Moses

Domande pertinenti

QWhat are the three major announcements made at Google I/O 2026 according to the article?

AThe three major announcements at Google I/O 2026 were: 1) The debut of Gemini Omni, a 'truly all-around' model capable of video output from any input. 2) The launch of the new flagship model Gemini 3.5 Flash, which significantly outperforms its predecessor. 3) The introduction of the personal AI Agent, Gemini Spark, which runs 24/7 in the cloud.

QWhat is the core capability of Gemini Omni as described in the article?

AThe core capability of Gemini Omni is that it is a 'truly all-around' AI creation engine. It can receive any combination of inputs (images, audio, video, text) and generate high-quality, meaningful videos. Key features include its understanding of the physical world, conversational video editing, and maintaining character and scene consistency across edits.

QHow does the Gemini 3.5 Flash model compare to the Gemini 3.1 Pro according to Google's presentation?

AAccording to Google CEO Sundar Pichai, Gemini 3.5 Flash outperforms the previous flagship model, Gemini 3.1 Pro, in almost all benchmark tests. It is described as achieving a 'fault leap forward' in areas like coding, real-world agent tasks, and multimodal understanding, while also being over 4 times faster than competing models like GPT-5.5 and Claude Opus 4.7.

QWhat impressive feat did the upgraded Antigravity 2.0 platform with Gemini 3.5 Flash accomplish in the demonstration?

AIn a demonstration, the Antigravity 2.0 platform, powered by Gemini 3.5 Flash, coordinated 93 sub-agents to build a fully functional operating system kernel from scratch in just 12 hours. The agents autonomously wrote over 26 billion tokens of code to create components like a scheduler, memory manager, and file system, and later successfully ran the classic game DOOM on this AI-built OS after a fix.

QWhat is the primary function of Gemini Spark, and what makes it unique?

AGemini Spark is a personal AI Agent designed to perform tasks autonomously on behalf of the user. Its primary function is to act as a 7x24 personal assistant that runs continuously on a dedicated cloud VM. It is unique because it can operate even when the user's device is off, deeply integrate with Google Workspace apps to perform complex, multi-step workflows, and execute multiple tasks parsed from a single voice command in parallel.

Letture associate

Tiger Research: On-Chain Risk Operators, The Market Cap Gap Between 147 Trillion and 70 Billion

This report by Tiger Research examines the evolution of risk management in decentralized finance (DeFi) lending. It highlights a power shift from protocol developers to specialized professional risk operators who manage on-chain capital. The era of protocols and community governance solely dictating DeFi lending is ending. A new professional asset management layer has emerged. While the sector is nascent, capital and distribution channels are rapidly consolidating around top risk operator teams, whose past performance is now a key criterion for institutional entry. The industry's development, accelerated by modular infrastructures like Morpho, has led to a clear division of labor mirroring traditional finance: distribution channels (e.g., exchanges), strategy/risk management (the risk operators), and product infrastructure/asset custody (smart contract protocols). This structure lowers the entry barrier for traditional institutions. Currently, the total value managed by risk operators is approximately $70 billion, dominated by a few leading teams like Steakhouse (RWA focus), Sentora (AI models), and Gauntlet (crisis management). Competition now centers on collateral standards, distribution access, and crisis response capabilities. The report outlines three primary entry paths for institutions: 1) **Distribution Model**: Leveraging external risk operators as backend service providers (common for exchanges). 2) **Asset Supply Model**: Onboarding real-world assets to DeFi as collateral. 3) **Independent Operator Model**: Building an in-house team to become a risk operator (e.g., Bitwise). The core opportunity lies in the strategy/risk management layer, where traditional financial institutions can leverage their existing expertise in due diligence and risk assessment without deep technical development. A vast opportunity gap exists: the global traditional asset management industry manages ~$147 trillion, while the entire DeFi sector is only ~$800 billion, with the risk operator niche at ~$70 billion. This disparity signifies immense growth potential. Once robust risk frameworks and clearer regulations are established, even a minor allocation from traditional markets could trigger exponential DeFi growth. Early movers who help build these foundational systems will gain significant rule-setting influence and first-mover advantages.

marsbit7 min fa

Tiger Research: On-Chain Risk Operators, The Market Cap Gap Between 147 Trillion and 70 Billion

marsbit7 min fa

Interview with Circle's Chief Economist: USDC's Entry into Hyperliquid Benefits Circle and HYPE, Stablecoins Are Becoming Marginal Buyers of U.S. Treasuries

In an interview with Circle's Chief Economist Gordon Liao, the conversation covers the strategic significance of USDC replacing USDH as the reference asset on the decentralized perpetual exchange Hyperliquid. This shift, facilitated by Coinbase as the reserve manager and Circle providing technical infrastructure, aims to capture net interest income for the platform, with 90% of reserve earnings directed back to Hyperliquid for HYPE token buybacks. Liao discusses how stablecoins like USDC, with their substantial on-chain settlement volumes (e.g., $21 trillion in Q1 2026), are emerging as marginal buyers of U.S. Treasuries, concentrating on short-term debt and effectively reducing the weighted duration of the market, which may provide underlying support for long-term rates. The dialogue also explores the evolving nature of stablecoins as both a medium of exchange and a vehicle for capital and collateral liquidity. Additionally, the panel touches on the CLARITY Act's legislative progress, noting compromises around "activity-based rewards" and remaining hurdles like ethics concerns. On AI, there's debate over value capture, with predictions that distribution and application layers, rather than foundational model companies like OpenAI, will accrue most value. Regarding the bond market, Liao attributes the rise in 30-year yields primarily to an increased term premium (around 80 bps) driven by supply-demand dynamics, including fiscal expansion and changing investor demand, rather than expectations of Fed rate hikes.

marsbit12 min fa

Interview with Circle's Chief Economist: USDC's Entry into Hyperliquid Benefits Circle and HYPE, Stablecoins Are Becoming Marginal Buyers of U.S. Treasuries

marsbit12 min fa

Cryptocurrency Asset Recovery: A Lucrative, Low-Profile Business

Summary: The article explores the growing business of cryptocurrency asset recovery, highlighting it as a quiet but profitable niche. While many assume recovery involves dramatic hacking or theft cases, the most common issues are everyday operational errors: sending crypto to the wrong blockchain network, forgetting transaction memos/Tags, hardware wallet failures, incorrect seed phrase backups, and frozen centralized exchange accounts. As cryptocurrency adoption expands to less technical users, the volume of such costly mistakes increases. This creates a genuine, recurring demand for professional recovery services. The article notes a paradox: while the technology emphasizes user-controlled assets, the complexity often necessitates expert intermediaries, similar to traditional financial services. However, the field is fraught with risks, including middlemen and secondary scammers who prey on desperate users. Truly professional teams avoid promising guaranteed results, instead focusing on diagnosing the specific problem—whether it's a technical wallet issue, an exchange compliance matter, or an unsolvable private key loss. The author concludes by noting the professionalization of this market and announces a partnership with a specialized recovery team, offering readers a preliminary assessment for issues like wrong-chain deposits, lost access, or frozen accounts, while emphasizing ethical practices and realistic expectations.

marsbit59 min fa

Cryptocurrency Asset Recovery: A Lucrative, Under-the-Radar Business

Cryptocurrency Asset Recovery: A Lucrative, Low-Key Business The article discusses the burgeoning business of cryptocurrency asset recovery, driven by common yet often crippling user errors rather than sensational hacking incidents. Key problem areas include selecting the wrong blockchain for a deposit, omitting required memos/tags when sending to exchanges, physical wallet device failures, errors in backing up or modifying seed phrases, and issues with frozen accounts or withdrawals on centralized exchanges. As cryptocurrency adoption grows among mainstream users—including retail investors and businesses—these operational mistakes increase. The decentralized nature of crypto places full responsibility for asset security on users, who may lack the technical expertise to navigate complex chains, wallets, and protocols. Even centralized exchanges, while offering some support, often present users with cumbersome, non-intuitive processes for resolving issues. This creates a persistent and growing demand for professional recovery services. However, the field is rife with risks, including middlemen without real expertise and outright scammers who promise guaranteed recovery, request sensitive information like private keys, or charge advance "fees." Legitimate service providers typically avoid absolute guarantees, as recovery feasibility depends heavily on the specific technical or administrative circumstances of each case. The business is evolving from an informal market into a professional one requiring a combination of technical analysis, exchange/platform communication, and legal/compliance knowledge. The article concludes by noting the author's partnership with a professional recovery team, offering preliminary assessments for issues like incorrect deposits, wallet access problems, or exchange account freezes, with an emphasis on realistic evaluation over promises.

链捕手1 h fa

Cryptocurrency Asset Recovery: A Lucrative, Under-the-Radar Business

链捕手1 h fa

YC Partner: How to Build a Self-Evolving AI-Native Company

YC Partner Tom Blomfield argues that the future lies in building AI-native companies designed as self-evolving systems, not just applying AI to traditional, hierarchical "Roman legion" structures. The core idea is to extract and codify all organizational knowledge—scattered across emails, Slack, documents, and human minds—into a central, AI-readable "company brain." This enables the creation of recursive AI loops that sense changes (from emails, support tickets, data), make decisions, execute via tools, and learn from feedback, all with minimal human intervention. YC exemplifies this with an agent that monitors failed queries, autonomously diagnoses the issue (e.g., needing a new database or index), writes code, submits it for review, and deploys fixes—optimizing the company while founders sleep. This shift redefines organizational structure: the bottleneck becomes token usage and context quality, not headcount. Middle management for coordination is largely obsolete. The critical human roles are individual contributors (ICs) and those handling high-risk, real-world judgments at the system's edge. Key steps include recording all organizational activity for AI, creating self-improving artifacts (like an AI-generated, living handbook), and treating internal software as temporary and disposable, while preserving valuable business context and data. The fundamental question for founders is whether to build their company as this new type of intelligent, self-optimizing system from the start.

marsbit1 h fa

YC Partner: How to Build a Self-Evolving AI-Native Company

marsbit1 h fa

Trading

Spot

Futures