Gemini 3.5 is Here! Tonight, Google Overtakes Google

链捕手Pubblicato 2026-05-20Pubblicato ultima volta 2026-05-20

Introduzione

Gemini 3.5 Launches: Google Renders Itself Obsolete at I/O 2026 At Google I/O 2026, the company unveiled a transformative suite of AI advancements headlined by three major releases. First, **Gemini Omni**, a true "omnimodal" model, can generate high-quality, coherent videos from any combination of text, image, audio, or video inputs, maintaining character consistency and physical logic across iterative edits. Second, the new flagship **Gemini 3.5 Flash** was introduced, decisively outperforming the previous Gemini 3.1 Pro on key benchmarks for coding, agent tasks, and multimodal reasoning. It is also significantly faster than competitors. This model powers the upgraded **Antigravity 2.0**, an independent Agent development platform that demonstrated the ability to orchestrate 93 sub-agents to build a functional operating system from scratch in just 12 hours. Third, **Gemini Spark** debuted as a personal, always-on AI agent. Running 24/7 in the cloud and integrated with Google Workspace, it can autonomously execute complex multi-step tasks like drafting emails, managing schedules, and planning events by accessing apps like Gmail, Docs, and Sheets. These releases collectively mark a significant leap, moving AI beyond simple generation towards autonomous understanding, decision-making, and task execution, signaling rapid progress on the path toward more advanced AI systems.

Author: XinZhiYuan

 

Google I/O 2026 goes all out!

Just now, Pichai and Demis Hassabis took the stage together, unveiling all the major releases they've been accumulating for half a year in one go.

Without any suspense, the biggest star of the night, Gemini Omni, officially debuted!

As a truly "omni" model, Omni can accept any form of input and generate any content. It debuts with video output support, making it the "video version of Nano Banana".

Another highlight of the night belongs to Gemini 3.5 Flash.

In almost all benchmarks, 3.5 Flash has achieved a crushing victory over its predecessor flagship, Gemini 3.1 Pro. Its output speed has also doubled, and it is over 4 times faster than GPT-5.5 and Opus 4.7. The more powerful 3.5 Pro will be released next month.

In addition, a slew of other major new products were unveiled:

  • Antigravity 2.0: A brand-new standalone desktop application, evolving from an IDE to an Agent development platform.

  • Gemini Spark: A personal AI agent, running 24/7 in the cloud.

  • Gemini App Redesign: Code-named "Neural Expressive," switching to compute-based billing.

  • AI Ultra Subscription Plan: Adds a new $100 tier; highest tier reduced from $250 to $200.

  • Google Search's Biggest Upgrade in 25 Years: Integrated with 3.5 Flash, adds intelligent search box, automatic mini-app generation, etc.

    ......

Without exaggeration, the density of substantive announcements at this I/O is the highest in years.

Gemini Omni Debut: The Birth of an 'Omni' AI

As hinted by the teaser video, the highly anticipated Gemini Omni has finally arrived. Hassabis personally took the stage to announce, "We are taking the next important step—Gemini Omni, a new model that can create content from any input."

This prominence says it all. What Google aims to build this time is an "omni" AI creation engine. It integrates Gemini's intelligence with the strongest generative AI, fully maximizing capabilities in world understanding, multimodality, and editing. Put simply, given any combination of images, audio, video, and text, it can generate a high-quality video. Moreover, you can edit videos through conversation.

More crucially, Omni doesn't just "look like it"; it truly understands the physical world. Hassabis stated, "Previous systems often stumbled when simulating concepts like gravity and momentum, but Omni achieves a 'step change.'" It injects Gemini's "world knowledge" and "reasoning ability" into video generation.

  • Given the prompt "Explain protein folding using clay animation," the generated video accurately depicts amino acid chains folding into α-helices and β-sheets at every step, visually presented as exquisite stop-motion animation.

  • Another example: assigning corresponding objects to the 26 letters of the English alphabet. C for Capybara, D for Disco Ball, L for Lava Lamp. Omni isn't just pasting assets; it's genuinely connecting language, images, and semantics.

It has to be said, the leap from realism to meaningfulness is enormous.

On stage, Hassabis pulled out a selfie video and began live editing. A circle drawn on a palm turned into a black hole; an evening street stroll transformed into a cyberpunk scene. Rewrite the scene with a sentence, change the world with another. Anything can become a canvas for creating new realities. For instance, conjuring fire in your palm from a selfie, or a circle drawn on paper instantly becoming a black hole—all sorts of imaginative possibilities are now achievable.

Moreover, this isn't a one-time generation. You can continue the conversation. Characters remain consistent in Gemini Omni's video output, physical logic holds, and scene memory is coherent.

  • Starting from an original performance clip. Round two: "Teleport the violinist into the environment of this picture," attaching a reference image of snowy mountains and meadows. The scene instantly switches, with actions and lighting fully adapting to the new environment.

  • Round three: "Cut the shot to behind the violinist's shoulder." The perspective rotates, but the performance actions and music remain completely continuous.

No matter how the scene changes, the main subjects in the video do not break.

What's even more thought-provoking is Omni's input flexibility. Images, text, video, audio—any references can be mixed as input to generate a coherent output. You can even create your own avatar, allowing an AI version of you to appear in any scene, speaking with your voice and doing things you haven't done.

Currently, Omni Flash is officially launched, with the API version opening in the coming weeks. The more powerful Omni Pro is also on the way. Leveraging Google's powerful integration capabilities, Omni is integrated at launch with Gemini App, Google Flow, and YouTube Shorts, and even free for YouTube Shorts users.

Flash Overtakes Pro: 3.5 Redefines 'Flagship'

Following Gemini Omni, another major highlight of this I/O is the release of the new flagship, Gemini 3.5 Flash. Google defines it as the strongest coding and agent model to date.

On stage, Pichai personally announced, "3.5 Flash outperforms Gemini 3.1 Pro across virtually all benchmarks!" Remember, 3.1 Pro was the flagship model Google launched just three months ago. Now, a Flash-tier model is crushing it.

Unexpectedly, Google delivered such impressive results in such a short time:

  • Terminal-Bench 2.1 (Coding): 76.2%

  • GDPval-AA (Real-world Agent Tasks): 1656 Elo

  • MCP Atlas (Large-scale Tool Usage): 83.6%

  • CharXiv Reasoning (Multimodal Understanding): 84.2%

In the four major benchmarks above, compared to Gemini 3.1 Pro, 3.5 Flash represents a massive leap forward. In terms of speed, 3.5 Flash occupies its own quadrant at 289 tokens/second, over 4 times faster than other frontier models. Additionally, 3.5 Flash matches or even surpasses GPT-5.5 and Claude Opus 4.7 in some benchmarks. It must be said, 3.5 Flash is both fast and powerful, with virtually no rivals.

Numbers are abstract; let's look at real demonstrations. In an instant, 3.5 Flash can digest an abstruse academic paper and write a fully interactive, visual website. In agent tasks, via Antigravity, it can complete multi-step workflows, automatically categorizing and naming sprawling assets. Or, using two agents, it reproduced the AlphaZero paper in just six hours and coded a fully playable game.

93 Agents Build an OS in Just 12 Hours

It's evident that all these capabilities of 3.5 Flash are enabled by the new Antigravity 2.0. Today, Google's agent development platform, Antigravity, has been upgraded to version 2.0, evolving from an IDE to a standalone desktop application, fully embracing an Agent-first design.

Varun took the stage and gave a demo that left the audience breathless. He tasked Antigravity powered by 3.5 Flash with building an operating system from scratch. 93 sub-agents worked in parallel, making over 15,000 model calls, processing 2.6 billion tokens. Twelve hours later, a completely blank project transformed into a fully functional OS kernel. Scheduler, memory management, file system—every line of code was written by agents, tested by agents, and audited by agents. The API cost was under $1,000.

Then, he attempted to run DOOM on this AI-written operating system. The first attempt failed, lacking video and keyboard drivers. So he immediately entered a fix command in Antigravity 2.0, and the agents began automatically writing the driver code. After a moment, the DOOM screen appeared, and the venue erupted.

To summarize, Antigravity 2.0's core upgrades include:

  • Sub-agents can be dynamically generated; the main agent splits tasks into subtasks and assigns them out, running in parallel without interference.

  • Asynchronous task management prevents long-running operations from blocking the main thread.

  • Scheduled Tasks allow setting "timed tasks" for agents to execute automatically, like checking PR status once a day or running a health check script every hour.

  • New slash commands: /goal lets the agent run to completion; /grill-me makes the agent clarify requirements before acting; /browser explicitly controls browser usage.

However, these are capabilities already proven internally. The token processing speed using Antigravity internally at Google was 500 billion per day in March. Now, it's roaring at 3 trillion per day. Moreover, this 12x accelerated Flash is available in Antigravity starting today.

3.5 Flash is now the default model for both the Gemini App and Google Search AI Mode, available to all users worldwide. Developers can access it via Antigravity 2.0, Gemini API, and Google AI Studio. Enterprise users can onboard via Gemini Enterprise Agent Platform. Even more explosive, 3.5 Pro is currently in internal testing and will be released next month.

24/7 Personal Assistant: Google Spark Finally Arrives

The third major announcement tonight is undoubtedly Gemini Spark! Pichai's positioning for it is very clear: your personal AI agent. It doesn't stop even when you close your laptop. It runs on a dedicated virtual machine in the cloud, enabling 24/7 availability.

Gemini Spark is powered by Gemini 3.5 + the Antigravity framework, deeply integrated with Google's "Workspace suite." Product VP Josh Woodward took the stage to demonstrate two scenarios that drove the audience wild.

  • The first is a work scenario: Input an instruction, "Draft an email for the team summarizing all information from the past week about the Gemini Live launch." Spark automatically pulls information across Gmail, Docs, and chat logs, and also invokes a "ghostwriter" skill Woodward wrote himself, making the email automatically match his personal tone. The entire process is done in the background; a human only needs to review and send. Yes, Spark supports custom skills, allowing it to learn your voice, your preferences, and your work style.

  • The second is a life scenario: Planning a neighborhood block party. Upon receiving the task, Spark executes step by step. It creates an RSVP tracking sheet in Google Sheets, directly linked to Gmail, updating automatically as people reply. For neighbors who haven't signed up, Spark automatically drafts reminder emails, creating drafts for confirmation before sending. Then, it also generates a promotional deck in Google Slides, even including information about placing an inflatable castle in the neighborhood. The entire process didn't involve opening a single app.

Moreover, Spark possesses powerful voice input capabilities. Live on stage, Woodward pulled out his phone and directly issued three tasks via voice: "Find all meetings with Sundar and mark them bright pink," "Write an invitation for new neighbor John to join the block party list," "Create a doc listing things to do for the kids before the school year ends, sorted by deadline."

The voice directly converted into text instructions, and Spark automatically split the continuous voice input into three independent task threads, executing them in parallel in the background.

Regarding pricing, the $100/month AI Ultra subscription provides access to the Spark Beta. The highest-tier Ultra plan has been reduced from $250 to $200. Spark will be available as a Beta next week, initially for U.S. AI Ultra subscribers.

Tonight, Google Unveils the Gateway to ASI

Looking back at this I/O, what's truly chilling isn't any single product. It's that all these capabilities arrived simultaneously.

Full multimodal understanding, full multimodal generation, and 24/7 online Agents—these three puzzle pieces were all put in place by Google in one night. Omni turns a sentence into a world without humans providing any assets; 93 agents create an operating system from scratch without humans writing a single line of code; Spark works for you 24/7 without humans opening an app.

When AI no longer needs humans to "feed it," but understands, decides, executes, and iterates on its own—the end of this road is called ASI (Artificial Superintelligence).

No one can give a definitive timeline. But tonight's Google I/O made everyone realize one thing: On the path to superintelligence, the obstacle of "technically impossible" no longer exists. What remains is merely the speed of engineering deployment. Half a year ago, we were debating whether AGI was a bubble. Half a year later, Google is already writing operating systems with agents. The acceleration in this industry has already surpassed what human intuition can perceive.

References:

  • https://youtu.be/wYSncx9zLIU

  • https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-3-5/

  • https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-omni/

  • https://antigravity.google/blog/introducing-google-antigravity-2-0

  • https://antigravity.google/blog/google-io-2026-feature-deep-dive

Edited by: Peach Moses

 

 

 

 

 

 

 

 

 

 

 

 

 

Domande pertinenti

QWhat are the three major announcements made at Google I/O 2026 according to the article?

AThe three major announcements at Google I/O 2026 were: 1) The debut of Gemini Omni, a 'truly all-around' model capable of video output from any input. 2) The launch of the new flagship model Gemini 3.5 Flash, which significantly outperforms its predecessor. 3) The introduction of the personal AI Agent, Gemini Spark, which runs 24/7 in the cloud.

QWhat is the core capability of Gemini Omni as described in the article?

AThe core capability of Gemini Omni is that it is a 'truly all-around' AI creation engine. It can receive any combination of inputs (images, audio, video, text) and generate high-quality, meaningful videos. Key features include its understanding of the physical world, conversational video editing, and maintaining character and scene consistency across edits.

QHow does the Gemini 3.5 Flash model compare to the Gemini 3.1 Pro according to Google's presentation?

AAccording to Google CEO Sundar Pichai, Gemini 3.5 Flash outperforms the previous flagship model, Gemini 3.1 Pro, in almost all benchmark tests. It is described as achieving a 'fault leap forward' in areas like coding, real-world agent tasks, and multimodal understanding, while also being over 4 times faster than competing models like GPT-5.5 and Claude Opus 4.7.

QWhat impressive feat did the upgraded Antigravity 2.0 platform with Gemini 3.5 Flash accomplish in the demonstration?

AIn a demonstration, the Antigravity 2.0 platform, powered by Gemini 3.5 Flash, coordinated 93 sub-agents to build a fully functional operating system kernel from scratch in just 12 hours. The agents autonomously wrote over 26 billion tokens of code to create components like a scheduler, memory manager, and file system, and later successfully ran the classic game DOOM on this AI-built OS after a fix.

QWhat is the primary function of Gemini Spark, and what makes it unique?

AGemini Spark is a personal AI Agent designed to perform tasks autonomously on behalf of the user. Its primary function is to act as a 7x24 personal assistant that runs continuously on a dedicated cloud VM. It is unique because it can operate even when the user's device is off, deeply integrate with Google Workspace apps to perform complex, multi-step workflows, and execute multiple tasks parsed from a single voice command in parallel.

Letture associate

How to Define "Real U.S. Stocks": Differences Between On-Chain Tokens, Price Contracts, and Direct Broker Connections

**Title:** Defining "Real US Stocks": Differences Among On-Chain Tokens, Price Contracts, and Broker-Direct Access **Summary:** In 2026, using stablecoins to purchase US stocks is mainstream, but products marketed as "buying US stocks with USDT" offer fundamentally different assets. This article analyzes three primary models. **1. Tokenized Stocks:** These are on-chain tokens representing economic exposure to underlying stocks, held by an issuer or custodian. They offer benefits like 24/7 trading and DeFi composability (e.g., use as loan collateral). However, users lack direct legal shareholder status; dividends may not be paid in cash, and voting rights are typically non-binding advisory expressions. Examples include platforms like Ondo Finance. **2. Stock Futures / Equity Perpetuals:** These are derivative contracts tracking a stock's price, allowing leveraged long/short positions 24/7, similar to crypto perpetuals. They offer high efficiency and flexibility but involve funding fees, which can be a significant long-term cost, especially during strong trends. Crucially, they confer no ownership rights (dividends, voting) to the holder. **3. Broker-Direct Model:** This model provides access to real securities via licensed broker-dealers. Stocks/ETFs are bought and held within the US clearing and custodial system (e.g., DTCC), making it the only path to genuine stock ownership. Users receive cash dividends and formal proxy voting rights (where applicable). It supports thousands of stocks and ETFs, far exceeding the coverage of the other two models. Key advantages include no funding fees, a clean cost structure for long-term holds, and the potential to transfer holdings to other brokers. Some platforms facilitate stablecoin (USDT/USDC) deposits, reducing reliance on traditional banking. A critical distinction exists *within* the broker-direct model: the underlying brokerage architecture (e.g., Fully Disclosed IB, Omnibus IB, Self-Clearing) determines how client assets are held, protected, and how safeguards like SIPC insurance are conveyed. Users should verify the specific clearing structure and regulatory compliance of any platform. In conclusion, "buying US stocks with USDT" can mean holding an on-chain economic proxy (Tokenized Stocks), trading a price derivative (Stock Futures), or owning the actual security (Broker-Direct). For users seeking full ownership rights and long-term investment, the broker-direct model is the definitive choice, though its implementation details require careful scrutiny.

marsbit17 min fa

How to Define "Real U.S. Stocks": Differences Between On-Chain Tokens, Price Contracts, and Direct Broker Connections

marsbit17 min fa

NVIDIA Launches DSX Platform, Expanding into AI Factory Infrastructure

NVIDIA has unveiled the DSX platform at its GTC Taipei event, marking a strategic expansion from GPU sales into comprehensive AI factory infrastructure solutions. The platform addresses challenges like power supply, cooling, and resource orchestration as AI models scale, shifting the industry focus from single-chip performance to overall infrastructure efficiency. DSX integrates NVIDIA's chips, systems, software, and partner technologies to cover the entire AI factory lifecycle—from design and simulation to deployment and operations. It aims to accelerate deployment, improve reliability and operational efficiency, and reduce the cost per generated token in AI inference. The software suite includes DSX MaxLPS, which uses 45°C liquid cooling and rack-level optimization to allow up to 40% more GPUs per megawatt, and DSX OS, an open-source platform for AI factory operations. The platform also encompasses reference designs, digital twin simulation (DSX Sim), dynamic workload adjustment based on grid conditions (DSX Flex), and data exchange between systems. Early adopters include cloud providers like CoreWeave and Lambda. Major hardware partners, including Dell, HPE, Lenovo, and Supermicro, are developing DSX-ready systems. Pilot projects for DSX Flex are underway with energy providers. Strategically, DSX represents NVIDIA's ongoing transition from an AI chip supplier to a full-stack AI infrastructure platform provider, aiming to set industry standards and solidify its market leadership.

marsbit23 min fa

NVIDIA Launches DSX Platform, Expanding into AI Factory Infrastructure

marsbit23 min fa

After Burning Tens of Billions of Dollars in Tokens, Silicon Valley Giants Start Limiting Employee Token Usage

After burning tens of billions of dollars on AI tokens, major Silicon Valley firms are now restricting employee usage. Companies like Microsoft, Uber, and Salesforce, which heavily promoted AI for "efficiency," are facing a cost crisis. The practice of "tokenmaxxing"—pushing employees to maximize AI tool usage—led to wasteful spending on trivial tasks like checking the weather or writing birthday messages, with studies showing significant hidden costs for bug fixes and code rewrites. The core issue is a misalignment between individual productivity gains and actual business value. While employees use AI to automate tasks they dislike, such as writing reports, this often doesn't translate to increased company revenue or improved core business outcomes. For instance, AI-generated code speeds up development but also sees an 800% increase in "code churn" (code being discarded or rewritten). As a result, only 14% of CFOs report seeing a clear, measurable return on AI investments. Firms are now shifting strategies. Microsoft has revoked most internal licenses for Claude Code, while others are implementing monitoring and cost controls. New tools from companies like Harness and CloudZero aim to track AI spending and tie costs to business results. Some AI vendors, like HubSpot, are moving from token-based pricing to charging based on outcomes, such as "resolved conversations" or "leads generated." This represents a necessary correction in the AI adoption cycle. The challenge now is for companies to move beyond using AI merely to speed up old tasks and instead rethink their workflows and business models fundamentally. The future of enterprise AI depends on proving its value, not just its usage.

marsbit43 min fa

After Burning Tens of Billions of Dollars in Tokens, Silicon Valley Giants Start Limiting Employee Token Usage

marsbit43 min fa

Trading

Spot
Futures
活动图片