Tens of Millions of Errors Per Hour: Investigation Reveals the 'Accuracy Illusion' of Google AI Search

marsbitPublished on 2026-04-10Last updated on 2026-04-10

Abstract

A New York Times investigation, in collaboration with AI startup Oumi, reveals significant accuracy and reliability issues with Google's AI Overviews search feature. Testing over 4,300 queries showed the accuracy rate improved from 85% (Gemini 2) to 91% (Gemini 3). However, given Google's scale of ~5 trillion annual searches, this 9% error rate translates to over 57 million incorrect answers generated hourly. A more critical issue is the prevalence of unsubstantiated citations. For correct answers, the rate of "unfounded citations"—where provided source links do not support the AI's claims—worsened, rising from 37% with Gemini 2 to 56% with Gemini 3. This makes it difficult for users to verify the information. The AI also heavily relies on low-quality sources, with Facebook and Reddit being its second and fourth most cited domains. Furthermore, the system is highly susceptible to manipulation. A BBC journalist successfully "poisoned" it by publishing a fake article; Google's AI began presenting the false information as fact within 24 hours. Google disputed the study's methodology, criticizing the use of the SimpleQA benchmark and an AI model (Oumi's HallOumi) to evaluate its own AI. The company maintains that its internal safeguards and ranking systems improve accuracy beyond the base model's performance.

Author: Claude, Deep Tide TechFlow

Deep Tide Introduction: The latest test by The New York Times in collaboration with AI startup Oumi shows that the accuracy rate of Google Search's AI Overviews feature is about 91%. However, given Google's scale of processing 5 trillion searches annually, this translates to tens of millions of incorrect answers generated every hour. More troublingly, even when the answers are correct, over half of the cited links fail to support their conclusions.

Google is delivering misinformation to users on an unprecedented scale, and most people are completely unaware.

According to The New York Times, AI startup Oumi, commissioned by the publication, used the industry-standard test SimpleQA developed by OpenAI to evaluate the accuracy of Google's AI Overviews feature. The test covered 4,326 search queries, conducting one round in October last year (powered by Gemini 2) and another in February this year (upgraded to Gemini 3). The results showed that Gemini 2's accuracy was about 85%, which improved to 91% with Gemini 3.

91% sounds good, but it's a different story when considering Google's scale. Google processes approximately 5 trillion search queries annually. Calculating with a 9% error rate, AI Overviews generates over 57 million inaccurate answers per hour, nearly 1 million per minute.

Correct Answers, Wrong Sources

More alarming than the accuracy rate is the issue of "unanchored" citation sources.

Oumi's data shows that in the Gemini 2 era, 37% of correct answers had "unsupported citations," meaning the links attached to the AI summaries did not support the information provided. After upgrading to Gemini 3, this proportion increased instead of decreasing, jumping to 56%. In other words, while the model gives correct answers, it's increasingly failing to "show its work."

Oumi CEO Manos Koukoumidis pointedly questioned: "Even if the answer is correct, how do you know it's correct? How do you verify it?"

The problem is exacerbated by AI Overviews' heavy reliance on low-quality sources. Oumi found that Facebook and Reddit are the second and fourth most cited sources for AI Overviews, respectively. In inaccurate answers, Facebook was cited 7% of the time, higher than the 5% in accurate answers.

BBC Journalist's Fake Article "Poisoned" Results Within 24 Hours

Another serious flaw of AI Overviews is its susceptibility to manipulation.

A BBC journalist tested the system with a deliberately fabricated false article. In less than 24 hours, Google's AI Overview presented the false information from the article as fact to users.

This means anyone who understands how the system works could potentially "poison" AI search results by publishing false content and boosting its traffic. Google spokesperson Ned Adriance responded by saying the search AI feature is built on the same ranking and security mechanisms that block spam, and claimed that "most examples in the test are unrealistic queries that people wouldn't actually search for."

Google's Rebuttal: The Test Itself Is Flawed

Google raised several objections to Oumi's research. A Google spokesperson called the study "seriously flawed," citing reasons including: the SimpleQA benchmark itself contains inaccurate information; Oumi used its own AI model HallOumi to judge another AI's performance, potentially introducing additional errors; and the test content doesn't reflect real user search behavior.

Google's internal tests also showed that when Gemini 3 operates independently outside the Google Search framework, it produces false outputs at a rate as high as 28%. But Google emphasized that AI Overviews leverages the search ranking system to improve accuracy, performing better than the model itself.

However, as PCMag's commentary pointed out the logical paradox: If your defense is that "the report pointing out our AI's inaccuracies itself uses potentially inaccurate AI," this probably doesn't enhance users' confidence in your product's accuracy.

List of Most Popular Altcoins by Recent Hourly Searches Published!

Cryptocurrency data platform CoinGecko released a list of the most popular altcoins based on user searches over the last three hours. Pudgy Penguins ($PENGU) leads the trending list, followed by Catecoin (CATE) and Bless ($BLESS) in the top three. According to CoinGecko, $PENGU's price increased by 3.9% in the last 24 hours. CATE, ranked second, surged 126.2% over the same period, while $BLESS saw a 24-hour gain of 86.1%. What IF (IF) also stands out with a 41.9% daily increase. The list of most searched cryptocurrencies and their total market capitalization over the past three hours is as follows: 1. Pudgy Penguins ($PENGU) – $389.13 million 2. Catecoin (CATE) – $19.62 million 3. Bless ($BLESS) – $32.72 million 4. Aerodrome Finance (AERO) – $385.03 million 5. Hyperliquid (HYPE) – $11.43 billion 6. Ethereum (ETH) – $224.17 billion 7. Chainlink (LINK) – $6.17 billion 8. Aave (AAVE) – $1.42 billion 9. What IF (IF) – $31.24 million 10. Polkadot (DOT) – $1.34 billion 11. Bitcoin (BTC) – $1.27 trillion 12. Virtual Protocol (VIRTUAL) – $366.19 million 13. Algorand (ALGO) – $758.15 million 14. Cash Cat (CASHCAT) – $41.81 million 15. Solana (SOL) – $42.38 billion *This is not investment advice.

cryptonews.ru37m ago

List of Most Popular Altcoins by Recent Hourly Searches Published!

cryptonews.ru37m ago

For $100,000 a Month: Truth Social Sells Access to Trump's Posts to Investment Firms

In August 2026, Trump Media and Technology Group (TMTG) launched Truth API, a paid data service offering real-time access to posts from influential Truth Social accounts, including Donald Trump's, for institutional and algorithmic trading firms. Subscriptions reportedly cost up to $100,000 monthly, with discounts for long-term contracts. TMTG's CEO framed it as a strategy to monetize platform assets and create shareholder value. The move drew criticism from lawmakers, including Democrats Elizabeth Warren and Adam Schiff, who called for an SEC investigation, and Republican Bill Cassidy, who criticized the "sale" of privileged access. An AI analysis notes this creates a market risk architecture similar to past incidents where algorithms rapidly traded on unverified social media posts, raising questions about accountability for potential misinformation or manipulation.

cryptonews.ru1h ago

For $100,000 a Month: Truth Social Sells Access to Trump's Posts to Investment Firms

cryptonews.ru1h ago

Strategy leaves preferred STRC dividend at 12% as price still below par

Strategy's preferred STRC shares remain priced significantly below their $100 par value, closing July at $89.46 despite a monthly gain. The company confirmed its August dividend will hold at the recently increased 12% annual rate, paid semi-monthly. Management's stated objective is for the shares to trade at $99-$100, though no timeline was given. The firm reported a large Q2 net loss due to unrealized losses on its Bitcoin holdings but has built a $3.75 billion cash reserve to support preferred dividend payments for over two years. It has also begun repurchasing STRC shares while they trade below par.

cointelegraph2h ago

Strategy leaves preferred STRC dividend at 12% as price still below par

cointelegraph2h ago

Bitcoin Withdrawals Continue: 8 Years of Storage in a Coldcard Cold Wallet Ended in Zero

Coldcard Hardware Wallet Hacked: Losses Mount Due to Vulnerable Seed Generation A critical vulnerability in Coldcard hardware wallets has led to a continued wave of fund thefts. According to Galaxy Research, the total stolen has reached 1,367.05 BTC (approx. $88.6 million) from 4,585 addresses, a significant increase from the initial 594.5 BTC reported on July 30, 2026. Most of the stolen funds remain on the attackers' addresses. The issue is not with the current firmware, which Coinkite has updated, but with seed phrases generated on vulnerable devices between March 2021 and the release of fixed firmware versions. Due to a programmer error, devices switched from using a hardware random number generator to the software-based Yasmarang generator, which was initialized with publicly accessible data like the chip's serial number. This made the seed phrases predictable through offline brute-force attacks, meaning wallets remain at risk until funds are moved to a new wallet generated with the patched firmware. Affected devices include Mk2/Mk3 with firmware 4.0.1–4.1.9 (and up to 5.0.3), Mk4/Mk5 up to version 5.6.0, and Q models up to 1.5.0Q. The only exceptions are seeds created with a high-entropy method like at least 50 independent dice rolls or a strong unique BIP-39 passphrase. All other owners must generate a new seed on the fixed firmware and transfer their assets. A case highlighting the human impact involves a 39-year-old long-term investor who lost 2 BTC (approx. $130,000) in minutes. He had accumulated the Bitcoin over eight years through physical labor, viewing it as a financial lifeline and a retirement plan in a country suffering from hyperinflation. His story underscores that even conservative "buy and hold in cold storage" strategies can be compromised by such underlying technical flaws. From a technical perspective, this incident echoes historical failures where weak random number generators undermined cryptographic security, challenging the assumption that offline storage is automatically foolproof.

cryptonews.ru2h ago

Bitcoin Withdrawals Continue: 8 Years of Storage in a Coldcard Cold Wallet Ended in Zero

cryptonews.ru2h ago

Explosive Growth in Trading Volumes of 15 Altcoins Observed in South Korea!

Major South Korean cryptocurrency exchanges Upbit and Bithumb have reported a significant surge in trading volumes for several altcoins. Over the past 24 hours, the total trading volume for the most popular altcoins reached approximately $347.7 million. MetaDAO (META) led the rankings with a trading volume of $65.84 million on Upbit alone, accounting for 12.39% of the exchange's total spot volume. Euler (EUL) followed in second place with a total volume of $47.65 million across both exchanges. XRP, which consistently attracts substantial interest from Korean investors, achieved a total volume of $38.11 million. Other notable altcoins in the top 15 by trading volume include ThunderCore (TT) at $35.64 million, Babylon (BABY) at $25.15 million, and Shiba Inu (SHIB) at $10.55 million.

cryptonews.ru4h ago

Trading

Spot