When AI Traffic Surpasses Humans, How Do You Prove You're Human?

marsbitPublished on 2026-06-12Last updated on 2026-06-12

Abstract

With AI-generated web traffic surpassing human activity, websites face a crisis as AI agents bypass ads, avoid clicks, and scrape data without generating revenue. This disrupts the ad-based internet economy, diverting traffic and reducing site visits. In response, sites are blocking AI crawlers and deploying traps like Cloudflare's "honeypot" pages. Traditional CAPTCHAs are now ineffective against advanced AI. The focus has shifted to behavioral biometrics—analyzing unique human patterns such as cursor movement, typing rhythm, and keystroke dynamics. Companies like IBM and BioCatch use this data to distinguish humans from bots, even detecting fraud through behavioral inconsistencies. Two competing approaches aim to verify human identity centrally. Sam Altman’s World (formerly Worldcoin) uses iris scanning to create unique credentials, though it faces privacy concerns and regulatory bans. Alternatively, cryptographic zero-knowledge proofs offer anonymous verification without revealing personal data, championed by Vitalik Buterin to avoid centralized surveillance. However, both systems have flaws. Centralized solutions risk biometric data misuse, while decentralized models may be exploited through identity rental markets in economically unequal regions. Despite challenges, the author favors cryptographic methods for preserving privacy over pervasive behavioral monitoring that permanently captures and controls personal biometric data.

Author: Vaidik Mandloi

Compiled by: Luffy, Foresight News

Since its launch at the end of 2022, ChatGPT has spawned a vast ecosystem of AI agents. Currently, the total web traffic generated by such programs has surpassed that of all human users worldwide. The online behavior of AI agents is fundamentally different from humans: they don't view ads, click on links, or shop online; they simply crawl web data to complete tasks and leave once finished.

The internet's original architecture and business logic were built around human behavior and usage patterns. Yet today, the vast majority of web visits are not from real people, a situation that deeply troubles many websites. Currently, 2.5 million websites have begun blocking AI crawlers, with platforms like Perplexity getting embroiled in related lawsuits. Cloud service provider Cloudflare has even built "honeypot mazes," using AI-generated nonsensical text to create infinite-loop pages designed to trap various data crawlers.

However, some advanced AI agents have already developed the ability to bypass such protective measures. In the face of escalating human-machine conflict, the industry is now focusing on developing a more reliable human identity verification mechanism. This system needs to accurately identify whether the operator behind the screen is human: real human operators exhibit hesitation, typing errors, and cursor movements with the subtle tremors unique to the human nervous system. This article will analyze the causes behind this transformation, the two mainstream technological solutions, and the choices people will face: either accept centralized biometric monitoring or adopt encrypted zero-knowledge proof technology for anonymous human verification.

AI Disrupts the Internet's Business Model

The root cause of websites blocking AI programs lies in AI undermining the commercial foundation of the internet from both ends. The profitability of the traditional internet is built on user attention: users visit pages, view ads, and content publishers earn revenue. If an AI handles shopping, it might search 5,000 websites at once, whereas an ordinary person typically browses only four or five pages.

AI reads far faster than humans, capable of comparing prices across the entire web and even placing orders directly within minutes, a process that generates no ad views. This means websites bear server costs without earning any revenue.

Simultaneously, AI search is continuously diverting website traffic. After Google added AI-generated summaries at the top of search results, only 8% of users clicked through to the original webpages, leading to a direct 33% drop in referral traffic for major content sites from Google. Within just a year of its launch, this feature's monthly active users exceeded 1 billion, and platform retrieval volume has doubled every quarter since its debut.

Surely everyone remembers Chegg, the study help platform. It originally operated a homework Q&A business relying on strong search rankings, but has now officially shut down its Q&A section, attributing its demise to the impact of ChatGPT. Content creators are caught in a double bind: crawlers scrape content on one side, while AI summaries intercept traffic before users even reach the website.

The data gap is even more staggering. For every referral visit OpenAI's crawler brings to a partner website, it previously scrapes data from 400 pages; for Anthropic, this ratio reaches 38,000:1. These companies use publicly available data across the web to train AI models for free, then use the finished products to divert traffic that originally belonged to the websites.

In any other industry, such predatory data collection would have sparked countless lawsuits, yet in the AI field, these companies secure valuations in the trillions.

Your Body is the New Password

For the past 25 years, the internet has primarily relied on CAPTCHAs to distinguish humans from machines. People needed to identify traffic signs or input distorted characters. This mechanism worked because machines' image recognition capabilities were far inferior to humans in the past.

Now the situation is completely reversed. OpenAI's agent operations score far higher than humans in Google's human verification system simulations, capable of accurately clicking interfaces and copying/pasting content; AI-generated photos can fool identity verification systems, and deepfake video calls have even been used by criminals to complete bank transfers. The design premise of traditional verification methods—that machines are weaker than humans—no longer holds.

The industry is now forced to focus on areas where AI still struggles to replicate human capabilities: the physical behavioral characteristics displayed when humans operate electronic devices, also known as behavioral biometrics. Companies like IBM and BioCatch are developing related systems. This technology not only verifies identity at login but also monitors user behavior throughout the session, collecting data on cursor movement speed, page scrolling patterns, typing rhythm, keystroke pressure, text editing habits, and even phone holding angles, with the phone's gyroscope recording relevant information throughout.

The system can also recognize details like the user's dominant hand and finger sliding trajectory. IBM needs to collect usage data just eight times to establish a unique user behavioral profile, which is then continuously compared against benchmark data for every subsequent operation.

BioCatch's technology can even identify online scam scenarios. When a victim reads out account passwords following a scammer's phone instructions, the panicked and disjointed typing rhythm is precisely captured by the system. Within just one year, the system helped 257 banks identify approximately 2 million money laundering accounts. The EU has also begun piloting gait recognition technology. Just three years into the era of AI agents, EU border personnel are already collecting data on people's walking gaits.

Related research also incorporates the Stroop effect: when the word "blue" is written in green font, the human brain experiences conflict between word meaning and visual color, significantly slowing reaction time, but AI remains unaffected. Research finds this cognitive interference is directly reflected in typing behavior. Platforms may not even need specific test questions; based on keystroke rhythm alone, they can judge whether the operator is human. Human typing habits contain unique characteristics of brain information processing.

Previous web tracking mainly recorded user browsing, clicking, and consumption behaviors. Users could evade this by blocking cookies, using VPNs, or turning off location services. But behavioral biometrics collects instinctive human characteristics: cursor movement patterns and typing rhythms are difficult to consciously alter.

Each person's behavioral characteristics are as unique as fingerprints. Unlike passwords or keys, this biometric profile cannot be changed or reset. Once this technology becomes widespread, major platforms will be forced to adapt. Voice simulation technology can already deceive in phone calls, and video deepfake technology is following closely. If this is the future, the core question emerges: Who will ultimately control this human data?

Who Controls the Human Verification System?

Currently, the industry is divided into two main camps exploring human identity verification solutions.

The first is Sam Altman's World (formerly Worldcoin). Users need to approach a spherical iris-scanning device. The device collects iris information and generates an encrypted credential to prove the user is a unique natural person. Currently, 18 million people across 160 countries have completed iris registration. In April 2026, World formed user verification partnerships with dating app Tinder, video conferencing platform Zoom, and e-signature service DocuSign. It also collaborated with Coinbase to launch the AgentKit tool, allowing users to link their AI agents to their verified identity. Platforms can confirm a human is behind the agent without leaking personal information.

However, iris scanning technology has been explicitly banned by multiple countries. The core reason for this resistance is that the public is unclear about the potential risks of authorizing biometric data collection. An investigation by MIT Technology Review also found that World, without valid authorization, privately collected multiple human vital signs data like heart rate and respiration in addition to iris data.

The second category is zero-knowledge proof based on encryption technology, which allows you to prove you are human without revealing your real identity, location, or appearance. Vitalik Buterin proposed this concept as early as 2023. He argued that if a decentralized human identity system cannot be built, the internet will ultimately move toward centralized identity control. Once identity verification authority is held by companies or governments, surveillance mechanisms will become embedded in the network's foundation.

Decentralized human identity systems have seen large-scale implementation attempts before, but ultimately failed. Idena was among the first blockchain projects promoting "one person, one identity." Within just two years of launch, 40% of network accounts and 48% of rewards were controlled by 23 institutions. Account operation teams in places like India and Russia hired ordinary people to lend their identities for less than a dollar per hour, profiting up to 55 times. Researchers also found that even children's identities were used as puppet accounts.

Vitalik had anticipated such risks earlier. He stated that for human identity verification systems, the lowest-cost attack method is not deepfakes or advanced hacking, but paying people in low-income regions to lend their personal identities. Any human identity verification system requires financial support: iris-scanning devices and on-chain verification nodes need continuous investment.

Yet once identity credentials gain economic value, a black market for identity lending inevitably emerges. In a world of stark wealth inequality, the capital-strong will always control such markets.

"Forcing a one-person-one-vote rule in a system with actual economic incentives will only repeat the failures of 20th-century social experiments."

Objectively, both development paths have clear flaws. Centralized solutions can achieve scale but involve users' biometric data being stored by companies prone to over-collection, companies that themselves benefit from the current bot proliferation. The encryption route theoretically protects privacy but struggles to escape real-world economic imbalances, ultimately being exploited by gray-market industries.

If forced to choose, I'd still bet on the encryption solution. Because behavioral biometrics and centralized iris scanning permanently record your bodily information, and the ownership of this information belongs to whoever deploys the system. Once they have your data, you cannot delete or transfer it; this data is locked with the company that collected it.

Even knowing zero-knowledge proofs might be exploited, they are still worth developing, as this proof can confirm you are human without revealing more information. Conversely, abandoning this path means in the future, every website we visit will retain our physical behavioral data. Currently, this centralized surveillance-based solution is being implemented far faster than the encryption technology route.

Related Questions

QAccording to the article, what is the fundamental reason why many websites are banning AI crawlers?

AThe fundamental reason is that AI disrupts the core business model of the internet. AI traffic generates zero advertising revenue for websites while incurring server costs, and AI search summaries divert human traffic away from the original content sources, leaving websites with no financial return for their content.

QWhat technology is the industry shifting towards to distinguish humans from AI, and what does it measure?

AThe industry is shifting towards behavioral biometrics. It measures unique, subconscious human physical behaviors during device interaction, such as cursor movement speed/patterns, typing rhythm/errors, scrolling style, key pressure, phone tilt, and even gait. These are difficult for AI to perfectly replicate.

QWhat are the two main approaches to human verification discussed in the article, and what are their key challenges?

A1. Centralized biometric systems (e.g., Worldcoin's iris scanning): The key challenge is user privacy and centralized control of sensitive, immutable biological data by corporations or governments. 2. Cryptographic zero-knowledge proof systems: The key challenge is economic attacks, where people in low-income regions can be paid to rent out their verified identities, undermining the 'one-person-one-identity' principle.

QHow does the article describe the impact of AI search summaries on website traffic?

AThe impact is severe. Google's AI overview feature has led to only 8% of users clicking through to the original websites, resulting in a 33% drop in referral traffic from Google to content sites. This creates a 'traffic interception' problem where AI provides answers before users visit the source.

QWhat example does the article give to illustrate the cognitive difference between humans and AI that can be used for verification?

AIt cites the Stroop effect. When a word like 'blue' is written in green ink, a human's brain experiences conflict, slowing their reaction time and affecting their typing rhythm. An AI, which processes text and color separately, shows no such delay. This cognitive dissonance manifests in typing behavior and can be used for passive verification.

Related Reads

The Revelation from the Raydium Theft Incident: New DeFi Vulnerabilities Lurking in Forgotten Old Contracts

**Raydium Exploit Reveals DeFi's Hidden Risk: Forgotten "Zombie" Contracts** A recent attack on Raydium's deprecated V3 AMM pools resulted in a loss of approximately $1.34 million. The hacker exploited pools that were no longer supported by Raydium's current UI or SDK but remained fully functional and accessible on-chain. This incident highlights a critical, often overlooked category of risk in DeFi: inactive or legacy smart contracts that projects fail to properly decommission. Since March 2025, there have been at least 8 publicly reported attacks targeting such abandoned contracts, with total losses around $10.8 million. Including older pools and deprecated features, the count rises to 10 incidents with roughly $22.5 million in losses. These "zombie contracts" represent a lifecycle management failure rather than a code vulnerability, yet they are typically misclassified under general "code bug" categories in security reports, masking the true scale of the problem. The root cause is that projects often merely document a contract as "deprecated" without taking essential technical steps to secure it: withdrawing remaining assets, disabling external call functions, and implementing ongoing monitoring. These forgotten, under-monitored components become prime targets for attackers. To address this, the industry needs to recognize "zombie contracts" as a distinct risk category and establish standardized decommissioning protocols. Essential steps should include: 1) a formal retirement announcement, 2) removal of all front-end integrations, 3) withdrawal of locked assets, 4) disabling key contract functions, 5) ongoing security monitoring, 6) clear user communication, and 7) a post-mortem analysis. The value of a DeFi project lies not only in its current TVL but also in the security of its historical codebase, which has now become a new attack surface.

Foresight News16m ago

The Revelation from the Raydium Theft Incident: New DeFi Vulnerabilities Lurking in Forgotten Old Contracts

Foresight News16m ago

Robots Begin to 'Consume Data': The Hidden Production Chain from Indian Data Factories to Billion-Dollar Humanoid Robots

Robots have started to 'consume data,' driving the formation of a new industrial supply chain focused on producing training data for embodied AI. Unlike large language models, which are trained on vast internet text corpora, embodied AI models face a 'data desert' in the physical world. This has created a massive demand for first-person perspective video data (Ego Data), captured by workers wearing cameras in places like Indian garment factories. Companies like Neocambrian AI are establishing 'data factories' where workers perform standardized tasks (e.g., sorting clothes, kitchen organization) to generate thousands of hours of video. Research, such as NVIDIA's EgoScale, demonstrates that scaling this human demonstration data predictably improves robot performance, particularly for dexterous manipulation. This has validated a training path combining large-scale human data for pre-training with smaller amounts of robot-specific data for fine-tuning. The value of different data types varies significantly, forming a 'data pyramid.' The base consists of low-cost, large-scale internet and Ego Data. Higher layers include more expensive motion-capture data (e.g., from data gloves), simulation/synthetic data, and the most costly and scarce layer: real robot teleoperation data. This demand has spawned a layered ecosystem of data suppliers: low-cost data factories, motion capture and alignment specialists, robot-native teleoperation service providers, simulation data companies, and platforms aiming for data standardization. Robot companies themselves are adopting a 'layered procurement' strategy: outsourcing generic Ego Data while building in-house capabilities for robot-specific adaptation data and the critical deployment/failure data generated in real-world applications. The industry is shifting focus from hardware and basic mobility to the data pipelines required for general-purpose capability. While parallels exist to data labeling companies like Scale AI in the LLM boom, the physical complexity of robot data—involving action success ambiguity and sim-to-real gaps—requires more integrated solutions for data collection, annotation, and a continuous feedback loop. The race is on to build the data engines that will teach robots to operate reliably in the unstructured real world.

marsbit2h ago

Robots Begin to 'Consume Data': The Hidden Production Chain from Indian Data Factories to Billion-Dollar Humanoid Robots

marsbit2h ago

Spicy Commentary | Michael Saylor's 'Player Talk'; 60-Year-Old Aunt Liquidated After 'Scamming a Young Man'

**"Spicy Commentary": Three Tales of Crypto's Wild Week** This week's "Spicy Commentary" column highlights three dramatic stories from the cryptocurrency world. First, **MicroStrategy's Michael Saylor** addressed the controversy over his company potentially selling Bitcoin. At the BTC Prague event, he clarified, "I never said the company can't sell Bitcoin. I told *you* never to sell *your* Bitcoin." This "do as I say, not as I do" stance was criticized by netizens as peak linguistic gymnastics, noting a history of him previously stating the company would "never" sell. Second, a **bizarre fraud case** emerged from Beijing. A 60-year-old woman, obsessed with getting rich from crypto but unwilling to risk her own savings, posed online as the 20-something "god-daughter" of a high-ranking official. She catfished a young man, convincing him to give her over 200,000 yuan for fabricated emergencies. She then invested all the stolen money into cryptocurrency with 10x leverage, only to lose everything in a market crash. The woman was sentenced to four years in prison for fraud. Finally, a **sobering trader's tale** surfaced on Reddit. A user posted "Tale of a crypto trader," confessing their net worth had plummeted from a peak of $45 million to roughly $17,200, primarily due to holding meme coins too long. The post, described as a crypto "book of confessions," sparked reactions ranging from sympathy to critique about greed, poor risk management, and the perils of treating meme coins as long-term investments instead of taking profits. The column concludes that this week featured masterful rhetoric, elaborate scams, and extreme financial volatility, stitching together another chapter in crypto's unpredictable theater.

Foresight News3h ago

Spicy Commentary | Michael Saylor's 'Player Talk'; 60-Year-Old Aunt Liquidated After 'Scamming a Young Man'

Foresight News3h ago

Tremble Humans, AI Continues Its Accelerated Sprint

Trembling, Humans: AI Continues Its Accelerated Sprint Yes, AI is still rapidly accelerating. While deep learning seemed to stall quickly in its early years, large models after years of development show no sign of hitting their ceiling. At the Zhiyuan Conference 2026, the focus is on enabling AI to move from the digital world into the physical world. Scaling Law remains effective, continuing to drive advancements in both large language models and multimodal models. The industry is now entering a phase of pursuing World Models, though unresolved technical paths and data issues mean this exploration may take 3-5 more years. Concurrently, breakthroughs in Agents are accelerating AI's real-world application in fields like healthcare and meetings. Making Agents truly useful requires key hardware-software co-design, evident from the strong presence of chip vendors at the conference. We stand at a new historical threshold where AI is becoming a foundational force reshaping the world. The first day of the conference highlighted AI's evolution from "knowing how to chat" to "knowing how to work." Scaling Law persists, World Models are the next key battleground, and Agents are transitioning from usable to好用 (user-friendly). Scaling Law is not ending but diversifying. New models like Anthropic's Fable 5 demonstrate scaling through parameter size, synthetic data, and reinforcement learning. Advancements in AI Coding and Agent deployment are enabling a trend of AI self-evolution, potentially allowing AI to take over digital world iterations. World Models represent the next frontier for large models extending into the physical realm, but no current model is truly impressive at solving real-world problems. Technical consensus is lacking, with debates on data sources (video, simulation, real-world). Different approaches are emerging: language-centric, pixel-centric, 3D-structure-centric, and visual-representation-centric models. Zhiyuan Institute is exploring a fifth path: unified latent space modeling fusing language and visual representations, and introduced its own under-development World Model, Physis-v0.1. On the product side, Agents are key to bringing AI into daily life. Since 2025, the "Year of the Agent," products have become more proactive and capable of complex tasks. Zhiyuan showcased four vertical Agents for cardiac diagnosis, autonomous research, meeting summarization, and protein risk discovery. However, technical challenges remain, particularly in context engineering like memory and orchestration. "Harness" – the engineering framework around an Agent – is crucial for maximizing its capabilities by clarifying intent, designing workflows, and incorporating validation and feedback. In summary, AI's breakneck pace continues on multiple fronts: foundational model scaling, the ambitious pursuit of World Models for physical understanding, and the ongoing refinement of practical Agents. The journey from capable to truly reliable and useful AI systems is well underway.

marsbit3h ago

Tremble Humans, AI Continues Its Accelerated Sprint

marsbit3h ago

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of AI (AI) are presented below.

活动图片