Thousands of people around the world are selling their identities to train AI, but at what cost?

marsbitPublished on 2026-03-23Last updated on 2026-03-23

Abstract

A global investigation reveals a growing gray market where thousands of people worldwide are selling their biometric data—voices, faces, call logs, and daily videos—to train AI models for small payments. Examples include individuals in South Africa, India, and the U.S. earning modest sums through apps like Kled AI, Silencio, and Neon Mobile. While this provides crucial income, especially in economically strained regions, it raises serious concerns about privacy, exploitation, and long-term risks. Contributors often grant broad, irreversible rights to their data, potentially exposing them to deepfakes, identity theft, and unauthorized commercial use. Experts warn that this practice is unstable, offers no career progression, and primarily benefits tech companies in wealthier nations, leaving workers vulnerable with little recourse. Cases like an actor discovering his AI likeness promoting medical products without consent highlight the ethical and personal consequences of this emerging data-for-cash economy.

Author: The Guardian

Compiled by: Deep Tide TechFlow

Deep Tide Introduction: This investigative report reveals a rapidly growing gray industry: thousands of people worldwide are earning money for AI training by selling their voices, faces, call records, and daily videos.

This is not a general discussion about privacy controversies, but an investigation with real people, real amounts of money, and real consequences—an actor who sold his face later saw "himself" promoting an unknown medical product on Instagram, with people in the comments evaluating his "looks."

When the data hunger of AI companies combines with global economic disparities, it is creating an unequal transaction.

Full text as follows:

One morning last year, Jacobus Louw, who lives in Cape Town, South Africa, went out for his usual walk, feeding seagulls along the way. But this time he recorded a few videos—filming his footsteps and view as he walked on the sidewalk. This video earned him $14, about 10 times the country's minimum wage and equivalent to half a week's food expenses for this 27-year-old.

This was a "city navigation" task Louw completed on Kled AI. Kled AI is an app that pays users to upload photos, videos, and other data for training AI models. In just a few weeks, Louw earned $50 by uploading photos and videos from his daily life.

Thousands of miles away, in Ranchi, India, 22-year-old student Sahil Tigga regularly earns money through Silencio—an app that crowdsources audio data for AI training, accessing his phone's microphone to capture ambient noises like inside restaurants or busy intersections. He also uploads recordings of his own voice. Sahil makes special trips to unique locations, such as hotel lobbies not yet recorded on Silencio's map. He earns over $100 per month from this, enough to cover all his food expenses.

In Chicago, 18-year-old welding apprentice Ramelio Hill sold his private phone conversations with friends and family to Neon Mobile—a conversational AI training platform that pays $0.50 per minute—earning a few hundred dollars. For Hill, the calculation was simple: he believes tech companies already have vast amounts of his private data, so he might as well get a share of the profits.

These "AI training gigs"—uploading surroundings, personal photos, videos, and audio—are at the forefront of a new global data gold rush. As Silicon Valley's hunger for high-quality human data exceeds what can be scraped from the open internet, a booming data market industry has emerged to bridge this gap. From Cape Town to Chicago, thousands of people are micro-licensing their biometric identities and private data to the next generation of AI.

But this new gig economy comes at a cost. Behind the few dollars earned, these trainers are fueling an industry that may ultimately render their skills obsolete, while exposing themselves to future risks of deepfakes, identity theft, and digital exploitation—risks they are only beginning to understand.

Keeping the AI Gears Turning

AI language models like ChatGPT and Gemini require massive amounts of material to continuously improve, but they are facing a data shortage. The most commonly used training data sources—C4, RefinedWeb, and Dolma—which comprise a quarter of the highest-quality datasets on the web, are now restricting generative AI companies from using their data to train models. Researchers estimate that AI companies could run out of available fresh, high-quality text as early as 2026. Although some labs have begun training models with synthetic data generated by AI itself, this recursive process leads to models outputting error-filled "garbage," eventually causing a collapse.

This is where apps like Kled AI and Silencio come in. In these data markets, millions of people are feeding and training AI by selling their identity data. Beyond Kled AI, Silencio, and Neon Mobile, AI trainers have many choices: Luel AI, backed by the famous incubator Y-Combinator, acquires multilingual conversation material at about $0.15 per minute; ElevenLabs allows you to digitally clone your voice and make it available for others to use at a base rate of $0.02 per minute.

Bouke Klein Teeselink, an economics professor at King's College London, says AI training gigs are an emerging work category that will grow significantly.

AI companies know that paying people for data licensing helps avoid copyright disputes that might arise from relying entirely on web-scraped content, Teeselink says. AI researcher Veniamin Veselovsky adds that these companies also need high-quality data to model new, improved behaviors for their systems. "For now, human data is the gold standard for sampling outside the model's distribution," Veselovsky added.

The humans driving these machines—especially those in developing countries—often need the money and have few alternatives. For many AI training gig workers, taking on this work is a pragmatic response to economic disparity. In countries with high unemployment rates and depreciating local currencies, earning dollars is often more stable and lucrative than local jobs. Some struggle to find entry-level work and are forced into AI training out of necessity. Even in wealthier countries, rising living costs make selling oneself a logical financial choice.

Cape Town-based AI trainer Louw is well aware of the privacy cost. Although the income is unstable and not enough to cover all his monthly expenses, he is willing to accept these conditions to earn money. He has suffered from neurological diseases for years, making it difficult to find work, but the money he earned from AI data markets (including Kled AI) allowed him to save $500 to enroll in a spa training course to become a massage therapist.

"As a South African, receiving dollars is more valuable than people think," Louw said.

Mark Graham, a professor of internet geography at Oxford University and author of "Feeding the Machine," acknowledges that for individuals in developing countries, the money may have practical significance in the short term, but he warns, "Structurally, this work is unstable, has no upward mobility, and is essentially a dead end."

Graham added that AI data markets rely on "a race to the bottom in wages" and a "temporary demand for human data." Once that demand shifts, "workers will have no security, no transferable skills, and no safety net."

Graham said the only winners are "the platforms in the Global North, which capture all the lasting value."

Full Authorization

Chicago-based AI trainer Hill has mixed feelings about selling his private phone calls to Neon Mobile. About 11 hours of call content earned him $200, but he said the app often goes offline and delays payments. "Neon always seemed suspicious to me, but I kept using it just to earn some extra pocket money to pay bills," Hill said.

Now he is reconsidering whether the money was really that easy. Last September, Neon Mobile went offline just weeks after launching, after TechCrunch discovered a security vulnerability that allowed anyone to access users' phone numbers, call recordings, and transcripts. Hill said Neon Mobile never notified him of this situation, and now he is worried his voice could be misused online.

Jennifer King, a data privacy researcher at Stanford University's Institute for Human-Centered Artificial Intelligence, is concerned that AI data markets are not clear about how and where user data will be used. She added that without understanding their rights or being able to negotiate them, "consumers face the risk of their data being reused in ways they dislike, do not understand, or did not anticipate, with almost no recourse."

When AI trainers share data on Neon Mobile and Kled AI, they grant a full authorization (global, exclusive, irrevocable, transferable, and royalty-free) allowing the platform to sell, use, publicly display, and store their likeness, and even create derivative works based on it.

Kled AI founder Avi Patel said his company's data agreement limits use to AI training and research purposes. "The entire business model relies on user trust. If contributors think their data might be misused, the platform cannot function." He said the company vets buyers before selling datasets, avoiding cooperation with "suspicious-intent" organizations, such as the pornography industry, and "government agencies" they believe might use the data in ways that violate that trust.

Neon Mobile did not respond to requests for comment.

Enrico Bonadio, a law professor at City, University of London, pointed out that these agreement terms allow the platform and its clients to "do almost anything with that material, permanently, without additional payment, and contributors have no practical way to withdraw consent or renegotiate."

More worrying risks include: trainers' data being used to create deepfakes and identity impersonation. Although data markets claim to strip identifying information (such as names and locations) from data before sale, biometric patterns are inherently difficult to anonymize meaningfully, Bonadio added.

Seller's Remorse

Even if AI trainers could negotiate more detailed protections for how their data is used, they might still regret it. In 2024, New York-based actor Adam Coy sold his likeness for $1,000 to Captions—an AI video editing software now renamed Mirage. His agreement stipulated that his identity would not be used for any political purposes, not for promoting alcohol, tobacco, or pornography, and the license would last for one year.

Captions did not respond to requests for comment.

Soon after, Adam's friends began forwarding videos they found online featuring his face and voice, which had garnered millions of views. In one Instagram video, Adam's AI replica claimed to be a "vaginal doctor" promoting unverified medical supplements for pregnant and postpartum women.

"It's embarrassing to explain this to others," Coy said.

"The comments were weird because they were evaluating my appearance, but it wasn't even me," Coy added. "My thinking when I made the decision (to sell my likeness) was that most models would scrape data and portraits from the internet anyway, so I might as well get paid."

Coy said he has not taken any AI data gigs since. He said he would only consider doing it again if a company offered significant compensation.

Related Questions

QWhat is the main concern raised in the article about people selling their personal data for AI training?

AThe article highlights that while individuals earn money by selling their biometric identities and private data (like voice, face, and conversations) to train AI models, they face risks such as deepfakes, identity theft, and digital exploitation, often without fully understanding the long-term consequences or having recourse if their data is misused.

QWhy are AI companies turning to data markets like Kled AI and Silencio for training data?

AAI companies are facing a shortage of high-quality training data from the open internet due to restrictions on datasets and potential copyright issues. Data markets provide a way to acquire fresh, human-generated data directly from individuals, which is considered the 'gold standard' for training AI models to improve their behavior and avoid using synthetic data that can lead to model degradation.

QHow do economic disparities play a role in the growth of AI training gigs?

AEconomic disparities drive the growth of AI training gigs because people in developing countries, or those facing high unemployment and currency devaluation, can earn valuable dollars from these platforms. For many, it is a pragmatic response to financial need, offering income that is more stable and lucrative than local jobs, even though the work is unstable and lacks long-term career prospects.

QWhat are some specific risks mentioned regarding the authorization agreements signed by AI trainers?

AThe authorization agreements often grant platforms global, exclusive, irrevocable, transferable, and royalty-free rights to use, sell, display, and create derivative works from the contributors' data. This means trainers have little control over how their data is used long-term, and risks include data being used for deepfakes, identity impersonation, or in ways they did not anticipate, with no practical way to withdraw consent or renegotiate.

QCan you provide an example from the article where an AI trainer regretted selling their data?

AActor Adam Coy from New York sold his likeness for $1000 to Captions (now Mirage) with specific restrictions, but later found his AI replica used in Instagram videos promoting unverified medical supplements, with millions of views. He felt embarrassed and noted that the comments were evaluating his appearance based on the AI clone, which was not actually him. He stated he would only consider such gigs again for significant payment due to the negative experience.

Related Reads

The Recursive AI Anthropic Warned About: Tian Yuandong's New Company Has Just Taken the "First Step"

Anthropic recently highlighted the rapid progress toward "recursive self-improvement," where AI systems autonomously design and train their successors. In response, Recursive Superintelligence, a new company co-founded by former Meta researcher Tian Yuan Dong, has publicly demonstrated its first step toward automating AI research. The company released a system designed to autonomously execute the full AI research cycle: generating ideas, implementing code, running experiments, and learning from results. It validated this approach by achieving state-of-the-art results on three diverse benchmarks: 1. **NanoChat Autoresearch:** Optimizing a small language model's validation loss under a fixed 5-minute GPU budget, improving upon the community's best result. 2. **NanoGPT Speedrun:** Reducing the time to train a GPT model to a specific loss on 8 H100 GPUs from 79.7 seconds to 77.5 seconds, beating a highly optimized, human-driven community effort. 3. **SOL-ExecBench:** Improving the overall score on NVIDIA's suite of 235 GPU kernel optimization tasks by 18%, closing the gap to the hardware limit. The system discovered novel optimizations in this highly specialized domain without direct human expertise. Recursive's system operates as a general framework, capable of parallel exploration and cross-task knowledge transfer while incorporating safeguards against reward hacking. The company, backed by $650M in funding and a star-studded team including Richard Socher and Alexey Dosovitskiy, aims to create AI that recursively enhances its own research capabilities. This development represents an early but concrete move toward a new paradigm where AI accelerates its own advancement. It occurs alongside Anthropic's warnings about the need for industry coordination and potential pauses when recursive self-improvement thresholds are reached, highlighting the dual trajectory of rapid technical progress and growing calls for careful stewardship.

marsbit4m ago

The Recursive AI Anthropic Warned About: Tian Yuandong's New Company Has Just Taken the "First Step"

marsbit4m ago

The Gold Buy-on-the-Dip Guide: Watch Interest Rates, Not Just War

"Gold Buying Guide: Focus on Interest Rates, Not Just War" Four months ago, gold buyers likely didn't anticipate buying at a peak that even a war couldn't sustain. After hitting a record high of $5,596 on January 29, gold entered a bear market just 91 days later, its fastest decline since 2008. A key trigger was the Fed's hawkish shift, highlighting that monetary policy, not geopolitics, is the primary driver. The article argues that the traditional "buy gold in turmoil" script has changed. While the US-Iran conflict initially boosted prices, the sustained rally in oil prices heightened inflation fears, forcing central banks to maintain or consider tighter policy. Since gold yields no interest, higher rates increase its opportunity cost, eroding its appeal. This dynamic was evident when gold fell sharply on May 18 despite positive peace talks, as lower oil prices eased inflation and thus rate hike pressures. The recent sell-off is also part of a broader market deleveraging. Correlations between gold, Nasdaq, and Bitcoin spiked as leveraged investors sold liquid assets to cover losses, creating a synchronized downturn. Historically, gold bottoms align with policy shifts, not conflict resolutions. The 2008 and 2022 bear markets ended with shifts to extreme easing and peak inflation expectations, respectively. For potential buyers, the author suggests monitoring three signals: 1) Peak interest rate hike expectations, 2) Reopening of the Strait of Hormuz (to ease oil/inflation pressure), and 3) A return to net inflows for Gold ETFs, indicating the end of forced selling. While predicting the exact bottom is impossible, the author's personal strategy involves scaling into a position across price levels like $4000, $3700, and $3500, committing no more than 30% of the intended total allocation initially, and adding the remainder only if key signals emerge. The core conclusion: In turbulent times, watching interest rates is more crucial than watching wars.

marsbit10m ago

The Gold Buy-on-the-Dip Guide: Watch Interest Rates, Not Just War

marsbit10m ago

Recent On-Chain Review: No Clear Narrative Under U.S. Stock Market Pressure, Just Hype

This article analyzes the current state of the Solana meme coin and community token ecosystem, highlighting a market caught between two dominant forces: attention-based PvP and a gradual return to community-centric projects. The first part explores the "Attention PvP" dynamic, where success is driven by celebrity endorsements, viral events, and speed. Examples include $JOTCHUA, which surged after its meme creator's social media activity, and $WORLDCUP, which outperformed a similar Base chain project ($PITCH) largely due to influencer support. The recent "pump.fun GO" feature, allowing bounty tasks for token promotion, is critiqued for fostering sensationalist and often negative stunts—like people getting token tickers tattooed on their bodies for rewards—reminiscent of old internet shock content. In contrast, the article points to a resurgence of organic, community-driven tokens that survive market volatility through strong holder bases and shared ideology, not just hype. Influencer Ansem is cited, arguing that durable meme coins rely on communities willing to endure losses and promote their core message daily. Examples given are older tokens like $neet (anti-work ethos), $troll, $buttcoin, and $triplet, which have maintained relative price stability. A prime example of this community-build model is the new project $KINS, the token for the browser-based MMORPG Kintara. Its success stems not from advanced graphics but from consistently delivering updates, fostering player trust, and creating genuine engagement (e.g., in-game economies, events, property auctions). It has attracted a growing player base and even notable KOLs as participants, demonstrating that sustainable growth can come from building trust rather than orchestrating pumps. The article concludes by questioning whether the market is ultimately a game of mutual trust or mutual deception, expressing hope that such reflection might lead to a healthier ecosystem.

marsbit10m ago

Recent On-Chain Review: No Clear Narrative Under U.S. Stock Market Pressure, Just Hype

marsbit10m ago

On-Chain Scene on Opening Day: $20 Billion Already Staked, How Do On-Chain Contracts Know Who Wins?

On the opening day of the 2026 World Cup, over $2 billion had already been wagered on just the "tournament winner" contracts on platforms like Polymarket and Kalshi. This article explores how these blockchain-based prediction markets actually function once the games begin. It breaks down the massive volume and explains how single-game and tournament-long contracts are priced, with values moving between 1-99 cents to reflect implied probabilities. A key mechanism highlighted is "elimination zeroing," where a team's "champion yes" contract immediately settles to zero once they are mathematically eliminated. The core technical question answered is: how does a smart contract "know" who won a real-world match? The answer lies in oracles. The article details two primary paradigms: UMA's "optimistic oracle" (used by most of Polymarket), which allows a challenge period after a proposed result, and Chainlink's multi-source data aggregation (used by FIFA partners like ADI Predictstreet), which automates settlement with minimal dispute windows. Finally, the article injects a note of caution, citing research estimating that a significant portion of historical trading volume on these platforms might be "wash trading" to inflate numbers. It concludes by contrasting the legal status of these "event contracts" under CFTC rules in the U.S. versus traditional, state-regulated sports betting. As the tournament progresses, the real-time operation of this multi-billion dollar machine—its settlements, eliminations, and underlying mechanisms—becomes a story as compelling as the football itself.

marsbit25m ago

On-Chain Scene on Opening Day: $20 Billion Already Staked, How Do On-Chain Contracts Know Who Wins?

marsbit25m ago

Trading

Spot
Futures

Hot Articles

How to Buy PEOPLE

Welcome to HTX.com! We've made purchasing ConstitutionDAO (PEOPLE) simple and convenient. Follow our step-by-step guide to embark on your crypto journey.Step 1: Create Your HTX AccountUse your email or phone number to sign up for a free account on HTX. Experience a hassle-free registration journey and unlock all features.Get My AccountStep 2: Go to Buy Crypto and Choose Your Payment MethodCredit/Debit Card: Use your Visa or Mastercard to buy ConstitutionDAO (PEOPLE) instantly.Balance: Use funds from your HTX account balance to trade seamlessly.Third Parties: We've added popular payment methods such as Google Pay and Apple Pay to enhance convenience.P2P: Trade directly with other users on HTX.Over-the-Counter (OTC): We offer tailor-made services and competitive exchange rates for traders.Step 3: Store Your ConstitutionDAO (PEOPLE)After purchasing your ConstitutionDAO (PEOPLE), store it in your HTX account. Alternatively, you can send it elsewhere via blockchain transfer or use it to trade other cryptocurrencies.Step 4: Trade ConstitutionDAO (PEOPLE)Easily trade ConstitutionDAO (PEOPLE) on HTX's spot market. Simply access your account, select your trading pair, execute your trades, and monitor in real-time. We offer a user-friendly experience for both beginners and seasoned traders.

7.1k Total ViewsPublished 2024.03.29Updated 2026.06.02

How to Buy PEOPLE

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of PEOPLE (PEOPLE) are presented below.

活动图片