Thousands of people around the world are selling their identities to train AI, but at what cost?

marsbitPublicado a 2026-03-23Actualizado a 2026-03-23

Resumen

A global investigation reveals a growing gray market where thousands of people worldwide are selling their biometric data—voices, faces, call logs, and daily videos—to train AI models for small payments. Examples include individuals in South Africa, India, and the U.S. earning modest sums through apps like Kled AI, Silencio, and Neon Mobile. While this provides crucial income, especially in economically strained regions, it raises serious concerns about privacy, exploitation, and long-term risks. Contributors often grant broad, irreversible rights to their data, potentially exposing them to deepfakes, identity theft, and unauthorized commercial use. Experts warn that this practice is unstable, offers no career progression, and primarily benefits tech companies in wealthier nations, leaving workers vulnerable with little recourse. Cases like an actor discovering his AI likeness promoting medical products without consent highlight the ethical and personal consequences of this emerging data-for-cash economy.

Author: The Guardian

Compiled by: Deep Tide TechFlow

Deep Tide Introduction: This investigative report reveals a rapidly growing gray industry: thousands of people worldwide are earning money for AI training by selling their voices, faces, call records, and daily videos.

This is not a general discussion about privacy controversies, but an investigation with real people, real amounts of money, and real consequences—an actor who sold his face later saw "himself" promoting an unknown medical product on Instagram, with people in the comments evaluating his "looks."

When the data hunger of AI companies combines with global economic disparities, it is creating an unequal transaction.

Full text as follows:

One morning last year, Jacobus Louw, who lives in Cape Town, South Africa, went out for his usual walk, feeding seagulls along the way. But this time he recorded a few videos—filming his footsteps and view as he walked on the sidewalk. This video earned him $14, about 10 times the country's minimum wage and equivalent to half a week's food expenses for this 27-year-old.

This was a "city navigation" task Louw completed on Kled AI. Kled AI is an app that pays users to upload photos, videos, and other data for training AI models. In just a few weeks, Louw earned $50 by uploading photos and videos from his daily life.

Thousands of miles away, in Ranchi, India, 22-year-old student Sahil Tigga regularly earns money through Silencio—an app that crowdsources audio data for AI training, accessing his phone's microphone to capture ambient noises like inside restaurants or busy intersections. He also uploads recordings of his own voice. Sahil makes special trips to unique locations, such as hotel lobbies not yet recorded on Silencio's map. He earns over $100 per month from this, enough to cover all his food expenses.

In Chicago, 18-year-old welding apprentice Ramelio Hill sold his private phone conversations with friends and family to Neon Mobile—a conversational AI training platform that pays $0.50 per minute—earning a few hundred dollars. For Hill, the calculation was simple: he believes tech companies already have vast amounts of his private data, so he might as well get a share of the profits.

These "AI training gigs"—uploading surroundings, personal photos, videos, and audio—are at the forefront of a new global data gold rush. As Silicon Valley's hunger for high-quality human data exceeds what can be scraped from the open internet, a booming data market industry has emerged to bridge this gap. From Cape Town to Chicago, thousands of people are micro-licensing their biometric identities and private data to the next generation of AI.

But this new gig economy comes at a cost. Behind the few dollars earned, these trainers are fueling an industry that may ultimately render their skills obsolete, while exposing themselves to future risks of deepfakes, identity theft, and digital exploitation—risks they are only beginning to understand.

Keeping the AI Gears Turning

AI language models like ChatGPT and Gemini require massive amounts of material to continuously improve, but they are facing a data shortage. The most commonly used training data sources—C4, RefinedWeb, and Dolma—which comprise a quarter of the highest-quality datasets on the web, are now restricting generative AI companies from using their data to train models. Researchers estimate that AI companies could run out of available fresh, high-quality text as early as 2026. Although some labs have begun training models with synthetic data generated by AI itself, this recursive process leads to models outputting error-filled "garbage," eventually causing a collapse.

This is where apps like Kled AI and Silencio come in. In these data markets, millions of people are feeding and training AI by selling their identity data. Beyond Kled AI, Silencio, and Neon Mobile, AI trainers have many choices: Luel AI, backed by the famous incubator Y-Combinator, acquires multilingual conversation material at about $0.15 per minute; ElevenLabs allows you to digitally clone your voice and make it available for others to use at a base rate of $0.02 per minute.

Bouke Klein Teeselink, an economics professor at King's College London, says AI training gigs are an emerging work category that will grow significantly.

AI companies know that paying people for data licensing helps avoid copyright disputes that might arise from relying entirely on web-scraped content, Teeselink says. AI researcher Veniamin Veselovsky adds that these companies also need high-quality data to model new, improved behaviors for their systems. "For now, human data is the gold standard for sampling outside the model's distribution," Veselovsky added.

The humans driving these machines—especially those in developing countries—often need the money and have few alternatives. For many AI training gig workers, taking on this work is a pragmatic response to economic disparity. In countries with high unemployment rates and depreciating local currencies, earning dollars is often more stable and lucrative than local jobs. Some struggle to find entry-level work and are forced into AI training out of necessity. Even in wealthier countries, rising living costs make selling oneself a logical financial choice.

Cape Town-based AI trainer Louw is well aware of the privacy cost. Although the income is unstable and not enough to cover all his monthly expenses, he is willing to accept these conditions to earn money. He has suffered from neurological diseases for years, making it difficult to find work, but the money he earned from AI data markets (including Kled AI) allowed him to save $500 to enroll in a spa training course to become a massage therapist.

"As a South African, receiving dollars is more valuable than people think," Louw said.

Mark Graham, a professor of internet geography at Oxford University and author of "Feeding the Machine," acknowledges that for individuals in developing countries, the money may have practical significance in the short term, but he warns, "Structurally, this work is unstable, has no upward mobility, and is essentially a dead end."

Graham added that AI data markets rely on "a race to the bottom in wages" and a "temporary demand for human data." Once that demand shifts, "workers will have no security, no transferable skills, and no safety net."

Graham said the only winners are "the platforms in the Global North, which capture all the lasting value."

Full Authorization

Chicago-based AI trainer Hill has mixed feelings about selling his private phone calls to Neon Mobile. About 11 hours of call content earned him $200, but he said the app often goes offline and delays payments. "Neon always seemed suspicious to me, but I kept using it just to earn some extra pocket money to pay bills," Hill said.

Now he is reconsidering whether the money was really that easy. Last September, Neon Mobile went offline just weeks after launching, after TechCrunch discovered a security vulnerability that allowed anyone to access users' phone numbers, call recordings, and transcripts. Hill said Neon Mobile never notified him of this situation, and now he is worried his voice could be misused online.

Jennifer King, a data privacy researcher at Stanford University's Institute for Human-Centered Artificial Intelligence, is concerned that AI data markets are not clear about how and where user data will be used. She added that without understanding their rights or being able to negotiate them, "consumers face the risk of their data being reused in ways they dislike, do not understand, or did not anticipate, with almost no recourse."

When AI trainers share data on Neon Mobile and Kled AI, they grant a full authorization (global, exclusive, irrevocable, transferable, and royalty-free) allowing the platform to sell, use, publicly display, and store their likeness, and even create derivative works based on it.

Kled AI founder Avi Patel said his company's data agreement limits use to AI training and research purposes. "The entire business model relies on user trust. If contributors think their data might be misused, the platform cannot function." He said the company vets buyers before selling datasets, avoiding cooperation with "suspicious-intent" organizations, such as the pornography industry, and "government agencies" they believe might use the data in ways that violate that trust.

Neon Mobile did not respond to requests for comment.

Enrico Bonadio, a law professor at City, University of London, pointed out that these agreement terms allow the platform and its clients to "do almost anything with that material, permanently, without additional payment, and contributors have no practical way to withdraw consent or renegotiate."

More worrying risks include: trainers' data being used to create deepfakes and identity impersonation. Although data markets claim to strip identifying information (such as names and locations) from data before sale, biometric patterns are inherently difficult to anonymize meaningfully, Bonadio added.

Seller's Remorse

Even if AI trainers could negotiate more detailed protections for how their data is used, they might still regret it. In 2024, New York-based actor Adam Coy sold his likeness for $1,000 to Captions—an AI video editing software now renamed Mirage. His agreement stipulated that his identity would not be used for any political purposes, not for promoting alcohol, tobacco, or pornography, and the license would last for one year.

Captions did not respond to requests for comment.

Soon after, Adam's friends began forwarding videos they found online featuring his face and voice, which had garnered millions of views. In one Instagram video, Adam's AI replica claimed to be a "vaginal doctor" promoting unverified medical supplements for pregnant and postpartum women.

"It's embarrassing to explain this to others," Coy said.

"The comments were weird because they were evaluating my appearance, but it wasn't even me," Coy added. "My thinking when I made the decision (to sell my likeness) was that most models would scrape data and portraits from the internet anyway, so I might as well get paid."

Coy said he has not taken any AI data gigs since. He said he would only consider doing it again if a company offered significant compensation.

Preguntas relacionadas

QWhat is the main concern raised in the article about people selling their personal data for AI training?

AThe article highlights that while individuals earn money by selling their biometric identities and private data (like voice, face, and conversations) to train AI models, they face risks such as deepfakes, identity theft, and digital exploitation, often without fully understanding the long-term consequences or having recourse if their data is misused.

QWhy are AI companies turning to data markets like Kled AI and Silencio for training data?

AAI companies are facing a shortage of high-quality training data from the open internet due to restrictions on datasets and potential copyright issues. Data markets provide a way to acquire fresh, human-generated data directly from individuals, which is considered the 'gold standard' for training AI models to improve their behavior and avoid using synthetic data that can lead to model degradation.

QHow do economic disparities play a role in the growth of AI training gigs?

AEconomic disparities drive the growth of AI training gigs because people in developing countries, or those facing high unemployment and currency devaluation, can earn valuable dollars from these platforms. For many, it is a pragmatic response to financial need, offering income that is more stable and lucrative than local jobs, even though the work is unstable and lacks long-term career prospects.

QWhat are some specific risks mentioned regarding the authorization agreements signed by AI trainers?

AThe authorization agreements often grant platforms global, exclusive, irrevocable, transferable, and royalty-free rights to use, sell, display, and create derivative works from the contributors' data. This means trainers have little control over how their data is used long-term, and risks include data being used for deepfakes, identity impersonation, or in ways they did not anticipate, with no practical way to withdraw consent or renegotiate.

QCan you provide an example from the article where an AI trainer regretted selling their data?

AActor Adam Coy from New York sold his likeness for $1000 to Captions (now Mirage) with specific restrictions, but later found his AI replica used in Instagram videos promoting unverified medical supplements, with millions of views. He felt embarrassed and noted that the comments were evaluating his appearance based on the AI clone, which was not actually him. He stated he would only consider such gigs again for significant payment due to the negative experience.

Lecturas Relacionadas

Google and Amazon Simultaneously Invest Heavily in a Competitor: The Most Absurd Business Logic of the AI Era Is Becoming Reality

In a span of four days, Amazon announced an additional $25 billion investment, and Google pledged up to $40 billion—both direct competitors pouring over $65 billion into the same AI startup, Anthropic. Rather than a typical venture capital move, this signals the latest escalation in the cloud wars. The core of the deal is not equity but compute pre-orders: Anthropic must spend the majority of these funds on AWS and Google Cloud services and chips, effectively locking in massive future compute consumption. This reflects a shift in cloud market dynamics—enterprises now choose cloud providers based on which hosts the best AI models, not just price or stability. With OpenAI deeply tied to Microsoft, Anthropic’s Claude has become the only viable strategic asset for Google and Amazon to remain competitive. Anthropic’s annualized revenue has surged to $30 billion, and it is expanding into verticals like biotech, positioning itself as a cross-industry AI infrastructure layer. However, this funding comes with constraints: Anthropic’s independence is challenged as it balances two rival investors, its safety-first narrative faces pressure from regulatory scrutiny, and its path to IPO introduces new financial pressures. Globally, this accelerates a "tri-polar" closed-loop structure in AI infrastructure, with Microsoft-OpenAI, Google-Anthropic, and Amazon-Anthropic forming exclusive model-cloud alliances. In contrast, China’s landscape differs—investments like Alibaba and Tencent backing open-source model firm DeepSeek reflect a more decoupled approach, though closed-source models from major cloud providers still dominate. The $65 billion bet is ultimately about securing a seat at the table in an AI-defined future—where missing the model layer means losing the cloud war.

marsbitHace 2 hora(s)

Google and Amazon Simultaneously Invest Heavily in a Competitor: The Most Absurd Business Logic of the AI Era Is Becoming Reality

marsbitHace 2 hora(s)

Computing Power Constrained, Why Did DeepSeek-V4 Open Source?

DeepSeek-V4 has been released as a preview open-source model, featuring 1 million tokens of context length as a baseline capability—previously a premium feature locked behind enterprise paywalls by major overseas AI firms. The official announcement, however, openly acknowledges computational constraints, particularly limited service throughput for the high-end DeepSeek-V4-Pro version due to restricted high-end computing power. Rather than competing on pure scale, DeepSeek adopts a pragmatic approach that balances algorithmic innovation with hardware realities in China’s AI ecosystem. The V4-Pro model uses a highly sparse architecture with 1.6T total parameters but only activates 49B during inference. It performs strongly in agentic coding, knowledge-intensive tasks, and STEM reasoning, competing closely with top-tier closed models like Gemini Pro 3.1 and Claude Opus 4.6 in certain scenarios. A key strategic product is the Flash edition, with 284B total parameters but only 13B activated—making it cost-effective and accessible for mid- and low-tier hardware, including domestic AI chips from Huawei (Ascend), Cambricon, and Hygon. This design supports broader adoption across developers and SMEs while stimulating China's domestic semiconductor ecosystem. Despite facing talent outflow and intense competition in user traffic—with rivals like Doubao and Qianwen leading in monthly active users—DeepSeek has maintained technical momentum. The release also comes amid reports of a new funding round targeting a valuation exceeding $10 billion, potentially setting a new record in China’s LLM sector. Ultimately, DeepSeek-V4 represents a shift toward open yet realistic infrastructure development in the constrained compute landscape of Chinese AI, emphasizing engineering efficiency and domestic hardware compatibility over pure model scale.

marsbitHace 3 hora(s)

Computing Power Constrained, Why Did DeepSeek-V4 Open Source?

marsbitHace 3 hora(s)

Trading

Spot
Futuros

Artículos destacados

Cómo comprar PEOPLE

¡Bienvenido a HTX.com! Hemos hecho que comprar ConstitutionDAO (PEOPLE) sea simple y conveniente. Sigue nuestra guía paso a paso para iniciar tu viaje de criptos.Paso 1: crea tu cuenta HTXUtiliza tu correo electrónico o número de teléfono para registrarte y obtener una cuenta gratuita en HTX. Experimenta un proceso de registro sin complicaciones y desbloquea todas las funciones.Obtener mi cuentaPaso 2: ve a Comprar cripto y elige tu método de pagoTarjeta de crédito/débito: usa tu Visa o Mastercard para comprar ConstitutionDAO (PEOPLE) al instante.Saldo: utiliza fondos del saldo de tu cuenta HTX para tradear sin problemas.Terceros: hemos agregado métodos de pago populares como Google Pay y Apple Pay para mejorar la comodidad.P2P: tradear directamente con otros usuarios en HTX.Over-the-Counter (OTC): ofrecemos servicios personalizados y tipos de cambio competitivos para los traders.Paso 3: guarda tu ConstitutionDAO (PEOPLE)Después de comprar tu ConstitutionDAO (PEOPLE), guárdalo en tu cuenta HTX. Alternativamente, puedes enviarlo a otro lugar mediante transferencia blockchain o utilizarlo para tradear otras criptomonedas.Paso 4: tradear ConstitutionDAO (PEOPLE)Tradear fácilmente con ConstitutionDAO (PEOPLE) en HTX's mercado spot. Simplemente accede a tu cuenta, selecciona tu par de trading, ejecuta tus trades y monitorea en tiempo real. Ofrecemos una experiencia fácil de usar tanto para principiantes como para traders experimentados.

383 Vistas totalesPublicado en 2024.12.12Actualizado en 2025.03.21

Cómo comprar PEOPLE

Discusiones

Bienvenido a la comunidad de HTX. Aquí puedes mantenerte informado sobre los últimos desarrollos de la plataforma y acceder a análisis profesionales del mercado. A continuación se presentan las opiniones de los usuarios sobre el precio de PEOPLE (PEOPLE).

活动图片