The Small-Town Youth Labeling AI Giants

marsbitPublished on 2026-04-07Last updated on 2026-04-07

Abstract

In China's hinterland cities like Datong, Shanxi, thousands of young people are working as data annotators—the invisible workforce behind AI development. They perform repetitive tasks like drawing bounding boxes on images or rating AI-generated responses, earning piece-rate wages as low as a few cents per task. These workers, mostly from rural areas or small towns, endure intense labor conditions: strict monitoring, high error tolerance thresholds, and mental exhaustion. Despite the cognitive nature of their work, they are often paid meager salaries, with some earning as little as ¥30 ($4) for a day’s work. As AI industry evolves, even highly educated workers—including master’s graduates—are being drawn into similar precarious freelance roles, evaluating complex AI outputs under vague and shifting standards. Yet the industry is structured through layers of outsourcing, where most profits flow to tech giants like OpenAI and Microsoft, while annotators see dwindling incomes. Worse, as AI models become more self-sufficient, the demand for human annotators is declining. Companies like Li Auto have slashed annotation costs by using AI-powered tools that complete in hours what used to take humans years. These annotators, who helped train the very systems now replacing them, face an uncertain future—a stark contrast to the booming valuations and optimistic narratives of the global AI industry. No one seems to see a problem with any of this.

Datong, Shanxi—once a city propped up by coal—has shaken off its dust and now wields a sharp pickaxe, striking down upon another invisible mine.

In the office buildings of Jinmao International Center in Pingcheng District, there are no more elevator shafts or coal trucks. Instead, thousands of tightly packed computer workstations fill several floors. The Shanghai Runxun Yunzhong Shengu Big Data Smart Service Base occupies entire levels, where thousands of young employees, wearing headphones, stare at screens, clicking, dragging, and boxing.

According to official data, as of November 2025, Datong had put into operation 745,000 servers, attracted 69 call center and data labeling companies, created over 30,000 local jobs, and generated 750 million yuan in output value. In this digital mine, 94% of the workers are local residents.

It’s not just Datong. Among the first batch of data labeling bases designated by the National Data Administration, counties in central and western China like Yonghe in Shanxi, Bijie in Guizhou, and Mengzi in Yunnan are prominently listed. In Yonghe County’s data labeling base, 80% of the employees are women, mostly rural stay-at-home moms or returning youth who couldn’t find suitable jobs.

A hundred years ago, Manchester’s textile mills were filled with landless farmers. Today, the computer screens in these remote counties are manned by young people who found no place in the real economy.

They are engaged in a job that feels both futuristic and primitive—piecework—producing the essential data feed for AI giants in Beijing, Shenzhen, and Silicon Valley.

No one sees anything wrong with this.

The New Assembly Line on the Loess Plateau

At its core, data labeling is about teaching machines to recognize the world.

Self-driving cars need to identify traffic lights and pedestrians; large models need to distinguish cats from dogs. Machines have no innate common sense; humans must first draw boxes on images, telling them “this is a pedestrian,” so that after digesting millions of pictures, they can learn to recognize on their own.

This job doesn’t require advanced degrees—just patience and an index finger that can click incessantly.

In the golden year of 2017, a simple 2D box could fetch over ten cents, with some companies even offering fifty cents. Fast labelers, working over ten hours a day, could earn five to six hundred yuan. In a small town, this was undoubtedly a high-paying, respectable job.

But as large models evolved, the brutal side of this assembly line began to show.

By 2023, the price for simple image labeling had plummeted to 3-4 cents—a drop of over 90%. Even for more complex 3D point cloud images—dense point matrices that require extreme zoom to discern edges—labelers must draw a 3D box in space, encompassing length, width, height, and yaw angle, to tightly wrap around vehicles or pedestrians. Yet such a intricate 3D box earns only five cents.

The direct consequence of the unit price crash is a dramatic increase in labor intensity. To cling to a base salary of two to three thousand yuan a month, labelers must constantly, relentlessly, increase their speed.

This is no easy white-collar job. In many labeling bases, management is stiflingly strict: no phone calls allowed during work hours, phones must be locked in storage compartments. The system meticulously tracks each employee’s mouse movements and idle time. If you stop for more than three minutes, a warning from the backend lashes out like a whip.

Even more crushing is the error rate. The industry’s passing threshold is usually above 95%, with some companies demanding 98%-99%. This means if you draw 100 boxes and just 2 are wrong, the entire image is sent back for rework.

Dynamic images are frame-linked; changing lanes, vehicles get occluded, and labelers must use inference to find them one by one. In 3D point cloud images, any object with over 10 points must be boxed. For a complex parking space project, lines drawn too long, labels missed—quality checks always find flaws. An image being sent back four or five times is commonplace. In the end, after an hour’s work, the pay might be just a few dimes.

A labeler from Hunan posted her settlement slip on social media: after a day’s work, she drew over 700 boxes at 4 cents each, totaling 30.2 yuan.

It’s a profoundly split reality.

On one side, glamorous tech giants at press conferences talk about how AGI will liberate humanity; on the other, young people in counties on the Loess Plateau and southwestern mountains stare at screens for eight to ten hours a day, mechanically drawing boxes—thousands, tens of thousands—so many that at night, their fingers twitch in the air, tracing lane lines in their dreams.

Someone once said, the exterior of artificial intelligence is a luxury car speeding by, but if you open the door, you’ll find a hundred people inside, pedaling bicycles furiously, gritting their teeth.

No one sees anything wrong with this.

Piecework Labor Teaching Machines "How to Love"

As the bottlenecks of image recognition were broken, large models evolved deeper, needing to learn to think, converse, and even show "empathy" like humans.

This gave rise to the most core, yet expensive, part of large model training—RLHF (Reinforcement Learning from Human Feedback).

Simply put, it involves real people scoring AI-generated responses, telling it which answer is better, more aligned with human values and emotional preferences.

ChatGPT seems "human-like" precisely because countless RLHF labelers are teaching it.

On crowdsourcing platforms, such labeling tasks are often priced clearly: 3 to 7 yuan per task. Labelers must assign highly subjective emotional scores to AI responses, judging whether an answer is "warm," "empathetic," or "attentive to the user's emotions."

A low-wage worker, struggling in the mire of reality, with no time to tend to their own emotions, must now serve as the AI's emotional tutor and values judge within the system.

They must break down complex, subtle human emotions like warmth and empathy, forcibly quantizing them into cold scores of 1 to 5. If their scores don’t match the system’s preset standard answers, they are marked as below accuracy standards, deducting from their already meager piece-rate pay.

This is a cognitive voidance. The intricate, profound human emotions, morality, and compassion are being dragged into the algorithm's funnel. In the icy quantification and standardization, they are drained of their last warmth. While you marvel at the cyber behemoth on screen learning to write poetry, compose music, offer comfort, even donning a sentimental skin; outside the screen, those once-vibrant humans, through daily mechanical judgments, are regressing into emotionless scoring machines.

This is the most hidden side of the entire industry chain, never appearing in any funding news or technical white papers.

No one sees anything wrong with this.

The Master's Graduate and the Small-Town Youth

As底层 (low-level) boxing work is being crushed by AI’s treads, this cyber assembly line is expanding upward, beginning to吞噬 (devour) higher-level intellectual labor.

The appetite of large models has changed. They are no longer satisfied with chewing simple常识 (common sense); they need to devour human expertise and advanced logic.

Major recruitment platforms are频繁闪烁 (frequently flashing) with special part-time jobs, such as "Large Model Logical Reasoning Labeling" or "AI Humanities Trainer." These roles have extremely high barriers, often requiring "Master's degree or above from 985/211 universities," involving specialized fields like law, medicine, philosophy, and literature.

Many top-university graduates are attracted, flooding into these outsourcing groups for big tech companies. But they soon discover this is no轻松的脑力体操 (light mental exercise)—it’s mental torture.

Before officially taking tasks, they must read dozens of pages of scoring dimensions and evaluation criteria, undergoing two or three rounds of trial labeling. After meeting the standard, during正式标注 (formal labeling), if their accuracy falls below the average, they lose eligibility and are kicked out of the group.

The most suffocating part is that these standards are not fixed. Facing similar questions and answers, using the same reasoning to score, the results can be completely opposite. It’s like taking an endless exam with no standard answer. There’s no way to improve accuracy through self-effort or study; you can only spin in place, consuming mental and physical energy.

This is the new exploitation of the large model era—class folding.

Knowledge, once seen as a golden ladder to break barriers and climb upward, has now become (沦为) more complex digital fodder chewed up for the algorithm. Before the absolute power of algorithms and systems, the 985 master’s graduate in the ivory tower and the small-town youth on the Loess Plateau have reached the most bizarre convergence.

They fall together into this bottomless cyber mine pit, stripped of their光环 (halo), their differences flattened, all reduced to cheap, replaceable cogs on the conveyor belt.

It’s the same abroad. In 2024, Apple directly cut an AI voice labeling team of 121 people in San Diego. These employees were responsible for improving Siri’s multilingual processing. They once thought they were on the edge of the core business of a major company, but instantly fell into the abyss of unemployment.

In the eyes of tech giants, whether it’s the boxing auntie in a county town or the logic trainer graduated from a prestigious school, they are essentially disposable "consumables."

No one sees anything wrong with this.

The Trillion-Dollar Babel, Built with Pennies of Sweat

According to data released by the China Academy of Information and Communications Technology, China’s data labeling market reached 6.08 billion yuan in 2023, with projections of 20-30 billion yuan by 2025. It is predicted that by 2030, global data labeling and service market sales will soar to 117.1 billion yuan.

Behind these numbers lies the valuation狂欢 (狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢极 (extreme) valuations of OpenAI, Microsoft, ByteDance, and other tech giants, reaching trillions of dollars.

But this泼天的财富 (immense wealth) does not flow to those who truly "feed" the AI.

China’s data labeling industry exhibits a typical inverted pyramid outsourcing structure. At the very top are the tech giants死死捏着 (firmly grasping) the core algorithms. The second tier consists of large data service suppliers. The third tier is made up of data labeling bases and small-to-medium outsourcing companies scattered across the country. At the very bottom are the piece-rate labelers, the "muddy-legged" workers.

Each layer of outsourcing skims off a hefty portion. When the big company pays 50 cents per unit, after层层盘剥 (layers of exploitation), what reaches the county labeler might be less than 5 cents.

Yanis Varoufakis, former Finance Minister of Greece, in his book "Technofeudalism," presents a penetrating view: today’s tech giants are no longer traditional capitalists but "Cloudalists."

They don’t own factories and machines but algorithms, platforms, computing power—these are the digital territories of the cyber age. In this new feudal system, users are not consumers but digital serfs; our every like, comment, and browse on social media is免费上供 (offering up for free) data to the云领主 (cloud lords).

And those data labelers in下沉市场 (downstream markets) are the lowest digital serfs in this system. They not only produce data but also清洗 (clean), classify, and score massive amounts of raw data, turning it into high-quality feed digestible by large models.

This is a隐秘的认知圈地运动 (hidden cognitive enclosure movement). Just as the 19th-century English enclosure movement drove farmers into textile mills, today’s AI wave drives youth who find no place in the real economy to screens.

AI has not leveled the class divide; instead, it has built a "data and sweat conveyor belt" stretching from counties in central and western China straight to the headquarters of tech giants in Beijing, Shanghai, Shenzhen, and Guangzhou. The narrative of technological revolution is always grand and华丽 (splendid), but its underlying color is always the规模化消耗 (large-scale consumption) of cheap labor.

No one sees anything wrong with this.

The Tomorrow That No Longer Needs Humans

The cruelest outcome is coming, faster and faster.

As large model capabilities leap, those labeling tasks that once required human day-and-night labor are being taken over by AI itself.

In April 2023, Li Xiang, founder of Li Auto, revealed on a forum that in the past, the company did about 10 million frames of manual自动驾驶图像标定 (autonomous driving image calibration) per year, with外包成本 (outsourcing costs)接近 (approaching) 100 million yuan. But when they used large models for automated labeling, what used to take a year could be done in basically 3 hours.

The efficiency is 1000 times that of humans, and this was back in 2023. Just this past March, Li Auto also released its new-generation MindVLA-o1 automatic labeling engine.

A painfully true industry自嘲 (self-mockery) goes: "As much intelligence, as much manual labor." But now, big tech companies’ investment in data labeling outsourcing has seen a断崖式下降 (cliff-like drop) of 40%-50%.

The small-town youth who sat countless days and nights before computers, straining their eyes red, have亲手喂大 (personally fed) a giant beast. And now, this beast is turning around to smash their rice bowls.

Night falls, and the office buildings in Datong’s Pingcheng District remain starkly white. Young people changing shifts silently exchange weary shells in the elevator. In this folded space禁锢 (imprisoned) by countless polygonal boxes, no one cares about the epic leaps in the Transformer architecture across the ocean, nor can anyone understand the roar of computing power behind billions of parameters.

Their gaze is welded solely to the red and green progress bar in the backend representing the "qualifying line," calculating whether the piece-rate pennies and dimes can piece together a decent life by month’s end.

On one side, Nasdaq bell rings and tech media coverage abound, giants raising glasses to celebrate the advent of AGI; on the other, these digital serfs who fed the AI mouthful by mouthful with their flesh and blood can only wait战战兢兢 (trepidatiously) in aching sleep for the beast they亲手饲养 (personally raised) to, on some ordinary morning, casually kick away their rice bowls.

No one sees anything wrong with this.

a16z: 5 Ways Blockchain Can Help AI Agent Infrastructure

Blockchain technology provides critical infrastructure for AI agents by addressing five key challenges: 1) Non-human identity: AI agents lack standardized, portable identity systems. Blockchain enables verifiable, cross-platform agent identities (like "Know Your Agent" frameworks) through cryptographic credentials and on-chain registries. 2) AI governance: When AI systems execute decisions, blockchain ensures transparency and prevents centralized control by recording actions on-chain and enabling auditable execution logs. 3) Payments: Stablecoins and crypto payments (e.g., x402, MPP) serve as default settlement layers for agent-to-agent commerce, enabling frictionless, programmable transactions for "headless" AI-native businesses. 4) Trust and verification: As AI scales, blockchain provides cryptographic proof of origin and auditable histories, making verification—not intelligence—the scarce resource. 5) User control: Crypto-native tools (e.g., delegation toolkits, intent-based architectures) allow users to set boundaries and maintain oversight over autonomous agents, minimizing blind trust. Together, blockchain and AI can create an economic infrastructure built on transparency, accountability, and user sovereignty.

marsbit32m ago

a16z: 5 Ways Blockchain Can Help AI Agent Infrastructure

marsbit32m ago

You Bet on the News, the Pros Read the Rules: The True Cognitive Gap in Losing Money on Polymarket

The article explains that the key to profiting on Polymarket, a prediction market platform, lies not just predicting real-world events correctly, but in meticulously understanding the specific rules that govern how each market will be resolved. It illustrates this with examples, such as a market on Venezuela's 2026 leader, where the official rules defining "officially holds" the office overruled the intuitive answer of who was in practical control. Other examples include debates over the definition of a "token" or what constitutes an "agreement." The core argument is that a "reality vs. rules" gap creates pricing discrepancies that savvy traders ("车头" or "whales") exploit. The platform has a formal dispute resolution process managed by UMA token holders to settle ambiguous outcomes. This process involves proposal submission, a challenge window, a discussion period, and a final vote. However, the article highlights a critical flaw in this system compared to a traditional court: the lack of separation between the arbiters (UMA voters) and the interested parties (traders with financial stakes in the outcome). This conflict of interest undermines the discussion phase, leads to herd mentality, and results in opaque final decisions without explanatory rulings. Consequently, the system lacks a body of precedent, making it difficult for users to learn from past disputes. The ultimate takeaway is that success on Polymarket requires a lawyer-like scrutiny of the rules to identify and capitalize on the cognitive gap between how events appear and how they are contractually defined for settlement.

marsbit48m ago

You Bet on the News, the Pros Read the Rules: The True Cognitive Gap in Losing Money on Polymarket

marsbit48m ago

Will Solana Flip Ethereum Soon? SOL Takes First Step Toward Total Domination

Solana has significantly closed the gap with Ethereum, particularly in transaction volume, processing 9 billion transactions last month compared to Ethereum's 69 million. It has also surpassed Ethereum in cumulative lifetime transactions. This highlights Solana's high-throughput, low-cost architecture designed for real-time usage. Major partnerships, such as with Visa for stablecoin settlements and Western Union's upcoming stablecoin launch, underscore its growing institutional adoption. Solana has even overtaken Ethereum in real-world asset (RWA) holders. However, a complete "flippening" depends on broader factors like capital inflows, developer activity, and network confidence. While Solana's smaller market cap suggests greater growth potential, Ethereum's Layer-2 scaling strategy strengthens its ecosystem. The outcome remains uncertain, with trade-offs on both sides.

bitcoinist54m ago

Will Solana Flip Ethereum Soon? SOL Takes First Step Toward Total Domination

bitcoinist54m ago

Spending $200 to Buy Stars, Scamming VCs Out of Tens of Millions: The Entire GitHub Fake Star Industry Exposed

A peer-reviewed study from Carnegie Mellon University (CMU) reveals that GitHub hosts approximately 6 million fake Stars, involving 18,600 repositories and 301,000 accounts, with AI/LLM projects being the largest non-malicious category for fake engagement. The fake Star market has exploded, with prices as low as $0.03 per Star. Research shows that venture capital firms, such as Redpoint Ventures, use GitHub Star counts as a key metric for evaluating startups, with median Stars at 2,850 for seed-stage funding. For less than $200, a project can artificially meet this threshold, distorting investment landscape. Over a dozen websites openly sell GitHub Stars, and fake Star activity saw explosive growth in 2024. AI-related repositories were among the most heavily affected. Despite GitHub’s policies against fake engagement, enforcement remains inconsistent: while 90% of flagged repositories were deleted, only 57% of involved accounts were suspended. The report highlights how purchased Stars can manipulate GitHub’s Trending algorithm and influence VC funding decisions, creating a cycle where artificial metrics attract real investment.

marsbit57m ago

Spending $200 to Buy Stars, Scamming VCs Out of Tens of Millions: The Entire GitHub Fake Star Industry Exposed

marsbit57m ago

Will the Fed Still Cut Interest Rates? Tonight's Data Is Crucial

The core debate surrounding the Federal Reserve's potential interest rate cuts is intensifying amid geopolitical conflict and rebounding inflation. The key question is whether high energy prices will cause persistent inflation or weaken consumer demand enough to force the Fed to cut rates. Citigroup presents a bullish case for cuts, arguing that oil supply disruptions from the Strait of Hormuz are temporary and will not lead to lasting inflationary pressure. They point to receding bond yields and oil prices as evidence the market is pricing in a short-lived shock. Citi's data also shows tightening financial conditions, a stabilizing labor market, and healthy tax returns, supporting their view that the path to lower rates remains open. Conversely, Deutsche Bank offers a starkly contrasting, more hawkish outlook. They argue the Fed's current policy is already neutral and expect rates to remain unchanged indefinitely. Their view is based on stalled disinflation progress and a shift toward more hawkish rhetoric from key Fed officials like Waller, who cited risks from prolonged Middle East conflict and tariffs. Other officials, including Williams and Hammack, signaled rates would likely stay on hold for a "considerable time." The market pricing has shifted dramatically, now forecasting zero cuts in 2026. The imminent release of the March retail sales "control group" data is highlighted as a critical test. This metric, which excludes gas station sales, will reveal if high gasoline prices are eroding consumer spending in other areas. A weak reading could support the case for imminent rate cuts, while a strong one would bolster the argument for the Fed to hold steady. This data is pivotal for determining the near-term policy path.

marsbit1h ago

Will the Fed Still Cut Interest Rates? Tonight's Data Is Crucial

marsbit1h ago

Trading

Spot

Futures

The Small-Town Youth Labeling AI Giants

Abstract

The New Assembly Line on the Loess Plateau

Piecework Labor Teaching Machines "How to Love"

The Master's Graduate and the Small-Town Youth

The Trillion-Dollar Babel, Built with Pennies of Sweat

The Tomorrow That No Longer Needs Humans

Related Questions

Related Reads

a16z: 5 Ways Blockchain Can Help AI Agent Infrastructure

You Bet on the News, the Pros Read the Rules: The True Cognitive Gap in Losing Money on Polymarket

Will Solana Flip Ethereum Soon? SOL Takes First Step Toward Total Domination

Spending $200 to Buy Stars, Scamming VCs Out of Tens of Millions: The Entire GitHub Fake Star Industry Exposed

Will the Fed Still Cut Interest Rates? Tonight's Data Is Crucial

Trading