The Small-Town Youth Labeling AI Giants

marsbitОпубліковано о 2026-04-07Востаннє оновлено о 2026-04-07

Анотація

In China's hinterland cities like Datong, Shanxi, thousands of young people are working as data annotators—the invisible workforce behind AI development. They perform repetitive tasks like drawing bounding boxes on images or rating AI-generated responses, earning piece-rate wages as low as a few cents per task. These workers, mostly from rural areas or small towns, endure intense labor conditions: strict monitoring, high error tolerance thresholds, and mental exhaustion. Despite the cognitive nature of their work, they are often paid meager salaries, with some earning as little as ¥30 ($4) for a day’s work. As AI industry evolves, even highly educated workers—including master’s graduates—are being drawn into similar precarious freelance roles, evaluating complex AI outputs under vague and shifting standards. Yet the industry is structured through layers of outsourcing, where most profits flow to tech giants like OpenAI and Microsoft, while annotators see dwindling incomes. Worse, as AI models become more self-sufficient, the demand for human annotators is declining. Companies like Li Auto have slashed annotation costs by using AI-powered tools that complete in hours what used to take humans years. These annotators, who helped train the very systems now replacing them, face an uncertain future—a stark contrast to the booming valuations and optimistic narratives of the global AI industry. No one seems to see a problem with any of this.

Datong, Shanxi—once a city propped up by coal—has shaken off its dust and now wields a sharp pickaxe, striking down upon another invisible mine.

In the office buildings of Jinmao International Center in Pingcheng District, there are no more elevator shafts or coal trucks. Instead, thousands of tightly packed computer workstations fill several floors. The Shanghai Runxun Yunzhong Shengu Big Data Smart Service Base occupies entire levels, where thousands of young employees, wearing headphones, stare at screens, clicking, dragging, and boxing.

According to official data, as of November 2025, Datong had put into operation 745,000 servers, attracted 69 call center and data labeling companies, created over 30,000 local jobs, and generated 750 million yuan in output value. In this digital mine, 94% of the workers are local residents.

It’s not just Datong. Among the first batch of data labeling bases designated by the National Data Administration, counties in central and western China like Yonghe in Shanxi, Bijie in Guizhou, and Mengzi in Yunnan are prominently listed. In Yonghe County’s data labeling base, 80% of the employees are women, mostly rural stay-at-home moms or returning youth who couldn’t find suitable jobs.

A hundred years ago, Manchester’s textile mills were filled with landless farmers. Today, the computer screens in these remote counties are manned by young people who found no place in the real economy.

They are engaged in a job that feels both futuristic and primitive—piecework—producing the essential data feed for AI giants in Beijing, Shenzhen, and Silicon Valley.

No one sees anything wrong with this.

The New Assembly Line on the Loess Plateau

At its core, data labeling is about teaching machines to recognize the world.

Self-driving cars need to identify traffic lights and pedestrians; large models need to distinguish cats from dogs. Machines have no innate common sense; humans must first draw boxes on images, telling them “this is a pedestrian,” so that after digesting millions of pictures, they can learn to recognize on their own.

This job doesn’t require advanced degrees—just patience and an index finger that can click incessantly.

In the golden year of 2017, a simple 2D box could fetch over ten cents, with some companies even offering fifty cents. Fast labelers, working over ten hours a day, could earn five to six hundred yuan. In a small town, this was undoubtedly a high-paying, respectable job.

But as large models evolved, the brutal side of this assembly line began to show.

By 2023, the price for simple image labeling had plummeted to 3-4 cents—a drop of over 90%. Even for more complex 3D point cloud images—dense point matrices that require extreme zoom to discern edges—labelers must draw a 3D box in space, encompassing length, width, height, and yaw angle, to tightly wrap around vehicles or pedestrians. Yet such a intricate 3D box earns only five cents.

The direct consequence of the unit price crash is a dramatic increase in labor intensity. To cling to a base salary of two to three thousand yuan a month, labelers must constantly, relentlessly, increase their speed.

This is no easy white-collar job. In many labeling bases, management is stiflingly strict: no phone calls allowed during work hours, phones must be locked in storage compartments. The system meticulously tracks each employee’s mouse movements and idle time. If you stop for more than three minutes, a warning from the backend lashes out like a whip.

Even more crushing is the error rate. The industry’s passing threshold is usually above 95%, with some companies demanding 98%-99%. This means if you draw 100 boxes and just 2 are wrong, the entire image is sent back for rework.

Dynamic images are frame-linked; changing lanes, vehicles get occluded, and labelers must use inference to find them one by one. In 3D point cloud images, any object with over 10 points must be boxed. For a complex parking space project, lines drawn too long, labels missed—quality checks always find flaws. An image being sent back four or five times is commonplace. In the end, after an hour’s work, the pay might be just a few dimes.

A labeler from Hunan posted her settlement slip on social media: after a day’s work, she drew over 700 boxes at 4 cents each, totaling 30.2 yuan.

It’s a profoundly split reality.

On one side, glamorous tech giants at press conferences talk about how AGI will liberate humanity; on the other, young people in counties on the Loess Plateau and southwestern mountains stare at screens for eight to ten hours a day, mechanically drawing boxes—thousands, tens of thousands—so many that at night, their fingers twitch in the air, tracing lane lines in their dreams.

Someone once said, the exterior of artificial intelligence is a luxury car speeding by, but if you open the door, you’ll find a hundred people inside, pedaling bicycles furiously, gritting their teeth.

No one sees anything wrong with this.

Piecework Labor Teaching Machines "How to Love"

As the bottlenecks of image recognition were broken, large models evolved deeper, needing to learn to think, converse, and even show "empathy" like humans.

This gave rise to the most core, yet expensive, part of large model training—RLHF (Reinforcement Learning from Human Feedback).

Simply put, it involves real people scoring AI-generated responses, telling it which answer is better, more aligned with human values and emotional preferences.

ChatGPT seems "human-like" precisely because countless RLHF labelers are teaching it.

On crowdsourcing platforms, such labeling tasks are often priced clearly: 3 to 7 yuan per task. Labelers must assign highly subjective emotional scores to AI responses, judging whether an answer is "warm," "empathetic," or "attentive to the user's emotions."

A low-wage worker, struggling in the mire of reality, with no time to tend to their own emotions, must now serve as the AI's emotional tutor and values judge within the system.

They must break down complex, subtle human emotions like warmth and empathy, forcibly quantizing them into cold scores of 1 to 5. If their scores don’t match the system’s preset standard answers, they are marked as below accuracy standards, deducting from their already meager piece-rate pay.

This is a cognitive voidance. The intricate, profound human emotions, morality, and compassion are being dragged into the algorithm's funnel. In the icy quantification and standardization, they are drained of their last warmth. While you marvel at the cyber behemoth on screen learning to write poetry, compose music, offer comfort, even donning a sentimental skin; outside the screen, those once-vibrant humans, through daily mechanical judgments, are regressing into emotionless scoring machines.

This is the most hidden side of the entire industry chain, never appearing in any funding news or technical white papers.

No one sees anything wrong with this.

The Master's Graduate and the Small-Town Youth

As底层 (low-level) boxing work is being crushed by AI’s treads, this cyber assembly line is expanding upward, beginning to吞噬 (devour) higher-level intellectual labor.

The appetite of large models has changed. They are no longer satisfied with chewing simple常识 (common sense); they need to devour human expertise and advanced logic.

Major recruitment platforms are频繁闪烁 (frequently flashing) with special part-time jobs, such as "Large Model Logical Reasoning Labeling" or "AI Humanities Trainer." These roles have extremely high barriers, often requiring "Master's degree or above from 985/211 universities," involving specialized fields like law, medicine, philosophy, and literature.

Many top-university graduates are attracted, flooding into these outsourcing groups for big tech companies. But they soon discover this is no轻松的脑力体操 (light mental exercise)—it’s mental torture.

Before officially taking tasks, they must read dozens of pages of scoring dimensions and evaluation criteria, undergoing two or three rounds of trial labeling. After meeting the standard, during正式标注 (formal labeling), if their accuracy falls below the average, they lose eligibility and are kicked out of the group.

The most suffocating part is that these standards are not fixed. Facing similar questions and answers, using the same reasoning to score, the results can be completely opposite. It’s like taking an endless exam with no standard answer. There’s no way to improve accuracy through self-effort or study; you can only spin in place, consuming mental and physical energy.

This is the new exploitation of the large model era—class folding.

Knowledge, once seen as a golden ladder to break barriers and climb upward, has now become (沦为) more complex digital fodder chewed up for the algorithm. Before the absolute power of algorithms and systems, the 985 master’s graduate in the ivory tower and the small-town youth on the Loess Plateau have reached the most bizarre convergence.

They fall together into this bottomless cyber mine pit, stripped of their光环 (halo), their differences flattened, all reduced to cheap, replaceable cogs on the conveyor belt.

It’s the same abroad. In 2024, Apple directly cut an AI voice labeling team of 121 people in San Diego. These employees were responsible for improving Siri’s multilingual processing. They once thought they were on the edge of the core business of a major company, but instantly fell into the abyss of unemployment.

In the eyes of tech giants, whether it’s the boxing auntie in a county town or the logic trainer graduated from a prestigious school, they are essentially disposable "consumables."

No one sees anything wrong with this.

The Trillion-Dollar Babel, Built with Pennies of Sweat

According to data released by the China Academy of Information and Communications Technology, China’s data labeling market reached 6.08 billion yuan in 2023, with projections of 20-30 billion yuan by 2025. It is predicted that by 2030, global data labeling and service market sales will soar to 117.1 billion yuan.

Behind these numbers lies the valuation狂欢 (狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢狂欢极 (extreme) valuations of OpenAI, Microsoft, ByteDance, and other tech giants, reaching trillions of dollars.

But this泼天的财富 (immense wealth) does not flow to those who truly "feed" the AI.

China’s data labeling industry exhibits a typical inverted pyramid outsourcing structure. At the very top are the tech giants死死捏着 (firmly grasping) the core algorithms. The second tier consists of large data service suppliers. The third tier is made up of data labeling bases and small-to-medium outsourcing companies scattered across the country. At the very bottom are the piece-rate labelers, the "muddy-legged" workers.

Each layer of outsourcing skims off a hefty portion. When the big company pays 50 cents per unit, after层层盘剥 (layers of exploitation), what reaches the county labeler might be less than 5 cents.

Yanis Varoufakis, former Finance Minister of Greece, in his book "Technofeudalism," presents a penetrating view: today’s tech giants are no longer traditional capitalists but "Cloudalists."

They don’t own factories and machines but algorithms, platforms, computing power—these are the digital territories of the cyber age. In this new feudal system, users are not consumers but digital serfs; our every like, comment, and browse on social media is免费上供 (offering up for free) data to the云领主 (cloud lords).

And those data labelers in下沉市场 (downstream markets) are the lowest digital serfs in this system. They not only produce data but also清洗 (clean), classify, and score massive amounts of raw data, turning it into high-quality feed digestible by large models.

This is a隐秘的认知圈地运动 (hidden cognitive enclosure movement). Just as the 19th-century English enclosure movement drove farmers into textile mills, today’s AI wave drives youth who find no place in the real economy to screens.

AI has not leveled the class divide; instead, it has built a "data and sweat conveyor belt" stretching from counties in central and western China straight to the headquarters of tech giants in Beijing, Shanghai, Shenzhen, and Guangzhou. The narrative of technological revolution is always grand and华丽 (splendid), but its underlying color is always the规模化消耗 (large-scale consumption) of cheap labor.

No one sees anything wrong with this.

The Tomorrow That No Longer Needs Humans

The cruelest outcome is coming, faster and faster.

As large model capabilities leap, those labeling tasks that once required human day-and-night labor are being taken over by AI itself.

In April 2023, Li Xiang, founder of Li Auto, revealed on a forum that in the past, the company did about 10 million frames of manual自动驾驶图像标定 (autonomous driving image calibration) per year, with外包成本 (outsourcing costs)接近 (approaching) 100 million yuan. But when they used large models for automated labeling, what used to take a year could be done in basically 3 hours.

The efficiency is 1000 times that of humans, and this was back in 2023. Just this past March, Li Auto also released its new-generation MindVLA-o1 automatic labeling engine.

A painfully true industry自嘲 (self-mockery) goes: "As much intelligence, as much manual labor." But now, big tech companies’ investment in data labeling outsourcing has seen a断崖式下降 (cliff-like drop) of 40%-50%.

The small-town youth who sat countless days and nights before computers, straining their eyes red, have亲手喂大 (personally fed) a giant beast. And now, this beast is turning around to smash their rice bowls.

Night falls, and the office buildings in Datong’s Pingcheng District remain starkly white. Young people changing shifts silently exchange weary shells in the elevator. In this folded space禁锢 (imprisoned) by countless polygonal boxes, no one cares about the epic leaps in the Transformer architecture across the ocean, nor can anyone understand the roar of computing power behind billions of parameters.

Their gaze is welded solely to the red and green progress bar in the backend representing the "qualifying line," calculating whether the piece-rate pennies and dimes can piece together a decent life by month’s end.

On one side, Nasdaq bell rings and tech media coverage abound, giants raising glasses to celebrate the advent of AGI; on the other, these digital serfs who fed the AI mouthful by mouthful with their flesh and blood can only wait战战兢兢 (trepidatiously) in aching sleep for the beast they亲手饲养 (personally raised) to, on some ordinary morning, casually kick away their rice bowls.

No one sees anything wrong with this.

Пов'язані питання

QWhat is the main job of the young people in small towns like Datong, as described in the article?

AThey are data annotators, performing tasks like drawing boxes (2D/3D annotation) and providing feedback (RLHF) to train AI models, often for tech giants.

QHow has the pay for simple image annotation tasks changed from 2017 to 2023?

AThe price for a simple 2D box annotation dropped from over 0.1 yuan to 3-4 fen (0.03-0.04 yuan), a decrease of more than 90%.

QWhat is RLHF, and what ironic role do the annotators play in this process?

ARLHF (Reinforcement Learning from Human Feedback) is a process where humans rate AI responses to teach it human-like values and empathy. The irony is that low-wage workers, struggling with their own lives, must judge and quantify complex human emotions like warmth and empathy for the AI.

QAccording to the article, what is the predicted future trend for the data annotation job market?

AThe job market is shrinking rapidly as AI automation improves. For example, AI can now perform tasks 1000 times faster than humans, leading to a 40-50% drop in outsourcing spending by major companies, threatening the jobs of these annotators.

QWhat term does the article use to describe the new economic structure where tech giants act as 'cloud lords'?

AThe article references the term 'technofeudalism' from Yanis Varoufakis's book, describing tech giants as 'Cloudalists' who own digital territories (algorithms, platforms), while users are 'digital serfs' and data annotators are the lowest-tier 'digital farm laborers'.

Пов'язані матеріали

Deconstructing the U.S. Stock Quantum Computing Sector: IonQ, Rigetti, D-Wave, Which of These Concept Stocks is Worth Betting On?

**Title:** Analyzing the US Quantum Computing Race: IonQ, Rigetti, D-Wave – Which Concept Stock is Worth Betting On? **Summary:** The podcast discusses the resurgence of quantum computing as a national priority for both the US and China, driven by its potential to break current encryption, revolutionize drug discovery, finance, and logistics. The core challenge is commercializing the technology, which is hampered by high error rates in quantum bits (qubits). Quantum error correction, requiring thousands of physical qubits per reliable logical qubit, is key but years away. The analysis compares three main publicly traded US quantum computing firms: * **IonQ (Ion Trap):** Considered the most financially stable with the fastest commercial progress (2025 revenue: $130M, +202%) and high-quality clients. Its valuation is very high, pricing in significant future growth. * **Rigetti (Superconducting):** Seen as the highest-risk, highest-potential-reward bet. It has the smallest revenue but recently launched a 108-qubit system. Its valuation multiples are extreme, making it highly sensitive to news. * **D-Wave (Quantum Annealing):** Has the most unique positioning with real-world enterprise clients today (e.g., Mastercard, Volkswagen) solving optimization problems. Its recent acquisition moves it into general-purpose quantum computing ("dual-platform"), adding execution risk. Major tech giants like Google, IBM, and Microsoft are also heavily invested, pursuing various technical approaches. Nvidia is positioning itself as the essential bridge between classical and quantum computing. The investment phase is likened to AI in 2018-2020: promising underlying technology with accelerating breakthroughs but a commercial inflection point still 3-7 years away, suggesting potential for a market correction ("bubble washout"). For investors, suggested approaches include gaining exposure through tech giants with quantum divisions (e.g., Google, IBM) or using niche ETFs like WQTM for pure-play quantum exposure, rather than direct stock picks in the highly volatile pure-play companies at this early stage.

marsbit12 хв тому

Deconstructing the U.S. Stock Quantum Computing Sector: IonQ, Rigetti, D-Wave, Which of These Concept Stocks is Worth Betting On?

marsbit12 хв тому

From Parallel Finance to Mainstream Finance: The On-Chain Securities Era Ushers in a Historic Window

From Parallel Finance to Mainstream: The Dawn of On-Chain Securities For over a decade, the crypto industry has operated as a parallel financial system with its own currencies, markets, and assets—from Bitcoin and ICOs to DeFi, NFTs, and memecoins. Despite building a robust internal ecosystem, a wall has separated it from the traditional financial world. That barrier is now crumbling. The industry's first act was one of internal evolution: ICOs streamlined fundraising, DeFi recreated financial services on-chain, and layer-2 networks competed for scalability—all within the crypto bubble. While innovative, this cycle remained closed, with capital and users circulating internally, leading to volatile boom-bust cycles. Even Bitcoin ETFs, while attracting Wall Street capital, merely provided a channel to buy crypto assets without bridging the systems. The next, larger narrative is Real-World Assets (RWA) moving on-chain. This involves tokenizing stocks, bonds, funds, and future cash flows. Blockchain can compress the complex traditional processes of trading, settlement, clearing, and custody into a seamless, automated network operating in seconds. This shift is creating a new financial gateway: the native crypto securities broker. This entity will combine functions of an exchange, broker, bank, and custodian into a unified global financial operating system. Consequently, the next major battleground won't be the "public chain wars" focused on speed and cost, but the competition to build the financial infrastructure capable of hosting high-quality, liquid real-world assets. Access to global equities, index funds, or stakes in companies like SpaceX could erase the boundary between crypto and traditional finance, unlocking a market orders of magnitude larger than crypto's current valuation. In summary, after years of creating a separate financial world, crypto's next decade will be defined by its integration into the existing global financial system, marking the true beginning of its largest growth story.

marsbit33 хв тому

From Parallel Finance to Mainstream Finance: The On-Chain Securities Era Ushers in a Historic Window

marsbit33 хв тому

Wang Chuan: When the Neighbor Old Wang Made 30x on Memory Stocks, How to Avoid Anxiety (Part Six) - The Trap of Commoditized Goods

Wang Chuan: When the Neighbor Lao Wang Made 30x on Storage Stocks, How to Stay Anxiety-Free (Part 6) - The Trap of Commoditized Goods. This essay uses historical and current examples to analyze the cyclical and high-risk nature of the data storage industry. It begins with the 1990s rise and dramatic fall of Iomega, whose stock soared over 160x in 18 months before collapsing 97% from its peak, illustrating the fleeting success of storage "meme stocks." The core problem is that storage products, like DRAM and flash memory, are highly commoditized. This leads to extreme volatility: prices have plummeted over 80% multiple times, and company stocks often crash 95% or go bankrupt. The industry's dynamic is defined by "elastic demand facing heavy-asset, long-cycle, rigid supply." When demand spikes and supply is fixed, prices skyrocket, as seen recently with AI-driven demand for High Bandwidth Memory (HBM). Companies like Sandisk and Micron have reported massive revenue and gross margin jumps (e.g., Sandisk's gross margin rising from 22.5% to 78.3%) despite minimal increases in production volume. However, these high margins are self-defeating. They incentivize massive new capacity investments (hundreds of billions planned from 2026), with supply expected to surge by late 2027. Once new supply meets demand, prices and profits will crash, potentially leading to a scenario where "selling more results in earning less." The article debunks the safety of long-term supply agreements, comparing them to fragile non-aggression pacts easily broken when market conditions shift. It warns that when an industry is highly profitable but trades at low P/E ratios, the risk is greatest, as plummeting prices quickly erase those earnings. Multiple asymmetric risks loom, including economic recession, reduced AI spending, faster-than-expected capacity expansion (especially from Chinese firms), and technological innovations that reduce memory requirements. In conclusion, the storage sector is a cyclical trap where periods of euphoric profits are often precursors to devastating downturns, luring unprepared investors into a "wealth incinerator."

marsbit42 хв тому

Wang Chuan: When the Neighbor Old Wang Made 30x on Memory Stocks, How to Avoid Anxiety (Part Six) - The Trap of Commoditized Goods

marsbit42 хв тому

Wang Chuan: When the neighbor Lao Wang earned thirty times from investing in memory storage stocks, how can you still avoid anxiety (6) - The trap of homogeneous products

The article, "Wang Chuan: How to Remain Unanxious After Neighbor Lao Wang's Thirty-Fold Gain on Storage Stocks (Part 6) - The Trap of Commoditized Goods," analyzes the cyclical and perilous nature of the data storage industry through historical and current case studies. It begins with the example of Iomega, whose Zip drives led to a stock surge of over 160x in the mid-1990s before collapsing over 97% from its peak due to competition from cheaper CD-R technology. This pattern is characteristic of storage, where products like DRAM are highly commoditized, leading to extreme price volatility. The sector has seen prices crash over 80% multiple times, with companies often facing bankruptcy. The core dynamic is "elastic demand facing heavy-asset, long-cycle, rigid supply." High prices attract new capacity, but the long lead time means supply eventually overshoots, causing sharp price corrections. The current AI-driven boom, exemplified by surging demand for High-Bandwidth Memory (HBM), has led to skyrocketing prices and profit margins for companies like SanDisk and Micron, despite relatively flat production volumes. However, the author warns this high-margin environment is self-defeating. The high profits are already triggering massive new capacity investments (hundreds of billions starting 2026), with supply expected to ramp up by late 2027. When supply catches up, total revenue and profits may fall even as more units are sold. Long-term supply agreements offer little protection, as buyers can find ways to renegotiate if market prices drop, similar to fragile political treaties. Key risks include economic downturns, cuts in AI spending, faster-than-expected capacity expansion (especially from Chinese firms), and innovations in chip/algorithm design that reduce memory needs. A critical trap is that at the cycle's peak, storage stocks often appear cheap with low P/E ratios, luring value investors just before an impending downturn where profits evaporate. The conclusion cautions that for commoditized goods like storage, high margins inevitably destroy themselves, and the current asymmetry favors downside risk over further upside. The neighbor's dream of easy wealth from storage stocks is portrayed as a precarious illusion.

链捕手1 год тому

Wang Chuan: When the neighbor Lao Wang earned thirty times from investing in memory storage stocks, how can you still avoid anxiety (6) - The trap of homogeneous products

链捕手1 год тому

Торгівля

Спот
Ф'ючерси
活动图片