In the AI Era, Data is the New Oil: How Can We Ordinary People Go from Exploration to Selling Gasoline?

marsbitPubblicato 2026-01-19Pubblicato ultima volta 2026-01-19

Introduzione

The article reinterprets the popular metaphor "data is the new oil" for the AI era, arguing that it is not just relevant for tech giants but also actionable for ordinary individuals. It breaks down the process into a practical framework mirroring the oil industry: exploration (finding data), refining (processing data), and selling (monetizing outputs). For exploration, individuals should tap into two types of "oil fields": personal private data (e.g., work notes, emails, decision logs) and public data sources (e.g., articles, podcasts, social media), while avoiding low-quality AI-generated content. The refining stage involves using AI tools (like ChatGPT or Gemini) not as ends in themselves but as part of a broader "refinery"—a personal system of methodologies, prompts, and workflows. The final and most challenging step is monetization: AI outputs are often non-standard (e.g., custom scripts, reports, or advice) and require identifying a target audience—whether for personal use, businesses, or consumers. The article also emphasizes the need for "environmental cleanup" to avoid digital clutter, outdated tools, and unnecessary subscriptions. Ultimately, it urges readers to focus on building their own data reserves, refining processes, and defining their "gas station" (distribution channel) rather than being passive consumers of AI hype.

Author: Huang Shiliang

"Data is the new oil"—this phrase is almost worn out in the AI circle. But in the mainstream narrative, it seems to have nothing to do with us ordinary folks—it's a capital game for tech giants, competing with GPUs and trillions of parameters.

But I later pondered it, and this metaphor is actually a very good compass for us to navigate the AI world.

I. A Severely Misunderstood Metaphor

"Data is the new oil"—this phrase has almost become the bible of the AI era.

But honestly, most people's first reaction upon hearing this is: This is TMD a big company thing, what the hell does it have to do with an ordinary person like me?

Because in the mainstream narrative, the "data" they talk about is stuff on the scale of the entire internet, Wikipedia—petabyte-level things; "refining technology" is tens of thousands of H100 GPUs + a bunch of scientists with million-dollar salaries; the "final product" is an omniscient, all-powerful God-model like GPT-5.

This logic is fine commercially, but the problem is—it basically says: Don't participate, you're not at the table.

We ordinary people are directly kicked out of the game.

Even darker, there's another version of this saying that pisses me off more the more I think about it:

Data is the new oil, our consumer data is the oil field in Venezuela; and companies like Meituan, Alibaba, Douyin are the US's Trump.

They accidentally (actually deliberately) come to our place, stick pipes in to extract oil, take our data for free, refine it into "98-octane gasoline" (precise algorithms, big data price discrimination), and then forcefully sell it back to us.

The result is: we become the suckers—not only contributing the raw materials for free, but also helping the platforms count their money after being sold out.

In this version of the story, the players are only the giants. We have neither massive data nor capital, let alone the ability to train a large model. So "data is the new oil" becomes a slogan that sounds awesome but is utterly useless, even somewhat disgusting, to individuals.

II. Change Your Perspective, and There's Hope

I think this consensus is problematic. We need to look at it from a different angle.

If we insist on applying the concept of "data is the new oil" to ordinary people, then the question is no longer "is this metaphor correct", but rather: how exactly does this thing guide me to work?

The reason the oil industry is awesome is that it has a very clear, unavoidable logical chain:

Find oil fields (exploration) → Build refineries (processing) → Standardize products (gasoline) → Build channels (gas stations) → Sell to users.

For us ordinary people, the "data oil" of the AI era must also be broken down step by step according to this logic. Miss one step, and your AI anxiety will never turn into productivity, only into the mental drain of "scrolling news + saving links + watching others get rich".

Below, I'll break down how ordinary folks should proceed according to this logic.

III. Step One: Where is the Oil Field?—Find the "Micro Rich Mines" Around You

In traditional industries, you go to places like Saudi Arabia, Russia to find oil. But on our path, the oil field is actually right next to you. I think there are at least two major categories.

1. Personal Private Data: Your Own Backyard

This is the most easily overlooked, but the most stable type of data. It doesn't need to be large in scale, but its purity is extremely high.

For example, your work processes, the logic behind your decisions, the pitfalls you've stepped into (failure reviews), and the unwritten rules you've learned from years in the industry.

Also, your digital footprint: notes, code repositories, drafts, emails written over the past decade... these all count.

The value of this lies in: it belongs entirely to you. A "personal digital twin" or "domain expert Agent" trained on this data cannot be replaced by any general-purpose large model.

If in the past 5 years you haven't really used a computer in your work and life, relying solely on a smartphone to get by, then you probably won't evolve into an AI producer, destined to be just an AI consumer.

If you really want to make money with AI, I think you need to buy a computer. Why?

Because without a computer, you most likely don't have a systematic data沉淀 (precipitation/sedimentation), you are a complete "oil-poor country". Don't expect the few pictures in your phone's gallery, or the dozens of GB of voice messages and fragmented chat records on WeChat to do anything big—too many impurities, too poor structure, you can't refine qualified 92-octane gasoline, at best you might get some 29-octane stuff.

2. Public Data Rich Mines: Assemble Your "Exploration Team"

The second category is data that everyone can see, but 99% of people are just "consuming" rather than "exploring": X.com, public accounts, arXiv, YouTube... these are the "high seas" of the data era.

The internet today, especially social media, is deteriorating too fast. I dare say, definitely over 50%, maybe even over 90% of the content is AGRC (AI Generated Rubbish Content).

These people use AI to mass-produce nonsense, directly polluting the stratum. If you're not conscious during geological exploration, you'll just dig up garbage.

Worse: if you feed garbage to your brain or to AI, what you refine in the end can only be garbage, and it might even clog your refinery.

So to ensure what you dig up isn't AGRC, I suggest you create a strictly curated **"inspiration source portfolio". But note: just reading is useless, this is hoarding crude oil. You must learn crude oil pre-processing **—run each source through AI to turn it into fuel the machine can read:

Deep Sedimentary Rock (Books): This is the ballast. Set an annual reading list, must include professional classics and literature.

AI Method: Don't just read dumbly. Must use Gemini or ChatGPT to assist reading, discuss each chapter with it, let it generate thought questions. After reading, must create electronic reading notes and feed them to AI—this is your knowledge base.

Frontier Exploration Zone (Papers & Reports): Browse arXiv or Google Scholar more often. Force yourself to啃 (chew on) one paper a week in a "paper lunch meeting".

AI Method: Can't read the raw meat? Throw the PDF directly to NotebookLM or ChatGPT, let it summarize the core arguments and data for you, turn the "tough bones" into "rich broth" to store.

Surface Runoff (News & Information): Use RSS or customized feeds. I scan headlines, only deeply bookmark the truly awesome ones.

AI Method: Don't just bookmark links. Copy the content, let AI help you tag it, extract keywords, categorize and save it to your note-taking software, otherwise bookmarks just gather dust.

Associated Gas Field (Podcasts & Lectures): Listen to stuff like TED Radio Hour during commutes. Force yourself to attend one or two offline salons each month.

AI Method: When you hear a good point, don't just nod. Use Whisper to transcribe the audio to text, then let AI organize it into structured notes. Sound cannot be searched, but text can.

High-Yield Oil Well (Social Media): Follow a group of real experts on Twitter/X. Regularly clean your follow list, unfollow those posting emotional garbage.

AI Method: See an awesome Thread, copy it directly to AI, let it analyze the logical flaws, or integrate its viewpoints into your knowledge system.

Field Expedition (Life Observation, Field Research): Deliberately practice "viewing life with questions". This is perceptual data that AI crawlers can never scrape.

AI Method: When inspiration strikes, don't type, just talk via voice, then throw it to AI to organize into a diary. Let AI help you turn ramblings into logical insights.

We must develop the habit of随时 (always) picking up the phone and口喷 (verbally spewing) a bunch of words to Douban (or similar AI assistant).

These six sources are your "hybrid oil field". Only if your input is wild and diverse enough, and all pre-processed by AI, will the stuff you refine not be clichéd.

IV. Step Two: Where is the Refining Equipment?—Don't Just Stare at Large Models

Found the oil, next step is to refine it. Mainstream media忽悠 (hype/trick) you into buying GPUs every day, but for individuals, the real refinery must be your own software stack + thinking process.

1. Large Models are Just a "Boiler"

Paying for a ChatGPT Plus subscription doesn't make us awesome, it's like buying a boiler and then standing next to admire its brightness—but you're not starting work!

Large models like ChatGPT, DeepSeek are, frankly, basic power units, the foundation. They can burn, but that doesn't mean you can produce oil.

2. The Real Refinery is the "Personal Tool System"

An efficient personal refinery needs these components:

Pipelines (Toolchain): VS Code, Python, Skills (likely referring to AI agent skills/functions)这些东西 (these things).

Process (Methodology): This is the core barrier. It's how you write Prompts, how you build a RAG knowledge base, how you make several Agents (skills) cooperate.

The focus is never "how strong the model is", but rather: how do you interact with AI, how do you translate the tacit experience in your brain into instructions AI can understand.

This set of "personal engineering system" is your refinery, not the model itself.

V. Step Three: The Product is Not the End, Selling it is the Real Battle

This is the cruelest link in the whole chain. Sinopec just needs to transport oil to gas stations, and car owners naturally queue up. But in the AI era, productization and sales are really TMD difficult.

1. The "Gasoline" Refined by AI is Extremely Non-Standard

The stuff you refine using "personal data" + "large model" is most likely not universal gasoline, but rather:

  • A Python script only you can use
  • An article with a unique style
  • An AI-processed report after seeing a doctor for check-ups
  • A set of personalized legal advice

These things are not universal, not standard, and very scenario-specific.

2. The Real Big Problem: Who to Sell To?

So before you start, you must ask in reverse: Who the hell am I going to sell this thing I make? This actually argues backwards for what oil we should refine.

Sell to yourself (Self-use): Saving time is making money, this is the easiest closed loop to achieve.

Sell to businesses (B2B): Package your Prompt or workflow into a solution. This requires extremely strong pre-sales ability (忽悠能力 - hustling/convincing ability).

Sell to the public (B2C): Make it into an App or content column. This depends on whether you have traffic distribution ability.

Actually: Refining oil (generating content) in the AI era is getting easier and easier, but building gas stations (distribution & sales) is unprecedentedly difficult.

VI. Don't Forget Environmental Protection: Don't Let Waste Bury You

Traditional oil refining produces waste residue, wastewater, exhaust. If you don't treat it, the refinery won't make money before the person is熏死 (fumigated to death).

Data refining is the same, **"cyber pollution"** is extremely serious, you must have an "environmental department" to clean up regularly.

1. Clean Up Expired "Tool Waste"

AI evolves way too TMD fast, ridiculously fast.

The "Top 10 Must-Use AI Navigation Sites for 2025" you bookmarked last month might have five倒闭 (go bankrupt) this week; the AI drawing parameters you are struggling with today might be降维打击 (dimensionality reduced/demoted/surpassed) by "one-click generation" tomorrow.

Don't be a "cyber scavenger", hoarding a bunch of outdated tools unwilling to throw away. Uninstall what needs uninstalling, unfollow what needs unfollowing. Tools are for using, not for worshiping.

Hoarding outdated tools is like filling your house with rusty scrap iron, it only slows down your operating speed.

2. Discard Drained "Data Shells"

Many people have "squirrel syndrome": download upon seeing a PDF, bookmark upon seeing a video, fill their hard drives with several TB of materials, and feel like they own the world.

That's not knowledge, that's landfill garbage.

The true environmentally friendly approach is: use AI to榨 (squeeze/extract) the "oil" from PDFs, videos, long articles—generate summaries, extract golden quotes, convert them into your notes.

Once榨干 (squeezed dry), throw away the original file (or archive it to cold storage). Your attention is an extremely expensive limited resource, don't let these raw files占用 (occupy) your bandwidth.

Only keep "refined fuel", discard "crude oil shells", this is an efficient refinery.

3. Cut Off Those "Blood-Sucking Zombie Bills"

AI anxiety makes us do many stupid things, the stupidest of which is: spending money in a hurry to buy a sense.

Signing up for classes, buying courses, rushing to conferences, buying Plus memberships... the costs are not low. What's worse, many things一旦订阅 (once subscribed, especially monthly deductions), you often forget to cancel.

I once bought a server for testing, for over three years, it silently deducted money from me every month, hidden among a pile of bills, I had no idea—I only used it on the day of testing.

Also, in a moment of brain fever, I bought ChatGPT, Gemini, Claude, Perplexity... a bunch of auto-renewals, and bought some APIs. Result? Most of the time they were吃灰 (gathering dust).

Damn, what a waste.

These are things that must be cleaned up for "environmental protection". Otherwise, before you refine any sellable oil, your family assets will be stolen by these pollutants.

VII. Final Words: An Action Map

When we strip off the grand exterior of "data is the new oil", it is no longer a distant capital story, but a cold, hard roadmap for ordinary people.

In this era, if you want to win, quickly check your "balance sheet":

  • Reserves: Are you still scrolling Douyin? Or are you already consciously accumulating high-quality data through "inspiration sources" + AI assistance? (Remember to avoid AGRC garbage)
  • Production Capacity: Do you have your own set of tools and methodology (refinery), and what oil to refine?
  • Channels: Have you figured out who you are going to sell these non-standard products to? This can argue backwards for production capacity, whether to refine 92-octane or 98-octane oil.
  • Environmental Protection: Are you hoarding a bunch of digital junk? Have you checked your credit card bills to cut off those zombie subscriptions?

Final advice: Forget those news about billions of parameters. Start today—buy a computer, establish your "inspiration data sources", go drill your first micro oil well, sell to yourself first, refine an automated tool that solidifies your work into AI-primary, self-secondary.

Actually, I'm also very confused, I've been折腾 (tinkering/struggling) with AI for over three years, I haven't refined anything. I only refined an AI to manage my to-do list, and refined an AI to manage my reading notes, I'm still thinking, what can I refine?

Domande pertinenti

QWhat are the two main types of 'micro-rich mines' of data that the article suggests ordinary people can tap into?

AThe two main types are: 1. Personal private data (your own backyard), such as your work processes, decision-making logic, failure reviews, industry insights, and digital footprints like notes and code. 2. Public data rich mines (publicly available sources), like X.com, arXiv, YouTube, which require careful curation to avoid AI-generated rubbish content (AGRC).

QAccording to the article, what is the true 'refinery' for an individual in the AI era, and what does it consist of?

AThe true 'refinery' is not the large language model itself, but an individual's 'personal engineering system.' This consists of a toolchain (e.g., VS Code, Python) and a methodology (e.g., how to write prompts, build a RAG knowledge base, and orchestrate multiple AI agents). The core barrier is the process of translating one's tacit experience into instructions the AI can understand.

QWhat are the three potential customer groups the article identifies for the 'non-standard gasoline' (AI-generated products) that individuals create?

AThe three potential customer groups are: 1. Yourself (for personal use, saving time is earning money). 2. Enterprises (B2B, packaging your prompts or workflows as solutions). 3. The general public (B2C, creating apps or content columns, which requires traffic distribution capabilities).

QWhat does the article mean by 'cyber pollution' and what are the three specific types of 'environmental cleanup' it recommends?

A'Cyber pollution' refers to the digital waste and inefficiencies that can slow you down. The three types of cleanup are: 1. Cleaning up expired 'tool slag' (uninstalling outdated AI tools and navigation sites). 2. Discarding drained 'data shells' (using AI to extract value from files like PDFs and then archiving or deleting the originals, keeping only the refined notes). 3. Cutting off 'blood-sucking zombie bills' (canceling unused subscriptions and API services that automatically renew).

QWhat is the final actionable advice the article gives to ordinary people who want to start participating in the 'data is oil' paradigm?

AThe final advice is to stop focusing on news about billion-parameter models and start today by: buying a computer, establishing your 'inspiration data sources,' drilling your first 'micro oil well,' and first selling to yourself. The initial goal is to refine your work into AI-driven automated tools where AI is the primary worker and you are the secondary assistant.

Letture associate

Telegram Takes Direct Control of TON, Social Traffic Rewrites the Public Chain Narrative

Telegram founder Pavel Durov announced that Telegram will replace the TON Foundation as the core driver and largest validator of The Open Network (TON). Key initiatives include a sixfold reduction in transaction fees, performance upgrades, and improved developer tools within the next few weeks. This marks a strategic shift from Telegram merely providing user access to deeply integrating TON into its platform's core infrastructure. The goal is to transform Telegram's massive social traffic into sustainable on-chain activity. While viral mini-apps like Notcoin have demonstrated Telegram's ability to drive user adoption, TON aims to support frequent, low-value transactions inherent to social platforms—such as tipping, in-app payments, and game rewards. Ultra-low fees and sub-second finality (0.6 seconds) are crucial to making blockchain interactions seamless and nearly invisible within the Telegram user experience. However, Telegram's increased central role raises questions about network decentralization. Durov argues that Telegram's participation will attract more large validators, thereby enhancing decentralization. TON also offers high annual staking rewards (18.8%), aiming to retain capital within its ecosystem. The fundamental challenge for TON is no longer leveraging Telegram's user base, but becoming an indispensable, seamless infrastructure layer for Telegram's everyday applications—moving from an adjacent chain to an embedded utility.

marsbit1 min fa

Telegram Takes Direct Control of TON, Social Traffic Rewrites the Public Chain Narrative

marsbit1 min fa

Telegram Takes Direct Control of TON, Social Traffic Reshapes Public Chain Narrative

Telegram's founder, Pavel Durov, has announced a major shift in the development of The Open Network (TON). Telegram will now become the core driver of TON, replacing the TON Foundation and becoming its largest validator. The focus will be on technical upgrades over the next few weeks, including slashing network fees by six times to near-zero and improving finality time to 0.6 seconds. This move signifies a deeper integration between Telegram and TON, moving beyond just providing a user base. The goal is to transform Telegram's vast social traffic and built-in features—like Mini Apps, payments, and bots—into sustainable, on-chain usage scenarios. The reduced fees and faster speeds are crucial for enabling the small, frequent transactions typical of social interactions. While this promises stronger execution and product alignment, it raises questions about centralization. Durov argues Telegram's involvement will attract more validators, enhancing decentralization, but the outcome remains to be seen. Additionally, TON's high annual staking reward of 18.8% aims to retain capital within the ecosystem. The key challenge for TON is no longer just leveraging Telegram's entry point, but becoming an invisible, seamless infrastructure layer within Telegram's daily use. Its success hinges on converting viral attention into lasting, embedded utility.

Odaily星球日报11 min fa

Telegram Takes Direct Control of TON, Social Traffic Reshapes Public Chain Narrative

Odaily星球日报11 min fa

OpenAI Post-Training Engineer Weng Jiayi Proposes a New Paradigm Hypothesis for Agentic AI

OpenAI engineer Weng Jiayi's "Heuristic Learning" experiments propose a new paradigm for Agentic AI, suggesting that intelligent agents can improve not just by training neural networks, but also by autonomously writing and refining code based on environmental feedback. In the experiment, a coding agent (powered by Codex) was tasked with developing and maintaining a programmatic strategy for the Atari game Breakout. Starting from a basic prompt, the agent iteratively wrote code, ran the game, analyzed logs and video replays to identify failures, and then modified the code. Through this engineering loop of "code-run-debug-update," it evolved a pure Python heuristic strategy that achieved a perfect score of 864 in Breakout and performed competitively with deep reinforcement learning (RL) algorithms in MuJoCo control tasks like Ant and HalfCheetah. This approach, termed Heuristic Learning (HL), contrasts with Deep RL. In HL, experience is captured in readable, modifiable code, tests, logs, and configurations—a software system—rather than being encoded solely into opaque neural network weights. This offers potential advantages in explainability, auditability for safety-critical applications, easier integration of regression tests to combat catastrophic forgetting, and more efficient sample use in early learning stages, as demonstrated in broader tests on 57 Atari games. However, the blog acknowledges clear limitations. Programmatic strategies struggle with tasks requiring long-horizon planning or complex perception (e.g., Montezuma's Revenge), areas where neural networks excel. The future vision is a hybrid architecture: specialized neural networks for fast perception (System 1), HL systems for rules, safety, and local recovery (also System 1), and LLM agents providing high-level feedback and learning from the HL system's data (System 2). The core proposition is that in the era of capable coding agents, a significant portion of an AI's learned experience could be maintained as an auditable, evolving software system.

marsbit1 h fa

OpenAI Post-Training Engineer Weng Jiayi Proposes a New Paradigm Hypothesis for Agentic AI

marsbit1 h fa

Your Claude Will Dream Tonight, Don't Disturb It

This article explores the recent phenomenon of AI companies increasingly using anthropomorphic language—like "thinking," "memory," "hallucination," and now "dreaming"—to describe machine learning processes. Focusing on Anthropic's newly announced "Dreaming" feature for its Claude Agent platform, the piece explains that this function is essentially an automated, offline batch processing of an agent's operational logs. It analyzes past task sessions to identify patterns, optimize future actions, and consolidate learnings into a persistent memory system, akin to a form of reinforcement learning and self-correction. The article draws parallels to similar features in other AI agent systems like Hermes Agent and OpenClaw, which also implement mechanisms for reviewing historical data, extracting reusable "skills," and strengthening long-term memory. It notes a key difference from human dreaming: these AI "dreams" still consume computational resources and user tokens. Further context is provided by discussing the technical challenges of managing AI "memory" or context, highlighting the computational expense of large context windows and innovations like Subquadratic's new model claiming drastically longer contexts. The core critique argues that this strategic use of human-centric vocabulary does more than market products; it subtly reshapes user perception. By framing algorithms with terms associated with consciousness, companies blur the line between tool and autonomous entity. This linguistic shift can influence user expectations, tolerance for errors, and even perceptions of responsibility when systems fail, potentially diverting scrutiny from the companies and engineers behind the technology. The article concludes by speculating that terms like "daydreaming" for predictive task simulation might be next, continuing this trend of embedding the idea of an "inner life" into computational processes.

marsbit1 h fa

Your Claude Will Dream Tonight, Don't Disturb It

marsbit1 h fa

Trading

Spot
Futures

Articoli Popolari

Come comprare PEOPLE

Benvenuto in HTX.com! Abbiamo reso l'acquisto di ConstitutionDAO (PEOPLE) semplice e conveniente. Segui la nostra guida passo passo per intraprendere il tuo viaggio nel mondo delle criptovalute.Step 1: Crea il tuo Account HTXUsa la tua email o numero di telefono per registrarti il tuo account gratuito su HTX. Vivi un'esperienza facile e sblocca tutte le funzionalità,Crea il mio accountStep 2: Vai in Acquista crypto e seleziona il tuo metodo di pagamentoCarta di credito/debito: utilizza la tua Visa o Mastercard per acquistare immediatamente ConstitutionDAOPEOPLE.Bilancio: Usa i fondi dal bilancio del tuo account HTX per fare trading senza problemi.Terze parti: abbiamo aggiunto metodi di pagamento molto utilizzati come Google Pay e Apple Pay per maggiore comodità.P2P: Fai trading direttamente con altri utenti HTX.Over-the-Counter (OTC): Offriamo servizi su misura e tassi di cambio competitivi per i trader.Step 3: Conserva ConstitutionDAO (PEOPLE)Dopo aver acquistato ConstitutionDAO (PEOPLE), conserva nel tuo account HTX. In alternativa, puoi inviare tramite trasferimento blockchain o scambiare per altre criptovalute.Step 4: Scambia ConstitutionDAO (PEOPLE)Scambia facilmente ConstitutionDAO (PEOPLE) nel mercato spot di HTX. Accedi al tuo account, seleziona la tua coppia di trading, esegui le tue operazioni e monitora in tempo reale. Offriamo un'esperienza user-friendly sia per chi ha appena iniziato che per i trader più esperti.

424 Totale visualizzazioniPubblicato il 2024.12.12Aggiornato il 2025.03.21

Come comprare PEOPLE

Discussioni

Benvenuto nella Community HTX. Qui puoi rimanere informato sugli ultimi sviluppi della piattaforma e accedere ad approfondimenti esperti sul mercato. Le opinioni degli utenti sul prezzo di PEOPLE PEOPLE sono presentate come di seguito.

活动图片