Why Are We So Persistent in That 'Laborious and Unrewarding' Data Cleaning?

marsbit2026-01-24 tarihinde yayınlandı2026-01-24 tarihinde güncellendi

Özet

In the article "Why Are We So Committed to 'Labor-Intensive and Unrewarding' Data Cleaning?", the RootData team reflects on their second bounty event, which focused on enhancing data transparency in Web3. The event involving over 140 participants resulted in 1,220 submissions, with 564 valid data points approved—a 46.2% acceptance rate. Key improvements included identifying key team members from projects like MOMO.FUN and Subhub (often not publicly listed), correcting inaccuracies in token unlock details and TGE timelines, and updating outdated information such as misattributed founders and deprecated social accounts. The author emphasizes that ensuring data transparency—though challenging—is critical for protecting investors' "right to know." In Web3, where misinformation is common (e.g., inconsistent token unlock data across platforms), RootData aims to serve as a reliable source of validated information. The team notes that core team changes around TGE events often signal project risks, yet such details are frequently overlooked. To uphold transparency, RootData publishes monthly reports on false fundraising claims, conducts in-depth analyses (e.g., exchange listing reports), and cross-verifies data rigorously—even declining unverified submissions. They also engage with industry leaders like Binance to align on data accuracy goals. The long-term vision is to transform isolated data points into structured, actionable transparency reports that support informed investmen...

Author: @BlockCookies

Hello everyone, I am the Data Activity Lead at RootData.

The second round of RootData's Bounty Activity has been successfully concluded. While sharing this review, rather than just cold numbers, I'd like to discuss: Why is promoting 'data transparency' in Web3 extremely challenging, yet something that must be done?

First, here are the data for this round's activity: Over 140 unique users participated, providing 1220 pieces of feedback, ultimately resulting in 564 validated data points, with an average approval rate of 46.2%.

Overview of Round 2 Bounty Activity Data

This activity helped RootData supplement nearly 300+ 'People Behind the Alpha,' such as executives and leads from MOMO.FUN, Subhub, boop, etc. These individuals often do not list their positions in their X bios or LinkedIn but may appear at events or be active in communities.

Additionally, we corrected about 120 token unlock information points. Some had inaccurate TGE times, while some had unlock rules not disclosed promptly; these issues were all optimized through the community's efforts.

Furthermore, we conducted in-depth optimization on 150 existing data points. For instance, we found that the founder of Fanable was mistakenly recorded as a non-Web3 individual with the same name, and its Managing Director Sergio had already left; the AINFT project had long changed its Twitter account...

Why are we pushing for transparency in the Web3 space? This data might seem mundane, and RootData itself is an expert in aggregating off-chain data, so why spend our own funds and mobilize the community for such 'grunt work'?

Honestly, when my boss @yubopan1 assigned me this task, I hesitated too. But one thing he said struck a chord: "From the ICO era to the FTX incident, the biggest tragedy for users is the lack of fair 'investment知情权 (right to know).' As crypto moves towards compliance, data platforms must be at the forefront, acting as that mirror."

As the data lead, I deeply feel his judgment is correct: Relying on a single source is insufficient for accuracy. Data未经多方验证 is不足以让 RootData become a platform trusted by investors.

Take token unlock data alone; it's very 'fragmented': the same project might have 5 different versions across 5 mainstream unlock platforms.

As is well known, Binance Listing requires submitting at least 3 team members. RootData has cataloged over 18,000 industry figures. How many update their resumes urgently before TGE, and how many 'quietly leave' after securing funding?

This round revealed: Significant projects experience frequent core team changes around TGE. For investors, this is often a 'barometer' of the project's direction. If no one verifies and discloses this, it gets lost in the daily information overload.

To ensure 'transparency' isn't just a slogan, our current implemented solutions include:

  • Monthly disclosures of false funding intelligence.
  • Regular in-depth research, like the recently published 《Exchange Listing Decision Report》.
  • Increasing the frequency of LinkedIn profile动态抓取 and verification.

Moreover, we insist on rigorous review standards. In this round, a user provided detailed information on the River development team, but the source was merely a post by a third-party account on Binance Square. Despite the detailed content, due to the lack of official endorsement or multi-source cross-verification, we still chose not to approve it.

This round focused on 'Binance Alpha,' and we also attempted communication with the Binance team. We don't aim to target any specific exchange; on the contrary, we hope to stand together with industry giants.

We once reached out to the Binance team to confirm some key dimensions, and the response was very positive: "If there's any information regarding Alpha that needs confirmation, feel free to communicate anytime."

Single-point data correction is just the beginning. In the future, RootData will connect 'discrete data points' into 'logically rigorous transparency reports,'甚至 transforming them into practical investment strategies.

Transparency is a持久战 (long-term battle) and an inevitable path for Web3 to go mainstream. We need more 'data hunters' to join us in揭开迷雾 (lifting the fog). Everyone is welcome to leave comments and discuss.

İlgili Sorular

QWhy does RootData insist on the laborious task of data cleaning in Web3 space?

ARootData believes that ensuring data transparency is crucial for providing fair 'investment知情权' (right to know) to users, especially after events like the ICO era and FTX incident. They aim to be a reliable platform by verifying data through multiple sources, as unverified data cannot be trusted.

QWhat were the key outcomes of RootData's second bounty event?

AThe event had over 140 independent participants who provided 1,220 feedback entries, resulting in 564 validated data points with an average approval rate of 46.2%. It helped add 300+ 'Alpha behind the people' and corrected about 120 token unlock details.

QWhat challenges exist in maintaining data accuracy for Web3 projects, according to the article?

AData accuracy is highly fragmented; for example, token unlock information for the same project can vary across five mainstream platforms. Additionally, core team members often change frequently around TGE, which is a critical signal for investors but easily overlooked without verification.

QHow does RootData ensure the reliability of the data it collects?

ARootData employs rigorous verification methods, including cross-referencing multiple sources and rejecting data without official backing. They also publish monthly reports on false funding information, conduct deep research like exchange listing reports, and increase frequency of LinkedIn profile checks.

QWhat is RootData's long-term goal regarding data transparency?

ARootData aims to transform discrete data points into logically coherent transparency reports and eventually into practical investment strategies. They seek to collaborate with industry leaders like Binance and encourage more 'data hunters' to join in demystifying Web3 information.

İlgili Okumalar

From Banning Doubao to Embracing Honor: Why Did WeChat Suddenly 'Change Its Face'?

The article explores the sudden shift in WeChat's strategy towards AI assistants from mobile phone manufacturers, transitioning from strict opposition to active collaboration. For over a year, WeChat fiercely resisted attempts by phone AI assistants (like ByteDance's Doubao in late 2025) to control its features via GUI automation ("simulated clicking"), citing security and data control concerns. This stance created a significant barrier for system-level AI integration. Now, Tencent has initiated A2A (Agent-to-Agent) partnerships with major phone brands like Honor, Xiaomi, OPPO, and vivo. This model allows a phone's system AI (e.g., Honor's YOYO) to parse a user's voice command and send a structured request directly to WeChat's own internal AI agent via secure APIs. WeChat then executes the action (e.g., sending a message) and returns the result. The article attributes Tencent's "change of face" to strategic pressure. While leading in social app usage, Tencent trails rivals like ByteDance and Alibaba in standalone AI app popularity. WeChat, with its vast mini-program ecosystem, is Tencent's key asset for an AI comeback. The upcoming WeChat AI agent aims to handle tasks like booking and payments within the app. However, phone system assistants remain the primary AI entry point for most users. The A2A collaboration allows Tencent to extend WeChat's AI reach to this crucial system layer while maintaining control over its core functions and data. For phone manufacturers, embracing A2A is a pragmatic move. The GUI route proved unviable due to WeChat's blocks. A2A offers a compliant path to integrate a vital service, enhancing their AI assistants' usefulness. It allows them to focus on developing their own AI ecosystems for other services while cooperating on WeChat access. The collaboration is framed as a mutual, strategic necessity: Tencent gains a distribution channel, and manufacturers gain a key functionality. The partnership relies on a "dual authorization" mechanism for security, requiring both user and app consent for each action. While questions about long-term data privacy practices remain, experts note A2A is more secure and compliant than GUI automation. Ultimately, this cooperation is seen as a tentative, calculated truce. Tencent's long-term goal is to make WeChat an AI-powered "service OS." Phone manufacturers aim to make their system AI the central user interface. Their paths may converge or clash in the future, but for now, the A2A deal represents the opening chapter in the battle for the AI-era user入口, driven by necessity and strategic calculus on both sides.

marsbit55 dk önce

From Banning Doubao to Embracing Honor: Why Did WeChat Suddenly 'Change Its Face'?

marsbit55 dk önce

On-Chain Figures on the Eve of Kickoff: 1.6 Billion Traded Before the World Cup Even Begins

"On-Chain Numbers on the Eve of the World Cup: $1.6 Billion Traded Before Kick-off" Analysis of on-chain markets before the 2026 FIFA World Cup reveals significant crypto integration into football. The most striking figure is the approximately **$1.6 billion** in total trading volume on the single "World Cup Winner" contract on the Polymarket prediction market platform, accumulated before a single match was played. This represents explosive growth for a sector whose annual volume surged from ~$16B in 2024 to ~$64B in 2025. The ecosystem is maturing beyond speculation. Key developments include: 1) **Infrastructure upgrades** like Polymarket's migration to native, regulated USDC stablecoin for settlements; 2) **Reliable data oracles**, such as Chainlink, being used to resolve real-world match outcomes on-chain; and 3) **Official recognition**, with FIFA appointing its first-ever "Prediction Markets" partner. Over 100 contracts now cover everything from the outright winner to individual match results and even non-sporting risks like venue relocation. This evolution marks a fundamental shift. While crypto firms are absent from FIFA's top-tier sponsor list, the technology has deeply penetrated the tournament's financial and predictive infrastructure through regulated stablecoin settlements, decentralized oracles, and new official partnership categories. The regulatory landscape remains complex and varies by jurisdiction, but on-chain markets for the World Cup are already a multi-billion-dollar reality.

marsbit1 saat önce

On-Chain Figures on the Eve of Kickoff: 1.6 Billion Traded Before the World Cup Even Begins

marsbit1 saat önce

From SpaceX's IPO to the Future of Crypto: Which Crypto Sectors Will Host the Trillion-Dollar Narrative?

From the SpaceX IPO, which targets a $750 billion raise at a $1.77 trillion valuation, we can extrapolate capital flow trends relevant to crypto. The focus shifts from speculative narratives to foundational infrastructure and real-world asset (RWA) integration. Key crypto sectors poised to benefit include: 1. **AI Infrastructure**: The narrative is moving from consumer-facing AI applications to underlying, scarce resources like compute power and decentralized GPU networks (e.g., TAO, RENDER, AKT, IO). These protocols are positioning as the essential "picks and shovels" providers for the AI economy. 2. **Real-World Assets (RWA)**: Beyond tokenized treasury bonds, RWA's future lies in on-chain equity and pre-IPO assets like SpaceX. This could democratize access to high-growth assets and reshape global capital flows, benefiting infrastructure projects like ONDO, LINK, and Plume that facilitate issuance, data, and liquidity. 3. **Core Financial Infrastructure**: Stablecoins, payment networks, and DePIN (Decentralized Physical Infrastructure Networks) are critical for settling the future on-chain economy. Their role expands from internal trading tools to foundational layers for global finance, AI systems, and real-world asset networks, leading to potential value reassessment. In summary, the next cycle may prioritize long-term infrastructure value—AI compute, asset tokenization networks, and settlement layers—over short-lived application hype, mirroring the broader market's shift towards funding the foundational systems of the future.

marsbit2 saat önce

From SpaceX's IPO to the Future of Crypto: Which Crypto Sectors Will Host the Trillion-Dollar Narrative?

marsbit2 saat önce

İşlemler

Spot
Futures
活动图片