The DATA Foundation Launches to Tackle AI’s Multi-Billion Dollar Training Data Bottleneck

TheNewsCryptoPublished on 2026-06-25Last updated on 2026-06-25

Abstract

Story has rebranded as The DATA Foundation and launched the DATA Network alongside Trace, a public audit layer for AI training data provenance and licensing. The launch includes a flagship integration with Kled, the world's largest opt-in human data marketplace, registering 1.5 billion user-contributed records. This move addresses the AI industry's multi-billion dollar bottleneck in sourcing high-quality, legally compliant training data, as the supply of scrapable public web data is exhausted. Trace provides immutable receipts for each data contribution, enabling verification of consent, licensing, and provenance. The ecosystem also includes Poseidon's data processing and the contributor app Numo. The foundation's thesis is that proving data origin and ensuring fair compensation is critical for advancing trusted AI. The existing $IP token will migrate to $DATA on a one-to-one basis.

Palo Alto, United States, June 25th, 2026, Chainwire

Story rebrands as The DATA Foundation, launches DATA Network with flagship Kled AI integration, registering 1.5 billion user-contributed records on the platform

The Foundation also introduces Trace, the first public audit layer for consent, licensing, and data provenance at scale

Today, Story announces a strategic transition to become The DATA Foundation (“DATA”) and launches Trace, an onchain registry for AI training data provenance and licensing. The launch includes a flagship integration with Kled, the world’s largest opt-in human data marketplace, registering 1.5 billion user-contributed records on the Network. Andrea Muttoni becomes CEO of The DATA Foundation, and Kled’s founder, Avi Patel, joins in an advisor position as the Chief Data Officer.

AI’s Training Data Has Hit a Bottleneck

The shift to DATA reflects where the market is pulling hardest. AI training data has emerged as the most valuable and least solved category of IP. Frontier AI labs have hit a multi-billion-dollar data bottleneck, where the internet has been effectively exhausted for scraping. The remaining supply is either expensive and bespoke or legally undocumented, leaving labs without a way to source data at scale, prove its provenance, or guarantee its quality.

The legal stakes are rising, as frontier labs stake out market-defining products on data sourced through opaque networks, often without clear records of consent or jurisdiction. Scraped and undocumented data is no longer an option for enterprise-grade AI.

“The challenge in AI has shifted from compute and architecture to sourcing and provenance. As the scrapable web fractures, the question for labs now is who is keeping the receipts,” said Andrea Muttoni, CEO of The DATA Foundation. “With Kled, we combine full data transparency and auditability with the largest pool of AI training data on the planet.”

Building the Infrastructure for Trusted AI Data

DATA builds on the original mission to deliver a data and intellectual property (IP) layer for the internet, recognizing that the form of data and IP that is most critical in this era is AI training data. DATA Network brings essential infrastructure for training AI, anchored by a flagship integration with Kled. Starting today, Kled’s licensing rails and contributor receipts run on DATA Network with added support for stable coin payouts, which involves registering a staggering 1.5 billion user-contributed records with programmatic legal safeguards.

“Frontier labs have exhausted the supply of high-quality, human-generated public text available on the open web. Suppliers showing data-sourcing provenance will win the next decade of deals, and that’s our bet,” said Avi Patel, CEO and founder of Kled and part-time advisory CDO of The DATA Foundation. “Instead of sourcing data blindly, Kled’s data marketplace and DATA’s auditable chain of custody converge on what labs actually need to license data with confidence and transparency.”

Trace Launches as the Public Audit Layer for AI Training Data

Trace, The DATA Foundation’s public audit and search platform, also launches today alongside the Kled integration. Trace generates immutable, confidential receipts for every contribution, allowing labs to verify the legitimacy of datasets in seconds. For every single record uploaded by users worldwide, a receipt on DATA will be generated, enabling upstream compensation for contributors’ data and intellectual property. This addresses an urgent need for a verifiable and compliant AI training data market, which has become a legal and operational minefield.

A Wider Contributor Network

DATA’s thesis was validated by Poseidon, the AI data processing project incubated by Story, which cleans, normalizes, and scores raw human data for authenticity and quality, ensuring every record that reaches a buyer is model-ready. Poseidon’s early traction with frontier labs proved the AI training data opportunity. Backed by a16z and now running entirely on DATA, its contributor app Numo is live today, bringing thousands of contributors into the AI economy in exchange for real-time payouts.

“We started Story to build an IP layer for the internet, and the most important IP of this era is the data you can’t scrape: how a surgeon’s hands move, how a robot grips, how people speak, drive, and work in the real world,” said SY Lee, CEO of PIP Labs and strategic adviser to The DATA Foundation. “DATA is where that conviction goes next: an end-to-end network that proves real-world data’s origin, licenses it, and pays the people who made it. “

Token Migration and Ecosystem Continuity

The $IP token migrates to $DATA one-to-one with no action required from existing holders. Migration guidance, exchange timing, and an FAQ are available here.

About The DATA Foundation

Data is the biggest bottleneck in frontier AI. The data models need most either sits siloed with people and companies, or doesn’t exist yet, and won’t, until incentives are aligned to create it. DATA Network is the world’s AI audit rails built to answer the three questions every lab asks: can you source data at scale, prove where it came from, and guarantee its quality? Contributor apps including Numo and Kled supply opt-in human data; Trace gives every record a public, tamper-proof receipt; Poseidon turns it into model-ready datasets, so frontier AI can keep advancing on a foundation it can trust. $IP is now $DATA. More information available at datafdn.org.

Contact

HV
henri.vies@piplabs.xyz

Trending Cryptos

Related Questions

QWhat is the main purpose of The DATA Foundation's launch and its DATA Network?

AThe main purpose of The DATA Foundation's launch and its DATA Network is to tackle the multi-billion dollar AI training data bottleneck by providing infrastructure for sourcing, proving provenance, and guaranteeing the quality of AI training data at scale. It focuses on opt-in, legally compliant data to replace scraped and undocumented sources.

QWhat key problem does Trace, the new platform launched by The DATA Foundation, aim to solve?

ATrace aims to solve the problem of data provenance and licensing transparency for AI training data. It acts as a public audit layer by generating immutable, confidential receipts for every data contribution, allowing AI labs to verify dataset legitimacy, consent, and licensing in seconds, thereby creating a verifiable and compliant AI training data market.

QHow does the integration with Kled contribute to The DATA Foundation's network?

AThe integration with Kled contributes 1.5 billion user-contributed records to The DATA Network. It brings Kled's licensing rails and contributor receipts onto the network, adding support for stablecoin payouts. This combines Kled's large pool of opt-in human data with DATA's infrastructure for full transparency and auditability, providing AI labs with a trusted source of scalable, documented data.

QAccording to the article, why has scraped data become a problematic source for enterprise AI development?

AScraped data has become problematic because the internet has been effectively exhausted as a high-quality source, frontier AI labs face a data bottleneck, and such data often lacks clear records of consent, jurisdiction, or legal documentation. Using opaque, undocumented data poses significant legal risks and is no longer viable for building enterprise-grade AI products that require provenance and compliance.

QWhat changes are occurring with the $IP token as part of The DATA Foundation's launch?

AThe $IP token is migrating to the new $DATA token on a one-to-one basis. Existing holders do not need to take any action for this migration. Guidance on the migration process, exchange timing, and FAQs are provided by the foundation.

Related Reads

Trading

Spot
Futures

Hot Articles

Discussions

Welcome to the HTX Community. Here, you can stay informed about the latest platform developments and gain access to professional market insights. Users' opinions on the price of S (S) are presented below.

活动图片