Author: 137Labs
At the end of June 2026, Chainalysis publicly released a data framework called the 'Blockchain Tracing Ontology', aiming to establish a more unified data description system for blockchain analysis. Compared to previous new product or feature announcements, this document resembles more of an industry standard proposal: it attempts to redefine the fundamental concepts of on-chain data analysis and establish a data model for blockchain tracing that is interpretable, verifiable, and reproducible.
Following its release, this proposal quickly became a topic of interest in the fields of blockchain analysis and digital asset compliance. Although it is still in the stage of public discussion and an industry initiative, it has already prompted people to reconsider: does on-chain analysis require a more unified and transparent data standard?
A Long-standing Problem: Why Do Different Companies Arrive at Different Analysis Results?
Blockchain data is inherently open and transparent, but there has long been a lack of unified standards for interpreting this data.
Currently, most on-chain analysis platforms employ 'Address Clustering' technology, inferring which addresses might be controlled by the same entity based on transaction behavior. However, the algorithms, rules, and evidence sources adopted by different institutions are inconsistent. Consequently, the same address may be attributed to completely different entities on different platforms.
For example, one analysis firm might identify an address as belonging to a major exchange, while another marks it as an unknown wallet; the same batch of addresses might also be assigned to different clusters across platforms. Such discrepancies may have limited impact on market analysis, but when it comes to judicial investigations, asset freezes, anti-money laundering, or law enforcement forensics, they can lead to significant controversy.
For courts, simply concluding that 'this is an exchange's wallet' is far from sufficient; a more critical question to answer is: Why can this judgment be made?
What Chainalysis Proposes Is Not a New Algorithm, But a 'Language'
Seeing the word 'Ontology', many might mistakenly think Chainalysis has introduced another new clustering algorithm. That is not the case.
Ontology is a concept from knowledge engineering, referring to a unified system of concepts and a relationship model designed to standardize the definitions of different objects and the ways they are interrelated. Internet search, medical knowledge bases, and even AI knowledge graphs heavily utilize ontologies to ensure data can be understood uniformly.
What Chainalysis hopes to achieve is establishing a similar 'common language' for blockchain analysis.
In other words, it does not mandate that all companies must adopt the same clustering algorithm, but rather encourages everyone to express analysis results according to a unified data structure. This would make the analytical process more transparent and facilitate third-party understanding, verification, and reproduction.
'Cluster' Is No Longer Sufficient
In the past, the industry widely used 'Clusters' as the basic unit of analysis, assuming multiple addresses collectively belonged to a single wallet or entity.
While simple and intuitive, this method's limitations have become increasingly apparent with the evolution of blockchain infrastructure.
Today, the wallet system of a large exchange might contain millions of addresses, each serving completely different functions such as deposits, withdrawals, cold/hot wallet management, consolidation, and change. If they are still simplistically grouped into a single Cluster, it becomes difficult to accurately describe the complex wallet structure.
Therefore, in its proposal, Chainalysis introduces the new concept of 'Wallet Segment'.
In the new model, an Entity can have multiple Wallets, each Wallet can be divided into multiple Wallet Segments, and each Segment contains specific Addresses. This hierarchical structure more realistically reflects the wallet management model of large institutions compared to the traditional Cluster approach, and allows for a more granular description of the control relationships between different addresses.
From 'Trust in Results' to 'Trust in the Process'
Beyond the model itself, a more significant change lies in the second layer of the design.
Traditional on-chain analysis focused more on the final result—who an address belongs to, where funds flow, whether it involves illicit activity.
The new Ontology emphasizes the inference process itself.
For every analytical conclusion, several questions should be explicitly answered:
- What on-chain evidence supports this conclusion?
- Which analytical rules were applied?
- Was off-chain information referenced?
- What is the confidence level of this inference?
- Can a third party re-validate this process?
In other words, it's not just about telling someone 'what', but also explaining 'why'.
Chainalysis refers to this part as the Evidence and Confidence layer.
In the future, marking an address as an exchange wallet would no longer be just a simple label; it would be accompanied by a complete set of inference basis, including transaction patterns, address relationships, public information, investigation records, etc., along with a corresponding confidence level. This design better aligns with judicial evidence requirements for explainability and facilitates cross-validation between different institutions.
The Insight from the Bitcoin Fog Case
In fact, this proposal did not emerge in a vacuum; it is closely related to the prominent U.S. Bitcoin Fog money laundering case.
Bitcoin Fog was once one of the longest-running Bitcoin mixing services. The U.S. Department of Justice heavily relied on Chainalysis Reactor's analysis as key evidence during its investigation.
During the trial, the court held a notable Daubert hearing, subjecting Chainalysis's analysis methods to rigorous scrutiny, including:
- Whether address clustering has a scientific basis;
- Whether the analysis method can be repeatedly verified;
- Whether it constitutes an unexplainable 'black-box algorithm';
- Whether other experts can independently reproduce the analysis process.
Ultimately, the court recognized that Chainalysis's analytical methods possessed sufficient scientific reliability to be admissible as judicial evidence.
However, this case also exposed issues within the industry: if different analysis firms employ different standards, future similar cases could face more challenges. Therefore, establishing a unified framework for data expression and evidence became a key driver for Chainalysis to push forward the Ontology.
Blockchain Analysis Cannot Directly Identify Real-World Identities
It is worth noting that Chainalysis specifically emphasizes a key point in this proposal: on-chain analysis itself cannot directly identify real-world individual identities.
On-chain data can only reveal relationships between addresses and fund flow paths. Determining the real-world controller behind an address typically still relies on off-chain evidence, such as exchange KYC information, data subpoenaed by courts, server logs obtained by law enforcement, etc.
This means blockchain analysis provides high-quality data inference, not the definitive evidence that directly proves identity. A truly complete judicial evidence chain requires the combination of on-chain data and off-chain investigation.
From Data Quality to Industry Standard
Beyond the Ontology itself, the overall framework presented systematically addresses data quality, analytical transparency, and judicial admissibility. It is evident that Chainalysis aims to encourage the industry to focus not just on analysis results, but on whether the analytical process can be explained, verified, and reproduced.
This also indicates that the future competitive focus of the industry may shift from 'who covers more addresses' or 'who identifies more labels' to 'whose data quality is higher', 'whose analysis is more transparent', and 'whose evidence is more readily admissible in court'.
For regulators, law enforcement agencies, and large financial institutions, a system capable of explaining its analytical logic, supporting independent audits, and possessing reproducible verification capabilities is clearly more trustworthy than a 'black-box model' that only outputs results.
What Does This Proposal Mean?
From a longer-term perspective, what Chainalysis released this time is not an ordinary software upgrade, but more akin to an effort to push the blockchain analysis industry from being 'experience-driven' to 'standard-driven'.
If this Ontology is widely adopted by the industry, different analysis firms, exchanges, regulatory bodies, and even judicial authorities could potentially share analysis results under a unified data model, reducing communication costs, improving evidence consistency, and providing a more reliable foundation for cross-border law enforcement, anti-money laundering investigations, and digital asset regulation.
Of course, establishing standards is not achieved overnight. Balancing commercial secrecy with transparency, encouraging different institutions to adopt unified norms, and continuously refining the evidence model will require joint exploration by the industry.
However, it is certain that as digital assets become increasingly integrated into the global financial system, the focus of competition in blockchain analysis is changing: what will truly determine the industry's value in the future is not just the accuracy of algorithms, but also the explainability of the analysis process, data quality, and evidence credibility. And this is precisely the new direction that Chainalysis hopes to open up with the Blockchain Tracing Ontology.





