Your Backtest Is Lying: Why You Must Use Point-in-Time Data

insights.glassnodePublicado a 2026-03-13Actualizado a 2026-03-13

Resumen

The article warns that backtests using revised historical data are misleading due to look-ahead bias. It illustrates this with a hypothetical trading strategy based on Bitcoin exchange outflows from Binance: entering the market when the 5-day moving average of BTC balance falls below the 14-day average, and exiting on the reverse. Initial backtest results using current data appeared reasonably successful, even comparable to buy-and-hold at times. However, the author argues this is deceptive because metrics like exchange balances are often revised retroactively as new data arrives, meaning a backtest uses information that wasn't available in real-time. Re-running the identical strategy using Point-in-Time (PiT) data—which is immutable and reflects only what was known when the data was first published—yielded significantly worse performance. The PiT-based strategy failed to capture key market upswings as effectively, resulting in meaningfully lower cumulative returns. The key takeaway is that only immutable, Point-in-Time data ensures a backtest accurately replays history without look-ahead bias, and that using revised data will produce lying backtests.

Let's build a simple, hypothetical trading strategy. The premise is straightforward and rooted in a widely discussed narrative: when coins leave exchanges, it tends to be bullish. The reasoning is intuitive: coins moving off exchanges typically signal that holders are withdrawing to self-custody, reducing the available supply for selling. Conversely, coins flowing onto exchanges may indicate that holders are preparing to sell.

A single day of outflows, however, is just noise. To identify a genuine trend, we would apply a moving average crossover on the exchange balance. When the short-term average falls below the long-term average, it confirms that coins have been leaving exchanges consistently, as a sustained pattern, rather than isolated events.

Using Glassnode's exchange balance for Binance, we define the following:

  • Enter the market when the 5-day moving average of Binance's BTC balance falls below its 14-day moving average, signaling a sustained outflow trend.
  • Exit the market when the 5-day average rises back above the 14-day average, signaling that the outflow trend has reversed and coins are returning to the exchange.

We then benchmark this strategy against simply holding BTC over the same period, starting January 1, 2024 through March 9, 2026, with an initial capital of $1,000 and 0.1% trading fees applied to each trade.

This is a simplified trading strategy, designed primarily for illustrative purposes. It is not investment advice, nor is it meant to suggest that exchange balances are a robust foundation for a trading system.
Access live chart

Here's how to read this chart:

🟫 The brown line at the bottom is the binary trading signal, toggling between in the market (1) and out of the market (0).

🟦 The blue line tracks the strategy's portfolio value over time.

🟩 The green line is the buy-and-hold portfolio benchmark.

We can observe that the exchange balance strategy performed reasonably well, although at times the buy-and-hold strategy outperformed it. In the final days of the research period, however, the exchange balance strategy caught up. While some investors may find the combination of reduced volatility and an ultimately comparable performance to buy-and-hold appealing, the final numbers are misleading – and here’s why.

The Problem: Data Mutation and Look-Ahead Bias

Metrics are not static. Many are retroactively revised as new information becomes available. This is particularly true for metrics that depend on address clustering or entity labeling, such as on-chain exchange balances. However, it is also the case for metrics such as trading volume or price, as individual exchanges can occasionally submit their data with slight delays.

This means that a value you see today for, say, January 15, 2024, may not be the same value that was published on January 15, 2024. The data has been revised with hindsight. When you backtest a strategy on this revised data, you are implicitly using information that was not available at the time the trading decisions would have been made. This introduces a look-ahead bias.

The Honest Backtest: Using Point-in-Time Data

Let's therefore repeat the exact same backtest – same signal logic, same parameters, same dates, same fees – but this time using the Point-in-Time (PiT) variant of the Exchange Balance metric, available in Glassnode Studio.

PiT metrics are strictly append-only and immutable. Each historical data point reflects only the information that was known at the time it was first computed. No retroactive revisions, no look-ahead bias.

While we are using the same metric, the strategy now produces significantly different results, as illustrated by the purple line in the new chart below. The overall performance is notably worse.

Although both strategies behave similarly for much of 2024, we observe that the PiT-based version fails to capture the strong upticks in November 2024 and March 2025 as effectively. As a result, the cumulative performance diverges meaningfully and ends up considerably lower.

Access live chart

Key Takeaway

In this example, the purple strategy, which only has access to information as it was available at the time, performs noticeably worse. ► Backtests will lie if fed with wrong or revised data. Only immutable, Point-in-Time metrics ensure you’re replaying history as it actually happened.

Preguntas relacionadas

QWhat is the main premise of the hypothetical trading strategy described in the article?

AThe strategy is based on the narrative that when coins leave exchanges, it is a bullish signal, as it indicates holders are moving to self-custody and reducing available supply. It enters the market when the 5-day moving average of Binance's BTC balance falls below its 14-day average and exits when it rises back above.

QWhat is the primary problem identified with using standard historical data for backtesting?

AThe primary problem is data mutation and look-ahead bias. Many metrics are retroactively revised as new information becomes available, meaning a backtest uses data that was not known at the time of the original trading decision, leading to inaccurate and overly optimistic results.

QWhat is Point-in-Time (PiT) data and how does it differ from standard data?

APoint-in-Time data is strictly append-only and immutable. Each historical data point reflects only the information that was known and published at that specific time, with no retroactive revisions. This prevents look-ahead bias in backtesting.

QHow did the performance of the strategy change when backtested with Point-in-Time data?

AThe performance was significantly worse. The PiT-based strategy failed to capture strong market upticks as effectively, leading to a meaningfully lower cumulative performance compared to the backtest using revised data.

QWhat is the key takeaway from the article regarding backtesting and data?

AThe key takeaway is that backtests will produce misleading results if they use revised data. Only immutable, Point-in-Time metrics ensure an accurate historical replay of what was actually knowable at the time, preventing look-ahead bias.

Lecturas Relacionadas

The Largest IPO in History Ignites Heated Debate: Is SpaceX Worth $1.77 Trillion?

SpaceX's potential IPO is priced at $135 per share, aiming to raise $75 billion and valuing the company at approximately $1.77 trillion, which would make it the largest IPO in history. This valuation has sparked intense debate among investors. Bullish analysts, including major underwriters Goldman Sachs and Morgan Stanley, argue the valuation is justified by SpaceX's long-term potential. They see it not just as a rocket company but as a future leader in space infrastructure, with key growth drivers being Starlink satellite internet, low-cost rocket launches, and future AI-related ventures. They project revenues reaching hundreds of billions to trillions of dollars by 2030-2040. ARK Invest's model suggests a 2030 enterprise value could reach $2.5 trillion. Bearish analysts from independent research firms like Morningstar, PitchBook, and New Constructs contend the IPO price is excessively high, already pricing in unrealistic future growth. Using DCF and sum-of-the-parts models, they estimate fair value between $780 billion and $1.7 trillion, significantly below the IPO target. They highlight risks such as the speculative nature of AI projections, over-dependence on Elon Musk, high growth expectations, and corporate governance concerns. Trefis set a target price of just $79 per share. While both sides acknowledge SpaceX's unique position in commercial space, the core disagreement centers on whether the $135 share price offers a reasonable margin of safety or is overly optimistic. Despite the valuation controversy, reported strong demand for the IPO indicates significant market interest.

marsbitHace 48 min(s)

The Largest IPO in History Ignites Heated Debate: Is SpaceX Worth $1.77 Trillion?

marsbitHace 48 min(s)

After the Passage of the GENIUS Act and the CLARITY Act, What Is the Correct Architecture for On-Chain Yield?

The article discusses the evolution of on-chain credit, distinguishing three markets: overcollateralized crypto lending, unsecured lending (largely unsuccessful), and asset-backed credit (ABC). ABC, backed by identifiable real-world collateral with legal recourse, is identified as the fastest-growing category and the only one credibly addressing adverse selection—the core problem in credit where the riskiest borrowers self-select. Current growth in on-chain Real World Assets (RWAs), particularly tokenized private credit funds (e.g., Maple Finance, Centrifuge), is substantial but often merely "wraps" existing fund structures, inheriting their risks rather than solving adverse selection at the protocol level. The regulatory landscape is a key driver, with the US GENIUS Act (prohibiting stablecoin issuers from paying yield) and the proposed CLARITY Act (closing loopholes on indirect yield) set to redefine permissible yield-bearing products. This makes vaults (like ERC-4626) the critical architecture—they become the primary compliant vehicle for delivering yield, functioning as issuance, disclosure, distribution, and recovery mechanisms. The author's thesis is that the correct post-GENIUS/CLARITY architecture involves building ABC solutions where credit assessment, structure, and recovery are encoded directly into the smart contract vault layer, moving beyond mere tokenized fund wrappers to solve adverse selection fundamentally and ensure regulatory compliance.

Foresight NewsHace 1 hora(s)

After the Passage of the GENIUS Act and the CLARITY Act, What Is the Correct Architecture for On-Chain Yield?

Foresight NewsHace 1 hora(s)

TechFlow Intelligence Bureau: Anthropic's New Model Fable Sparks Controversy by Restricting Biosafety Research, US CPI Soars to 4.2%, a Three-Year High

**Summary of TechFlow Intelligence Report:** The newsletter covers several key tech and finance developments. In AI, Anthropic's new Fable model faced backlash for secretly limiting biomedical research capabilities and enforcing a 30-day data retention policy, prompting the company to promise more transparent adjustments. In a related story, Anthropic's founder revealed his departure from OpenAI was due to dishonesty from Sam Altman, not safety concerns. Meanwhile, OpenAI is considering significant price cuts to compete with Anthropic, potentially sparking a price war. In crypto/Web3, BlackRock filed a new amendment for a yield-generating Bitcoin ETF, while Bank of America's CEO warned that stablecoin yields could drain trillions from traditional banks. U.S. Senator Cynthia Lummis advocated for the U.S. to officially accumulate Bitcoin reserves. In hardware, Nvidia released the DiffusionGemma-2-6B image model optimized for efficient inference, and AMD promoted its unified memory architecture to challenge Nvidia's dominance. TSMC's CFO hinted at possible price increases due to soaring AI chip demand. A major legal ruling in Germany held Google legally responsible for inaccurate information generated by its AI Overviews feature. Google Chrome also moved to fully block ad-blocker workarounds like uBlock Origin. Macroeconomic headlines included U.S. CPI rising to 4.2% (a 3-year high) and Iran's complete closure of the Strait of Hormuz, raising oil price and inflation fears. South Korean markets saw continued volatility with massive foreign capital outflow. Other notable stories: Microsoft expanded its Copilot AI assistant "Mico" globally; a study found r/wallstreetbets users' stock picks outperformed Wall Street; a fully autonomous drone killed a human soldier for the first time, raising AI ethics concerns; and a Chinese hospital used brain-computer interface technology to help a blind person "see." The overarching theme connects debates over AI boundaries and responsibility (Anthropic's restrictions, Google's liability, lethal autonomous drones) with real-world economic and geopolitical turmoil (inflation, Strait of Hormuz closure, market instability), highlighting the tense interplay between technological advancement and global chaos.

marsbitHace 1 hora(s)

TechFlow Intelligence Bureau: Anthropic's New Model Fable Sparks Controversy by Restricting Biosafety Research, US CPI Soars to 4.2%, a Three-Year High

marsbitHace 1 hora(s)

Alibaba's Yet Another New Business Division: What Signal Does It Send?

Alibaba has established a new "Token Foundry" business unit, merging its Tongyi large model division and Future Life Lab. Led directly by Group CEO Wu Yongming, this marks the company's third significant AI organizational reshuffle in 2026, following the creation of the Alibaba Token Hub (ATH) and a Group Technology Committee. The move signals a strategic shift from consolidating AI resources to accelerating productization and commercialization. The "Token Foundry" name reflects Alibaba's ambition to become a foundational supplier in the AI era, focusing on model development and commercial application. Key teams, including those behind the high-performing HappyHorse video generation model, have been integrated into the new unit. Concurrently, Zhou Jingren, architect of the Qwen model series, has been appointed Group Chief Scientist to lead a new AI Future Research Institute, focusing on long-term technological breakthroughs like Agent capabilities. This restructuring creates a clear four-layer AI architecture within Alibaba: the research institute for frontier exploration, Token Foundry for core models and commercialization, MaaS for platform services, and business units like Qianwen (C端) and Wukong (B端) for end-user applications. The adjustments align with a global trend among tech giants like Google and Microsoft to centralize AI leadership under the CEO and deeply integrate research with business units. The urgency is driven by a narrowing competitive window. Alibaba has announced its AI business is now entering a commercialization phase, with AI-related revenue seeing triple-digit growth for eleven consecutive quarters. The company faces intense competition in the MaaS (Model-as-a-Service) sector from rivals like ByteDance and Tencent. The Token Foundry initiative represents Alibaba's effort to streamline execution and enhance competitiveness in this critical, fast-evolving landscape.

marsbitHace 2 hora(s)

Alibaba's Yet Another New Business Division: What Signal Does It Send?

marsbitHace 2 hora(s)

Trading

Spot
Futuros
活动图片