Working paper

Robustness in Backtest Inference

How much of a historical investment edge survives multiple testing, costs, and out-of-sample evaluation.

Status: Working paper
Version: Working paper
Date: 2026-05
Authors: Aryan Patel

Abstract

A method-focused study of how inference choices, model selection, and repeated testing inflate the apparent performance of historical trading strategies, and what survives once those effects are controlled for. The work develops a structure-preserving randomization test that holds the trade structure and price path fixed and re-randomizes only entry placement, separating profitability from genuine entry-timing skill. A companion working paper presents the method and results in full.

Research question

How much of a historical investment edge survives multiple testing, costs, and out-of-sample evaluation?

Methods

Structure-preserving randomization
Multiple-testing control
Out-of-sample evaluation
Sensitivity to declared measures
Cross-asset robustness

Data

Gold futures and a cross-asset panel of public daily market data; provenance recorded.

Results

Working paper. Across the examined panel, profitability is common but no rule shows robust, measure-invariant entry-timing skill after multiplicity control. The work is not published.

Limitations

Findings are primarily single-asset; the cross-asset panel is exploratory, and conclusions depend on the declared re-randomization measures.

Code availability: Internal. Release forthcoming.
Data availability: Public market data.
AI disclosure: AI agents assisted with literature retrieval, code generation, analysis support, critique, and drafting. Deterministic systems produced and verified the estimates. A human researcher approved the question, design, interpretation, and release.
Reproduction status: Internal validation.