ch2_pre-optimize-issues

Chapter 2: Pre-optimization Issues

Assessing and Improving Stationarity

Understanding Stationarity

Stationarity Definition:
- Stationarity of a time series means its statistical properties (mean, variance, etc.) remain constant over time.
- In practical terms, stationarity ensures that historical patterns can predict future trends, which is critical for designing trading systems.

Key Points About Market Nonstationarity

Inherent Nonstationarity:
- Financial markets, and indicators derived from them, are inherently nonstationary.
- The properties of market data constantly change.
Challenges:
- Traditional statistical tests for nonstationarity are not useful because they always indicate significant nonstationarity in market data.
- Nonstationarity can manifest in various ways:
  - The mean may wander while the variance remains constant, or vice versa.
  - Skewness and other statistical properties may also change.
Impact on Trading Systems:
- Some types of nonstationarity may not affect trading systems, while others can be detrimental.
- Different trading systems may be sensitive to different forms of nonstationarity.

Evaluating Stationarity

Visual Analysis:
- Plotting indicators can reveal nonstationarity that impacts trading models.
- Look for slow changes in central tendency or variance that can disrupt model predictions.
Equity Curve Analysis:
- Study the equity curve of your trading system:
  - Look for periods of excellent performance versus mediocre or poor performance.
  - Consider if the performance is due to favorable market conditions during the backtest.

Strategies for Improving Stationarity

Avoid Overfitting:
- Ensure the trading system performs well across different market conditions, not just in a favorable segment of the backtest history.
- Regularly update and tweak the system to adapt to changing market conditions.
Use Progressive Walkforward Testing:
- Progressive walkforward testing is a robust method to validate trading systems after development.
- This method helps assess how well a system adapts to new data over time.
Monitor Indicators:
- Regularly review plots of your indicators to detect slow wandering or changes in variance.
- Be vigilant about prolonged periods where indicators deviate from expected behavior.
Dynamic Adjustment:
- Be prepared to tweak or redesign the system when market conditions change.
- Ensure that the development and testing period includes a variety of market conditions to account for potential nonstationarity.

Conclusion

Expect Market Changes:
- Understand that market conditions will change, affecting the performance of trading systems.
- Design systems that can adapt to these changes to maintain consistent performance.
Focus on Robustness:
- Aim for a trading system that performs well across different conditions rather than excelling only in specific scenarios.
- Continuous monitoring and adjustment are key to managing the impact of nonstationarity on trading systems.

The STATN Program

Introduction

For those who prefer concrete data over subjective analysis, the STATN.CPP program provides a quantitative approach to evaluating market stationarity. This program checks the trend and volatility over time and can be easily modified to include other market indicators.

Program Overview

The principle behind the program is that trading systems developed under specific market conditions will likely perform poorly under different conditions. It is crucial that market conditions, reflected in our indicators, vary regularly and randomly to develop robust models. The program identifies "slow wandering" in market properties, which is a sign of dangerous nonstationarity.

Command Structure

The program is run with the following command:

STATN Lookback Fractile Version Filename

Lookback: Number of historical bars used to compute trend and volatility.
Fractile: The threshold (0–1) for gap analysis.
Version: Determines the modification of indicators:
- 0 for raw indicators.
- 1 for differenced indicators.
- >1 for raw minus extended raw indicators.
Filename: The market history file in the format YYYYMMDD Open High Low Close.

Example Command

STATN 20 0.5 1 market_history.txt

Code Snippets

Full Lookback Calculation

The full_lookback is calculated based on the Version parameter.

if (version == 0)
    full_lookback = lookback;
else if (version == 1)
    full_lookback = 2 * lookback;
else if (version > 1)
    full_lookback = version * lookback;
nind = nprices - full_lookback + 1; // This many indicators

Indicator Calculation

For each pass, the program computes the (possibly modified) indicators for trend:

for (i = 0; i < nind; i++) {
    k = full_lookback - 1 + i;
    if (version == 0)
        trend[i] = find_slope(lookback, close + k);
    else if (version == 1)
        trend[i] = find_slope(lookback, close + k) - find_slope(lookback, close + k - lookback);
    else
        trend[i] = find_slope(lookback, close + k) - find_slope(full_lookback, close + k);
    trend_sorted[i] = trend[i];
}

Sorting and Gap Analysis

The program sorts the trend values to find the specified quantile and performs gap analysis.

qsortd(0, nind-1, trend_sorted);
k = (int)(fractile * (nind + 1)) - 1;
if (k < 0) k = 0;
trend_quantile = trend_sorted[k];
gap_analyze(nind, trend, trend_quantile, ngaps, gap_size, gap_count);

Gap Size Initialization

Defines the gap sizes for analysis.

#define NGAPS 11 /* Number of gaps in analysis */
ngaps = NGAPS;
k = 1;
for (i = 0; i < ngaps - 1; i++) {
    gap_size[i] = k;
    k *= 2;
}

Gap Analysis Function

This function keeps a tally of the counts when the state of the indicator changes.

void gap_analyze(int n, double *x, double thresh, int ngaps, int *gap_size, int *gap_count) {
    int i, j, above_below, new_above_below, count;
    for (i = 0; i < ngaps; i++)
        gap_count[i] = 0;
    count = 1;
    above_below = (x[0] >= thresh) ? 1 : 0;
    for (i = 1; i <= n; i++) {
        if (i == n) // Passing end of array counts as a change
            new_above_below = 1 - above_below;
        else
            new_above_below = (x[i] >= thresh) ? 1 : 0;
        if (new_above_below == above_below)
            ++count;
        else {
            for (j = 0; j < ngaps - 1; j++) {
                if (count <= gap_size[j])
                    break;
            }
            ++gap_count[j];
            count = 1;
            above_below = new_above_below;
        }
    }
}