I just read a great blog post by Howard Bandy which you can find here.
… my criticism is of blindly mining a set of data in search of the best fit, expecting the result to be a robust and reliable trading system.
If the data being processed is not a financial time series, then that best fit probably is both descriptive of the in-sample data and predictive of the out-of-sample data. But financial time series data is different. Its characteristics change, influenced by economic cycles, government policies, political actions, global events, and corporate actions. Models that accurately fit one time period lose accuracy as time passes and the important patterns in the data shift.
Proper development of trading systems requires that the model is able to, or allowed to, adapt to changes in the data. That can be accomplished by either including logic that recognizes changes and applies different rules or different parameter values to different conditions; or by periodically repeating the search for the best fit to the recent data – reoptimizing.
In some circumstances, such as moving averages, adaptive algorithms are well known and efficiently implemented. Attempting to recognize and adapt to more general changes greatly increases the complexity of the logic and is usually not practical.
The other approach, to periodically reoptimize, requires that the length of the in-sample period be consistent with the length of time the characteristics of the data remain stationary. There is no general rule for determining that length. It can only be determined by experimentation. Given the length of time the data is in a stable state, relative to the specific model, the system must use the early portion to synchronize the logic to the data and the later portion to trade it. That is, the length of the in-sample period plus the length of the out-of-sample period must be no greater than the length of the period of stability. Toward the end of the out-of-sample period, system performance degrades as the characteristics of the data change beyond the ability of the logic to adapt. The trader must either reoptimize / resynchronize on a time schedule such that the system performance is not allowed to degrade, or he must be able to monitor the health of the system and resynchronize it as necessary.
The trading system developer gains the necessary confidence in the resynchronization process by practicing. He tries different combinations of logic and parameters, optimizes over an in-sample period, selects the best alternative of those tested, and tests over an out-of-sample period. The walk forward technique is an automated process that does exactly that. It repeatedly searches an in-sample period, selects the best, and tests the following out-of-sample period. Each step is a practice event in anticipation of the time when the developer moves the system from development to trading. The success of the walk forward process depends on accurately fitting the model to the data over the in-sample period, and choosing the alternative that the developer prefers.
The final result of the walk forward process is a set of several out-of-sample periods, each of which resulted from using the best in-sample fit. That set of combined out-of-sample trades is the very best estimate of future performance available. It is the gold standard. We can use it as a baseline with which to determine the best position size and to compare actual performance.
As a trading system developer, I can assure you that there is a lot of wisdom in Howard’s post.
Howard has recently made available a paper titled “Assessing Trading System Health” which is available here.