top of page

Backtesting & Data Mining

Introduction

In this text we'll check out two associated practices which can be extensively utilized by merchants known as Backtesting and Data Mining . These are strategies which can be highly effective and precious if we use them appropriately, nonetheless merchants usually misuse them. Therefore, we'll additionally discover two frequent pitfalls of those strategies, referred to as the a number of speculation downside and overfitting and learn how to overcome these pitfalls.

Backtesting

Backtesting is simply the method of utilizing historic knowledge to check the efficiency of some buying and selling technique . Backtesting usually begins with a method that we wish to take a look at, as an example shopping for GBP / USD when it crosses above the 20-day transferring common and promoting when it crosses under that common. Now we might take a look at that technique by watching what the market does going ahead, however that might take a very long time. This is why we use historic knowledge that’s already obtainable.

"But wait, wait!" I hear you say. "Could not you cheat or at least be biased because you already know what happened in the past?" That's undoubtedly a priority, so a sound backtest might be one wherein we’re not conversant in the historic knowledge. We can accomplish this by selecting random time durations or by selecting many alternative time durations wherein to conduct the take a look at.

Now I can hear one other group of you saying, "But all that historical data just sitting there waiting to be analyzed is tempting is not it? be so wrong for us to examine that historical data first, to analyze it and see if we can find patterns hidden within it? " This argument can also be legitimate, however it leads us into an space fraught with hazard … the world of Data Mining

Data Mining

Data Mining entails looking out by way of knowledge with a view to find patterns and discover attainable correlations between variables . In the instance above involving the 20-day transferring common technique, we simply got here up with that exact indicator out of the blue, however suppose we had no concept what kind of technique we needed to check? That's when knowledge mining is available in tender. We might search by way of our historic knowledge on GBP / USD to see how the value behaved after it crossed many alternative transferring charges. We might examine worth actions in opposition to many different varieties of indicators as nicely and see which of them correspond to giant worth actions.

The topic of knowledge mining might be controversial as a result of as a result of I mentioned above it appears to be like a bit like dishonest or "looking ahead" within the knowledge. Is knowledge mining a sound scientific method? On the one hand the scientific methodology says that we're imagined to make a speculation first after which take a look at it in opposition to our knowledge, however then again it appears to be like acceptable to do some "exploration" of the info first with a view to counsel a speculation . So which is true? We can take a look at the steps within the Scientific Method for a clue to the supply of the confusion. The course of generally appears to be like like this:

Observation (knowledge) >>> Hypothesis >>> Prediction >>> Experiment (knowledge)

Notice that we will take care of knowledge throughout each the Observation and Experiment phases. So each views are proper. We should use knowledge with a view to create a wise speculation, however we additionally take a look at that speculation utilizing knowledge. The trick is just to guarantee that the 2 units of knowledge are usually not the identical! We must not ever take a look at our speculation utilizing the identical set of knowledge that we used to counsel our speculation . In different phrases, when you use knowledge mining with a view to give you technique concepts, be sure you use a distinct set of knowledge to backtest these concepts.

Now we'll flip our consideration to the principle pitfalls of utilizing knowledge mining and backtesting incorrectly. The normal downside is called "over-optimization" and I want to interrupt that downside down into two distinct sorts. These are the a number of speculation downside and overfitting . In a way they’re reverse methods of creating the identical error. The a number of speculation downside entails making many easy hypotheses whereas overfitting entails the creation of one very complicated speculation .

The Multiple Hypothesis Problem

To see how this downside arises, let's return to our instance the place we backtested the 20-day transferring common technique. Let's suppose that we backtest the technique in opposition to ten years of historic market knowledge and lo and behold guess what? The outcomes are usually not very encouraging. However, being tough and tumble merchants as we’re, we determine not to surrender so simply. What a few ten day transferring common? That would possibly work out a little bit higher, so let's backtest it! We run one other backtest and we discover that the outcomes nonetheless are usually not stellar, however they're a bit higher than the 20-day outcomes. We determine to discover a little bit and run related exams with 5-day and 30-day transferring charges. Finally it occurs to us that we might really simply take a look at each single transferring common as much as some level and see how all of them carry out. So we take a look at the 2-day, 3-day, 4-day, and so forth, all the best way as much as the 50-day transferring common.

Now actually a few of these charges will carry out poorly and others will carry out pretty nicely, however there must be one in all them which is the very best. For occasion we could discover that the 32-day transferring common turned out to be the perfect performer throughout this specific ten 12 months interval. Does this imply that there’s something particular concerning the 32-day common and that we ought to be assured that it’ll carry out nicely sooner or later? Unfortunately many merchants assume this to be the case, and so they simply cease their evaluation at this level, pondering that they've found one thing substantial. They have fallen into the "Multiple Hypothesis Problem" pitfall.

The downside is that there’s nothing in any respect uncommon or important about the truth that some common turned out to be the perfect. After all, we examined virtually fifty of them in opposition to the identical knowledge, so we'd anticipate finding a couple of good performers, simply by likelihood . It doesn’t imply there's something particular concerning the specific transferring common that "won" on this case. The downside arises as a result of we examined a number of hypotheses till we discovered one which labored, as an alternative of selecting a single speculation and testing it.

Here's a very good traditional analogy. We might give you a single speculation corresponding to "Scott is great at flipping heads on a coin." From that, we might create a prediction that claims, "If the hypothesis is true, Scott will be able to flip 10 heads in a row." Then we will carry out a easy experiment to check that speculation. If I can flip 10 heads in a row it really doesn’t show the speculation. However if I can’t accomplish this feat it undoubtedly disproves the speculation. As we do repeated experiments which fail to disprove the speculation, then our confidence in its fact grows.

That's the correct solution to do it. However, what if we had give you 1,000 hypotheses as an alternative of simply the one about me being a very good coin flipper? We might make the identical speculation about 1,000 completely different folks … me, Ed, Cindy, Bill, Sam, and so on. Ok, now let's take a look at our a number of hypotheses. We ask all 1000 folks to flip a coin. There will in all probability be about 500 who flip heads. Everyone else can go dwelling. Now we ask these 500 folks to flip once more, and this time about 250 will flip heads. On the third flip About 125 folks flip heads, on the fourth about 63 individuals are left, and on the fifth flip there are about 32. These 32 individuals are all fairly wonderful are usually not they? They've all flipped 5 heads in a row! If we flip 5 extra occasions and eradicate half the folks every time on common, we’ll find yourself with 16, then 8, then 4, then 2 and at last one individual left who has flipped ten heads in a row. It's Bill! Bill is a "fantabulous" flipper of cash! Or is he?

Well we actually have no idea, and that's the purpose. Bill could have received our contest out of pure likelihood, or he could very nicely be the perfect flipper of heads this facet of the Andromeda galaxy. By the identical token, we have no idea if the 32-day transferring common from our instance above simply carried out nicely in our take a look at by pure likelihood, or if there may be actually one thing particular about it. But all we've accomplished thus far is to discover a speculation , that the 32-day transferring common technique is worthwhile (or that Bill is a superb coin flipper). We haven’t really examined that speculation but.

So now that we perceive that we have now probably not found something important but concerning the 32-day transferring common or about Bill's potential to flip cash, the pure query to ask is what we should always do subsequent? As I discussed above, many merchants by no means notice that there’s a subsequent step required in any respect. Well, within the case of Bill you'd in all probability ask, "Aha, however can he flip ten heads in a row once more ?" In the case of the 32-day transferring common, we'd wish to take a look at it once more, however actually not in opposition to the identical knowledge pattern that we used to select that speculation. We would select one other ten-year interval and see if the technique labored simply as nicely. We might proceed to do that experiment as many occasions as we needed till our provide of recent ten-year durations ran out. We check with this as "out of sample testing", and it's the best way to keep away from this pitfall. There are numerous strategies of such testing, one in all which is "cross validation", however we is not going to get into that a lot element right here.

Overfitting

Overfitting is mostly a sort of reversal of the above downside. In the a number of speculation instance above, we checked out many easy hypotheses and picked the one which carried out greatest prior to now. In overfitting we first take a look at the previous after which assemble a single complicated speculation that matches nicely with what occurred. For instance if I take a look at the USD / JPY fee over the previous 10 days, I’d see that the day by day closures did this:

up, up, down, up, up, down, down, down, up.

Got it? See the sample? Yeah, neither do I really. But if I needed to make use of this knowledge to counsel a speculation, I’d give you …

My wonderful speculation:

If the closing worth goes up twice in a row then down for at some point, or if it goes down for 3 days in a row we should always purchase,

but when the closing worth goes up three days in a row we should always promote,

but when it goes up three days in a row after which down three days in a row we should always purchase.

Huh? Sounds like a whacky speculation proper? But if we had used this technique over the previous 10 days, we might have been proper on each single commerce we made! The "overfitter" makes use of backtesting and knowledge mining in another way than the "multiple hypothesis makers" do. The "overfitter" doesn’t give you 400 completely different methods to backtest. No manner! The "overfitter" makes use of knowledge mining instruments to determine only one technique , irrespective of how complicated, that might have had the perfect efficiency over the backtesting interval. Will it work sooner or later?

Not possible, however we might all the time maintain tweaking the mannequin and testing the technique in several samples (out of pattern testing once more) to see if our efficiency enhancements. When we cease getting efficiency enhancements and the one factor that's rising is the complexity of our mannequin, then we all know we've crossed the road into overfitting.

Conclusion

So in abstract, we've seen that knowledge mining is a manner to make use of our historic worth knowledge to counsel a workable buying and selling technique, however that we have now to concentrate on the pitfalls of the a number of speculation downside and overfitting. The solution to guarantee that we don’t fall prey to those pitfalls is to backtest our technique utilizing a completely different dataset than the one we used throughout our knowledge mining exploration. We generally check with this as "out of sample testing".

Scott Percival

October 2006

0 views0 comments

Recent Posts

See All

Comments


bottom of page