Certainly one of the main concerns of traders developing automated strategies is the achievement of accurate simulations that can give us an idea about the past performance of a given strategy. The whole premise of automated trading is based on the building of systems that have worked extensively in the past - under very varied market conditions - which have ideal possibilities of surviving in the future. It therefore becomes extremely important to know the reliability of the data we have and if we can or can't trust our simulations of past market conditions to be accurate. On today's post I want to discuss the definition of "reliable data" what it means, what we should look for and the current state of metaquotes' data sets when it comes to this concept of "reliability".
In general trading there is only one definition for reliable data. This is data which is extracted from a central exchange which shows all the variations (down to each transaction) which happen across certain dates. In stocks and futures obtaining reliable data is a very simple process, there are a number of central exchanges and you simply buy the data you want from whatever exchange you are interested in. If you are interested in knowing US stocks data for the past 10 years you can easily purchase extremely detailed data from the New York Stock Exchange from a different series of data providers.
When it comes to forex trading we face a totally different beast because we lack a central exchange. Since there is no "original data" as in the case of stocks and futures, it is impossible to tell just from looking at them which one is the best data feed. In practice, the best data feed for you would be the historical data feed that best matches the liquidity providers of your present broker but the information pertaining to the origin of the data, the providers it comes from and the liquidity providers of your broker is often very secret and hard to know.
Because of these facts we have to make a new definition of reliable data which centers around the technical aspects of the data sets used. What we want - at a minimum - is a data set that does not have mismatches between time periods or odd occurrences like 1 pip daily bars. The 4 digit metaquotes data set is completely unreliable since it has a lot of these errors (which I showed on an Asirikuy video and highlighted on a blog post before) reason why we must ONLY use the 5 digit metaquotes history to run simulations on the metatrader 4 platform, meaning that only 5 digit brokers can be used for reliable backtesting.
However, outside these very simple technical aspects that judge the general soundness of the data use, we cannot have any criteria that tells us if the data is more or less valid than other data sets. Of course, metaquotes data will be different from 1 minute data from other providers (such as Oanda or Dukascopy) but the truth is that such differences are expected merely because of the previously mentioned lack of a central exchange.
The best things we can do to develop systems that are not overly dependent on the "fine" and possibly extremely variable details of historical data is to use strategies which only trade above or on the one hour chart and which do not have a strong dependency on spread widening or execution variables for accurate simulation. A system that trades on lower time frames (below one hour) and has small TP or SL values is not only going to give inaccurate simulations due to one minute interpolation problems but any LIVE results obtained on one broker will probably be very hard to reproduce on another due to changes in data feeds between them.
In the end the accuracy of simulations in the forex market is still a topic of debate but the whole preference of one data set over another because of "reliability" does not make a lot of sense once the original technical aspects of the data proposed are examined. If you have two equally technically sound historical data groups then both are bound to be equally valid and merely different because of the natural feed differences between forex brokers. Our weapons against any possible dependency include the use of longer time frames and wide profit and stop loss targets but such an approach is also bound to have a certain degree of broker dependency, something we are currently examining within Asirikuy.
What is your opinion ? do you think that one broker's data set may be more reliable than another ? Do you believe that there are ways in which you can run accurate simulations which eliminate the broker dependency problem ? How do you deal with the data set quality issues ? Please leave a comment ! :o)
If you would like to learn more about accurate simulations and how to build systems that have a high chance of being live/back testing consistent please consider buying my ebook on automated trading or joining Asirikuy to receive all ebook purchase benefits, weekly updates, check the live accounts I am running with several expert advisors and get in the road towards long term success in the forex market using automated trading systems. I hope you enjoyed the article !
In general trading there is only one definition for reliable data. This is data which is extracted from a central exchange which shows all the variations (down to each transaction) which happen across certain dates. In stocks and futures obtaining reliable data is a very simple process, there are a number of central exchanges and you simply buy the data you want from whatever exchange you are interested in. If you are interested in knowing US stocks data for the past 10 years you can easily purchase extremely detailed data from the New York Stock Exchange from a different series of data providers.
When it comes to forex trading we face a totally different beast because we lack a central exchange. Since there is no "original data" as in the case of stocks and futures, it is impossible to tell just from looking at them which one is the best data feed. In practice, the best data feed for you would be the historical data feed that best matches the liquidity providers of your present broker but the information pertaining to the origin of the data, the providers it comes from and the liquidity providers of your broker is often very secret and hard to know.
Because of these facts we have to make a new definition of reliable data which centers around the technical aspects of the data sets used. What we want - at a minimum - is a data set that does not have mismatches between time periods or odd occurrences like 1 pip daily bars. The 4 digit metaquotes data set is completely unreliable since it has a lot of these errors (which I showed on an Asirikuy video and highlighted on a blog post before) reason why we must ONLY use the 5 digit metaquotes history to run simulations on the metatrader 4 platform, meaning that only 5 digit brokers can be used for reliable backtesting.
However, outside these very simple technical aspects that judge the general soundness of the data use, we cannot have any criteria that tells us if the data is more or less valid than other data sets. Of course, metaquotes data will be different from 1 minute data from other providers (such as Oanda or Dukascopy) but the truth is that such differences are expected merely because of the previously mentioned lack of a central exchange.
The best things we can do to develop systems that are not overly dependent on the "fine" and possibly extremely variable details of historical data is to use strategies which only trade above or on the one hour chart and which do not have a strong dependency on spread widening or execution variables for accurate simulation. A system that trades on lower time frames (below one hour) and has small TP or SL values is not only going to give inaccurate simulations due to one minute interpolation problems but any LIVE results obtained on one broker will probably be very hard to reproduce on another due to changes in data feeds between them.
In the end the accuracy of simulations in the forex market is still a topic of debate but the whole preference of one data set over another because of "reliability" does not make a lot of sense once the original technical aspects of the data proposed are examined. If you have two equally technically sound historical data groups then both are bound to be equally valid and merely different because of the natural feed differences between forex brokers. Our weapons against any possible dependency include the use of longer time frames and wide profit and stop loss targets but such an approach is also bound to have a certain degree of broker dependency, something we are currently examining within Asirikuy.
What is your opinion ? do you think that one broker's data set may be more reliable than another ? Do you believe that there are ways in which you can run accurate simulations which eliminate the broker dependency problem ? How do you deal with the data set quality issues ? Please leave a comment ! :o)
If you would like to learn more about accurate simulations and how to build systems that have a high chance of being live/back testing consistent please consider buying my ebook on automated trading or joining Asirikuy to receive all ebook purchase benefits, weekly updates, check the live accounts I am running with several expert advisors and get in the road towards long term success in the forex market using automated trading systems. I hope you enjoyed the article !
No comments:
Post a Comment