The collection of data is at the heart of time-series analysis

No, not the middling to good, with a few very good tracks from The Police in ’83. What I would like to discuss is the impact of timing on the analysis of time series data. And by timing I am not talking about when the data is actually put to use, but rather when it is collected in the first place.
The collection of data is at the heart of time-series analysis but, given the number of over-estimated models at large in the analysis space, what is remarkable is the lack of them at the other end of the information chain. The most prevalent type of data problem is missing data. Holidays, for example, do not occur in all countries on the same day. The most common responses to this issue is (a) remove that day from the time-series or (b) for the more “sophisticated”, linearly interpolate the missing data. Such approaches are orders of magnitude of sophistication below that of the models to which they are applied. To date in the financial space, little attention has been paid to this area. Given the amount of questionable data and sums involved this is surprising.
However from a risk analysis point of view, missing data is less of a significant issue. So long as the more “interesting” days are included in the dataset, and generally they always are, the impact of a model to recover missing data, does not have a significant impact on a given time-series based risk metric. Significance in this case is determined as a percentage of the normal error bars around the estimate of the risk metric in the first place. What I would like to consider is the impact of timing, or the lack thereof in the analysis of time series data. As a simple example consider daily “closing” data. From a European (Dublin) perspective, Asian markets close between 7am and 9am. Middle East around lunchtime, Frankfurt, Paris and Milan at 4pm, the UK at 5pm, and the US closes between 8pm and 10pm depending on which exchange you are disposed to use. All of which constitutes closing data for any given calendar day. Given the global interconnectedness of markets this can lead to some not very intuitive outcomes.
Using a simple, not a risk, metric such as correlation, we see that the long-term correlation between European and US equity indices using the above described closing data is above 70%. This is in line with expectations. The correlation however between US indices and a Japanese index is below 20%. The reason is simple. When there is a crash in the US markets, we know that Japanese markets are tanking too. This is clear from the Nikkei 225 futures which trades 24hs a day on GLOBEX. However, that crash happened 12 hours after the Japanese markets closes and so is not seen in that day’s data. To see the more real (less unreal) relationship you need the advance the Japanese data by 1 day. If this is done, the correlation between the two markets rises to 50%. I say less unreal in parenthesis because whilst there is some insight in the fact that the 1-day advanced data has a higher correlation with US markets, implying normally correctly, that Asian markets follow the US rather than vice-versa, there is still a significant effective time lag between the two markets. This effect follows true for all Asian markets except for China. Irrespective of lags or advances in the data, the correlation between Chinese and US markets is always lower than 20%, effectively in the noise. This demonstrates what most people probably suspected already – from a financial market perspective, China is sui generis.
This can have a non-trivial impact on, less useless, risk metrics such as VaR. A portfolio with 50% US and 50% Japanese exposure has a median VaR that is 20% lower using the 1-day advanced data, than using unaltered closing data. This is because there is a diversification element in the closing data that does not, in reality, exist. The reality is that due to synchronicity effects, global portfolios are actually running significantly more risk that is being estimated using a VaR based on closing data. The obvious fix is to use synchronous data, i.e take the price of all assets at the same time, 3pm. At a hedge fund where I used to work we used 3pm as both US and European markets were open at that time and we interpolated Asian prices based on futures. However, in reality this is not a simple task to carry out. Many OTC products are not priced properly intra-day and most likely of more relevance, the cost of intra-day price data is significantly greater than closing data.
At the end of the day, one of the roles of the risk manager is to understand the impact of synchronicity, and the lack thereof on the portfolio for which they have responsibility. The 20% mentioned above does seem, at first, a significant difference, but given the number of bells, whistles, assumptions etc that can be used in a VaR estimate, it is not always obvious that a 20% difference in VaR estimates is in fact significant. I always like to think that risk analysis is more of a science than an art. As above, sometimes artistry is required.

Synchronicity

Read more

SVB – Cautionary tale or just plain dumb?

Poly-crisis? What poly-crisis?

ManCoTek

Get in touch