Pairs Trading in Practice

Part 1 – Methodologies

It is perhaps a little premature for a deep dive into the Gemini Pairs Trading strategy which trades on our Systematic Algotrading platform.  At this stage all one can say for sure is that the strategy has made a pretty decent start – up around 17% from October 2018.  The strategy does trade multiple times intraday, so the record in terms of completed trades – numbering over 580 – is appreciable (the web site gives a complete list of live trades).  And despite the turmoil through the end of last year the Sharpe Ratio has ranged consistently around 2.5.

One of the theoretical advantages of pairs trading is, of course, that the coupling of long and short positions in a relative value trade is supposed to provide a hedge against market downdrafts, such as we saw in Q4 2018.  In that sense pairs trading is the quintessential hedge fund strategy, embodying the central concept on which the entire edifice of hedge fund strategies is premised.
In practice, however, things often don’t work out as they should. In this thread I want to spend a little time reviewing why that is and to offer some thoughts based on my own experience of working with statistical arbitrage strategies over many years.

Methodology

There is no “secret recipe” for pairs trading:  the standard methodologies are as well known as the strategy concept.  But there are some important practical considerations that I would like to delve into in this post.  Before doing that, let me quickly review the tried and tested approaches used by statistical arbitrageurs.

The Ratio Model is one of the standard pair trading models described in literature. It is based in ratio of instrument prices, moving average and standard deviation. In other words, it is based on Bollinger Bands indicator.

  • we trade pair of stocks A, B, having price series A(t)B(t)
  • we need to calculate ratio time series R(t) = A(t) / B(t)
  • we apply a moving average of type T with period Pm on R(t) to get time series M(t)
  • Next we apply the standard deviation with period Ps on R(t) to get time series S(t)
  • now we can create Z-score series Z(t) as Z(t) = (R(t) – M(t)) / S(t), this time series can give us z-score to signal trading decision directly (in reality we have two Z-scores: Z-scoreask and Z-scorebid as they are calculated using different prices, but for the sake of simplicity let’s now pretend we don’t pay bid-ask spread and we have just one Z-score)

Another common way to visualize  this approach is to think in terms of bands around the moving average M(t):

  • upper entry band Un(t) = M(t) + S(t) * En
  • lower entry band Ln(t) = M(t) – S(t) * En
  • upper exit band Ux(t) = M(t) + S(t) * Ex
  • lower exit band Lx(t) = M(t) – S(t) * Ex

These bands are actually the same bands as in Bollinger Bands indicator and we can use crossing of R(t) and bands as trade signals.

  • We open short pair position, if the Z-score Z(t) >= En (equivalent to R(t) >= Un(t))
  • We open long pair position if the Z-score Z(t) <= -En (equivalent to R(t) <= Ln(t))

In the Regression, Residual or Cointegration approach we construct a linear regression between A(t)B(t) using OLS, where A(t) = β * B(t) + α + R(t)

Because we use a moving window of period P (we calculate new regression each day), we actually get new series β(t)α(t)R(t), where β(t)α(t) are series of regression coefficients and R(t) are residuals (prediction errors)

  • We look at the residuals series  R(t) = A(t) – (β(t) * B(t) + α(t))
  • We next calculate the standard deviation of the residuals R(t), which we designate S(t)
  • Now we can create Z-score series Z(t) as Z(t) = R(t) / S(t) – the time series that is used to generate trade signals, just as in the Ratio model.

The Kalman Filter model provides superior estimates of the current hedge ratio compared to the Regression method.  For a detailed explanation of the techniques, see the following posts (the post on ETF trading contains complete Matlab code).

 

 

Finally,  the rather complex Copula methodology models the joint and margin distributions of the returns process in each stock as described in the following post

Developing Long/Short ETF Strategies

Recently I have been working on the problem of how to construct large portfolios of cointegrated securities.  My focus has been on ETFs rather that stocks, although in principle the methodology applies equally well to either, of course.

My preference for ETFs is due primarily to the fact that  it is easier to achieve a wide diversification in the portfolio with a more limited number of securities: trading just a handful of ETFs one can easily gain exposure, not only to the US equity market, but also international equity markets, currencies, real estate, metals and commodities. Survivorship bias, shorting restrictions  and security-specific risk are also less of an issue with ETFs than with stocks (although these problems are not too difficult to handle).

On the downside, with few exceptions ETFs tend to have much shorter histories than equities or commodities.  One also has to pay close attention to the issue of liquidity. That said, I managed to assemble a universe of 85 ETF products with histories from 2006 that have sufficient liquidity collectively to easily absorb an investment of several hundreds of  millions of dollars, at minimum.

The Cardinality Problem

The basic methodology for constructing a long/short portfolio using cointegration is covered in an earlier post.   But problems arise when trying to extend the universe of underlying securities.  There are two challenges that need to be overcome.

Magic Cube.112

The first issue is that, other than the simple regression approach, more advanced techniques such as the Johansen test are unable to handle data sets comprising more than about a dozen securities. The second issue is that the number of possible combinations of cointegrated securities quickly becomes unmanageable as the size of the universe grows.  In this case, even taking a subset of just six securities from the ETF universe gives rise to a total of over 437 million possible combinations (85! / (79! * 6!).  An exhaustive test of all the possible combinations of a larger portfolio of, say, 20 ETFs, would entail examining around 1.4E+19 possibilities.

Given the scale of the computational problem, how to proceed? One approach to addressing the cardinality issue is sparse canonical correlation analysis, as described in Identifying Small Mean Reverting Portfolios,  d’Aspremont (2008). The essence of the idea is something like this. Suppose you find that, in a smaller, computable universe consisting of just two securities, a portfolio comprising, say, SPY and QQQ was  found to be cointegrated.  Then, when extending consideration to portfolios of three securities, instead of examining every possible combination, you might instead restrict your search to only those portfolios which contain SPY and QQQ. Having fixed the first two selections, you are left with only 83 possible combinations of three securities to consider.  This process is repeated as you move from portfolios comprising 3 securities to 4, 5, 6, … etc.

Other approaches to the cardinality problem are  possible.  In their 2014 paper Sparse, mean reverting portfolio selection using simulated annealing,  the Hungarian researchers Norbert Fogarasi and Janos Levendovszky consider a new optimization approach based on simulated annealing.  I have developed my own, hybrid approach to portfolio construction that makes use of similar analytical methodologies. Does it work?

A Cointegrated Long/Short ETF Basket

Below are summarized the out-of-sample results for a portfolio comprising 21 cointegrated ETFs over the period from 2010 to 2015.  The basket has broad exposure (long and short) to US and international equities, real estate, currencies and interest rates, as well as exposure in banking, oil and gas and other  specific sectors.

The portfolio was constructed using daily data from 2006 – 2009, and cointegration vectors were re-computed annually using data up to the end of the prior year.  I followed my usual practice of using daily data comprising “closing” prices around 12pm, i.e. in the middle of the trading session, in preference to prices at the 4pm market close.  Although liquidity at that time is often lower than at the close, volatility also tends to be muted and one has a period of perhaps as much at two hours to try to achieve the arrival price. I find this to be a more reliable assumption that the usual alternative.

Fig 2   Fig 1 The risk-adjusted performance of the strategy is consistently outstanding throughout the out-of-sample period from 2010.  After a slowdown in 2014, strategy performance in the first quarter of 2015 has again accelerated to the level achieved in earlier years (i.e. with a Sharpe ratio above 4).

Another useful test procedure is to compare the strategy performance with that of a portfolio constructed using standard mean-variance optimization (using the same ETF universe, of course).  The test indicates that a portfolio constructed using the traditional Markowitz approach produces a similar annual return, but with 2.5x the annual volatility (i.e. a Sharpe ratio of only 1.6).  What is impressive about this result is that the comparison one is making is between the out-of-sample performance of the strategy vs. the in-sample performance of a portfolio constructed using all of the available data.

Having demonstrated the validity of the methodology,  at least to my own satisfaction, the next step is to deploy the strategy and test it in a live environment.  This is now under way, using execution algos that are designed to minimize the implementation shortfall (i.e to minimize any difference between the theoretical and live performance of the strategy).  So far the implementation appears to be working very well.

Once a track record has been built and audited, the really hard work begins:  raising investment capital!

Cointegration Breakdown

The Low Power of Cointegration Tests

One of the perennial difficulties in developing statistical arbitrage strategies is the lack of reliable methods of estimating a stationary portfolio comprising two or more securities. In a prior post (below) I discussed at some length one of the primary reasons for this, i.e. the lower power of cointegration tests. In this post I want to explore the issue in more depth, looking at the standard Johansen test Procedure to estimate cointegrating vectors.

Johansen Test for Cointegration

Start with some weekly data for an ETF triplet analyzed in Ernie Chan’s book:

After downloading the weekly close prices for the three ETFs we divide the data into 14 years of in-sample data and 1 year out of sample:

We next apply the Johansen test, using code kindly provided by Amanda Gerrish:

We find evidence of up to three cointegrating vectors at the 95% confidence level:

 

Let’s take a look at the vector coefficients (laid out in rows, in Amanda’s function):

In-Sample vs. Out-of-Sample testing

We now calculate the in-sample and out-of-sample portfolio values using the first cointegrating vector:

The portfolio does indeed appear to be stationary, in-sample, and this is confirmed by the unit root test, which rejects the null hypothesis of a unit root:

Unfortunately (and this is typically the case) the same is not true for the out of sample period:

More Data Doesn’t Help

The problem with the nonstationarity of the out-of-sample estimated portfolio values is not mitigated by adding more in-sample data points and re-estimating the cointegrating vector(s):

We continue to add more in-sample data points, reducing the size of the out-of-sample dataset correspondingly. But none of the tests for any of the out-of-sample datasets is able to reject the null hypothesis of a unit root in the portfolio price process:

 

 

The Challenge of Cointegration Testing in Real Time

In our toy problem we know the out-of-sample prices of the constituent ETFs, and can therefore test the stationarity of the portfolio process out of sample. In a real world application, that discovery could only be made in real time, when the unknown, future ETFs prices are formed. In that scenario, all the researcher has to go on are the results of in-sample cointegration analysis, which demonstrate that the first cointegrating vector consistently yields a portfolio price process that is very likely stationary in sample (with high probability).

The researcher might understandably be persuaded, wrongly, that the same is likely to hold true in future. Only when the assumed cointegration relationship falls apart in real time will the researcher then discover that it’s not true, incurring significant losses in the process, assuming the research has been translated into some kind of trading strategy.

A great many analysts have been down exactly this path, learning this important lesson the hard way. Nor do additional “safety checks” such as, for example, also requiring high levels of correlation between the constituent processes add much value. They might offer the researcher comfort that a “belt and braces” approach is more likely to succeed, but in my experience it is not the case: the problem of non-stationarity in the out of sample price process persists.

Conclusion:  Why Cointegration Breaks Down

We have seen how a portfolio of ETFs consistently estimated to be cointegrated in-sample, turns out to be non-stationary when tested out-of-sample.  This goes to the issue of the low power of cointegration test, and their inability to estimate cointegrating vectors with sufficient accuracy.  Analysts relying on standard tests such as the Johansen procedure to design their statistical arbitrage strategies are likely to be disappointed by the regularity with which their strategies break down in live trading.

 

Successful Statistical Arbitrage

 

I tend not to get involved in Q&A with readers of my blog, or with investors.  I am at a point in my life where I spend my time mostly doing what I want to do, rather than what other people would like me to do.  And since I enjoy doing research and trading, I try to maximize the amount of time I spend on those activities.

As a business strategy, I wouldn’t necessarily recommend this approach.  It’s just something I evolved while learning to play chess: since I had no-one to teach me, I had to learn everything for myself and this involved studying for many, many hours alone.

By contrast, several of the best money managers are also excellent communicators – take Roy Niederhoffer, or Ernie Chan, for example. Having regular, informed communication with your investors is, as smarter managers have realized, a means of building trust and investor loyalty – important factors that come into play during periods when your strategy is underperforming. Not only that, but since communication is two-way, an analyst/manager can learn much from his exchanges with his clients.  Knowing how others perceive you – and your competitors – for example, is very useful information.  So, too, is information about your competitors’ research ideas, investment strategies and fund performance, which can often be gleaned from discussions with investors.  There are plenty of reasons to prefer a policy of regular, open communication.

As a case in point, I was surprised to learn from  comments on another research blog that readers drew the conclusion from my previous posts that pursuing the cointegration or Kalman Filter approach to statistical arbitrage was a waste of time.  Apparently, my remark to the effect that researchers often failed to pay attention to the net PnL per share in evaluating stat. arb. trading strategies was taken by some to mean that any apparent profitability would always be subsumed within the bid-offer spread.  That was not my intention.  What I intended to convey was that in some instances, this would be the case  – some, but not all.

To illustrate the point, below are the out-of-sample results from a research study applying the Kalman Filter approach for four equity pairs using 5-minute data.  For competitive reasons I am unable to identify the specific stocks in each pair, which result from an exhaustive analysis of over 30,000 pairs, but I can say that they are liquid large-cap equities traded in large volume on the US exchanges.  The performance numbers are net of transaction costs and are based on the assumption of a 5-minute delay in execution: meaning, a trading signal received at time t is assumed to be executed at time t+5 minutes.  This allows sufficient time to leg into each trade passively, in most cases avoiding the bid-offer spread.  The net PnL per share is above 1.5c per share for each pair.

Fig 0 While the performance of none of the pairs is spectacular, a combined portfolio has quite attractive characteristics, which include 81% winning months since Jan 2012, a CAGR of over 27% and Information Ratio of 2.29, measured on monthly returns (2.74 based on daily returns).

Fig 2

Fig 3

Finally, I am currently implementing trading of a number of stock portfolios based on static cointegration relationships that have out-of-sample information ratios of between 3 and 4, using daily data.

 

 

ETF Pairs Trading with the Kalman Filter

I was asked by a reader if I could illustrate the application of the Kalman Filter technique described in my previous post with an example. Let’s take the ETF pair AGG IEF, using daily data from Jan 2006 to Feb 2015 to estimate the model.  As you can see from the chart in Fig. 1, the pair have been highly correlated over the last several years.

Fig 1Fig 1.  AGG and IEF Daily Prices 2006-2015

We now estimate the beta-relationship between the ETF pair with the Kalman Filter, using the Matlab code given below, and plot the estimated vs actual prices of the first ETF, AGG in Fig 2.  There are one or two outliers that you might want to take a look at, but mostly the fit looks very good. Fig 2

 Fig 2 – Actual vs Fitted Prices of AGG

Now lets take a look at Kalman Filter estimates of beta.  As you can see in Fig 3, it wanders around a lot!  Very difficult to handle using some kind of static beta estimate. Fig 3

Fig 3 – Kalman Filter Beta Estimates

  Finally, we compute the raw and standardized alphas, being the differences between the observed and fitted prices , i.e. Alpha(t) = AGG(t) – b(t)* IEF(t) and kfAlpha(t) = (Alpha(t) – mean(Alpha(t)) / std(Alpha(t)   I have plotted the kfAlpha estimates over the last year in Fig 4.   Fig 4

Fig 4 – Standardized Alpha Estimates

  The last step is to decide how to trade this relationship.  You might, for example, trade the portfolio in proportion to the standardized deviation (i.e. the  size of kfAlpha(t)).  Alternatively, you might set a threshold level, say +/- 1 Sd, and trade the portfolio when  kfAlpha(t) exceeds this the threshold.   In the Matlab code below I use the particle swarm method  to maximize the likelihood.  I have found this to be more reliable than other methods.

Continue reading “ETF Pairs Trading with the Kalman Filter”

Statistical Arbitrage Using the Kalman Filter

One of the challenges with the cointegration approach to statistical arbitrage which I discussed in my previous post, is that cointegration relationships are seldom static: they change quite frequently and often break down completely.  Back in 2009 I began experimenting with a more dynamic approach to pairs trading, based on the Kalman Filter.

 

In its simplest form, we  model the relationship between a pair of securities in the following way:

beta(t) = beta(t-1) + w     beta(t), the unobserved state variable, that follows a random walk

Y(t) = beta(t)X(t) + v      The observed processes of stock prices Y(t) and X(t)

where:

w ~ N(0,Q) meaning w is gaussian noise with zero mean and variance Q

v ~ N(0,R) meaning v is gaussian noise with variance R

So this is just like the usual pairs relationship Y = beta * X + v, where the typical approach is to estimate beta using least squares regression, or some kind of rolling regression (to try to take account of the fact that beta may change over time).  In this traditional framework, beta is static, or slowly changing.

In the Kalman framework, beta is itself a random process that evolves continuously over time, as a random walk.  Because it is random and contaminated by noise we cannot observe beta directly, but must infer its (changing) value from the observable stock prices X and Y. (Note: in what follows I shall use X and Y to refer to stock prices.  But you could also use log prices, or returns).

Unknown to me at that time,  several other researchers were thinking along the same lines and later published their research.  One such example is Statistical Arbitrage and High-Frequency Data with an Application to Eurostoxx 50 Equities,  Rudy, Dunis, Giorgioni and Laws, 2010.  Another closely related study is  Performance Analysis of Pairs Trading Strategy Utilizing High Frequency Data with an Application to KOSPI 100 Equities, Kim, 2011.  Both research studies follow a very similar path, rejecting beta estimation using rolling regression or exponential smoothing in favor of the Kalman approach and applying a Ornstein-Uhlenbeck model to estimate the half-life of mean reversion of the pairs portfolios.  The studies report very high out-of-sample information ratios that in some cases exceed 3.

I have already made the point that such unusually high performance is typically the result of ignoring the fact that the net PnL per share may lie within the region of the average bid-offer spread, making implementation highly problematic.  In this post I want to dwell on another critical issue that is particular to the Kalman approach: the signal:noise ratio, Q/R, which expresses the ratio of the variance of the beta process to that of the price process.  (Curiously, both papers make the same mistake of labelling Q and R as standard deviations. In fact, they are variances).

Beta, being a random process, obviously contains some noise:  but the hope is that it is less noisy than the price process.  The idea is that the relationship between two stocks is more stable – less volatile – than the stock processes themselves.  On its face, that assumption appears reasonable, from an empirical standpoint.  The question is:  how stable is the beta process, relative to the price process? If the variance in the beta process is  low relative to the price process,  we can determine beta quite accurately over time and so obtain accurate estimates of the true price Y(t), based on X(t).  Then, if we observe a big enough departure in the quoted price Y(t) from the true price at time t, we have a potential trade.

In other words, we are interested in:

alpha(t) = Y(t) – Y*(t) = Y(t) – beta(t) X(t)

where Y(t) and X(t) are the observed stock prices and beta(t) is the estimated value of beta at time t.

As usual, we would standardize the alpha using an estimate of the alpha standard deviation, which is sqrt(R).  (Alternatively, you can estimate the standard deviation of the alpha directly, using a lookback period based on the alpha half-life).

If the standardized alpha is large enough, the model suggests that the price Y(t) is quoted significantly in excess of the true value.  Hence we would short stock Y and buy stock X.  (In this context, where X and Y represent raw prices, you would hold an equal and opposite number of shares in Y and X.  If X and Y represented returns, you would hold equal and opposite market value in each stock).

The success of such a strategy depends critically on the quality of our estimates of alpha, which in turn rest on the accuracy of our estimates of beta. This depends on the noisiness of the beta process, i.e. its variance, Q.  If the beta process is very noisy, i.e. if Q is large, our estimates of alpha are going to be too noisy to be useful as the basis for a reversion strategy.

So, the key question I want to address in this post is: in order for the Kalman approach to be effective in modeling a pairs relationship, what would be an acceptable range for the beta process variance Q ?  (It is often said that what matters in the Kalman framework is not the variance Q, per se, but rather the signal:noise ratio Q/R.  It turns out that this is not strictly true, as we shall see).

To get a handle on the problem, I have taken the following approach:

(i) Simulate a stock process X(t) as a geometric brownian motion process with specified drift and volatility (I used 0%,  5% and 10% for the annual drift, and 10%,  30% and 60% for the corresponding annual volatility).

(ii) simulate a beta(t) process as a random walk with variance Q in the range from 1E-10 to 1E-1.

(iii) Generate the true price process Y(t) = beta(t)* X(t)

(iv) Simulate an observed price process Yobs(t), by adding random noise with variance R to Y(t), with R in the range 1E-6 to 1.0

(v) Calculate the true, known alpha(t) = Y(t) – Yobs(t)

(vi) Fit the Kalman Filter model to the simulated processes and estimate beta(t)  and Yest(t). Hence produce estimates kfalpha(t)  = Yobs(t) – Yest(t) and compare these with the known, true alpha(t).

The charts in Fig. 1 below illustrate the procedure for a stock process X(t) with annual drift of 10%, annual volatility 40%, beta process variance Q of 8.65E-9 and price process variance R of 5.62E-2 (Q/R ratio of 1.54E-7).

Fig 1

Fig. 1 True and Estimated Beta and Alpha Using the Kalman Filter

As you can see, the Kalman Filter does  a very good job of updating its beta estimate to track the underlying, true beta (which, in this experiment, is known). As the noise ratio Q/R is small, the Kalman Filter estimates of the process alpha, kfalpha(t), correspond closely to the true alpha(t), which again are known to us in this experimental setting.  You can examine the relationship between the true alpha(t) and the Kalman Filter estimates kfalpha(t) is the chart in the upmost left quadrant of the figure.  The correlation between the two is around 89%.  With a level of accuracy this good for our alpha estimates, the pair of simulated stocks would make an ideal candidate for a pairs trading strategy.

Of course, the outcome is highly dependent on the values we assume for Q and R (and also to some degree on the assumptions made about the drift and volatility of the price process X(t)).

The next stage of the analysis is therefore to generate a large number of simulated price and beta observations and examine the impact of different levels of Q and R, the variances of the beta and price process.  The results are summarized in the table in Fig 2 below.

Fig 2

 Fig 2. Correlation between true alpha(t) and kfalpha(t) for values of Q and R

As anticipated, the correlation between the true alpha(t) and the estimates produced by the Kalman Filter is very high when the signal:noise ratio is small, i.e. of the order of 1E-6, or less.  Average correlations begin to tail off very quickly when Q/R exceeds this level, falling to as low as 30% when the noise ratio exceeds 1E-3.  With a Q/R ratio of 1E-2 or higher, the alpha estimates become too noisy to be useful.

I find it rather fortuitous, even implausible, that in their study Rudy, et al, feel able to assume a noise ratio of 3E-7 for all of the stock pairs in their study, which just happens to be in the sweet spot for alpha estimation.  From my own research, a much larger value in the region of 1E-3 to 1E-5 is  more typical. Furthermore, the noise ratio varies significantly from pair to pair, and over time.  Indeed, I would go so far as to recommend applying a noise ratio filter to the strategy, meaning that trading signals are ignored when the noise ratio exceeds some specified level.

The take-away is this:  the Kalman Filter approach can be applied very successfully in developing statistical arbitrage strategies, but only for processes where the noise ratio is not too large.  One suggestion is to use a filter rule to supress trade signals generated at times when the noise ratio is too large, and/or to increase allocations to pairs in which the noise ratio is relatively low.

 

 

 

Developing Statistical Arbitrage Strategies Using Cointegration

In his latest book (Algorithmic Trading: Winning Strategies and their Rationale, Wiley, 2013) Ernie Chan does an excellent job of setting out the procedures for developing statistical arbitrage strategies using cointegration.  In such mean-reverting strategies, long positions are taken in under-performing stocks and short positions in stocks that have recently outperformed.

I will leave a detailed description of the procedure to Ernie (see pp 47 – 60), which in essence involves:

(i) estimating a cointegrating relationship between two or more stocks, using the Johansen procedure

(ii) computing the half-life of mean reversion of the cointegrated process, based on an Ornstein-Uhlenbeck  representation, using this as a basis for deciding the amount of recent historical data to be used for estimation in (iii)

(iii) Taking a position proportionate to the Z-score of the market value of the cointegrated portfolio (subtracting the recent mean and dividing by the recent standard deviation, where “recent” is defined with reference to the half-life of mean reversion)

Countless researchers have followed this well worn track, many of them reporting excellent results.  In this post I would like to discuss a few of many considerations  in the procedure and variations in its implementation.  We will follow Ernie’s example, using daily data for the EWF-EWG-ITG triplet of ETFs from April 2006 – April 2012. The analysis runs as follows (I am using an adapted version of the Matlab code provided with Ernie’s book):

Johansen test We reject the null hypothesis of fewer then three cointegrating relationships at the 95% level. The eigenvalues and eigenvectors are as follows:

Eigenvalues The eignevectors are sorted by the size of their eigenvalues, so we pick the first of them, which is expected to have the shortest half-life of mean reversion, and create a portfolio based on the eigenvector weights (-1.046, 0.76, 0.2233).  From there, it requires a simple linear regression to estimate the half-life of mean reversion:

Halflife From which we estimate the half-life of mean reversion to be 23 days.  This estimate gets used during the final, stage 3, of the process, when we choose a look-back period for estimating the running mean and standard deviation of the cointegrated portfolio.  The position in each stock (numUnits) is sized according to the standardized deviation from the mean (i.e. the greater the deviation the larger the allocation). Apply Ci The results appear very promising, with an annual APR of 12.6% and Sharpe ratio of 1.4:   Returns EWA-EWC-IGE

Ernie is at pains to point out that, in this and other examples in the book, he pays no attention to transaction costs, nor to the out-of-sample performance of the strategies he evaluates, which is fair enough.

The great majority of the academic studies that examine the cointegration approach to statistical arbitrage for a variety of investment universes do take account of transaction costs.  For the most part such studies report very impressive returns and Sharpe ratios that frequently exceed 3.  Furthermore, unlike Ernie’s example which is entirely in-sample, these studies typically report consistent out-of-sample performance results also.

But the single, most common failing of such studies is that they fail to consider the per share performance of the strategy.  If the net P&L per share is less than the average bid-offer spread of the securities in the investment portfolio, the theoretical performance of the strategy is unlikely to survive the transition to implementation.  It is not at all hard to achieve a theoretical Sharpe ratio of 3 or higher, if you are prepared to ignore the fact that the net P&L per share is lower than the average bid-offer spread.  In practice, however, any such profits are likely to be whittled away to zero in trading frictions – the costs incurred in entering, adjusting and exiting positions across multiple symbols in the portfolio.

Put another way, you would want to see a P&L per share of at least 1c, after transaction costs, before contemplating implementation of the strategy.  In the case of the EWA-EWC-IGC portfolio the P&L per share is around 3.5 cents.  Even after allowing, say, commissions of 0.5 cents per share and a bid-offer spread of 1c per share on both entry and exit, there remains a profit of around 2 cents per share – more than enough to meet this threshold test.

Let’s address the second concern regarding out-of-sample testing.   We’ll introduce a parameter to allow us to select the number of in-sample days, re-estimate the model parameters using only the in-sample data, and test the performance out of sample.  With a in-sample size of 1,000 days, for instance, we find that we can no longer reject the null hypothesis of fewer than 3 cointegrating relationships and the weights for the best linear portfolio differ significantly from those estimated using the entire data set.

Johansen 2

Repeating the regression analysis using the eigenvector weights of the maximum eigenvalue vector (-1.4308, 0.6558, 0.5806), we now estimate the half-life to be only 14 days.  The out-of-sample APR of the strategy over the remaining 500 days drops to around 5.15%, with a considerably less impressive Sharpe ratio of only 1.09.

osPerfOut-of-sample cumulative returns

One way to improve the strategy performance is to relax the assumption of strict proportionality between the portfolio holdings and the standardized deviation in the market value of the cointegrated portfolio.  Instead, we now require  the standardized deviation of the portfolio market value to exceed some chosen threshold level before we open a position (and we close any open positions when the deviation falls below the threshold).  If we choose a threshold level of 1, (i.e. we require the market value of the portfolio to deviate 1 standard deviation from its mean before opening a position), the out-of-sample performance improves considerably:

osPerf 2

The out-of-sample APR is now over 7%, with a Sharpe ratio of 1.45.

The strict proportionality requirement, while logical,  is rather unusual:  in practice, it is much more common to apply a threshold, as I have done here.  This addresses the need to ensure an adequate P&L per share, which will typically increase with higher thresholds.  A countervailing concern, however, is that as the threshold is increased the number of trades will decline, making the results less reliable statistically.  Balancing the two considerations, a threshold of around 1-2 standard deviations is a popular and sensible choice.

Of course, introducing thresholds opens up a new set of possibilities:  just because you decide to enter based on a 2x SD trigger level doesn’t mean that you have to exit a position at the same level.  You might consider the outcome of entering at 2x SD, while exiting at 1x SD, 0x SD, or even -2x SD.  The possible nuances are endless.

Unfortunately, the inconsistency in the estimates of the cointegrating relationships over different data samples is very common.  In fact, from my own research, it is often the case that cointegrating relationships break down entirely out-of-sample, just as do correlations.  A recent study by Matthew Clegg of over 860,000 pairs confirms this finding (On the Persistence of Cointegration in Pais Trading, 2014) that cointegration is not a persistent property.

I shall examine one approach to  addressing the shortcomings  of the cointegration methodology  in a future post.

 

Matlab code (adapted from Ernie Chan’s book):

Continue reading “Developing Statistical Arbitrage Strategies Using Cointegration”

The Correlation Signal

The use of correlations is widespread in investment management theory and practice, from the construction of portfolios to the design of hedge trades to statistical arbitrage strategies.

A common difficulty encountered in all of these applications is the variation in correlation: assets that at one time appear to be suitably uncorrelated for hedging purposes, may become much more highly correlated at other times, such as periods of market stress. Conversely, stocks that appear suitable for pairs trading due to the high correlation in their prices or returns, may de-couple at a later time, causing significant losses.

The instability in the level of correlation is further aggravated by the empirical finding that the volatility in correlation is itself time-dependent:  at times the correlations between assets may appear to fluctuate smoothly within a tight range; at other times we might see several fluctuations in the sign of the correlation  coefficient over the course of a few days.

One tool I have found useful in this context is a concept I refer to as the correlation signal, defined at the average correlation divided by the standard deviation of the correlation coefficient.  The chart below illustrates a typical pattern for a pair of Oil and Gas industry stocks.  The blue line is the average daily correlation between the stocks, measured at 5-minute intervals.  The red line is the correlation signal – the average daily correlation divided by the standard deviation in the intra-day correlation.  The stochastic nature of both the correlation coefficient and the correlation signal is quite evident.  Note that the correlation signal, unlike the coefficient, is not constrained within the limits of +/- 1.  At times when the variation in correlation is low the signal an easily exceed those limits by as much as an order of magnitude.

CorrSig Plot

In later posts I will illustrate the usefulness of the correlation signal in portfolio construction and statistical arbitrage.  For now, let me just say that it is a measure of the strength of the correlation as a signal, relative to the noise of random variation in the correlation process.   It can be used to identify situations in which a relationship – whether a positive or negative correlation – appears to be stable or unstable, and therefore viable as a basis for inference, or not.