Modeling Asset Volatility

I am planning a series of posts on the subject of asset volatility and option pricing and thought I would begin with a survey of some of the central ideas. The attached presentation on Modeling Asset Volatility sets out the foundation for a number of key concepts and the basis for the research to follow.

Perhaps the most important feature of volatility is that it is stochastic rather than constant, as envisioned in the Black Scholes framework.  The presentation addresses this issue by identifying some of the chief stylized facts about volatility processes and how they can be modelled.  Certain characteristics of volatility are well known to most analysts, such as, for instance, its tendency to “cluster” in periods of higher and lower volatility.  However, there are many other typical features that are less often rehearsed and these too are examined in the presentation.

Long Memory
For example, while it is true that GARCH models do a fine job of modeling the clustering effect  they typically fail to capture one of the most important features of volatility processes – long term serial autocorrelation.  In the typical GARCH model autocorrelations die away approximately exponentially, and historical events are seen to have little influence on the behaviour of the process very far into the future.  In volatility processes that is typically not the case, however:  autocorrelations die away very slowly and historical events may continue to affect the process many weeks, months or even years ahead.

Volatility Direction Prediction Accuracy
Volatility Direction Prediction Accuracy

There are two immediate and very important consequences of this feature.  The first is that volatility processes will tend to trend over long periods – a characteristic of Black Noise or Fractionally Integrated processes, compared to the White Noise behavior that typically characterizes asset return processes.  Secondly, and again in contrast with asset return processes, volatility processes are inherently predictable, being conditioned to a significant degree on past behavior.  The presentation considers the fractional integration frameworks as a basis for modeling and forecasting volatility.

Mean Reversion vs. Momentum
A puzzling feature of much of the literature on volatility is that it tends to stress the mean-reverting behavior of volatility processes.  This appears to contradict the finding that volatility behaves as a reinforcing process, whose long-term serial autocorrelations create a tendency to trend.  This leads to one of the most important findings about asset processes in general, and volatility process in particular: i.e. that the assets processes are simultaneously trending and mean-reverting.  One way to understand this is to think of volatility, not as a single process, but as the superposition of two processes:  a long term process in the mean, which tends to reinforce and trend, around which there operates a second, transient process that has a tendency to produce short term spikes in volatility that decay very quickly.  In other words, a transient, mean reverting processes inter-linked with a momentum process in the mean.  The presentation discusses two-factor modeling concepts along these lines, and about which I will have more to say later.

SSALGOTRADING AD

Cointegration
One of the most striking developments in econometrics over the last thirty years, cointegration is now a principal weapon of choice routinely used by quantitative analysts to address research issues ranging from statistical arbitrage to portfolio construction and asset allocation.  Back in the late 1990’s I and a handful of other researchers realized that volatility processes exhibited very powerful cointegration tendencies that could be harnessed to create long-short volatility strategies, mirroring the approach much beloved by equity hedge fund managers.  In fact, this modeling technique provided the basis for the Caissa Capital volatility fund, which I founded in 2002.  The presentation examines characteristics of multivariate volatility processes and some of the ideas that have been proposed to model them, such as FIGARCH (fractionally-integrated GARCH).

Dispersion Dynamics
Finally, one topic that is not considered in the presentation, but on which I have spent much research effort in recent years, is the behavior of cross-sectional volatility processes, which I like to term dispersion.  It turns out that, like its univariate cousin, dispersion displays certain characteristics that in principle make it highly forecastable.  Given an appropriate model of dispersion dynamics, the question then becomes how to monetize efficiently the insight that such a model offers.  Again, I will have much more to say on this subject, in future.

Correlation Cointegration

In a previous post I looked at ways of modeling the relationship between the CBOE VIX Index and the Year 1 and Year 2 CBOE Correlation Indices:

http://jonathankinlay.com/2017/08/modeling-volatility-correlation/

 

The question was put to me whether the VIX and correlation indices might be cointegrated.

Let’s begin by looking at the pattern of correlation between the three indices:

VIX-Correlation1 VIX-Correlation2 VIX-Correlation3

If you recall from my previous post, we were able to fit a linear regression model with the Year 1 and Year 2 Correlation Indices that accounts for around 50% in the variation in the VIX index.  While the model certainly has its shortcomings, as explained in the post, it will serve the purpose of demonstrating that the three series are cointegrated.  The standard Dickey-Fuller test rejects the null hypothesis of a unit root in the residuals of the linear model, confirming that the three series are cointegrated, order 1.

SSALGOTRADING AD

UnitRootTest

 

Vector Autoregression

We can attempt to take the modeling a little further by fitting a VAR model.  We begin by splitting the data into an in-sample period from Jan 2007 to Dec 2015 and an out-of-sample test period from Jan 2016  to Aug 2017.  We then fit a vector autoregression model to the in-sample data:

VAR Model

When we examine how the model performs on the out-of-sample data, we find that it fails to pick up on much of the variation in the series – the forecasts are fairly flat and provide quite poor predictions of the trends in the three series over the period from 2016-2017:

VIX-CorrelationForecast

Conclusion

The VIX and Correlation Indices are not only highly correlated, but also cointegrated, in the sense that a linear combination of the series is stationary.

One can fit a weakly stationary VAR process model to the three series, but the fit is quite poor and forecasts from the model don’t appear to add much value.  It is conceivable that a more comprehensive model involving longer lags would improve forecasting performance.

 

 

Pairs Trading – Part 2: Practical Considerations

Pairs Trading = Numbers Game

One of the first things you quickly come to understand in equity pairs trading is how important it is to spread your risk.  The reason is obvious: stocks are subject to a multitude of risk factors – amongst them earning shocks and corporate actions -that can blow up an otherwise profitable pairs trade.  Instead of the pair re-converging, they continue to diverge until you are stopped out of the position.  There is not much you can do about this, because equities are inherently risky.  Some arbitrageurs prefer trading ETF pairs for precisely this reason.  But risk and reward are two sides of the same coin:  risks tend to be lower in ETF pairs trades, but so, too, are the rewards.  Another factor to consider is that there are many more opportunities to be found amongst the vast number of stock combinations than in the much smaller universe of ETFs.  So equities remain the preferred asset class of choice for the great majority of arbitrageurs.

So, because of the risk in trading equities, it is vitally important to spread the risk amongst a large number of pairs.  That way, when one of your pairs trades inevitably blows up for one reason or another, the capital allocation is low enough not to cause irreparable damage to the overall portfolio.  Nor are you over-reliant on one or two star performers that may cease to contribute if, for example, one of the stock pairs is subject to a merger or takeover.

Does that mean that pairs trading is accessible only to managers with deep enough pockets to allocate broadly in the investment universe?  Yes and no.  On the one hand, of course, you need sufficient capital to allocate a meaningful sum to each of your pairs.  But pairs trading is highly efficient in its use of capital:  margin requirements are greatly reduced by the much lower risk of a dollar-neutral portfolio.  So your capital goes further than in would in a long-only strategy, for example.

How many pair combinations would you need to research to build an investment portfolio of the required size?  The answer might shock you:  millions.  Or  even tens of millions.  In the case of the Gemini Pairs strategy, for example, the universe comprises around 10m stock pairs and 200,000 ETF combinations.

It turns out to be much more challenging to find reliable stock pairs to trade than one might imagine, for reasons I am about to discuss.  So what tends to discourage investors from exploring pairs trading as an investment strategy is not because the strategy is inherently hard to understand; nor because the methods are unknown; nor because it requires vast amounts of investment capital to be viable.  It is that the research effort required to build a successful statistical arbitrage strategy is beyond the capability of the great majority of investors.

Before you become too discouraged, I will just say that there are at least two solutions to this challenge I can offer, which I will discuss later.

Methodology Isn’t a Decider

I have traded pairs successfully using all of the techniques described in the first part of the post (i.e. Ratio, Regression, Kalman and Copula methods).  Equally, I have seen a great many failed pairs strategies produced by using every available technique.  There is no silver bullet.  One often finds that a pair that perform poorly using the ratio method produces decent returns when a regression or Kalman Filter model is applied.  From experience, there is no pattern that allows you to discern which technique, if any, is gong to work.  You have to be prepared to try all of them, at least in back-test.

Correlation is Not the Answer

In a typical description of pairs trading the first order of business is often to look for a highly correlated pairs to trade.  While this makes sense as a starting point, it can never provide a complete answer.  The reason is well known:  correlations are unstable, and can often arise from random chance rather than as a result of a real connection between two stock processes.  The concept of spurious correlation is most easily grasped with an example, for instance:

Of course, no rational person believes that there is a causal connection between cheese consumption and death by bedsheet entanglement – it is a spurious correlation that has arisen due to the random fluctuations in the two time series.  And because the correlation is spurious, the apparent relationship is likely to break down in future.

We can provide a slightly more realistic illustration as follows.  Let us suppose we have two correlated stocks, one with annual drift (i.e trend of 5% and annual volatility of 25%, the other with annual drift of 20% and annual volatility of 50%.  We assume that returns from the two processes follow a Normal distribution, with true correlation of 0.3.  Let’s assume that we sample the  returns for the two stocks over 90 days to estimate the correlation, simulating the real-world situation in which the true correlation is unknown.  Unlike in the real-world scenario, we can sample the 90-day returns many times (100,000 in this experiment) and look at the range of correlation estimates we observe:

We find that, over the 100,000 repeated experiments the average correlation estimate is very close indeed to the true correlation.  However, in the real-world situation we only have a single observation, based on the returns from the two stock processes over the prior 90 days.  If we are very lucky, we might happen to pick a period in which the processes correlate at a level close to the true value of 0.3.  But as the experiment shows, we might be unlucky enough to see an estimate as high as 0.64, or as low as zero!

So when we look at historical data and use estimates of the correlation coefficient to gauge the strength of the relationship between two stocks, we are at the mercy of random variation in the sampling process, one that could suggest a much stronger (or weaker) connection than is actually the case.

One is on firmer ground in selecting pairs of stocks in the same sector, for example oil or gold-mining stocks, because we are able to identify causal factors that should provide a basis for a reliable correlation, such as the price of oil or gold.  This is indeed one of the “screens” that statistical arbitrageurs often use to select pairs for analysis.  But there are many examples of stocks that “ought” to be correlated but which nonetheless break down and drift apart.  This can happen for many reasons:  changes in the capital structure of one of the companies; a major product launch;  regulatory action; or corporate actions such as mergers and takeovers.

The bottom line is that correlation, while important, is not by itself a sufficiently reliable measure to provide a basis for pair selection.

Cointegration: the Drunk and His Dog

Suppose you see two drunks (i.e., two random walks) wandering around. The drunks don’t know each other (they’re independent), so there’s no meaningful relationship between their paths.

But suppose instead you have a drunk walking with his dog. This time there is a connection. What’s the nature of this connection? Notice that although each path individually is still an unpredictable random walk, given the location of one of the drunk or dog, we have a pretty good idea of where the other is; that is, the distance between the two is fairly predictable. (For example, if the dog wanders too far away from his owner, he’ll tend to move in his direction to avoid losing him, so the two stay close together despite a tendency to wander around on their own.) We describe this relationship by saying that the drunk and her dog form a cointegrating pair.

In more technical terms, if we have two non-stationary time series X and Y that become stationary when differenced (these are called integrated of order one series, or I(1) series; random walks are one example) such that some linear combination of X and Y is stationary (aka, I(0)), then we say that X and Y are cointegrated. In other words, while neither X nor Y alone hovers around a constant value, some combination of them does, so we can think of cointegration as describing a particular kind of long-run equilibrium relationship. (The definition of cointegration can be extended to multiple time series, with higher orders of integration.)

Other examples of cointegrated pairs:

  • Income and consumption: as income increases/decreases, so too does consumption.
  • Size of police force and amount of criminal activity
  • A book and its movie adaptation: while the book and the movie may differ in small details, the overall plot will remain the same.
  • Number of patients entering or leaving a hospital

So why do we care about cointegration? Someone else can probably give more econometric applications, but in quantitative finance, cointegration forms the basis of the pairs trading strategy: suppose we have two cointegrated stocks X and Y, with the particular (for concreteness) cointegrating relationship X – 2Y = Z, where Z is a stationary series of zero mean. For example, X could be McDonald’s, Y could be Burger King, and the cointegration relationship would mean that X tends to be priced twice as high as Y, so that when X is more than twice the price of Y, we expect X to move down or Y to move up in the near future (and analogously, if X is less than twice the price of Y, we expect X to move up or Y to move down). This suggests the following trading strategy: if X – 2Y > d, for some positive threshold d, then we should sell X and buy Y (since we expect X to decrease in price and Y to increase), and similarly, if X – 2Y < -d, then we should buy X and sell Y.

So how do you detect cointegration? There are several different methods, but the simplest is probably the Engle-Granger test, which works roughly as follows:

  • Check that Xt and Yt are both I(1).
  • Estimate the cointegrating relationship =+Yt=aXt+et by ordinary least squares.
  • Check that the cointegrating residuals et are stationary (say, by using a so-called unit root test, e.g., the Dickey-Fuller test).

Also, something else that should perhaps be mentioned is the relationship between cointegration and error-correction mechanisms: suppose we have two cointegrated series ,Xt,Yt, with autoregressive representations

=1+1+Xt=aXt−1+bYt−1+ut
=1+1+Yt=cXt−1+dYt−1+vt

By the Granger representation theorem (which is actually a bit more general than this), we then have

Δ=1(11)+ΔXt=α1(Yt−1−βXt−1)+ut
Δ=2(11)+ΔYt=α2(Yt−1−βXt−1)+vt

where 11(0)Yt−1−βXt−1∼I(0) is the cointegrating relationship. Regarding 11Yt−1−βXt−1 as the extent of disequilibrium from the long-run relationship, and the αi as the speed (and direction) at which the time series correct themselves from this disequilibrium, we can see that this formalizes the way cointegrated variables adjust to match their long-run equilibrium.

So, just to summarize a bit, cointegration is an equilibrium relationship between time series that individually aren’t in equilibrium (you can kind of contrast this with (Pearson) correlation, which describes a linear relationship), and it’s useful because it allows us to incorporate both short-term dynamics (deviations from equilibrium) and long-run expectations , i.e. corrections to equilibrium.  (My thanks to Edwin Chen for this entertaining explanation)

Cointegration is Not the Answer

So a typical workflow for researching possible pairs trade might be to examine a large number of pairs in a sector of interest, select those that meet some correlation threshold (e.e. 90%), test those pairs for cointegration and select those that appear to be cointegrated.  The problem is:  it doesn’t work!  The pairs thrown up by this process are likely to work for a while, but many (even the majority) will break down at some point, typically soon after you begin live trading.  `The reason is that all of the major statistical tests for cointegration have relatively low power and pairs that are apparently cointegrated break down suddenly, with consequential losses for the trader.  The following posts delves into the subject in some detail:

 

Other Practical “Gotchas”

Apart from correlations/cointegration breakdowns there is a long list of things that can go wrong with a pairs trade that the practitioner needs to take account of, for instance:

  • A stock may become difficult or expensive to short
  • The overall backtest performance stats for a pair may look great, but the P&L per share is too small to overcome trading costs and other frictions.
  • Corporate actions (mergers, takeovers) and earnings can blow up one side of an otherwise profitable pair.
  • It is possible to trade passively, crossing the spread  to trade the other leg when the first leg trades.  But this trade expression is challenging to test.  If paying the spread on both legs is going to jeopardize the profitability of the strategy, it is probably better to reject the pair.

What Works

From my experience, the testing phase of the process of building a statistical arbitrage strategy is absolutely critical.  By this I mean that, after screening for correlation and cointegration, and back-testing all of the possible types of model, it is essential to conduct an extensive simulation test over a period of several weeks before adding a new pair to the production system.  Testing is important for any algorithmic strategy, of course, but it is an integral part of the selection process where pairs trading is concerned.  You should expect 60% to 80% of your candidates to fail in simulated trading, even after they have been carefully selected and thoroughly back-tested.  The good good news is that those pairs that pass the final stage of testing usually are successful in a production setting.

Implementation

Putting all of this information together, it should be apparent that the major challenge in pairs trading lies not so much in understanding and implementing methodologies and techniques, but in implementing the research process on an industrial scale, sufficient to collate and analyze tens of millions of pairs. This is beyond the reach of most retail investors, and indeed, many small trading firms:  I once worked with a trading firm for over a year on a similar research project, but in the end it proved to be capabilities of even their highly competent development team.

So does this mean that for the average quantitative strategist investors statistical arbitrage must remain an investment concept of purely theoretical interest?  Actually, no.  Firstly, for the investor, there are plenty of investment products available that they can access via hedge fund structures (or even our algotrading platform, as I have previously mentioned).

For those interested in building stat arb strategies there is an excellent resource that collates all of the data and analysis on tens of millions of stock pairs that enables the researcher to identify promising pairs, test their level of cointegration, backtest strategies using different methodologies and even put selected pars strategies into production (see example below).

Those interested should contact me for more information.

 

Developing Long/Short ETF Strategies

Recently I have been working on the problem of how to construct large portfolios of cointegrated securities.  My focus has been on ETFs rather that stocks, although in principle the methodology applies equally well to either, of course.

My preference for ETFs is due primarily to the fact that  it is easier to achieve a wide diversification in the portfolio with a more limited number of securities: trading just a handful of ETFs one can easily gain exposure, not only to the US equity market, but also international equity markets, currencies, real estate, metals and commodities. Survivorship bias, shorting restrictions  and security-specific risk are also less of an issue with ETFs than with stocks (although these problems are not too difficult to handle).

On the downside, with few exceptions ETFs tend to have much shorter histories than equities or commodities.  One also has to pay close attention to the issue of liquidity. That said, I managed to assemble a universe of 85 ETF products with histories from 2006 that have sufficient liquidity collectively to easily absorb an investment of several hundreds of  millions of dollars, at minimum.

The Cardinality Problem

The basic methodology for constructing a long/short portfolio using cointegration is covered in an earlier post.   But problems arise when trying to extend the universe of underlying securities.  There are two challenges that need to be overcome.

Magic Cube.112

The first issue is that, other than the simple regression approach, more advanced techniques such as the Johansen test are unable to handle data sets comprising more than about a dozen securities. The second issue is that the number of possible combinations of cointegrated securities quickly becomes unmanageable as the size of the universe grows.  In this case, even taking a subset of just six securities from the ETF universe gives rise to a total of over 437 million possible combinations (85! / (79! * 6!).  An exhaustive test of all the possible combinations of a larger portfolio of, say, 20 ETFs, would entail examining around 1.4E+19 possibilities.

Given the scale of the computational problem, how to proceed? One approach to addressing the cardinality issue is sparse canonical correlation analysis, as described in Identifying Small Mean Reverting Portfolios,  d’Aspremont (2008). The essence of the idea is something like this. Suppose you find that, in a smaller, computable universe consisting of just two securities, a portfolio comprising, say, SPY and QQQ was  found to be cointegrated.  Then, when extending consideration to portfolios of three securities, instead of examining every possible combination, you might instead restrict your search to only those portfolios which contain SPY and QQQ. Having fixed the first two selections, you are left with only 83 possible combinations of three securities to consider.  This process is repeated as you move from portfolios comprising 3 securities to 4, 5, 6, … etc.

Other approaches to the cardinality problem are  possible.  In their 2014 paper Sparse, mean reverting portfolio selection using simulated annealing,  the Hungarian researchers Norbert Fogarasi and Janos Levendovszky consider a new optimization approach based on simulated annealing.  I have developed my own, hybrid approach to portfolio construction that makes use of similar analytical methodologies. Does it work?

A Cointegrated Long/Short ETF Basket

Below are summarized the out-of-sample results for a portfolio comprising 21 cointegrated ETFs over the period from 2010 to 2015.  The basket has broad exposure (long and short) to US and international equities, real estate, currencies and interest rates, as well as exposure in banking, oil and gas and other  specific sectors.

The portfolio was constructed using daily data from 2006 – 2009, and cointegration vectors were re-computed annually using data up to the end of the prior year.  I followed my usual practice of using daily data comprising “closing” prices around 12pm, i.e. in the middle of the trading session, in preference to prices at the 4pm market close.  Although liquidity at that time is often lower than at the close, volatility also tends to be muted and one has a period of perhaps as much at two hours to try to achieve the arrival price. I find this to be a more reliable assumption that the usual alternative.

Fig 2   Fig 1 The risk-adjusted performance of the strategy is consistently outstanding throughout the out-of-sample period from 2010.  After a slowdown in 2014, strategy performance in the first quarter of 2015 has again accelerated to the level achieved in earlier years (i.e. with a Sharpe ratio above 4).

Another useful test procedure is to compare the strategy performance with that of a portfolio constructed using standard mean-variance optimization (using the same ETF universe, of course).  The test indicates that a portfolio constructed using the traditional Markowitz approach produces a similar annual return, but with 2.5x the annual volatility (i.e. a Sharpe ratio of only 1.6).  What is impressive about this result is that the comparison one is making is between the out-of-sample performance of the strategy vs. the in-sample performance of a portfolio constructed using all of the available data.

Having demonstrated the validity of the methodology,  at least to my own satisfaction, the next step is to deploy the strategy and test it in a live environment.  This is now under way, using execution algos that are designed to minimize the implementation shortfall (i.e to minimize any difference between the theoretical and live performance of the strategy).  So far the implementation appears to be working very well.

Once a track record has been built and audited, the really hard work begins:  raising investment capital!

Cointegration Breakdown

The Low Power of Cointegration Tests

One of the perennial difficulties in developing statistical arbitrage strategies is the lack of reliable methods of estimating a stationary portfolio comprising two or more securities. In a prior post (below) I discussed at some length one of the primary reasons for this, i.e. the lower power of cointegration tests. In this post I want to explore the issue in more depth, looking at the standard Johansen test Procedure to estimate cointegrating vectors.

Johansen Test for Cointegration

Start with some weekly data for an ETF triplet analyzed in Ernie Chan’s book:

After downloading the weekly close prices for the three ETFs we divide the data into 14 years of in-sample data and 1 year out of sample:

We next apply the Johansen test, using code kindly provided by Amanda Gerrish:

We find evidence of up to three cointegrating vectors at the 95% confidence level:

 

Let’s take a look at the vector coefficients (laid out in rows, in Amanda’s function):

In-Sample vs. Out-of-Sample testing

We now calculate the in-sample and out-of-sample portfolio values using the first cointegrating vector:

The portfolio does indeed appear to be stationary, in-sample, and this is confirmed by the unit root test, which rejects the null hypothesis of a unit root:

Unfortunately (and this is typically the case) the same is not true for the out of sample period:

More Data Doesn’t Help

The problem with the nonstationarity of the out-of-sample estimated portfolio values is not mitigated by adding more in-sample data points and re-estimating the cointegrating vector(s):

We continue to add more in-sample data points, reducing the size of the out-of-sample dataset correspondingly. But none of the tests for any of the out-of-sample datasets is able to reject the null hypothesis of a unit root in the portfolio price process:

 

 

The Challenge of Cointegration Testing in Real Time

In our toy problem we know the out-of-sample prices of the constituent ETFs, and can therefore test the stationarity of the portfolio process out of sample. In a real world application, that discovery could only be made in real time, when the unknown, future ETFs prices are formed. In that scenario, all the researcher has to go on are the results of in-sample cointegration analysis, which demonstrate that the first cointegrating vector consistently yields a portfolio price process that is very likely stationary in sample (with high probability).

The researcher might understandably be persuaded, wrongly, that the same is likely to hold true in future. Only when the assumed cointegration relationship falls apart in real time will the researcher then discover that it’s not true, incurring significant losses in the process, assuming the research has been translated into some kind of trading strategy.

A great many analysts have been down exactly this path, learning this important lesson the hard way. Nor do additional “safety checks” such as, for example, also requiring high levels of correlation between the constituent processes add much value. They might offer the researcher comfort that a “belt and braces” approach is more likely to succeed, but in my experience it is not the case: the problem of non-stationarity in the out of sample price process persists.

Conclusion:  Why Cointegration Breaks Down

We have seen how a portfolio of ETFs consistently estimated to be cointegrated in-sample, turns out to be non-stationary when tested out-of-sample.  This goes to the issue of the low power of cointegration test, and their inability to estimate cointegrating vectors with sufficient accuracy.  Analysts relying on standard tests such as the Johansen procedure to design their statistical arbitrage strategies are likely to be disappointed by the regularity with which their strategies break down in live trading.

 

Successful Statistical Arbitrage

 

I tend not to get involved in Q&A with readers of my blog, or with investors.  I am at a point in my life where I spend my time mostly doing what I want to do, rather than what other people would like me to do.  And since I enjoy doing research and trading, I try to maximize the amount of time I spend on those activities.

As a business strategy, I wouldn’t necessarily recommend this approach.  It’s just something I evolved while learning to play chess: since I had no-one to teach me, I had to learn everything for myself and this involved studying for many, many hours alone.

By contrast, several of the best money managers are also excellent communicators – take Roy Niederhoffer, or Ernie Chan, for example. Having regular, informed communication with your investors is, as smarter managers have realized, a means of building trust and investor loyalty – important factors that come into play during periods when your strategy is underperforming. Not only that, but since communication is two-way, an analyst/manager can learn much from his exchanges with his clients.  Knowing how others perceive you – and your competitors – for example, is very useful information.  So, too, is information about your competitors’ research ideas, investment strategies and fund performance, which can often be gleaned from discussions with investors.  There are plenty of reasons to prefer a policy of regular, open communication.

As a case in point, I was surprised to learn from  comments on another research blog that readers drew the conclusion from my previous posts that pursuing the cointegration or Kalman Filter approach to statistical arbitrage was a waste of time.  Apparently, my remark to the effect that researchers often failed to pay attention to the net PnL per share in evaluating stat. arb. trading strategies was taken by some to mean that any apparent profitability would always be subsumed within the bid-offer spread.  That was not my intention.  What I intended to convey was that in some instances, this would be the case  – some, but not all.

To illustrate the point, below are the out-of-sample results from a research study applying the Kalman Filter approach for four equity pairs using 5-minute data.  For competitive reasons I am unable to identify the specific stocks in each pair, which result from an exhaustive analysis of over 30,000 pairs, but I can say that they are liquid large-cap equities traded in large volume on the US exchanges.  The performance numbers are net of transaction costs and are based on the assumption of a 5-minute delay in execution: meaning, a trading signal received at time t is assumed to be executed at time t+5 minutes.  This allows sufficient time to leg into each trade passively, in most cases avoiding the bid-offer spread.  The net PnL per share is above 1.5c per share for each pair.

Fig 0 While the performance of none of the pairs is spectacular, a combined portfolio has quite attractive characteristics, which include 81% winning months since Jan 2012, a CAGR of over 27% and Information Ratio of 2.29, measured on monthly returns (2.74 based on daily returns).

Fig 2

Fig 3

Finally, I am currently implementing trading of a number of stock portfolios based on static cointegration relationships that have out-of-sample information ratios of between 3 and 4, using daily data.

 

 

Developing Statistical Arbitrage Strategies Using Cointegration

In his latest book (Algorithmic Trading: Winning Strategies and their Rationale, Wiley, 2013) Ernie Chan does an excellent job of setting out the procedures for developing statistical arbitrage strategies using cointegration.  In such mean-reverting strategies, long positions are taken in under-performing stocks and short positions in stocks that have recently outperformed.

I will leave a detailed description of the procedure to Ernie (see pp 47 – 60), which in essence involves:

(i) estimating a cointegrating relationship between two or more stocks, using the Johansen procedure

(ii) computing the half-life of mean reversion of the cointegrated process, based on an Ornstein-Uhlenbeck  representation, using this as a basis for deciding the amount of recent historical data to be used for estimation in (iii)

(iii) Taking a position proportionate to the Z-score of the market value of the cointegrated portfolio (subtracting the recent mean and dividing by the recent standard deviation, where “recent” is defined with reference to the half-life of mean reversion)

Countless researchers have followed this well worn track, many of them reporting excellent results.  In this post I would like to discuss a few of many considerations  in the procedure and variations in its implementation.  We will follow Ernie’s example, using daily data for the EWF-EWG-ITG triplet of ETFs from April 2006 – April 2012. The analysis runs as follows (I am using an adapted version of the Matlab code provided with Ernie’s book):

Johansen test We reject the null hypothesis of fewer then three cointegrating relationships at the 95% level. The eigenvalues and eigenvectors are as follows:

Eigenvalues The eignevectors are sorted by the size of their eigenvalues, so we pick the first of them, which is expected to have the shortest half-life of mean reversion, and create a portfolio based on the eigenvector weights (-1.046, 0.76, 0.2233).  From there, it requires a simple linear regression to estimate the half-life of mean reversion:

Halflife From which we estimate the half-life of mean reversion to be 23 days.  This estimate gets used during the final, stage 3, of the process, when we choose a look-back period for estimating the running mean and standard deviation of the cointegrated portfolio.  The position in each stock (numUnits) is sized according to the standardized deviation from the mean (i.e. the greater the deviation the larger the allocation). Apply Ci The results appear very promising, with an annual APR of 12.6% and Sharpe ratio of 1.4:   Returns EWA-EWC-IGE

Ernie is at pains to point out that, in this and other examples in the book, he pays no attention to transaction costs, nor to the out-of-sample performance of the strategies he evaluates, which is fair enough.

The great majority of the academic studies that examine the cointegration approach to statistical arbitrage for a variety of investment universes do take account of transaction costs.  For the most part such studies report very impressive returns and Sharpe ratios that frequently exceed 3.  Furthermore, unlike Ernie’s example which is entirely in-sample, these studies typically report consistent out-of-sample performance results also.

But the single, most common failing of such studies is that they fail to consider the per share performance of the strategy.  If the net P&L per share is less than the average bid-offer spread of the securities in the investment portfolio, the theoretical performance of the strategy is unlikely to survive the transition to implementation.  It is not at all hard to achieve a theoretical Sharpe ratio of 3 or higher, if you are prepared to ignore the fact that the net P&L per share is lower than the average bid-offer spread.  In practice, however, any such profits are likely to be whittled away to zero in trading frictions – the costs incurred in entering, adjusting and exiting positions across multiple symbols in the portfolio.

Put another way, you would want to see a P&L per share of at least 1c, after transaction costs, before contemplating implementation of the strategy.  In the case of the EWA-EWC-IGC portfolio the P&L per share is around 3.5 cents.  Even after allowing, say, commissions of 0.5 cents per share and a bid-offer spread of 1c per share on both entry and exit, there remains a profit of around 2 cents per share – more than enough to meet this threshold test.

Let’s address the second concern regarding out-of-sample testing.   We’ll introduce a parameter to allow us to select the number of in-sample days, re-estimate the model parameters using only the in-sample data, and test the performance out of sample.  With a in-sample size of 1,000 days, for instance, we find that we can no longer reject the null hypothesis of fewer than 3 cointegrating relationships and the weights for the best linear portfolio differ significantly from those estimated using the entire data set.

Johansen 2

Repeating the regression analysis using the eigenvector weights of the maximum eigenvalue vector (-1.4308, 0.6558, 0.5806), we now estimate the half-life to be only 14 days.  The out-of-sample APR of the strategy over the remaining 500 days drops to around 5.15%, with a considerably less impressive Sharpe ratio of only 1.09.

osPerfOut-of-sample cumulative returns

One way to improve the strategy performance is to relax the assumption of strict proportionality between the portfolio holdings and the standardized deviation in the market value of the cointegrated portfolio.  Instead, we now require  the standardized deviation of the portfolio market value to exceed some chosen threshold level before we open a position (and we close any open positions when the deviation falls below the threshold).  If we choose a threshold level of 1, (i.e. we require the market value of the portfolio to deviate 1 standard deviation from its mean before opening a position), the out-of-sample performance improves considerably:

osPerf 2

The out-of-sample APR is now over 7%, with a Sharpe ratio of 1.45.

The strict proportionality requirement, while logical,  is rather unusual:  in practice, it is much more common to apply a threshold, as I have done here.  This addresses the need to ensure an adequate P&L per share, which will typically increase with higher thresholds.  A countervailing concern, however, is that as the threshold is increased the number of trades will decline, making the results less reliable statistically.  Balancing the two considerations, a threshold of around 1-2 standard deviations is a popular and sensible choice.

Of course, introducing thresholds opens up a new set of possibilities:  just because you decide to enter based on a 2x SD trigger level doesn’t mean that you have to exit a position at the same level.  You might consider the outcome of entering at 2x SD, while exiting at 1x SD, 0x SD, or even -2x SD.  The possible nuances are endless.

Unfortunately, the inconsistency in the estimates of the cointegrating relationships over different data samples is very common.  In fact, from my own research, it is often the case that cointegrating relationships break down entirely out-of-sample, just as do correlations.  A recent study by Matthew Clegg of over 860,000 pairs confirms this finding (On the Persistence of Cointegration in Pais Trading, 2014) that cointegration is not a persistent property.

I shall examine one approach to  addressing the shortcomings  of the cointegration methodology  in a future post.

 

Matlab code (adapted from Ernie Chan’s book):

Continue reading “Developing Statistical Arbitrage Strategies Using Cointegration”

The Correlation Signal

The use of correlations is widespread in investment management theory and practice, from the construction of portfolios to the design of hedge trades to statistical arbitrage strategies.

A common difficulty encountered in all of these applications is the variation in correlation: assets that at one time appear to be suitably uncorrelated for hedging purposes, may become much more highly correlated at other times, such as periods of market stress. Conversely, stocks that appear suitable for pairs trading due to the high correlation in their prices or returns, may de-couple at a later time, causing significant losses.

The instability in the level of correlation is further aggravated by the empirical finding that the volatility in correlation is itself time-dependent:  at times the correlations between assets may appear to fluctuate smoothly within a tight range; at other times we might see several fluctuations in the sign of the correlation  coefficient over the course of a few days.

One tool I have found useful in this context is a concept I refer to as the correlation signal, defined at the average correlation divided by the standard deviation of the correlation coefficient.  The chart below illustrates a typical pattern for a pair of Oil and Gas industry stocks.  The blue line is the average daily correlation between the stocks, measured at 5-minute intervals.  The red line is the correlation signal – the average daily correlation divided by the standard deviation in the intra-day correlation.  The stochastic nature of both the correlation coefficient and the correlation signal is quite evident.  Note that the correlation signal, unlike the coefficient, is not constrained within the limits of +/- 1.  At times when the variation in correlation is low the signal an easily exceed those limits by as much as an order of magnitude.

CorrSig Plot

In later posts I will illustrate the usefulness of the correlation signal in portfolio construction and statistical arbitrage.  For now, let me just say that it is a measure of the strength of the correlation as a signal, relative to the noise of random variation in the correlation process.   It can be used to identify situations in which a relationship – whether a positive or negative correlation – appears to be stable or unstable, and therefore viable as a basis for inference, or not.