Measuring Toxic Flow for Trading & Risk Management

A common theme of microstructure modeling is that trade flow is often predictive of market direction.  One concept in particular that has gained traction is flow toxicity, i.e. flow where resting orders tend to be filled more quickly than expected, while aggressive orders rarely get filled at all, due to the participation of informed traders trading against uninformed traders.  The fundamental insight from microstructure research is that the order arrival process is informative of subsequent price moves in general and toxic flow in particular.  This is turn has led researchers to try to measure the probability of informed trading  (PIN).  One recent attempt to model flow toxicity, the Volume-Synchronized Probability of Informed Trading (VPIN)metric, seeks to estimate PIN based on volume imbalance and trade intensity.  A major advantage of this approach is that it does not require the estimation of unobservable parameters and, additionally, updating VPIN in trade time rather than clock time improves its predictive power.  VPIN has potential applications both in high frequency trading strategies, but also in risk management, since highly toxic flow is likely to lead to the withdrawal of liquidity providers, setting up the conditions for a flash-crash” type of market breakdown.

The procedure for estimating VPIN is as follows.  We begin by grouping sequential trades into equal volume buckets of size V.  If the last trade needed to complete a bucket was for a size greater than needed, the excess size is given to the next bucket.  Then we classify trades within each bucket into two volume groups:  Buys (V(t)B) and Sells (V(t)S), with V = V(t)B + V(t)S
The Volume-Synchronized Probability of Informed Trading is then derived as:

risk management

Typically one might choose to estimate VPIN using a moving average over n buckets, with n being in the range of 50 to 100.

Another related statistic of interest is the single-period signed VPIN. This will take a value of between -1 and =1, depending on the proportion of buying to selling during a single period t.

Toxic Flow

Fig 1. Single-Period Signed VPIN for the ES Futures Contract

It turns out that quote revisions condition strongly on the signed VPIN. For example, in tests of the ES futures contract, we found that the change in the midprice from one volume bucket the next  was highly correlated to the prior bucket’s signed VPIN, with a coefficient of 0.5.  In other words, market participants offering liquidity will adjust their quotes in a way that directly reflects the direction and intensity of toxic flow, which is perhaps hardly surprising.

Of greater interest is the finding that there is a small but statistically significant dependency of price changes, as measured by first buy (sell) trade price to last sell (buy) trade price, on the prior period’s signed VPIN.  The correlation is positive, meaning that strongly toxic flow in one direction has a tendency  to push prices in the same direction during the subsequent period. Moreover, the single period signed VPIN turns out to be somewhat predictable, since its autocorrelations are statistically significant at two or more lags.  A simple linear auto-regression ARMMA(2,1) model produces an R-square of around 7%, which is small, but statistically significant.

A more useful model, however , can be constructed by introducing the idea of Markov states and allowing the regression model to assume different parameter values (and error variances) in each state.  In the Markov-state framework, the system transitions from one state to another with conditional probabilities that are estimated in the model.

SSALGOTRADING AD

An example of such a model  for the signed VPIN in ES is shown below. Note that the model R-square is over 27%, around 4x larger than for a standard linear ARMA model.

We can describe the regime-switching model in the following terms.  In the regime 1 state  the model has two significant autoregressive terms and one significant moving average term (ARMA(2,1)).  The AR1 term is large and positive, suggesting that trends in VPIN tend to be reinforced from one period to the next. In other words, this is a momentum state. In the regime 2 state the AR2 term is not significant and the AR1 term is large and negative, suggesting that changes in VPIN in one period tend to be reversed in the following period, i.e. this is a mean-reversion state.

The state transition probabilities indicate that the system is in mean-reversion mode for the majority of the time, approximately around 2 periods out of 3.  During these periods, excessive flow in one direction during one period tends to be corrected in the
ensuring period.  But in the less frequently occurring state 1, excess flow in one direction tends to produce even more flow in the same direction in the following period.  This first state, then, may be regarded as the regime characterized by toxic flow.

Markov State Regime-Switching Model

Markov Transition Probabilities

P(.|1)       P(.|2)

P(1|.)        0.54916      0.27782

P(2|.)       0.45084      0.7221

Regime 1:

AR1           1.35502    0.02657   50.998        0

AR2         -0.33687    0.02354   -14.311        0

MA1          0.83662    0.01679   49.828        0

Error Variance^(1/2)           0.36294     0.0058

Regime 2:

AR1      -0.68268    0.08479    -8.051        0

AR2       0.00548    0.01854    0.296    0.767

MA1     -0.70513    0.08436    -8.359        0

Error Variance^(1/2)           0.42281     0.0016

Log Likelihood = -33390.6

Schwarz Criterion = -33445.7

Hannan-Quinn Criterion = -33414.6

Akaike Criterion = -33400.6

Sum of Squares = 8955.38

R-Squared =  0.2753

R-Bar-Squared =  0.2752

Residual SD =  0.3847

Residual Skewness = -0.0194

Residual Kurtosis =  2.5332

Jarque-Bera Test = 553.472     {0}

Box-Pierce (residuals):         Q(9) = 13.9395 {0.124}

Box-Pierce (squared residuals): Q(12) = 743.161     {0}

 

A Simple Trading Strategy

One way to try to monetize the predictability of the VPIN model is to use the forecasts to take directional positions in the ES
contract.  In this simple simulation we assume that we enter a long (short) position at the first buy (sell) price if the forecast VPIN exceeds some threshold value 0.1  (-0.1).  The simulation assumes that we exit the position at the end of the current volume bucket, at the last sell (buy) trade price in the bucket.

This simple strategy made 1024 trades over a 5-day period from 8/8 to 8/14, 90% of which were profitable, for a total of $7,675 – i.e. around ½ tick per trade.

The simulation is, of course, unrealistically simplistic, but it does give an indication of the prospects for  more realistic version of the strategy in which, for example, we might rest an order on one side of the book, depending on our VPIN forecast.

informed trading

Figure 2 – Cumulative Trade PL

References

Easley, D., Lopez de Prado, M., O’Hara, M., Flow Toxicity and Volatility in a High frequency World, Johnson School Research paper Series # 09-2011, 2011

Easley, D. and M. O‟Hara (1987), “Price, Trade Size, and Information in Securities Markets”, Journal of Financial Economics, 19.

Easley, D. and M. O‟Hara (1992a), “Adverse Selection and Large Trade Volume: The Implications for Market Efficiency”,
Journal of Financial and Quantitative Analysis, 27(2), June, 185-208.

Easley, D. and M. O‟Hara (1992b), “Time and the process of security price adjustment”, Journal of Finance, 47, 576-605.

 

Forecasting Financial Markets – Part 1: Time Series Analysis

The presentation in this post covers a number of important topics in forecasting, including:

  • Stationary processes and random walks
  • Unit roots and autocorrelation
  • ARMA models
  • Seasonality
  • Model testing
  • Forecasting
  • Dickey-Fuller and Phillips-Perron tests for unit roots

Also included are a number of detailed worked examples, including:

  1. ARMA Modeling
  2. Box Jenkins methodology
  3. Modeling the US Wholesale Price Index
  4. Pesaran & Timmermann study of excess equity returns
  5. Purchasing Power Parity

 

Forecasting 2011 - Time Series

 

Master’s in High Frequency Finance

I have been discussing with some potential academic partners the concept for a new graduate program in High Frequency Finance.  The idea is to take the concept of the Computational Finance program developed in the 1990s and update it to meet the needs of students in the 2010s.

The program will offer a thorough grounding in the modeling concepts, trading strategies and risk management procedures currently in use by leading investment banks, proprietary trading firms and hedge funds in US and international financial markets.  Students will also learn the necessary programming and systems design skills to enable them to make an effective contribution as quantitative analysts, traders, risk managers and developers.

I would be interested in feedback and suggestions as to the proposed content of the program.

A Practical Application of Regime Switching Models to Pairs Trading

In the previous post I outlined some of the available techniques used for modeling market states.  The following is an illustration of how these techniques can be applied in practice.    You can download this post in pdf format here.

SSALGOTRADING AD

The chart below shows the daily compounded returns for a single pair in an ETF statistical arbitrage strategy, back-tested over a 1-year period from April 2010 to March 2011.

The idea is to examine the characteristics of the returns process and assess its predictability.

Pairs Trading

The initial impression given by the analytics plots of daily returns, shown in Fig 2 below, is that the process may be somewhat predictable, given what appears to be a significant 1-order lag in the autocorrelation spectrum.  We also see evidence of the
customary non-Gaussian “fat-tailed” distribution in the error process.

Regime Switching

An initial attempt to fit a standard Auto-Regressive Moving Average ARMA(1,0,1) model  yields disappointing results, with an unadjusted  model R-squared of only 7% (see model output in Appendix 1)

However, by fitting a 2-state Markov model we are able to explain as much as 65% in the variation in the returns process (see Appendix II).
The model estimates Markov Transition Probabilities as follows.

P(.|1)       P(.|2)

P(1|.)       0.93920      0.69781

P(2|.)     0.060802      0.30219

In other words, the process spends most of the time in State 1, switching to State 2 around once a month, as illustrated in Fig 3 below.

Markov model
In the first state, the  pairs model produces an expected daily return of around 65bp, with a standard deviation of similar magnitude.  In this state, the process also exhibits very significant auto-regressive and moving average features.

Regime 1:

Intercept                   0.00648     0.0009       7.2          0

AR1                            0.92569    0.01897   48.797        0

MA1                         -0.96264    0.02111   -45.601        0

Error Variance^(1/2)           0.00666     0.0007

In the second state, the pairs model  produces lower average returns, and with much greater variability, while the autoregressive and moving average terms are poorly determined.

Regime 2:

Intercept                    0.03554    0.04778    0.744    0.459

AR1                            0.79349    0.06418   12.364        0

MA1                         -0.76904    0.51601     -1.49   0.139

Error Variance^(1/2)           0.01819     0.0031

CONCLUSION
The analysis in Appendix II suggests that the residual process is stable and Gaussian.  In other words, the two-state Markov model is able to account for the non-Normality of the returns process and extract the salient autoregressive and moving average features in a way that makes economic sense.

How is this information useful?  Potentially in two ways:

(i)     If the market state can be forecast successfully, we can use that information to increase our capital allocation during periods when the process is predicted to be in State 1, and reduce the allocation at times when it is in State 2.

(ii)    By examining the timing of the Markov states and considering different features of the market during the contrasting periods, we might be able to identify additional explanatory factors that could be used to further enhance the trading model.

Markov model

Regime-Switching & Market State Modeling

The Excel workbook referred to in this post can be downloaded here.

Market state models are amongst the most useful analytical techniques that can be helpful in developing alpha-signal generators.  That term covers a great deal of ground, with ideas drawn from statistics, econometrics, physics and bioinformatics.  The purpose of this short note is to provide an introduction to some of the key ideas and suggest ways in which they might usefully applied in the context of researching and developing trading systems.

Although they come from different origins, the concepts presented here share common foundational principles:

  1. Markets operate in different states that may be characterized by various measures (volatility, correlation, microstructure, etc);
  2. Alpha signals can be generated more effectively by developing models that are adapted to take account of different market regimes;
  3. Alpha signals may be combined together effectively by taking account of the various states that a market may be in.

Market state models have shown great promise is a variety of applications within the field of applied econometrics in finance, not only for price and market direction forecasting, but also basis trading, index arbitrage, statistical arbitrage, portfolio construction, capital allocation and risk management.

REGIME SWITCHING MODELS

These are econometric models which seek to use statistical techniques to characterize market states in terms of different estimates of the parameters of some underlying linear model.  This is accompanied by a transition matrix which estimates the probability of moving from one state to another.

To illustrate this approach I have constructed a simple example, given in the accompanying Excel workbook.  In this model the market operates as follows:

econometric Where

Yt is a variable of interest (e.g. the return in an asset over the next period t) 

et is an error process with constant variance s2 

S is the market state, with two regimes (S=1 or S=2) 

a0 is the drift in the asset process 

a1 is an autoregressive term, by which the return in the current period is dependent on the prior period return 

b1 is a moving average term, which smoothes the error process 

 This is one of the simplest possible structures, which in more general form can include multiple states, and independent regressions Xi as explanatory variables (such as book pressure, order flow, etc):

econometric

 

SSALGOTRADING AD

The form of the error process et may also be dependent on the market state.  It may simply be that, as in this example, the standard deviation of the error process changes from state to state.  But the changes can also be much more complex:  for instance, the error process may be non-Gaussian, or it may follow a formulation from the GARCH framework.

In this example the state parameters are as follows:

Reg1 Reg 2
s 0.01 0.02
a0 0.005 -0.015
a1 0.40 0.70
b1 0.10 0.20

What this means is that, in the first state the market tends to trend upwards with relatively low volatility.  In the second state, not only is market volatility much higher, but also the trend is 3x as large in the negative direction.

I have specified the following state transition matrix:

Reg1 Reg2
Reg1 0.85 0.15
Reg2 0.90 0.10

This is interpreted as follows:  if the market is in State 1, it will tend to remain in that state 85% of the time, transitioning to State 2 15% of the time.  Once in State 2, the market tends to revert to State 1 very quickly, with 90% probability.  So the system is in State 1 most of the time, trending slowly upwards with low volatility and occasionally flipping into an aggressively downward trending phase with much higher volatility.

The Generate sheet in the Excel workbook shows how observations are generated from this process, from which we select a single instance of 3,000 observations, shown in sheet named Sample.

The sample looks like this:

 

Market state 
 
 

 As anticipated, the market is in State 1 most of the time, occasionally flipping into State 2 for brief periods.

Market state 

 It is well-known that in financial markets we are typically dealing with highly non-Gaussian distributions.  Non-Normality can arise for a number of reasons, including changes in regimes, as illustrated here.  It is worth noting that, even though in this example the process in either market state follows a Gaussian distribution, the combined process is distinctly non-Gaussian in form, having (extremely) fat tails, as shown by the QQ-plot below.

 

 Market state

If we attempt to fit a standard ARMA model to the process, the outcome is very disappointing in terms of the model’s poor explanatory power (R2 0.5%) and lack of fit in the squared-residuals:

 

 

ARIMA(1,0,1)

         Estimate  Std. Err.   t Ratio  p-Value

Intercept                      0.00037    0.00032     1.164    0.244

AR1                            0.57261     0.1697     3.374    0.001

MA1                           -0.63292    0.16163    -3.916        0

Error Variance^(1/2)           0.02015     0.0004    ——   ——

                       Log Likelihood = 7451.96

                    Schwarz Criterion = 7435.95

               Hannan-Quinn Criterion = 7443.64

                     Akaike Criterion = 7447.96

                       Sum of Squares =  1.2172

                            R-Squared =  0.0054

                        R-Bar-Squared =  0.0044

                          Residual SD =  0.0202

                    Residual Skewness = -2.1345

                    Residual Kurtosis =  5.7279

                     Jarque-Bera Test = 3206.15     {0}

Box-Pierce (residuals):         Q(48) = 59.9785 {0.115}

Box-Pierce (squared residuals): Q(50) = 78.2253 {0.007}

              Durbin Watson Statistic = 2.01392

                    KPSS test of I(0) =  0.2001    {<1} *

                 Lo’s RS test of I(0) =  1.2259  {<0.5} *

Nyblom-Hansen Stability Test:  NH(4)  =  0.5275    {<1}

MA form is 1 + a_1 L +…+ a_q L^q.

Covariance matrix from robust formula.

* KPSS, RS bandwidth = 0.

Parzen HAC kernel with Newey-West plug-in bandwidth.

 

 

However, if we keep the same simple form of ARMA(1,1) model, but allow for the possibility of a two-state Markov process, the picture alters dramatically:  now the model is able to account for 98% of the variation in the process, as shown below.

 

Notice that we have succeeded in estimating the correct underlying transition probabilities, and how the ARMA model parameters change from regime to regime much as they should (small positive drift in one regime, large negative drift in the second, etc).

 

Markov Transition Probabilities

                    P(.|1)       P(.|2)

P(1|.)            0.080265      0.14613

P(2|.)             0.91973      0.85387

 

                              Estimate  Std. Err.   t Ratio  p-Value

Logistic, t(1,1)              -2.43875     0.1821    ——   ——

Logistic, t(1,2)              -1.76531     0.0558    ——   ——

Non-switching parameters shown as Regime 1.

 

Regime 1:

Intercept                     -0.05615    0.00315   -17.826        0

AR1                            0.70864    0.16008     4.427        0

MA1                           -0.67382    0.16787    -4.014        0

Error Variance^(1/2)           0.00244     0.0001    ——   ——

 

Regime 2:

Intercept                      0.00838     2e-005   419.246        0

AR1                            0.26716    0.08347     3.201    0.001

MA1                           -0.26592    0.08339    -3.189    0.001

 

                       Log Likelihood = 12593.3

                    Schwarz Criterion = 12557.2

               Hannan-Quinn Criterion = 12574.5

                     Akaike Criterion = 12584.3

                       Sum of Squares =  0.0178

                            R-Squared =  0.9854

                        R-Bar-Squared =  0.9854

                          Residual SD =  0.002

                    Residual Skewness = -0.0483

                    Residual Kurtosis = 13.8765

                     Jarque-Bera Test = 14778.5     {0}

Box-Pierce (residuals):         Q(48) = 379.511     {0}

Box-Pierce (squared residuals): Q(50) = 36.8248 {0.917}

              Durbin Watson Statistic = 1.50589

                    KPSS test of I(0) =  0.2332    {<1} *

                 Lo’s RS test of I(0) =  2.1352 {<0.005} *

Nyblom-Hansen Stability Test:  NH(9)  =  0.8396    {<1}

MA form is 1 + a_1 L +…+ a_q L^q.

Covariance matrix from robust formula.

* KPSS, RS bandwidth = 0.

Parzen HAC kernel with Newey-West plug-in bandwidth.

regime switching

There are a variety of types of regime switching mechanisms we can use in state models:

 

Hamiltonian – the simplest, where the process mean and variance vary from state to state

Markovian – the approach used here, with state transition matrix

Explained Switching – where the process changes state as a result of the influence of some underlying variable (such as interest rate volatility, for example)

Smooth Transition – comparable to explained Markov switching, but without and explicitly probabilistic interpretation.

 

 

This example is both rather simplistic and pathological at the same time:  the states are well-separated , by design, whereas for real processes they tend to be much harder to distinguish.  A difficulty of this methodology is that the models can be very difficult to estimate.  The likelihood function tends to be very flat and there are a great many local maxima that give similar fit, but with widely varying model forms and parameter estimates.  That said, this is a very rich class of models with a great many potential applications.

Volatility Forecasting in Emerging Markets

The great majority of empirical studies have focused on asset markets in the US and other developed economies.   The purpose of this research is to determine to what extent the findings of other researchers in relation to the characteristics of asset volatility in developed economies applies also to emerging markets.  The important characteristics observed in asset volatility that we wish to identify and examine in emerging markets include clustering, (the tendency for periodic regimes of high or low volatility) long memory, asymmetry, and correlation with the underlying returns process.  The extent to which such behaviors are present in emerging markets will serve to confirm or refute the conjecture that they are universal and not just the product of some factors specific to the intensely scrutinized, and widely traded developed markets.

The ten emerging markets we consider comprise equity markets in Australia, Hong Kong, Indonesia, Malaysia, New Zealand, Philippines, Singapore, South Korea, Sri Lanka and Taiwan focusing on the major market indices for those markets.   After analyzing the characteristics of index volatility for these indices, the research goes on to develop single- and two-factor REGARCH models in the form by Alizadeh, Brandt and Diebold (2002).

Cluster Analysis of Volatility
Processes for Ten Emerging Market Indices

The research confirms the presence of a number of typical characteristics of volatility processes for emerging markets that have previously been identified in empirical research conducted in developed markets.  These characteristics include volatility clustering, long memory, and asymmetry.   There appears to be strong evidence of a region-wide regime shift in volatility processes during the Asian crises in 1997, and a less prevalent regime shift in September 2001. We find evidence from multivariate analysis that the sample separates into two distinct groups:  a lower volatility group comprising the Australian and New Zealand indices and a higher volatility group comprising the majority of the other indices.

SSALGOTRADING AD

Models developed within the single- and two-factor REGARCH framework of Alizadeh, Brandt and Diebold (2002) provide a good fit for many of the volatility series and in many cases have performance characteristics that compare favorably with other classes of models with high R-squares, low MAPE and direction prediction accuracy of 70% or more.   On the debit side, many of the models demonstrate considerable variation in explanatory power over time, often associated with regime shifts or major market events, and this is typically accompanied by some model parameter drift and/or instability.

Single equation ARFIMA-GARCH models appear to be a robust and reliable framework for modeling asset volatility processes, as they are capable of capturing both the short- and long-memory effects in the volatility processes, as well as GARCH effects in the kurtosis process.   The available procedures for estimating the degree of fractional integration in the volatility processes produce estimates that appear to vary widely for processes which include both short- and long- memory effects, but the overall conclusion is that long memory effects are at least as important as they are for volatility processes in developed markets.  Simple extensions to the single-equation models, which include regressor lags of related volatility series, add significant explanatory power to the models and suggest the existence of Granger-causality relationships between processes.

Extending the modeling procedures into the realm of models which incorporate systems of equations provides evidence of two-way Granger causality between certain of the volatility processes and suggests that are fractionally cointegrated, a finding shared with parallel studies of volatility processes in developed markets.

Download paper here.

Resources for Quantitative Analysts

Two of the smartest econometricians I know are Prof. Stephen Taylor of Lancaster University, and Prof. James Davidson of Exeter University.

I recall spending many profitable hours in the 1980’s with Stephen’s book Modelling Financial Time Series, which I am pleased to see has now been reprinted in a second edition.  For a long time this was the best available book on the topic and it remains a classic. It has been surpassed by very few books, one being Stephen’s later work Asset Price Dynamics, Volatility and Prediction.  This is a superb exposition, one that will repay close study.

James Davidson is one of the smartest minds in econometrics. Not only is his research of the highest caliber, he has somehow managed (in his spare time!) to develop one of the most advanced econometrics packages available.  Based on Jurgen Doornik’s Ox programming system, the Time Series Modelling package covers almost every conceivable model type, including regression models, ARIMA, ARFIMA and other single equation models, systems of equations, panel data models, GARCH and other heteroscedastic models and regime switching models, accompanied by very comprehensive statistical testing capabilities.  Furthermore, TSM is very well documented and despite being arguably the most advanced system of its kind it is inexpensive relative to alternatives.  James’s research output is voluminous and often highly complex.  His book, Econometric Theory, is an excellent guide to the state of the art, but not for the novice (or the faint hearted!).

Those looking for a kinder, gentler introduction to econometrics would do well to acquire a copy of Prof. Chris Brooks’s Introductory Econometrics for Finance. This covers most of the key ideas, from regression, through ARMA, GARCH, panel data models, cointegration, regime switching and volatility modeling.  Not only is the coverage comprehensive, Chris’s explanation of the concepts is delightfully clear and illustrated with interesting case studies which he analyzes using the EViews econometrics package.    Although not as advanced as TSM, EViews has everything that most quantitative analysts are likely to require in a modeling system and is very well suited to Chris’s teaching style.  Chris’s research output is enormous and covers a great many topics of interest to financial market analysts, in the same lucid style.

Understanding Stock Price Range Forecasts

Stock Price Range Forecasts

Range forecasts are produced by estimating the parameters of a Geometric Brownian Motion process from historical data and using the model to project a large number of sample paths for the stock price over the coming month and year.

For example, this is a range forecast for Netflix, Inc. (NFLX) as at 7/27/2018 when the price of the stock stood at $355.21:

$NFLX

As you can see, the great majority of the simulated price paths trend upwards.  This is typical for most stocks on account of their upward drift, a tendency to move higher over time.  The statistical table below the chart tells you that in 50% of cases the ending stock price 1 month from the date of forecast was in the range $352.15 to $402.49. Similarly, around 50% of the time the price of the stock in one year’s time were found to be in the range $565.01 to $896.69.  Notice that the end points of the one-year range far exceed the end points of the one-month range forecast – again this is a feature of the upward drift in stocks.

If you want much greater certainty about the outcome, you should look at the 95% ranges.  So, for NFLX, the one month 95% range was projected to be $310.06 to $457.13.  Here, only 1 in 20 of the simulated price paths produced one month forecasts that were higher than $457.13, or lower than $310.06.

Notice that the spread of the one month and one year 95% ranges is much larger than of the corresponding 50% ranges.  This demonstrates the fundamental tradeoff between “accuracy” (the spread of the range) and “certainty”, (the probability of the outcome being with the projected range).  If you want greater certainty of the outcome, you have to allow for a broader span of possibilities, i.e. a wider range.

SSALGOTRADING AD

Uses of Range Forecasts

Most stock analysts tend to produce single price “targets”, rather than a range – these are known as “point forecasts” by econometricians.  So what’s the thinking behind range forecasts?

Range forecasts are arguably more useful than simple point forecasts.  Point forecasts make no guarantee as to the likelihood of the projected price – the only thing we know for sure about such forecasts is that they will be wrong!  Is the forecast target price optimistic or pessimistic?  We have no way to tell.

With range forecasts the situation is very different.  We can talk about the likelihood of a stock being within a specified range at a certain point in time.  If we want to provide a pessimistic forecast for the price in NFLX in one month’s time, for example, we could quote the value $352.15, the lower end of the 50% range forecast.  If we wanted to provide a very pessimistic forecast, one that is very likely to be exceeded, we could quote the bottom of the 95% range: $310.06.

The range also tells us about the future growth prospects for the firm.  So, for example, with NFLX, based on past performance, it is highly likely that the stock price will grow at a rate of more than 2.4% and, optimistically, might increase by almost 3x in the coming year (see the growth rates calculated for the 95% range values).

One specific use of range forecasts is in options trading.  If a trader is bullish on NFLX, instead of buying the stock, he might instead choose to sell one-month put options with a strike price below $352 (the lower end of the 50% one-month range).  If the trader wanted to be more conservative, he might look for put options struck at around $310, the bottom of the 95% range.  A more complex strategy might be to buy calls struck near the top of the 50% range, and sell more calls struck near the top of the 95% range (the theory being that the stock is quite likely to exceed the top of the 50% one-month range, but much less likely to reach the high end of the 95% range).

Limitations of Range Forecasts

Range forecasts are produced by using historical data to estimate the parameters of a particular type of mathematical model, known as a Geometric Brownian Motion process.  For those who are interested in the mechanics of how the forecasts are produced, I have summarized the relevant background theory below.

While there are grounds for challenging the use of such models in this context, it has to be acknowledged that the GBM process is one of the most successful mathematical models in finance today.  The problem lies not so much in the model, as in one of the key assumptions underpinning the approach:  specifically, that the characteristics of the stock process will remain as they are today (and as they have been in the historical past).  This assumption is manifestly untenable when applied to many stocks:  a company that was a high-growth $100M start-up is unlikely to demonstrate the same  rate of growth ten years later, as a $10Bn enterprise.  A company like Amazon that started out as an online book seller has fundamentally different characteristics today, as an online retail empire.  In such cases, forecasts about the future stock price – whether point or range forecasts – based on outdated historical informations are likely to be wrong, sometimes wildly so.

Having said that, there are a great many companies that have evolved to a point of relative stability over a period of perhaps several decades: for example, a company like Caterpillar Inc. (CAT).  In such cases the parameters of the GBM process underpinning the stock price are unlikely to fluctuate widely in the short term, so range forecasts are consequently more likely to be useful.

Another factor to consider are quarterly earnings reports, which can influence stock prices considerably in the short term, and corporate actions (mergers, takeovers, etc) that can change the long term characteristics of a firm and its stock price process in a fundamental way.  In these situations any forecast methodology is likely to unreliable, at least for a while, until the event has passed.  It’s best to avoid taking positions based on projections from historical data at times like this.

Review of Background Theory

 

GBM1

GBM2 GBM3 GBM4 GBM5

Identifying Drivers of Trading Strategy Performance

Building a winning strategy, like the one in the e-Mini S&P500 futures described here is only half the challenge:  it remains for the strategy architect to gain an understanding of the sources of strategy alpha, and risk.  This means identifying the factors that drive strategy performance and, ideally, building a model so that their relative importance can be evaluated.  A more advanced step is the construction of a meta-model that will predict strategy performance and provided recommendations as to whether the strategy should be traded over the upcoming period.

Strategy Performance – Case Study

Let’s take a look at how this works in practice.  Our case study makes use of the following daytrading strategy in e-Mini futures.

Fig1

The overall performance of the strategy is quite good.  Average monthly PNL over the period from April to Oct 2015 is almost $8,000 per contract, after fees, with a standard deviation of only $5,500. That equates to an annual Sharpe Ratio in the region of 5.0.  On a decent execution platform the strategy should scale to around 10-15 contracts, with an annual PNL of around $1.0 to $1.5 million.

Looking into the performance more closely we find that the win rate (56%) and profit factor (1.43) are typical for a profitable strategy of medium frequency, trading around 20 times per session (in this case from 9:30AM to 4PM EST).

fig2

Another attractive feature of the strategy risk profile is the Max Adverse Execution, the drawdown experienced in individual trades (rather than the realized drawdown). In the chart below we see that the MAE increases steadily, without major outliers, to a maximum of only around $1,000 per contract.

Fig3

One concern is that the average trade PL is rather small – $20, just over 1.5 ticks. Strategies that enter and exit with limit orders and have small average trade are generally highly dependent on the fill rate – i.e. the proportion of limit orders that are filled.  If the fill rate is too low, the strategy will be left with too many missed trades on entry or exit, or both.  This is likely to damage strategy performance, perhaps to a significant degree – see, for example my post on High Frequency Trading Strategies.

The fill rate is dependent on the number of limit orders posted at the extreme high or low of the bar, known as the extreme hit rate.  In this case the strategy has been designed specifically to operate at an extreme hit rate of only around 10%, which means that, on average, only around one trade in ten occurs at the high or low of the bar.  Consequently, the strategy is not highly fill-rate dependent and should execute satisfactorily even on a retail platform like Tradestation or Interactive Brokers.

Drivers of Strategy Performance

So far so good.  But before we put the strategy into production, let’s try to understand some of the key factors that determine its performance.  Hopefully that way we will be better placed to judge how profitable the strategy is likely to be as market conditions evolve.

In fact, we have already identified one potential key performance driver: the extreme hit rate (required fill rate) and determined that it is not a major concern in this case. However, in cases where the extreme hit rate rises to perhaps 20%, or more, the fill ratio is likely to become a major factor in determining the success of the strategy.  It would be highly inadvisable to attempt implementation of such a strategy on a retail platform.

SSALGOTRADING AD

What other factors might affect strategy performance?  The correct approach here is to apply the scientific method:  develop some theories about the drivers of performance and see if we can find evidence to support them.

For this case study we might conjecture that, since the strategy enters and exits using limit orders, it should exhibit characteristics of a mean reversion strategy, which will tend to do better when the market moves sideways and rather worse in a strongly trending market.

Another hypothesis is that, in common with most day-trading and high frequency strategies, this strategy will produce better results during periods of higher market volatility.  Empirically, HFT firms have always produced higher profits during volatile market conditions  – 2008 was a banner year for many of them, for example.  In broad terms, times when the market is whipsawing around create additional opportunities for strategies that seek to exploit temporary mis-pricings.  We shall attempt to qualify this general understanding shortly.  For now let’s try to gather some evidence that might support the hypotheses we have formulated.

I am going to take a very simple approach to this, using linear regression analysis.  It’s possible to do much more sophisticated analysis using nonlinear methods, including machine learning techniques. In our regression model the dependent variable will be the daily strategy returns.  In the first iteration, let’s use measures of market returns, trading volume and market volatility as the independent variables.

Fig4

The first surprise is the size of the (adjusted) R Square – at 28%, this far exceeds the typical 5% to 10% level achieved in most such regression models, when applied to trading systems.  In other words, this model does a very good job of account for a large proportion of the variation in strategy returns.

Note that the returns in the underlying S&P50o index play no part (the coefficient is not statistically significant). We might expect this: ours is is a trading strategy that is not specifically designed to be directional and has approximately equivalent performance characteristics on both the long and short side, as you can see from the performance report.

Now for the next surprise: the sign of the volatility coefficient.  Our ex-ante hypothesis is that the strategy would benefit from higher levels of market volatility.  In fact, the reverse appears to be true (due to the  negative coefficient).  How can this be?  On further reflection, the reason why most HFT strategies tend to benefit from higher market volatility is that they are momentum strategies.  A momentum strategy typically enters and exits using market orders and hence requires  a major market move to overcome the drag of the bid-offer spread (assuming it calls the market direction correctly!).  This strategy, by contrast, is a mean-reversion strategy, since entry/exits are effected using limit orders.  The strategy wants the S&P500 index to revert to the mean – a large move that continues in the same direction is going to hurt, not help, this strategy.

Note, by contrast, that the coefficient for the volume factor is positive and statistically significant.  Again this makes sense:  as anyone who has traded the e-mini futures overnight can tell you, the market tends to make major moves when volume is light – simply because it is easier to push around.  Conversely, during a heavy trading day there is likely to be significant opposition to a move in any direction.  In other words, the market is more likely to trade sideways on days when trading volume is high, and this is beneficial for our strategy.

The final surprise and perhaps the greatest of all, is that the strategy alpha appears to be negative (and statistically significant)!  How can this be?  What the regression analysis  appears to be telling us is that the strategy’s performance is largely determined by two underlying factors, volume and volatility.

Let’s dig into this a little more deeply with another regression, this time relating the current day’s strategy return to the prior day’s volume, volatility and market return.

Fig5

In this regression model the strategy alpha is effectively zero and statistically insignificant, as is the case for lagged volume.  The strategy returns relate inversely to the prior day’s market return, which again appears to make sense for a mean reversion strategy:  our model anticipates that, in the mean, the market will reverse the prior day’s gain or loss.  The coefficient for the lagged volatility factor is once again negative and statistically significant.  This, too, makes sense:  volatility tends to be highly autocorrelated, so if the strategy performance is dependent on market volatility during the current session, it is likely to show dependency on volatility in the prior day’s session also.

So, in summary, we can provisionally conclude that:

This strategy has no market directional predictive power: rather it is a pure, mean-reversal strategy that looks to make money by betting on a reversal in the prior session’s market direction.  It will do better during periods when trading volume is high, and when market volatility is low.

Conclusion

Now that we have some understanding of where the strategy performance comes from, where do we go from here?  The next steps might include some, or all, of the following:

(i) A more sophisticated econometric model bringing in additional lags of the explanatory variables and allowing for interaction effects between them.

(ii) Introducing additional exogenous variables that may have predictive power. Depending on the nature of the strategy, likely candidates might include related equity indices and futures contracts.

(iii) Constructing a predictive model and meta-strategy that would enable us assess the likely future performance of the strategy, and which could then be used to determine position size.  Machine learning techniques can often be helpful in this content.

I will give an example of the latter approach in my next post.

Developing Statistical Arbitrage Strategies Using Cointegration

In his latest book (Algorithmic Trading: Winning Strategies and their Rationale, Wiley, 2013) Ernie Chan does an excellent job of setting out the procedures for developing statistical arbitrage strategies using cointegration.  In such mean-reverting strategies, long positions are taken in under-performing stocks and short positions in stocks that have recently outperformed.

I will leave a detailed description of the procedure to Ernie (see pp 47 – 60), which in essence involves:

(i) estimating a cointegrating relationship between two or more stocks, using the Johansen procedure

(ii) computing the half-life of mean reversion of the cointegrated process, based on an Ornstein-Uhlenbeck  representation, using this as a basis for deciding the amount of recent historical data to be used for estimation in (iii)

(iii) Taking a position proportionate to the Z-score of the market value of the cointegrated portfolio (subtracting the recent mean and dividing by the recent standard deviation, where “recent” is defined with reference to the half-life of mean reversion)

Countless researchers have followed this well worn track, many of them reporting excellent results.  In this post I would like to discuss a few of many considerations  in the procedure and variations in its implementation.  We will follow Ernie’s example, using daily data for the EWF-EWG-ITG triplet of ETFs from April 2006 – April 2012. The analysis runs as follows (I am using an adapted version of the Matlab code provided with Ernie’s book):

Johansen test We reject the null hypothesis of fewer then three cointegrating relationships at the 95% level. The eigenvalues and eigenvectors are as follows:

Eigenvalues The eignevectors are sorted by the size of their eigenvalues, so we pick the first of them, which is expected to have the shortest half-life of mean reversion, and create a portfolio based on the eigenvector weights (-1.046, 0.76, 0.2233).  From there, it requires a simple linear regression to estimate the half-life of mean reversion:

Halflife From which we estimate the half-life of mean reversion to be 23 days.  This estimate gets used during the final, stage 3, of the process, when we choose a look-back period for estimating the running mean and standard deviation of the cointegrated portfolio.  The position in each stock (numUnits) is sized according to the standardized deviation from the mean (i.e. the greater the deviation the larger the allocation). Apply Ci The results appear very promising, with an annual APR of 12.6% and Sharpe ratio of 1.4:   Returns EWA-EWC-IGE

Ernie is at pains to point out that, in this and other examples in the book, he pays no attention to transaction costs, nor to the out-of-sample performance of the strategies he evaluates, which is fair enough.

The great majority of the academic studies that examine the cointegration approach to statistical arbitrage for a variety of investment universes do take account of transaction costs.  For the most part such studies report very impressive returns and Sharpe ratios that frequently exceed 3.  Furthermore, unlike Ernie’s example which is entirely in-sample, these studies typically report consistent out-of-sample performance results also.

But the single, most common failing of such studies is that they fail to consider the per share performance of the strategy.  If the net P&L per share is less than the average bid-offer spread of the securities in the investment portfolio, the theoretical performance of the strategy is unlikely to survive the transition to implementation.  It is not at all hard to achieve a theoretical Sharpe ratio of 3 or higher, if you are prepared to ignore the fact that the net P&L per share is lower than the average bid-offer spread.  In practice, however, any such profits are likely to be whittled away to zero in trading frictions – the costs incurred in entering, adjusting and exiting positions across multiple symbols in the portfolio.

Put another way, you would want to see a P&L per share of at least 1c, after transaction costs, before contemplating implementation of the strategy.  In the case of the EWA-EWC-IGC portfolio the P&L per share is around 3.5 cents.  Even after allowing, say, commissions of 0.5 cents per share and a bid-offer spread of 1c per share on both entry and exit, there remains a profit of around 2 cents per share – more than enough to meet this threshold test.

Let’s address the second concern regarding out-of-sample testing.   We’ll introduce a parameter to allow us to select the number of in-sample days, re-estimate the model parameters using only the in-sample data, and test the performance out of sample.  With a in-sample size of 1,000 days, for instance, we find that we can no longer reject the null hypothesis of fewer than 3 cointegrating relationships and the weights for the best linear portfolio differ significantly from those estimated using the entire data set.

Johansen 2

Repeating the regression analysis using the eigenvector weights of the maximum eigenvalue vector (-1.4308, 0.6558, 0.5806), we now estimate the half-life to be only 14 days.  The out-of-sample APR of the strategy over the remaining 500 days drops to around 5.15%, with a considerably less impressive Sharpe ratio of only 1.09.

osPerfOut-of-sample cumulative returns

One way to improve the strategy performance is to relax the assumption of strict proportionality between the portfolio holdings and the standardized deviation in the market value of the cointegrated portfolio.  Instead, we now require  the standardized deviation of the portfolio market value to exceed some chosen threshold level before we open a position (and we close any open positions when the deviation falls below the threshold).  If we choose a threshold level of 1, (i.e. we require the market value of the portfolio to deviate 1 standard deviation from its mean before opening a position), the out-of-sample performance improves considerably:

osPerf 2

The out-of-sample APR is now over 7%, with a Sharpe ratio of 1.45.

The strict proportionality requirement, while logical,  is rather unusual:  in practice, it is much more common to apply a threshold, as I have done here.  This addresses the need to ensure an adequate P&L per share, which will typically increase with higher thresholds.  A countervailing concern, however, is that as the threshold is increased the number of trades will decline, making the results less reliable statistically.  Balancing the two considerations, a threshold of around 1-2 standard deviations is a popular and sensible choice.

Of course, introducing thresholds opens up a new set of possibilities:  just because you decide to enter based on a 2x SD trigger level doesn’t mean that you have to exit a position at the same level.  You might consider the outcome of entering at 2x SD, while exiting at 1x SD, 0x SD, or even -2x SD.  The possible nuances are endless.

Unfortunately, the inconsistency in the estimates of the cointegrating relationships over different data samples is very common.  In fact, from my own research, it is often the case that cointegrating relationships break down entirely out-of-sample, just as do correlations.  A recent study by Matthew Clegg of over 860,000 pairs confirms this finding (On the Persistence of Cointegration in Pais Trading, 2014) that cointegration is not a persistent property.

I shall examine one approach to  addressing the shortcomings  of the cointegration methodology  in a future post.

 

Matlab code (adapted from Ernie Chan’s book):

Continue reading “Developing Statistical Arbitrage Strategies Using Cointegration”