Regime-Switching & Market State Modeling

The Excel workbook referred to in this post can be downloaded here.

Market state models are amongst the most useful analytical techniques that can be helpful in developing alpha-signal generators.  That term covers a great deal of ground, with ideas drawn from statistics, econometrics, physics and bioinformatics.  The purpose of this short note is to provide an introduction to some of the key ideas and suggest ways in which they might usefully applied in the context of researching and developing trading systems.

Although they come from different origins, the concepts presented here share common foundational principles:

1. Markets operate in different states that may be characterized by various measures (volatility, correlation, microstructure, etc);
2. Alpha signals can be generated more effectively by developing models that are adapted to take account of different market regimes;
3. Alpha signals may be combined together effectively by taking account of the various states that a market may be in.

Market state models have shown great promise is a variety of applications within the field of applied econometrics in finance, not only for price and market direction forecasting, but also basis trading, index arbitrage, statistical arbitrage, portfolio construction, capital allocation and risk management.

REGIME SWITCHING MODELS

These are econometric models which seek to use statistical techniques to characterize market states in terms of different estimates of the parameters of some underlying linear model.  This is accompanied by a transition matrix which estimates the probability of moving from one state to another.

To illustrate this approach I have constructed a simple example, given in the accompanying Excel workbook.  In this model the market operates as follows:

Where

Yt is a variable of interest (e.g. the return in an asset over the next period t)

et is an error process with constant variance s2

S is the market state, with two regimes (S=1 or S=2)

a0 is the drift in the asset process

a1 is an autoregressive term, by which the return in the current period is dependent on the prior period return

b1 is a moving average term, which smoothes the error process

This is one of the simplest possible structures, which in more general form can include multiple states, and independent regressions Xi as explanatory variables (such as book pressure, order flow, etc):

The form of the error process et may also be dependent on the market state.  It may simply be that, as in this example, the standard deviation of the error process changes from state to state.  But the changes can also be much more complex:  for instance, the error process may be non-Gaussian, or it may follow a formulation from the GARCH framework.

In this example the state parameters are as follows:

 Reg1 Reg 2 s 0.01 0.02 a0 0.005 -0.015 a1 0.40 0.70 b1 0.10 0.20

What this means is that, in the first state the market tends to trend upwards with relatively low volatility.  In the second state, not only is market volatility much higher, but also the trend is 3x as large in the negative direction.

I have specified the following state transition matrix:

 Reg1 Reg2 Reg1 0.85 0.15 Reg2 0.90 0.10

This is interpreted as follows:  if the market is in State 1, it will tend to remain in that state 85% of the time, transitioning to State 2 15% of the time.  Once in State 2, the market tends to revert to State 1 very quickly, with 90% probability.  So the system is in State 1 most of the time, trending slowly upwards with low volatility and occasionally flipping into an aggressively downward trending phase with much higher volatility.

The Generate sheet in the Excel workbook shows how observations are generated from this process, from which we select a single instance of 3,000 observations, shown in sheet named Sample.

The sample looks like this:

﻿

As anticipated, the market is in State 1 most of the time, occasionally flipping into State 2 for brief periods.

It is well-known that in financial markets we are typically dealing with highly non-Gaussian distributions.  Non-Normality can arise for a number of reasons, including changes in regimes, as illustrated here.  It is worth noting that, even though in this example the process in either market state follows a Gaussian distribution, the combined process is distinctly non-Gaussian in form, having (extremely) fat tails, as shown by the QQ-plot below.

If we attempt to fit a standard ARMA model to the process, the outcome is very disappointing in terms of the model’s poor explanatory power (R2 0.5%) and lack of fit in the squared-residuals:

ARIMA(1,0,1)

Estimate  Std. Err.   t Ratio  p-Value

Intercept                      0.00037    0.00032     1.164    0.244

AR1                            0.57261     0.1697     3.374    0.001

MA1                           -0.63292    0.16163    -3.916        0

Error Variance^(1/2)           0.02015     0.0004    ——   ——

Log Likelihood = 7451.96

Schwarz Criterion = 7435.95

Hannan-Quinn Criterion = 7443.64

Akaike Criterion = 7447.96

Sum of Squares =  1.2172

R-Squared =  0.0054

R-Bar-Squared =  0.0044

Residual SD =  0.0202

Residual Skewness = -2.1345

Residual Kurtosis =  5.7279

Jarque-Bera Test = 3206.15     {0}

Box-Pierce (residuals):         Q(48) = 59.9785 {0.115}

Box-Pierce (squared residuals): Q(50) = 78.2253 {0.007}

Durbin Watson Statistic = 2.01392

KPSS test of I(0) =  0.2001    {<1} *

Lo’s RS test of I(0) =  1.2259  {<0.5} *

Nyblom-Hansen Stability Test:  NH(4)  =  0.5275    {<1}

MA form is 1 + a_1 L +…+ a_q L^q.

Covariance matrix from robust formula.

* KPSS, RS bandwidth = 0.

Parzen HAC kernel with Newey-West plug-in bandwidth.

However, if we keep the same simple form of ARMA(1,1) model, but allow for the possibility of a two-state Markov process, the picture alters dramatically:  now the model is able to account for 98% of the variation in the process, as shown below.

Notice that we have succeeded in estimating the correct underlying transition probabilities, and how the ARMA model parameters change from regime to regime much as they should (small positive drift in one regime, large negative drift in the second, etc).

Markov Transition Probabilities

P(.|1)       P(.|2)

P(1|.)            0.080265      0.14613

P(2|.)             0.91973      0.85387

Estimate  Std. Err.   t Ratio  p-Value

Logistic, t(1,1)              -2.43875     0.1821    ——   ——

Logistic, t(1,2)              -1.76531     0.0558    ——   ——

Non-switching parameters shown as Regime 1.

Regime 1:

Intercept                     -0.05615    0.00315   -17.826        0

AR1                            0.70864    0.16008     4.427        0

MA1                           -0.67382    0.16787    -4.014        0

Error Variance^(1/2)           0.00244     0.0001    ——   ——

Regime 2:

Intercept                      0.00838     2e-005   419.246        0

AR1                            0.26716    0.08347     3.201    0.001

MA1                           -0.26592    0.08339    -3.189    0.001

Log Likelihood = 12593.3

Schwarz Criterion = 12557.2

Hannan-Quinn Criterion = 12574.5

Akaike Criterion = 12584.3

Sum of Squares =  0.0178

R-Squared =  0.9854

R-Bar-Squared =  0.9854

Residual SD =  0.002

Residual Skewness = -0.0483

Residual Kurtosis = 13.8765

Jarque-Bera Test = 14778.5     {0}

Box-Pierce (residuals):         Q(48) = 379.511     {0}

Box-Pierce (squared residuals): Q(50) = 36.8248 {0.917}

Durbin Watson Statistic = 1.50589

KPSS test of I(0) =  0.2332    {<1} *

Lo’s RS test of I(0) =  2.1352 {<0.005} *

Nyblom-Hansen Stability Test:  NH(9)  =  0.8396    {<1}

MA form is 1 + a_1 L +…+ a_q L^q.

Covariance matrix from robust formula.

* KPSS, RS bandwidth = 0.

Parzen HAC kernel with Newey-West plug-in bandwidth.

There are a variety of types of regime switching mechanisms we can use in state models:

Hamiltonian – the simplest, where the process mean and variance vary from state to state

Markovian – the approach used here, with state transition matrix

Explained Switching – where the process changes state as a result of the influence of some underlying variable (such as interest rate volatility, for example)

Smooth Transition – comparable to explained Markov switching, but without and explicitly probabilistic interpretation.

This example is both rather simplistic and pathological at the same time:  the states are well-separated , by design, whereas for real processes they tend to be much harder to distinguish.  A difficulty of this methodology is that the models can be very difficult to estimate.  The likelihood function tends to be very flat and there are a great many local maxima that give similar fit, but with widely varying model forms and parameter estimates.  That said, this is a very rich class of models with a great many potential applications.

Long Memory and Regime Shifts in Asset Volatility

This post covers quite a wide range of concepts in volatility modeling relating to long memory and regime shifts and is based on an article that was published in Wilmott magazine and republished in The Best of Wilmott Vol 1 in 2005.  A copy of the article can be downloaded here.

One of the defining characteristics of volatility processes in general (not just financial assets) is the tendency for the serial autocorrelations to decline very slowly.  This effect is illustrated quite clearly in the chart below, which maps the autocorrelations in the volatility processes of several financial asssets.

Volatility Autocorrelations

Thus we can say that events in the volatility process for IBM, for instance, continue to exert influence on the process almost two years later.

This feature in one that is typical of a black noise process – not some kind of rap music variant, but rather:

“a process with a 1/fβ spectrum, where β > 2 (Manfred Schroeder, “Fractalschaos, power laws“). Used in modeling various environmental processes. Is said to be a characteristic of “natural and unnatural catastrophes like floods, droughts, bear markets, and various outrageous outages, such as those of electrical power.” Further, “because of their black spectra, such disasters often come in clusters.”" [Wikipedia].

Because of these autocorrelations, black noise processes tend to reinforce or trend, and hence (to some degree) may be forecastable.  This contrasts with a white noise process, such as an asset return process, which has a uniform power spectrum, insignificant serial autocorrelations and no discernable trending behavior:

White Noise Power Spectrum

An econometrician might describe this situation by saying that a  black noise process is fractionally integrated order d, where d = H/2, H being the Hurst Exponent.  A way to appreciate the difference in the behavior of a black noise process vs. a white process is by comparing two fractionally integrated random walks generated using the same set of quasi random numbers by Feder’s (1988) algorithm (see p 32 of the presentation on Modeling Asset Volatility).

Fractal Random Walk - White Noise

Fractal Random Walk - Black Noise Process

As you can see. both random walks follow a similar pattern, but the black noise random walk is much smoother, and the downward trend is more clearly discernible.  You can play around with the Feder algorithm, which is coded in the accompanying Excel Workbook on Volatility and Nonlinear Dynamics .  Changing the Hurst Exponent parameter H in the worksheet will rerun the algorithm and illustrate a fractal random walk for a black noise (H > 0.5), white noise (H=0.5) and mean-reverting, pink noise (H<0.5) process.

One way of modeling the kind of behavior demonstrated by volatility process is by using long memory models such as ARFIMA and FIGARCH (see pp 47-62 of the Modeling Asset Volatility presentation for a discussion and comparison of various long memory models).  The article reviews research into long memory behavior and various techniques for estimating long memory models and the coefficient of fractional integration d for a process.

But long memory is not the only possible cause of long term serial correlation.  The same effect can result from structural breaks in the process, which can produce spurious autocorrelations.  The article goes on to review some of the statistical procedures that have been developed to detect regime shifts, due to Bai (1997), Bai and Perron (1998) and the Iterative Cumulative Sums of Squares methodology due to Aggarwal, Inclan and Leal (1999).  The article illustrates how the ICSS technique accurately identifies two changes of regimes in a synthetic GBM process.

In general, I have found the ICSS test to be a simple and highly informative means of gaining insight about a process representing an individual asset, or indeed an entire market.  For example, ICSS detects regime shifts in the process for IBM around 1984 (the time of the introduction of the IBM PC), the automotive industry in the early 1980′s (Chrysler bailout), the banking sector in the late 1980′s (Latin American debt crisis), Asian sector indices in Q3 1997, the S&P 500 index in April 2000 and just about every market imaginable during the 2008 credit crisis.  By splitting a series into pre- and post-regime shift sub-series and examining each segment for long memory effects, one can determine the cause of autocorrelations in the process.  In some cases, Asian equity indices being one example, long memory effects disappear from the series, indicating that spurious autocorrelations were induced by a major regime shift during the 1997 Asian crisis. In most cases, however, long memory effects persist.

Excel Workbook on Volatility and Nonlinear Dynamics

There are several other topics from chaos theory and nonlinear dynamics covered in the workbook, including:

More on these issues in due course.

Modeling Asset Volatility

I am planning a series of posts on the subject of asset volatility and option pricing and thought I would begin with a survey of some of the central ideas. The attached presentation on Modeling Asset Volatility sets out the foundation for a number of key concepts and the basis for the research to follow.

Perhaps the most important feature of volatility is that it is stochastic rather than constant, as envisioned in the Black Scholes framework.  The presentation addresses this issue by identifying some of the chief stylized facts about volatility processes and how they can be modelled.  Certain characteristics of volatility are well known to most analysts, such as, for instance, its tendency to “cluster” in periods of higher and lower volatility.  However, there are many other typical features that are less often rehearsed and these too are examined in the presentation.

Long Memory
For example, while it is true that GARCH models do a fine job of modeling the clustering effect  they typically fail to capture one of the most important features of volatility processes – long term serial autocorrelation.  In the typical GARCH model autocorrelations die away approximately exponentially, and historical events are seen to have little influence on the behaviour of the process very far into the future.  In volatility processes that is typically not the case, however:  autocorrelations die away very slowly and historical events may continue to affect the process many weeks, months or even years ahead.

Volatility Direction Prediction Accuracy

There are two immediate and very important consequences of this feature.  The first is that volatility processes will tend to trend over long periods – a characteristic of Black Noise or Fractionally Integrated processes, compared to the White Noise behavior that typically characterizes asset return processes.  Secondly, and again in contrast with asset return processes, volatility processes are inherently predictable, being conditioned to a significant degree on past behavior.  The presentation considers the fractional integration frameworks as a basis for modeling and forecasting volatility.

Mean Reversion vs. Momentum
A puzzling feature of much of the literature on volatility is that it tends to stress the mean-reverting behavior of volatility processes.  This appears to contradict the finding that volatility behaves as a reinforcing process, whose long-term serial autocorrelations create a tendency to trend.  This leads to one of the most important findings about asset processes in general, and volatility process in particular: i.e. that the assets processes are simultaneously trending and mean-reverting.  One way to understand this is to think of volatility, not as a single process, but as the superposition of two processes:  a long term process in the mean, which tends to reinforce and trend, around which there operates a second, transient process that has a tendency to produce short term spikes in volatility that decay very quickly.  In other words, a transient, mean reverting processes inter-linked with a momentum process in the mean.  The presentation discusses two-factor modeling concepts along these lines, and about which I will have more to say later.

Cointegration
One of the most striking developments in econometrics over the last thirty years, cointegration is now a principal weapon of choice routinely used by quantitative analysts to address research issues ranging from statistical arbitrage to portfolio construction and asset allocation.  Back in the late 1990′s I and a handful of other researchers realized that volatility processes exhibited very powerful cointegration tendencies that could be harnessed to create long-short volatility strategies, mirroring the approach much beloved by equity hedge fund managers.  In fact, this modeling technique provided the basis for the Caissa Capital volatility fund, which I founded in 2002.  The presentation examines characteristics of multivariate volatility processes and some of the ideas that have been proposed to model them, such as FIGARCH (fractionally-integrated GARCH).

Dispersion Dynamics
Finally, one topic that is not considered in the presentation, but on which I have spent much research effort in recent years, is the behavior of cross-sectional volatility processes, which I like to term dispersion.  It turns out that, like its univariate cousin, dispersion displays certain characteristics that in principle make it highly forecastable.  Given an appropriate model of dispersion dynamics, the question then becomes how to monetize efficiently the insight that such a model offers.  Again, I will have much more to say on this subject, in future.