Measuring Toxic Flow for Trading & Risk Management
A common theme of microstructure modeling is that trade flow is often predictive of market direction. One concept in particular that has gained traction is flow toxicity, i.e. flow where resting orders tend to be filled more quickly than expected, while aggressive orders rarely get filled at all, due to the participation of informed traders trading against uninformed traders. The fundamental insight from microstructure research is that the order arrival process is informative of subsequent price moves in general and toxic flow in particular. This is turn has led researchers to try to measure the probability of informed trading (PIN). One recent attempt to model flow toxicity, the Volume-Synchronized Probability of Informed Trading (VPIN)metric, seeks to estimate PIN based on volume imbalance and trade intensity. A major advantage of this approach is that it does not require the estimation of unobservable parameters and, additionally, updating VPIN in trade time rather than clock time improves its predictive power. VPIN has potential applications both in high frequency trading strategies, but also in risk management, since highly toxic flow is likely to lead to the withdrawal of liquidity providers, setting up the conditions for a flash-crash” type of market breakdown.
The procedure for estimating VPIN is as follows. We begin by grouping sequential trades into equal volume buckets of size V. If the last trade needed to complete a bucket was for a size greater than needed, the excess size is given to the next bucket. Then we classify trades within each bucket into two volume groups: Buys (V(t)B) and Sells (V(t)S), with V = V(t)B + V(t)S
The Volume-Synchronized Probability of Informed Trading is then derived as:
Typically one might choose to estimate VPIN using a moving average over n buckets, with n being in the range of 50 to 100.
Another related statistic of interest is the single-period signed VPIN. This will take a value of between -1 and =1, depending on the proportion of buying to selling during a single period t.
Fig 1. Single-Period Signed VPIN for the ES Futures Contract
It turns out that quote revisions condition strongly on the signed VPIN. For example, in tests of the ES futures contract, we found that the change in the midprice from one volume bucket the next was highly correlated to the prior bucket’s signed VPIN, with a coefficient of 0.5. In other words, market participants offering liquidity will adjust their quotes in a way that directly reflects the direction and intensity of toxic flow, which is perhaps hardly surprising.
Of greater interest is the finding that there is a small but statistically significant dependency of price changes, as measured by first buy (sell) trade price to last sell (buy) trade price, on the prior period’s signed VPIN. The correlation is positive, meaning that strongly toxic flow in one direction has a tendency to push prices in the same direction during the subsequent period. Moreover, the single period signed VPIN turns out to be somewhat predictable, since its autocorrelations are statistically significant at two or more lags. A simple linear auto-regression ARMMA(2,1) model produces an R-square of around 7%, which is small, but statistically significant.
A more useful model, however , can be constructed by introducing the idea of Markov states and allowing the regression model to assume different parameter values (and error variances) in each state. In the Markov-state framework, the system transitions from one state to another with conditional probabilities that are estimated in the model.
An example of such a model for the signed VPIN in ES is shown below. Note that the model R-square is over 27%, around 4x larger than for a standard linear ARMA model.
We can describe the regime-switching model in the following terms. In the regime 1 state the model has two significant autoregressive terms and one significant moving average term (ARMA(2,1)). The AR1 term is large and positive, suggesting that trends in VPIN tend to be reinforced from one period to the next. In other words, this is a momentum state. In the regime 2 state the AR2 term is not significant and the AR1 term is large and negative, suggesting that changes in VPIN in one period tend to be reversed in the following period, i.e. this is a mean-reversion state.
The state transition probabilities indicate that the system is in mean-reversion mode for the majority of the time, approximately around 2 periods out of 3. During these periods, excessive flow in one direction during one period tends to be corrected in the
ensuring period. But in the less frequently occurring state 1, excess flow in one direction tends to produce even more flow in the same direction in the following period. This first state, then, may be regarded as the regime characterized by toxic flow.
Markov State Regime-Switching Model
Markov Transition Probabilities
P(.|1) P(.|2)
P(1|.) 0.54916 0.27782
P(2|.) 0.45084 0.7221
Regime 1:
AR1 1.35502 0.02657 50.998 0
AR2 -0.33687 0.02354 -14.311 0
MA1 0.83662 0.01679 49.828 0
Error Variance^(1/2) 0.36294 0.0058
Regime 2:
AR1 -0.68268 0.08479 -8.051 0
AR2 0.00548 0.01854 0.296 0.767
MA1 -0.70513 0.08436 -8.359 0
Error Variance^(1/2) 0.42281 0.0016
Log Likelihood = -33390.6
Schwarz Criterion = -33445.7
Hannan-Quinn Criterion = -33414.6
Akaike Criterion = -33400.6
Sum of Squares = 8955.38
R-Squared = 0.2753
R-Bar-Squared = 0.2752
Residual SD = 0.3847
Residual Skewness = -0.0194
Residual Kurtosis = 2.5332
Jarque-Bera Test = 553.472 {0}
Box-Pierce (residuals): Q(9) = 13.9395 {0.124}
Box-Pierce (squared residuals): Q(12) = 743.161 {0}
A Simple Trading Strategy
One way to try to monetize the predictability of the VPIN model is to use the forecasts to take directional positions in the ES
contract. In this simple simulation we assume that we enter a long (short) position at the first buy (sell) price if the forecast VPIN exceeds some threshold value 0.1 (-0.1). The simulation assumes that we exit the position at the end of the current volume bucket, at the last sell (buy) trade price in the bucket.
This simple strategy made 1024 trades over a 5-day period from 8/8 to 8/14, 90% of which were profitable, for a total of $7,675 – i.e. around ½ tick per trade.
The simulation is, of course, unrealistically simplistic, but it does give an indication of the prospects for more realistic version of the strategy in which, for example, we might rest an order on one side of the book, depending on our VPIN forecast.
Figure 2 – Cumulative Trade PL
References
Easley, D., Lopez de Prado, M., O’Hara, M., Flow Toxicity and Volatility in a High frequency World, Johnson School Research paper Series # 09-2011, 2011
Easley, D. and M. O‟Hara (1987), “Price, Trade Size, and Information in Securities Markets”, Journal of Financial Economics, 19.
Easley, D. and M. O‟Hara (1992a), “Adverse Selection and Large Trade Volume: The Implications for Market Efficiency”,
Journal of Financial and Quantitative Analysis, 27(2), June, 185-208.
Easley, D. and M. O‟Hara (1992b), “Time and the process of security price adjustment”, Journal of Finance, 47, 576-605.
Generalized Regression
Linear regression is one of the most useful applications in the financial engineer’s tool-kit, but it suffers from a rather restrictive set of assumptions that limit its applicability in areas of research that are characterized by their focus on highly non-linear or correlated variables. The latter problem, referred to as colinearity (or multicolinearity) arises very frequently in financial research, because asset processes are often somewhat (or even highly) correlated. In a colinear system, one can test for the overall significant of the regression relationship, but one is unable to distinguish which of the explanatory variables is individually significant. Furthermore, the estimates of the model parameters, the weights applied to each explanatory variable, tend to be biased.
Over time, many attempts have been made to address this issue, one well-known example being ridge regression. More recent attempts include lasso, elastic net and what I term generalized regression, which appear to offer significant advantages vs traditional regression techniques in situations where the variables are correlated.
In this note, I examine a variety of these techniques and attempt to illustrate and compare their effectiveness.
The Mathematica notebook is also available here.
Generalized RegressionHiring High Frequency Quant/Traders
I am hiring in Chicago for exceptional HF Quant/Traders in Equities, F/X, Futures & Fixed Income. Remuneration for these roles, which will be dependent on qualifications and experience, will be in line with the highest market levels.
Role
Working closely with team members including developers, traders and quantitative researchers, the central focus of the role will be to research and develop high frequency trading strategies in equities, fixed income, foreign exchange and related commodities markets.
Responsibilities
The analyst will have responsibility of taking an idea from initial conception through research, testing and implementation. The work will entail:
- Formulation of mathematical and econometric models for market microstructure
- Data collation, normalization and analysis
- Model prototyping and programming
- Strategy development, simulation, back-testing and implementation
- Execution strategy & algorithms
Qualifications & Experience
- Minimum 5 years in quantitative research with a leading proprietary trading firm, hedge fund, or investment bank
- In-depth knowledge of Equities, F/X and/or futures markets, products and operational infrastructure
- High frequency data management & data mining techniques
- Microstructure modeling
- High frequency econometrics (cointegration, VAR,error correction models, GARCH, panel data models, etc.)
- Machine learning, signal processing, state space modeling and pattern recognition
- Trade execution and algorithmic trading
- PhD in Physics/Math/Engineering, Finance/Economics/Statistics
- Expert programming skills in Java, Matlab/Mathematica essential
- Must be US Citizen or Permanent Resident
Send your resume to: jkinlay at systematic-strategies.com.
No recruiters please.
Alpha Spectral Analysis
One of the questions of interest is the optimal sampling frequency to use for extracting the alpha signal from an alpha generation function. We can use Fourier transforms to help identify the cyclical behavior of the strategy alpha and hence determine the best time-frames for sampling and trading. Typically, these spectral analysis techniques will highlight several different cycle lengths where the alpha signal is strongest.
The spectral density of the combined alpha signals across twelve pairs of stocks is shown in Fig. 1 below. It is clear that the strongest signals occur in the shorter frequencies with cycles of up to several hundred seconds. Focusing on the density within
this time frame, we can identify in Fig. 2 several frequency cycles where the alpha signal appears strongest. These are around 50, 80, 160, 190, and 230 seconds. The cycle with the strongest signal appears to be around 228 secs, as illustrated in Fig. 3. The signals at cycles of 54 & 80 (Fig. 4), and 158 & 185/195 (Fig. 5) secs appear to be of approximately equal strength.
There is some variation in the individual pattern for of the power spectra for each pair, but the findings are broadly comparable, and indicate that strategies should be designed for sampling frequencies at around these time intervals.
Fig. 1 Alpha Power Spectrum
Fig.2
PRINCIPAL COMPONENTS ANALYSIS OF ALPHA POWER SPECTRUM
If we look at the correlation surface of the power spectra of the twelve pairs some clear patterns emerge (see Fig 6):
Focusing on the off-diagonal elements, it is clear that the power spectrum of each pair is perfectly correlated with the power spectrum of its conjugate. So, for instance the power spectrum of the Stock1-Stock3 pair is exactly correlated with the spectrum for its converse, Stock3-Stock1.
But it is also clear that there are many other significant correlations between non-conjugate pairs. For example, the correlation between the power spectra for Stock1-Stock2 vs Stock2-Stock3 is 0.72, while the correlation of the power spectra of Stock1-Stock2 and Stock2-Stock4 is 0.69.
We can further analyze the alpha power spectrum using PCA to expose the underlying factor structure. As shown in Fig. 7, the first two principal components account for around 87% of the variance in the alpha power spectrum, and the first four components account for over 98% of the total variation.

Fig. 7
Stock3 dominates PC-1 with loadings of 0.52 for Stock3-Stock4, 0.64 for Stock3-Stock2, 0.29 for Stock1-Stock3 and 0.26 for Stock4-Stock3. Stock3 is also highly influential in PC-2 with loadings of -0.64 for Stock3-Stock4 and 0.67 for Stock3-Stock2 and again in PC-3 with a loading of -0.60 for Stock3-Stock1. Stock4 plays a major role in the makeup of PC-3, with the highest loading of 0.74 for Stock4-Stock2.
Fig. 8 PCA Analysis of Power Spectra
Market Microstructure Models for High Frequency Trading Strategies
This note summarizes some of the key research in the field of market microstructure and considers some of the models proposed by the researchers. Many of the ideas presented here have become widely adopted by high frequency trading firms and incorporated into their trading systems.
Market MicroStructure ModelsForecasting Financial Markets – Part 1: Time Series Analysis
The presentation in this post covers a number of important topics in forecasting, including:
- Stationary processes and random walks
- Unit roots and autocorrelation
- ARMA models
- Seasonality
- Model testing
- Forecasting
- Dickey-Fuller and Phillips-Perron tests for unit roots
Also included are a number of detailed worked examples, including:
- ARMA Modeling
- Box Jenkins methodology
- Modeling the US Wholesale Price Index
- Pesaran & Timmermann study of excess equity returns
- Purchasing Power Parity
Forecasting 2011 - Time Series
Long/Short Stock Trading Strategy
The Long-Short Stock Trader strategy uses a quantitative model to introduce market orders, both entry and exits. The model looks for divergencies between stock price and its current volatility, closing the position when the Price-volatility gap is closed. The strategy is designed to obtain a better return on risk than S&P500 index and the risk management is focused on obtaining a lower drawdown and volatility than index.
The model trades only Large Cap stocks, with high liquidity and without scalability problems. Thanks to the high liquidity, market orders are filled without market impact and at the best market prices.
For more information and back-test results go here.
The Long-Short Trader is the first strategy launched on the Systematic Algotrading Platform under our new Strategy Manager Program.
Performance Summary
Monthly Returns
Value of $100,000 Portfolio
Master’s in High Frequency Finance
I have been discussing with some potential academic partners the concept for a new graduate program in High Frequency Finance. The idea is to take the concept of the Computational Finance program developed in the 1990s and update it to meet the needs of students in the 2010s.
The program will offer a thorough grounding in the modeling concepts, trading strategies and risk management procedures currently in use by leading investment banks, proprietary trading firms and hedge funds in US and international financial markets. Students will also learn the necessary programming and systems design skills to enable them to make an effective contribution as quantitative analysts, traders, risk managers and developers.
I would be interested in feedback and suggestions as to the proposed content of the program.
A Practical Application of Regime Switching Models to Pairs Trading
In the previous post I outlined some of the available techniques used for modeling market states. The following is an illustration of how these techniques can be applied in practice. You can download this post in pdf format here.
The chart below shows the daily compounded returns for a single pair in an ETF statistical arbitrage strategy, back-tested over a 1-year period from April 2010 to March 2011.
The idea is to examine the characteristics of the returns process and assess its predictability.
The initial impression given by the analytics plots of daily returns, shown in Fig 2 below, is that the process may be somewhat predictable, given what appears to be a significant 1-order lag in the autocorrelation spectrum. We also see evidence of the
customary non-Gaussian “fat-tailed” distribution in the error process.
An initial attempt to fit a standard Auto-Regressive Moving Average ARMA(1,0,1) model yields disappointing results, with an unadjusted model R-squared of only 7% (see model output in Appendix 1)
However, by fitting a 2-state Markov model we are able to explain as much as 65% in the variation in the returns process (see Appendix II).
The model estimates Markov Transition Probabilities as follows.
P(.|1) P(.|2)
P(1|.) 0.93920 0.69781
P(2|.) 0.060802 0.30219
In other words, the process spends most of the time in State 1, switching to State 2 around once a month, as illustrated in Fig 3 below.

In the first state, the pairs model produces an expected daily return of around 65bp, with a standard deviation of similar magnitude. In this state, the process also exhibits very significant auto-regressive and moving average features.
Regime 1:
Intercept 0.00648 0.0009 7.2 0
AR1 0.92569 0.01897 48.797 0
MA1 -0.96264 0.02111 -45.601 0
Error Variance^(1/2) 0.00666 0.0007
In the second state, the pairs model produces lower average returns, and with much greater variability, while the autoregressive and moving average terms are poorly determined.
Regime 2:
Intercept 0.03554 0.04778 0.744 0.459
AR1 0.79349 0.06418 12.364 0
MA1 -0.76904 0.51601 -1.49 0.139
Error Variance^(1/2) 0.01819 0.0031
CONCLUSION
The analysis in Appendix II suggests that the residual process is stable and Gaussian. In other words, the two-state Markov model is able to account for the non-Normality of the returns process and extract the salient autoregressive and moving average features in a way that makes economic sense.
How is this information useful? Potentially in two ways:
(i) If the market state can be forecast successfully, we can use that information to increase our capital allocation during periods when the process is predicted to be in State 1, and reduce the allocation at times when it is in State 2.
(ii) By examining the timing of the Markov states and considering different features of the market during the contrasting periods, we might be able to identify additional explanatory factors that could be used to further enhance the trading model.


















