Can Machine Learning Techniques Be Used To Predict Market Direction? The 1,000,000 Model Test.

During the 1990’s the advent of Neural Networks unleashed a torrent of research on their applications in financial markets, accompanied by some rather extravagant claims about their predicative abilities.  Sadly, much of the research proved to be sub-standard and the results illusionary, following which the topic was largely relegated to the bleachers, at least in the field of financial market research.

With the advent of new machine learning techniques such as Random Forests, Support Vector Machines and Nearest Neighbor Classification, there has been a resurgence of interest in non-linear modeling techniques and a flood of new research, a fair amount of it supportive of their potential for forecasting financial markets.  Once again, however, doubts about the quality of some of the research bring the results into question.

SSALGOTRADING AD

Against this background I and my co-researcher Dan Rico set out to address the question of whether these new techniques really do have predicative power, more specifically the ability to forecast market direction.  Using some excellent MatLab toolboxes and a new software package, an Excel Addin called 11Ants, that makes large scale testing of multiple models a snap, we examined over 1,000,000 models and model-ensembles, covering just about every available non-linear technique.  The data set for our study comprised daily prices for a selection of US equity securities, together with a large selection of technical indicators for which some other researchers have claimed explanatory power.

In-Sample Equity Curve for Best Performing Nonlinear Model
In-Sample Equity Curve for Best Performing Nonlinear Model

The answer provided by our research was, without exception, in the negative: not one of the models tested showed any significant ability to predict the direction of any of the securities in our data set.  Furthermore, our study found that the best-performing models favored raw price data over technical indicator variables, suggesting that the latter have little explanatory power.

As with Neural Networks, the principal difficulty with non-linear techniques appears to be curve-fitting and a failure to generalize:  while it is very easy to find models that provide an excellent fit to in-sample data, the forecasting performance out-of-sample is often very poor.

Out-of-Sample Equity Curve for Best Performing Nonlinear Model
Out-of-Sample Equity Curve for Best Performing Nonlinear Model

Some caveats about our own research apply.  First and foremost, it is of course impossible to prove a hypothesis in the negative.  Secondly, it is plausible that some markets are less efficient than others:  some studies have claimed success in developing predictive models due to the (relative) inefficiency of the F/X and futures markets, for example.  Thirdly, the choice of sample period may be criticized:  it could be that the models were over-conditioned on a too- lengthy in-sample data set, which in one case ran from 1993 to 2008, with just two years (2009-2010) of out-of-sample data.  The choice of sample was deliberate, however:  had we omitted the 2008 period from the “learning” data set, it would be very easy to criticize the study for failing to allow the algorithms to learn about the exceptional behavior of the markets during that turbulent year.

Despite these limitations, our research casts doubt on the findings of some less-extensive studies, that may be the result of sample-selection bias.  One characteristic of the most credible studies finding evidence in favor of market predictability, such as those by Pesaran and Timmermann, for instance (see paper for citations), is that the models they employ tend to incorporate independent explanatory variables, such as yield spreads, which do appear to have real explanatory power.  The finding of our study suggest that, absent such explanatory factors, the ability to predict markets using sophisticated non-linear techniques applied to price data alone may prove to be as illusionary as it was in the 1990’s.

 

ONE MILLION MODELS

Systematic Futures Trading

In its proprietary trading, Systematic Strategies primary focus in on equity and volatility strategies, both low and high frequency. In futures, the emphasis is on high frequency trading, although we also run one or two lower frequency strategies that have higher capacity, such as the Futures WealthBuilder. The version of WealthBuilder running on the Collective 2 site has performed very well in 2017, with net returns of 30% and a Sharpe Ratio of 3.4:

Futures C2 oct 2017

 

In the high frequency space, our focus is on strategies with very high Sharpe Ratios and low drawdowns. We trade a range of futures products, including equity, fixed income, metals and energy markets. Despite the current low levels of market volatility, these strategies have performed well in 2017:

HFT Futures Oct 2017 (NFA)

Building high frequency strategies with double-digit Sharpe Ratios requires a synergy of computational capability and modeling know-how. The microstructure of futures markets is, of course, substantially different to that of equity or forex markets and the components of the model that include microstructure effects vary widely from one product to another. There can be substantial variations too in the way that time is handled in the model – whether as discrete or continuous “wall time”, in trade time, or some other measure. But some of the simple technical indicators we use – moving averages, for example – are common to many models across different products and markets. Machine learning plays a role in most of our trading strategies, including high frequency.

Here are some relevant blog posts that you may find interesting:

http://jonathankinlay.com/2016/04/high-frequency-trading-equities-vs-futures/

 

http://jonathankinlay.com/2015/05/designing-scalable-futures-strategy/

 

http://jonathankinlay.com/2014/10/day-trading-system-in-vix-futures/

A Winer Process

No doubt many of you sharp-eyed readers will have spotted a spelling error, thinking I intended to refer to one of these:

Fig 1

 

But, in fact, I really did have in mind something more like this:

 

wine pour

 

We are following an example from the recently published Mathematica Beyond Mathematics by Jose Sanchez Leon, an up-to-date text that describes many of the latest features in Mathematica, illustrated with interesting applications. Sanchez Leon shows how Mathematica’s machine learning capabilities can be applied to the craft of wine-making.

SSALGOTRADING AD

We begin by loading a curated Wolfram dataset comprising measurements of the physical properties and quality of wines:

Fig 2

A Machine Learning Prediction Model for Wine Quality

We’re going to apply Mathematica’s built-in machine learning algorithms to train a predictor of wine quality, using the training dataset. Mathematica determines that the most effective machine learning technique in this case is Random Forest and after a few seconds produces the predictor function:

Fig 3

 

Mathematica automatically selects what it considers to be the best performing model from several available machine learning algorithms:

machine learning methods

Let’s take a look at how well the predictor perform on the test dataset of 1,298 wines:

Fig 4

We can use the predictor function to predict the quality of an unknown wine, based on its physical properties:

Fig 5

Next we create a function to predict the quality of an unknown wine as a function of just two of its characteristics, its pH and alcohol level.  The analysis suggests that the quality of our unknown wine could be improved by increasing both its pH and alcohol content:

Fig 6

Applications and Examples

This simple toy example illustrates how straightforward it is to deploy machine learning techniques in Mathematica.  Machine Learning and Neural Networks became a major focus for Wolfram Research in version 10, and the software’s capabilities have been significantly enhanced in version 11, with several applications such as text and sentiment analysis that have direct relevance to trading system development:

Fig 7

For other detailed examples see:

http://jonathankinlay.com/2016/08/machine-learning-model-spy/

http://jonathankinlay.com/2016/11/trading-market-sentiment/

 

http://jonathankinlay.com/2016/08/dynamic-time-warping/

 

 

 

 

 

Correlation Copulas

Continuing a previous post, in which we modeled the relationship in the levels of the VIX Index and the Year 1 and Year 2 CBOE Correlation Indices, we next turn our attention to modeling changes in the VIX index.

In case you missed it, the post can be found here:

http://jonathankinlay.com/2017/08/correlation-cointegration/

We saw previously that the levels of the three indices are all highly correlated, and we were able to successfully account for approximately half the variation in the VIX index using either linear regression models or non-linear machine-learning models that incorporated the two correlation indices.  It turns out that the log-returns processes are also highly correlated:

Fig1 Fig2

A Linear Model of VIX Returns

We can create a simple linear regression model that relates log-returns in the VIX index to contemporaneous log-returns in the two correlation indices, as follows.  The derived model accounts for just under 40% of the variation in VIX index returns, with each correlation index contributing approximately one half of the total VIX return.

SSALGOTRADING AD

Fig3

Non-Linear Model of VIX Returns

Although the linear model is highly statistically significant, we see clear evidence of lack of fit in the model residuals, which indicates non-linearities present in the relationship.  So, ext we use a nearest-neighbor algorithm, a machine learning technique that allows us to model non-linear components of the relationship.  The residual plot from the nearest neighbor model clearly shows that it does a better job of capturing these nonlinearities, with lower standard in the model residuals, compared to the linear regression model:

Fig4

Correlation Copulas

Another approach entails the use of copulas to model the inter-dependency between the volatility and correlation indices.  For a fairly detailed exposition on copulas, see the following blog posts:

http://jonathankinlay.com/2017/01/copulas-risk-management/

 

http://jonathankinlay.com/2017/03/pairs-trading-copulas/

We begin by taking a smaller sample comprising around three years of daily returns in the indices.  This minimizes the impact of any long-term nonstationarity in the processes and enables us to fit marginal distributions relatively easily.  First, let’s look at the correlations in our sample data:

Fig5

We next proceed to fit margin distributions to the VIX and Correlation Index processes.  It turns out that the VIX process is well represented by a Logistic distribution, while the two Correlation Index returns processes are better represented by a Student-T density.  In all three cases there is little evidence of lack of fit, wither in the body or tails of the estimated probability density functions:

Fig6 Fig7 Fig8

The final step is to fit a copula to model the joint density between the indices.  To keep it simple I have chosen to carry out the analysis for the combination of the VIX index with only the first of the correlation indices, although in principle there no reason why a copula could not be estimated for all three indices.  The fitted model is a multinormal Gaussian copula with correlation coefficient of 0.69.  of course, other copulas are feasible (Clayton, Gumbel, etc), but Gaussian model appears to provide an adequate fit to the empirical copula, with approximate symmetry in the left and right tails.

Fig9

 

 

 

 

 

Modeling Volatility and Correlation

In a previous blog post I mentioned the VVIX/VIX Ratio, which is measured as the ratio of the CBOE VVIX Index to the VIX Index. The former measures the volatility of the VIX, or the volatility of volatility.

http://jonathankinlay.com/2017/07/market-stress-test-signals-danger-ahead/

A follow-up article in ZeroHedge shortly afterwards pointed out that the VVIX/VIX ratio had reached record highs, prompting Goldman Sachs analyst Ian Wright to comment that this could signal the ending of the current low-volatility regime:

vvix to vix 2_0

 

 

 

 

 

 

 

 

 

 

 

 

A linkedIn reader pointed out that individual stock volatility was currently quite high and when selling index volatility one is effectively selling stock correlations, which had now reached historically low levels. I concurred:

What’s driving the low vol regime is the exceptionally low level of cross-sectional correlations. And, as correlations tighten, index vol will rise. Worse, we are likely to see a feedback loop – higher vol leading to higher correlations, further accelerating the rise in index vol. So there is a second order, Gamma effect going on. We see that is the very high levels of the VVIX index, which shot up to 130 last week. The all-time high in the VVIX prior to Aug 2015 was around 120. The intra-day high in Aug 2015 reached 225. I’m guessing it will get back up there at some point, possibly this year.

SSALGOTRADING AD

As there appears to be some interest in the subject I decided to add a further blog post looking a little further into the relationship between volatility and correlation.  To gain some additional insight we are going to make use of the CBOE implied correlation indices.  The CBOE web site explains:

Using SPX options prices, together with the prices of options on the 50 largest stocks in the S&P 500 Index, the CBOE S&P 500 Implied Correlation Indexes offers insight into the relative cost of SPX options compared to the price of options on individual stocks that comprise the S&P 500.

  • CBOE calculates and disseminates two indexes tied to two different maturities, usually one year and two years out. The index values are published every 15 seconds throughout the trading day.
  • Both are measures of the expected average correlation of price returns of S&P 500 Index components, implied through SPX option prices and prices of single-stock options on the 50 largest components of the SPX.

Dispersion Trading

One application is dispersion trading, which the CBOE site does a good job of summarizing:

The CBOE S&P 500 Implied Correlation Indexes may be used to provide trading signals for a strategy known as volatility dispersion (or correlation) trading. For example, a long volatility dispersion trade is characterized by selling at-the-money index option straddles and purchasing at-the-money straddles in options on index components. One interpretation of this strategy is that when implied correlation is high, index option premiums are rich relative to single-stock options. Therefore, it may be profitable to sell the rich index options and buy the relatively inexpensive equity options.

The VIX Index and the Implied Correlation Indices

Again, the CBOE web site is worth quoting:

The CBOE S&P 500 Implied Correlation Indexes measure changes in the relative premium between index options and single-stock options. A single stock’s volatility level is driven by factors that are different from what drives the volatility of an Index (which is a basket of stocks). The implied volatility of a single-stock option simply reflects the market’s expectation of the future volatility of that stock’s price returns. Similarly, the implied volatility of an index option reflects the market’s expectation of the future volatility of that index’s price returns. However, index volatility is driven by a combination of two factors: the individual volatilities of index components and the correlation of index component price returns.

Let’s dig into this analytically.  We first download and plot the daily for the VIX and Correlation Indices from the CBOE web site, from which it is evident that all three series are highly correlated:

Fig1

An inspection reveals significant correlations between the VIX index and the two implied correlation indices, which are themselves highly correlated.  The S&P 500 Index is, of course, negatively correlated with all three indices:

Fig8

Modeling Volatility-Correlation

The response surface that describes the relationship between the VIX index and the two implied correlation indices is locally very irregular, but the slope of the surface is generally positive, as we would expect, since the level of VIX correlates positively with that of the two correlation indices.

Fig2

The most straightforward approach is to use a simple linear regression specification to model the VIX level as a function of the two correlation indices.  We create a VIX Model Surface object using this specification with the  Mathematica Predict function:Fig3The linear model does quite a good job of capturing the positive gradient of the response surface, and in fact has a considerable amount of explanatory power, accounting for a little under half the variance in the level of the VIX index:

Fig 4

However, there are limitations.  To begin with, the assumption of independence between the explanatory variables, the correlation indices, clearly does not hold.  In cases such as this, where explanatory variables are multicolinear, we are unable to draw inferences about the explanatory power of individual regressors, even though the model as a whole may be highly statistically significant, as here.

Secondly, a linear regression model is not going to capture non-linearities in the volatility-correlation relationship that are evident in the surface plot.  This is confirmed by a comparison plot, which shows that the regression model underestimates the VIX level for both low and high values of the index:

Fig5

We can achieve a better outcome using a machine learning algorithm such as nearest neighbor, which is able to account for non-linearities in the response surface:

Fig6

The comparison plot shows a much closer correspondence between actual and predicted values of the VIX index,  even though there is evidence of some remaining heteroscedasticity in the model residuals:

Fig7

Conclusion

A useful way to think about index volatility is as a two dimensional process, with time-series volatility measured on one dimension and dispersion (cross-sectional volatility, the inverse of correlation) measured on the second.  The two factors are correlated and, as we have shown here, interact in a complicated, non-linear way.

The low levels of index volatility we have seen in recent months result, not from low levels of volatility in component stocks, but in the historically low levels of correlation (high levels of dispersion) in the underlying stock returns processes. As correlations begin to revert to historical averages, the impact will be felt in an upsurge in index volatility, compounded by the non-linear interaction between the two factors.

 

Volatility ETF Trader – June 2017: +15.3%

The Volatility ETF Trader product is an algorithmic strategy that trades several VIX ETFs using statistical and machine learning algorithms.

We offer a version of the strategy on the Collective 2 site (see here for details) that the user can subscribe to for a very modest fee of only $149 per month.

The risk-adjusted performance of the Collective 2 version of the strategy is unlikely to prove as good as the product we offer in our Systematic Strategies Fund, which trades a much wider range of algorithmic strategies.  There are other important differences too:  the Fund’s Systematic Volatility Strategy makes no use of leverage and only trades intra-day, exiting all positions by market close.  So it has a more conservative risk profile, suitable for longer term investment.

The Volatility ETF Trader on Collective 2, on the other hand, is a highly leveraged, tactical strategy that trades positions overnight and holds them for periods of several days .  As a consequence, the Collective 2 strategy is far more risky and is likely to experience significant drawdowns.    Those caveats aside, the strategy returns have been outstanding:  +48.9% for 2017 YTD and a total of +107.8% from inception in July 2016.

You can find full details of the strategy, including a listing of all of the trades, on the Collective 2 site.

Subscribers can sign up for a free, seven day trial and thereafter they can choose to trade the strategy automatically in their own brokerage account.

 

VIX ETF Strategy June 2017

Futures WealthBuilder – June 2017: +4.4%

The Futures WealthBuilder product is an algorithmic CTA strategy that trades several highly liquid futures contracts using machine learning algorithms.  More details about the strategy are given in this blog post.

We offer a version of the strategy on the Collective 2 site (see here for details) that the user can subscribe to for a very modest fee of only $149 per month.  The Collective 2 version of the strategy is unlikely to perform as well as the product we offer in our Systematic Strategies Fund, which trades a much wider range of futures products.  But the strategy is off to an excellent start, making +4.4% in June and is now up 6.7% since inception in May.  In June the strategy made profitable trades in US Bonds, Euro F/X and VIX futures, and the last seven trades in a row have been winners.

You can find full details of the strategy, including a listing of all of the trades, on the Collective 2 site.

Subscribers can sign up for a free, seven day trial and thereafter they can choose to trade the strategy automatically in their own brokerage account, using the Collective 2 api.

Futures WealthBuilder June 2017

Futures WealthBuilder

We are launching a new product, the Futures WealthBuilder,  a CTA system that trades futures contracts in several highly liquid financial and commodity markets, including SP500 EMinis, Euros, VIX, Gold, US Bonds, 10-year and five-year notes, Corn, Natural Gas and Crude Oil.  Each  component strategy uses a variety of machine learning algorithms to detect trends, seasonal effects and mean-reversion.  We develop several different types of model for each market, and deploy them according to their suitability for current market conditions.

Performance of the strategy (net of fees) since 2013 is detailed in the charts and tables below.  Notable features include a Sharpe Ratio of just over 2, an annual rate of return of 190% on an account size of $50,000, and a maximum drawdown of around 8% over the last three years.  It is worth mentioning, too, that the strategy produces approximately equal rates of return on both long and short trades, with an overall profit factor above 2.

 

Fig1

 

Fig2

 

 

Fig3

Fig4

 

Fig5

Low Correlation

Despite a high level of correlation between several of the underlying markets, the correlation between the component strategies of Futures WealthBuilder are, in the majority of cases, negligibly small (with a few exceptions, such as the high correlation between the 10-year and 5-year note strategies).  This accounts for the relative high level of return in relation to portfolio risk, as measured by the Sharpe Ratio.   We offer strategies in both products chiefly as a mean of providing additional liquidity, rather than for their diversification benefit.

Fig 6

Strategy Robustness

Strategy robustness is a key consideration in the design stage.  We use Monte Carlo simulation to evaluate scenarios not seen in historical price data in order to ensure consistent performance across the widest possible range of market conditions.  Our methodology introduces random fluctuations to historical prices, increasing or decreasing them by as much as 30%.  We allow similar random fluctuations in that value strategy parameters, to ensure that our models perform consistently without being overly-sensitive to the specific parameter values we have specified.  Finally, we allow the start date of each sub-system to vary randomly by up to a year.

The effect of these variations is to produce a wide range of outcomes in terms of strategy performance.  We focus on the 5% worst outcomes, ranked by profitability, and select only those strategies whose performance is acceptable under these adverse scenarios.  In this way we reduce the risk of overfitting the models while providing more realistic expectations of model performance going forward.  This procedure also has the effect of reducing portfolio tail risk, and the maximum peak-to-valley drawdown likely to be produced by the strategy in future.

GC Daily Stress Test

Futures WealthBuilder on Collective 2

We will be running a variant of the Futures WealthBuilder strategy on the Collective 2 site, using a subset of the strategy models in several futures markets(see this page for details).  Subscribers will be able to link and auto-trade the strategy in their own account, assuming they make use of one of the approved brokerages which include Interactive Brokers, MB Trading and several others.

Obviously the performance is unlikely to be as good as the complete strategy, since several component sub-strategies will not be traded on Collective 2.  However, this does give the subscriber the option to trial the strategy in simulation before plunging in with real money.

Fig7