Strategy Robustness Archives - QUANTITATIVE RESEARCH AND TRADING

July 26, 2022July 26, 2022

Developing Trading Strategies With Synthetic Data

One of the main criticisms levelled at systematic trading over the last few years is that the over-use of historical market data has tended to produce curve-fitted strategies that perform poorly out of sample in a live trading environment. This is indeed a valid criticism – given enough attempts one is bound to arrive eventually at a strategy that performs well in backtest, even on a holdout data sample. But that by no means guarantees that the strategy will continue to perform well going forward.

The solution to the problem has been clear for some time: what is required is a method of producing synthetic market data that can be used to build a strategy and test it under a wide variety of simulated market conditions. A strategy built in this way is more likely to survive the challenge of live trading than one that has been developed using only a single historical data path.

The problem, however, has been in implementation. Up until now all the attempts to produce credible synthetic price data have failed, for one reason or another, as I described in an earlier post:

A New Approach to Generating Synthetic Market Data

I have been able to devise a completely new algorithm for generating artificial price series that meet all of the key requirements, as follows:

Computational simplicity & efficiency. Important if we are looking to mass-produce synthetic series for a large number of assets, for a variety of different applications. Some deep learning methods would struggle to meet this requirement, even supposing that transfer learning is possible.
The ability to produce price series that are internally consistent (i.e High > Low, etc) in every case .
Should be able to produce a range of synthetic series that vary widely in their correspondence to the original price series. In some case we want synthetic price series that are highly correlated to the original; in other cases we might want to test our investment portfolio or risk control systems under extreme conditions never before seen in the market.
The distribution of returns in the synthetic series should closely match the historical series, being non-Gaussian and with “fat-tails”.
The ability to incorporate long memory effects in the sequence of returns.
The ability to model GARCH effects in the returns process.

This means that we are now in a position to develop trading strategies without any direct reference to the underlying market data. Consequently we can then use all of the real market data for out-of-sample back-testing.

Developing a Trading Strategy for the S&P 500 Index Using Synthetic Market Data

To illustrate the procedure I am going to use daily synthetic price data for the S&P 500 Index over the period from Jan 1999 to July 2022. Details of the the characteristics of the synthetic series are given in the post referred to above.

This image has an empty alt attribute; its file name is Fig3-12.png

Because we want to create a trading strategy that will perform under market conditions close to those currently prevailing, I will downsample the synthetic series to include only those that correlate quite closely, i.e. with a minimum correlation of 0.75, with the real price data.

Why do this? Surely if we want to make a strategy as robust as possible we should use all of the synthetic data series for model development?

The reason is that I believe that some of the more extreme adverse scenarios generated by the algorithm may occur quite rarely, perhaps once in every few decades. However, I am principally interested in a strategy that I can apply under current market conditions and I am prepared to take my chances that the worst-case scenarios are unlikely to come about any time soon. This is a major design decision, one that you may disagree with. Of course, one could make use of every available synthetic data series in the development of the trading model and by doing so it is likely that you would produce a model that is more robust. But the training could take longer and the performance during normal market conditions may not be as good.

Having generated the price series, the process I am going to follow is to use genetic programming to develop trading strategies that will be evaluated on all of the synthetic data series simultaneously. I will then use the performance of the aggregate portfolio, i.e. the outcome of all of the trades generated by the strategy when applied to all of the synthetic series, to assess the overall performance. In order to be considered, candidate strategies have to perform well under all of the different market scenarios, or at least the great majority of them. This ensures that the strategy is likely to prove more robust across different types of market conditions, rather than on just the single type of market scenario observed in the real historical series.

As usual in these cases I will reserve a portion (10%) of each data series for testing each strategy, and a further 10% sample for out-of-sample validation. This isn’t strictly necessary: since the real data series has not be used directly in the development of the trading system, we can later test the strategy on all of the historical data and regard this as an out-of-sample backtest.

To implement the procedure I am going to use Mike Bryant’s excellent Adaptrade Builder software.

This is an exemplar of outstanding software engineering and provides a broad range of features for generating trading strategies of every kind. One feature of Builder that is particularly useful in this context is its ability to construct strategies and test them on up to 20 data series concurrently. This enables us to develop a strategy using all of the synthetic data series simultaneously, showing the performance of each individual strategy as well for as the aggregate portfolio.

After evolving strategies for 50 generations we arrive at the following outcome:

The equity curve for the aggregate portfolio is shown in blue, while the equity curves for the strategy applied to individual synthetic data series are shown towards the bottom of the chart. Of course, the performance of the aggregate portfolio appears much superior to any of the individual strategies, because it is effectively the arithmetic sum of the individual equity curves. And just because the aggregate portfolio appears to perform well both in-sample and out-of-sample, that doesn’t imply that the strategy works equally well for every individual market scenario. In some scenarios it performs better than in others, as can be observed from the individual equity curves.

But, in any case, our objective here is not to create a stock portfolio strategy, but rather to trade a single asset – the S&P 500 Index. The role of the aggregate portfolio is simply to suggest that we may have found a strategy that is sufficiently robust to work well across a variety of market conditions, as represented by the various synthetic price series.

Builder generates code for the strategies it evolves in a number of different languages and in this case we take the EasyLanguage code for the fittest strategy #77 and apply it to a daily chart for the S&P 500 Index – i.e. the real data series – in Tradestation, with the following results:

The strategy appears to work well “out-of-the-box”, i,e, without any further refinement. So our quest for a robust strategy appears to have been quite successful, given that none of the 23-year span of real market data on which the strategy was tested was used in the development process.

We can take the process a little further, however, by “optimizing” the strategy. Traditionally this would mean finding the optimal set of parameters that produces the highest net profit on the test data. But this would be curve fitting in the worst possible sense, and is not at all what I am suggesting.

Instead we use a procedure known as Walk Forward Optimization (WFO), as described in this post:

A Simple Momentum Strategy

The goal of WFO is not to curve-fit the best parameters, which would entirely defeat the object of using synthetic data. Instead, its purpose is to test the robustness of the strategy. We accomplish this by using a sequence of overlapping in-sample and out-of-sample periods to evaluate how well the strategy stands up, assuming the parameters are optimized on in-sample periods of varying size and start date and tested of similarly varying out-of-sample periods. A strategy that fails a cluster of such tests is unlikely to prove robust in live trading. A strategy that passes a test cluster at least demonstrates some capability to perform well in different market regimes.

To some extent we might regard such a test as unnecessary, given that the strategy has already been observed to perform well under several different market conditions, encapsulated in the different synthetic price series, in addition to the real historical price series. Nonetheless, we conduct a WFO cluster test to further evaluate the robustness of the strategy.

As the goal of the procedure is not to maximize the theoretical profitability of the strategy, but rather to evaluate its robustness, we select a criterion other than net profit as the factor to optimize. Specifically, we select the sum of the areas of the strategy drawdowns as the quantity to minimize (by maximizing the inverse of the sum of drawdown areas, which amounts to the same thing). This requires a little explanation.

If we look at the strategy drawdown periods of the equity curve, we observe several periods (highlighted in red) in which the strategy was underwater:

The area of each drawdown represents the length and magnitude of the drawdown and our goal here is to minimize the sum of these areas, so that we reduce both the total duration and severity of strategy drawdowns.

In each WFO test we use different % of OOS data and a different number of runs, assessing the performance of the strategy on a battery of different criteria:

These criteria not only include overall profitability, but also factors such as parameter stability, profit consistency in each test, the ratio of in-sample to out-of-sample profits, etc. In other words, this WFO cluster analysis is not about profit maximization, but robustness evaluation, as assessed by these several different metrics. And in this case the strategy passes every test with flying colors:

Other than validating the robustness of the strategy’s performance, the overall effect of the procedure is to slightly improve the equity curve by diminishing the magnitude and duration of the drawdown periods:

Conclusion

We have shown how, by using synthetic price series, we can build a robust trading strategy that performs well under a variety of different market conditions, including on previously “unseen” historical market data. Further analysis using cluster WFO tests strengthens the assessment of the strategy’s robustness.

March 10, 2020

Capitalizing on the Coming Market Crash

Long-Only Equity Investors

Recently I have been discussing possible areas of collaboration with an RIA contact on LinkedIn, who also happens to be very familiar with the hedge fund world. He outlined the case of a high net worth investor in equities (long only), who wanted to remain invested, but was becoming increasingly concerned about the prospects for a significant market downturn, or even a market crash, similar to those of 2000 or 2008.

I am guessing he is not alone: hardly a day goes by without the publication of yet another article sounding a warning about stretched equity valuations and the dangerously elevated level of the market.

The question put to me was, what could be done to reduce the risk in the investor’s portfolio?

Typically, conservative investors would have simply moved more of their investment portfolio into fixed income securities, but with yields at such low levels this is hardly an attractive option today. Besides, many see the bond market as representing an even more extreme bubble than equities currently.

Hedging Strategies

The problem with traditional hedging mechanisms such as put options, for example, is that they are relatively expensive and can easily reduce annual returns from the overall portfolio by several hundred basis points. Even at current low level of volatility the performance drag is noticeable, since the potential upside in the equity portfolio is also lower than it has been for some time. A further consideration is that many investors are not mandated – or are simply reluctant – to move beyond traditional equity investing into complex ETF products or derivatives.

An equity long/short hedge fund product is one possible solution, but many equity investors are reluctant to consider shorting stocks under any circumstances, even for hedging purposes. And while a short hedge may provide some downside protection it is unlikely to fully safeguard the investor in a crash scenario. Furthermore, the cost of a hedge fund investment is typically greater than for a long-only product, entailing the payment of a performance fee in addition to management fees that are often higher than for standard investment products.

The Ideal Investment Strategy

Given this background, we can say that the ideal investment strategy is one that:

Invests long-only in equities
Is inexpensive to implement (reasonable management fees; no performance fees)
Does not require shorting stocks, or expensive hedging mechanisms such as options
Makes acceptable returns during both bull and bear markets
Is likely to produce positive returns in a market crash scenario

A typical buy-and-hold approach is unlikely to meet only the first three requirements, although an argument could be made that a judicious choice of defensive stocks might enable the investment portfolio to generate returns at an “acceptable” level during a downturn (without being prescriptive as to the precise meaning of that term may be). But no buy-and-hold strategy could ever be expected to prosper during times of severe market stress. A more sophisticated approach is required.

Market Timing

Market timing is regarded as a “holy grail” by some quantitative strategists. The idea, simply, is to increase or reduce risk exposure according to the prospects for the overall market. For a very long time the concept has been dismissed as impossible, by definition, given that markets are mostly efficient. But analysts have persisted in the attempt to develop market timing techniques, motivated by the enormous benefits that a viable market timing strategy would bring. And gradually, over time, evidence has accumulated that the market can be timed successfully and profitably. The rate of progress has accelerated in the last decade by the considerable advances in computing power and the development of machine learning algorithms and application of artificial intelligence to investment finance.

I have written several articles on the subject of market timing that the reader might be interested to review (see below). In this article, however, I want to focus firstly on the work on another investment strategist, Blair Hull.

http://jonathankinlay.com/2014/07/how-to-bulletproof-your-portfolio/

http://jonathankinlay.com/2014/07/enhancing-mutual-fund-returns-with-market-timing/

The Hull Tactical Fund

Blair Hull rose to prominence in the 1980’s and 1990’s as the founder of the highly successful quantitative option market making firm, the Hull Trading Company which at one time moved nearly a quarter of the entire daily market volume on some markets, and executed over 7% of the index options traded in the US. The firm was sold to Goldman Sachs at the peak of the equity market in 1999, for a staggering $531 million.

Blair used the capital to establish the Hull family office, Hull Investments, and in 2013 founded an RIA, Hull Tactical Asset Allocation LLC. The firm’s investment thesis is firmly grounded in the theory of market timing, as described in the paper “A Practitioner’s Defense of Return Predictability”, authored by Blair Hull and Xiao Qiao, in which the issues and opportunities of market timing and return predictability are explored.

In 2015 the firm launched The Hull Tactical Fund (NYSE Arca: HTUS), an actively managed ETF that uses quantitative trading model to take long and short positions in ETFs that seek to track the performance of the S&P 500, as well as leveraged ETFs or inverse ETFs that seek to deliver multiples, or the inverse, of the performance of the S&P 500. The goal to achieve long-term growth from investments in the U.S. equity and Treasury markets, independent of market direction.

How well has the Hull Tactical strategy performed? Since the fund takes the form of an ETF its performance is a matter in the public domain and is published on the firm’s web site. I reproduce the results here, which compare the performance of the HTUS ETF relative to the SPDR S&P 500 ETF (NYSE Arca: SPY):

Although the HTUS ETF has underperformed the benchmark SPY ETF since launching in 2015, it has produced a higher rate of return on a risk-adjusted basis, with a Sharpe ratio of 1.17 vs only 0.77 for SPY, as well as a lower drawdown (-3.94% vs. -13.01%). This means that for the same “risk budget” as required to buy and hold SPY, (i.e. an annual volatility of 13.23%), the investor could have achieved a total return of around 36% by using margin funds to leverage his investment in HTUS by a factor of 2.8x.

How does the Hull Tactical team achieve these results? While the detailed specifics are proprietary, we know from the background description that market timing (and machine learning concepts) are central to the strategy and this is confirmed by the dynamic level of the fund’s equity exposure over time:

A Long-Only, Crash-Resistant Equity Strategy

A couple of years ago I and my colleagues carried out an investigation of long-only equity strategies as part of a research project. Our primary focus was on index replication, but in the course of our research we came up with a methodology for developing long-only strategies that are highly crash-resistant.

The performance of our Long-Only Market Timing strategy is summarized below and compared with the performance of the HTUS ETF and benchmark SPY ETF (all results are net of fees). Over the period from inception of the HTUS ETF, our LOMT strategy produced a higher total return than HTUS (22.43% vs. 13.17%), higher CAGR (10.07% vs. 6.04%), higher risk adjusted returns (Sharpe Ratio 1.34 vs 1.21) and larger annual alpha (6.20% vs 4.25%). In broad terms, over this period the LOMT strategy produced approximately the same overall return as the benchmark SPY ETF, but with a little over half the annual volatility.

Application of Artificial Intelligence to Market Timing

Like the HTUS ETF, our LOMT strategy operates with very low fees, comparable to an ETF product rather than a hedge fund (1% management fee, no performance fees). Again, like the HTUS ETF our LOMT products makes no use of leverage. However, unlike HTUS it avoids complicated (and expensive) inverse or leveraged ETF products and instead invests only in two assets – the SPY ETF and 91-day US Treasury Bills. In other words, the LOMT strategy is a pure market timing strategy, moving capital between the SPY ETF and Treasury Bills depending on its forecast of future market performance. These forecasts are derived from machine learning algorithms that are specifically tuned to minimize the downside risk in the investment portfolio. This not only makes strategy returns less volatile, but also ensures that the strategy is very robust to market downturns.

In fact, even better than that, not only does the LOMT strategy tend to avoid large losses during periods of market stress, it is capable of capitalizing on the opportunities that more volatile market conditions offer. Looking at the compounded returns (net of fees) over the period from 1994 (the inception of the SPY ETF) we see that the LOMT strategy produces almost double the total profit of the SPY ETF, despite several years in which it underperforms the benchmark. The reason is clear from the charts: during the periods 2000-2002 and again in 2008, when the market crashed and returns in the SPY ETF were substantially negative, the LOMT strategy managed to produce positive returns. In fact, the banking crisis of 2008 provided an exceptional opportunity for the LOMT strategy, which in that year managed to produce a return nearing +40% at a time when the SPY ETF fell by almost the same amount!

Long Volatility Strategies

I recall having a conversation with Nassim Taleb, of Black Swan fame, about his Empirica fund around the time of its launch in the early 2000’s. He explained that his analysis had shown that volatility was often underpriced due to an under-estimation of tail risk, which the fund would seek to exploit by purchasing cheap out-of-the-money options. My response was that this struck me a great idea for an insurance product, but not a hedge fund – his investors, I explained, were going to hate seeing month after month of negative returns and would flee the fund. By the time the big event occurred there wouldn’t be sufficient AUM remaining to make up the shortfall. And so it proved.

A similar problem arises from most long-volatility strategies, whether constructed using options, futures or volatility ETFs: the combination of premium decay and/or negative carry typically produces continuing losses that are very difficult for the investor to endure.

Conclusion

What investors have been seeking is a strategy that can yield positive returns during normal market conditions while at the same time offering protection against the kind of market gyrations that typically decimate several years of returns from investment portfolios, such as we saw after the market crashes in 2000 and 2008. With the new breed of long-only strategies now being developed using machine learning algorithms, it appears that investors finally have an opportunity to get what they always wanted, at a reasonable price.

And just in time, if the prognostications of the doom-mongers turn out to be correct.

Contact Hull Tactical

Contact Systematic Strategies

July 23, 2019

Crash-Proof Investing

As markets continue to make new highs against a backdrop of ever diminishing participation and trading volume, investors have legitimate reasons for being concerned about prospects for the remainder of 2016 and beyond, even without consideration to the myriad of economic and geopolitical risks that now confront the US and global economies. Against that backdrop, remaining fully invested is a test of nerves for those whose instinct is that they may be picking up pennies in front an oncoming steamroller. On the other hand, there is a sense of frustration in cashing out, only to watch markets surge another several hundred points to new highs.

In this article I am going to outline some steps investors can take to match their investment portfolios to suit current market conditions in a way that allows them to remain fully invested, while safeguarding against downside risk. In what follows I will be using our own Strategic Volatility Strategy, which invests in volatility ETFs such as the iPath S&P 500 VIX ST Futures ETN (NYSEArca:VXX) and the VelocityShares Daily Inverse VIX ST ETN (NYSEArca:XIV), as an illustrative example, although the principles are no less valid for portfolios comprising other ETFs or equities.

Risk and Volatility

Risk may be defined as the uncertainty of outcome and the most common way of assessing it in the context of investment theory is by means of the standard deviation of returns. One difficulty here is that one may never ascertain the true rate of volatility – the second moment – of a returns process; one can only estimate it. Hence, while one can be certain what the closing price of a stock was at yesterday’s market close, one cannot say what the volatility of the stock was over the preceding week – it cannot be observed the way that a stock price can, only estimated. The most common estimator of asset volatility is, of course, the sample standard deviation. But there are many others that are arguably superior: Log-Range, Parkinson, Garman-Klass to name but a few (a starting point for those interested in such theoretical matters is a research paper entitled Estimating Historical Volatility, Brandt & Kinlay, 2005).

Leaving questions of estimation to one side, one issue with using standard deviation as a measure of risk is that it treats upside and downside risk equally – the “risk” that you might double your money in an investment is regarded no differently than the risk that you might see your investment capital cut in half. This is not, of course, how investors tend to look at things: they typically allocate a far higher cost to downside risk, compared to upside risk.

One way to address the issue is by using a measure of risk known as the semi-deviation. This is estimated in exactly the same way as the standard deviation, except that it is applied only to negative returns. In other words, it seeks to isolate the downside risk alone.

This leads directly to a measure of performance known as the Sortino Ratio. Like the more traditional Sharpe Ratio, the Sortino Ratio is a measure of risk-adjusted performance – the average return produced by an investment per unit of risk. But, whereas the Sharpe Ratio uses the standard deviation as the measure of risk, for the Sortino Ratio we use the semi-deviation. In other words, we are measuring the expected return per unit of downside risk.

There may be a great deal of variation in the upside returns of a strategy that would penalize the risk-adjusted returns, as measured by its Sharpe Ratio. But using the Sortino Ratio, we ignore the upside volatility entirely and focus exclusively on the volatility of negative returns (technically, the returns falling below a given threshold, such as the risk-free rate. Here we are using zero as our benchmark). This is, arguably, closer to the way most investors tend to think about their investment risk and return preferences.

In a scenario where, as an investor, you are particularly concerned about downside risk, it makes sense to focus on downside risk. It follows that, rather than aiming to maximize the Sharpe Ratio of your investment portfolio, you might do better to focus on the Sortino Ratio.

Factor Risk and Correlation Risk

Another type of market risk that is often present in an investment portfolio is correlation risk. This is the risk that your investment portfolio correlates to some other asset or investment index. Such risks are often occluded – hidden from view – only to emerge when least wanted. For example, it might be supposed that a “dollar-neutral” portfolio, i.e. a portfolio comprising equity long and short positions of equal dollar value, might be uncorrelated with the broad equity market indices. It might well be. On the other hand, the portfolio might become correlated with such indices during times of market turbulence; or it might correlate positively with some sector indices and negatively with others; or with market volatility, as measured by the CBOE VIX index, for instance.

Where such dependencies are included by design, they are not a problem; but when they are unintended and latent in the investment portfolio, they often create difficulties. The key here is to test for such dependencies against a variety of risk factors that are likely to be of concern. These might include currency and interest rate risk factors, for example; sector indices; or commodity risk factors such as oil or gold (in a situation where, for example, you are investing a a portfolio of mining stocks). Once an unwanted correlation is identified, the next step is to adjust the portfolio holdings to try to eliminate it. Typically, this can often only be done in the average, meaning that, while there is no correlation bias over the long term, there may be periods of positive, negative, or alternating correlation over shorter time horizons. Either way, it’s important to know.

Using the Strategic Volatility Strategy as an example, we target to maximize the Sortino Ratio, subject also to maintaining very lows levels of correlation to the principal risk factors of concern to us, the S&P 500 and VIX indices. Our aim is to create a portfolio that is broadly impervious to changes in the level of the overall market, or in the level of market volatility.

One method of quantifying such dependencies is with linear regression analysis. By way of illustration, in the table below are shown the results of regressing the daily returns from the Strategic Volatility Strategy against the returns in the VIX and S&P 500 indices. Both factor coefficients are statistically indistinguishable from zero, i.e. there is significant no (linear) dependency. However, the constant coefficient, referred to as the strategy alpha, is both positive and statistically significant. In simple terms, the strategy produces a return that is consistently positive, on average, and which is not dependent on changes in the level of the broad market, or its volatility. By contrast, for example, a commonplace volatility strategy that entails capturing the VIX futures roll would show a negative correlation to the VIX index and a positive dependency on the S&P500 index.

Tail Risk

Ever since the publication of Nassim Taleb’s “The Black Swan”, investors have taken a much greater interest in the risk of extreme events. If the bursting of the tech bubble in 2000 was not painful enough, investors surely appear to have learned the lesson thoroughly after the financial crisis of 2008. But even if investors understand the concept, the question remains: what can one do about it?

The place to start is by looking at the fundamental characteristics of the portfolio returns. Here we are not such much concerned with risk, as measured by the second moment, the standard deviation. Instead, we now want to consider the third and forth moments of the distribution, the skewness and kurtosis.

Comparing the two distributions below, we can see that the distribution on the left, with negative skew, has nonzero probability associated with events in the extreme left of the distribution, which in this context, we would associate with negative returns. The distribution on the right, with positive skew, is likewise “heavy-tailed”; but in this case the tail “risk” is associated with large, positive returns. That’s the kind of risk most investors can live with.

Source: Wikipedia

A more direct measure of tail risk is kurtosis, literally, “heavy tailed-ness”, indicating a propensity for extreme events to occur. Again, the shape of the distribution matters: a heavy tail in the right hand portion of the distribution is fine; a heavy tail on the left (indicating the likelihood of large, negative returns) is a no-no.

Let’s take a look at the distribution of returns for the Strategic Volatility Strategy. As you can see, the distribution is very positively skewed, with a very heavy right hand tail. In other words, the strategy has a tendency to produce extremely positive returns. That’s the kind of tail risk investors prefer.

Another way to evaluate tail risk is to examine directly the performance of the strategy during extreme market conditions, when the market makes a major move up or down. Since we are using a volatility strategy as an example, let’s take a look at how it performs on days when the VIX index moves up or down by more than 5%. As you can see from the chart below, by and large the strategy returns on such days tend to be positive and, furthermore, occasionally the strategy produces exceptionally high returns.

The property of producing higher returns to the upside and lower losses to the downside (or, in this case, a tendency to produce positive returns in major market moves in either direction) is known as positive convexity.

Positive convexity, more typically found in fixed income portfolios, is a highly desirable feature, of course. How can it be achieved? Those familiar with options will recognize the convexity feature as being similar to the concept of option Gamma and indeed, one way to produce such a payoff is buy adding options to the investment mix: put options to give positive convexity to the downside, call options to provide positive convexity to the upside (or using a combination of both, i.e. a straddle).

In this case we achieve positive convexity, not by incorporating options, but through a judicious choice of leveraged ETFs, both equity and volatility, for example, the ProShares UltraPro S&P500 ETF (NYSEArca:UPRO) and the ProShares Ultra VIX Short-Term Futures ETN (NYSEArca:UVXY).

Putting It All Together

While we have talked through the various concepts in creating a risk-protected portfolio one-at-a-time, in practice we use nonlinear optimization techniques to construct a portfolio that incorporates all of the desired characteristics simultaneously. This can be a lengthy and tedious procedure, involving lots of trial and error. And it cannot be emphasized enough how important the choice of the investment universe is from the outset. In this case, for instance, it would likely be pointless to target an overall positively convex portfolio without including one or more leveraged ETFs in the investment mix.

Let’s see how it turned out in the case of the Strategic Volatility Strategy.

Note that, while the portfolio Information Ratio is moderate (just above 3), the Sortino Ratio is consistently very high, averaging in excess of 7. In large part that is due to the exceptionally low downside risk, which at 1.36% is less than half the standard deviation (which is itself quite low at 3.3%). It is no surprise that the maximum drawdown over the period from 2012 amounts to less than 1%.

A critic might argue that a CAGR of only 10% is rather modest, especially since market conditions have generally been so benign. I would answer that criticism in two ways. Firstly, this is an investment that has the risk characteristics of a low-duration government bond; and yet it produces a yield many times that of a typical bond in the current low interest rate environment.

Secondly, I would point out that these results are based on use of standard 2:1 Reg-T leverage. In practice it is entirely feasible to increase the leverage up to 4:1, which would produce a CAGR of around 20%. Investors can choose where on the spectrum of risk-return they wish to locate the portfolio and the strategy leverage can be adjusted accordingly.

Conclusion

The current investment environment, characterized by low yields and growing downside risk, poses difficult challenges for investors. A way to address these concerns is to focus on metrics of downside risk in the construction of the investment portfolio, aiming for high Sortino Ratios, low correlation with market risk factors, and positive skewness and convexity in the portfolio returns process.

Such desirable characteristics can be achieved with modern portfolio construction techniques providing the investment universe is chosen carefully and need not include anything more exotic than a collection of commonplace ETF products.

September 14, 2018

More on Strategy Robustness

Commentators have made the point that a high % win rate is not enough.

Yes, you obviously want to pay attention to other performance metrics also, such as profit factor. In fact, there is no reason why you shouldn’t consider an objective function that explicitly combines various desirable performance measures, for example:

net profit * % win rate * profit factor

Another approach is to build the model using a data set spanning a different period. I did this with WFC using data from 1990, rather than 1970. Not only was the performance from 1990-2014 better, so too was the performance during the OOS period 1970-1989. Profit factor was 2.49 and %Win rate was 70% across the 44 year period from 1970. For the period from 1990, the performance metrics increase to 3.04 and 73%, respectively.

So in this case, it appears, a most robust strategy resulted from using less data, rather than more. At first this appears counterintuitive. But it’s quite possible for a strategy to be over-condition on behavior that is no longer relevant to the market today. Eliminating such conditioning can sometimes enable strategies to emerge that have greater longevity.

September 14, 2018

Optimizing Strategy Robustness

Below is the equity curve for an equity strategy I developed recently, implemented in WFC. The results appear outstanding: no losing years in over 20 years, profit factor of 2.76 and average win rate of 75%. Out-of-sample results (double blind) for 2013 and 2014: net returns of 27% and 16% YTD.

So far so good. However, if we take a step back through the earlier out of sample period, from 1970, the picture is rather less rosy:

Now, at this point, some of you will be saying: nothing to see here – it’s obviously just curve fitting. To which I would respond that I have seen successful strategies, including several hedge fund products, with far shorter and less impressive back-tests than the initial 20-year history I showed above.

That said, would you be willing to take the risk of trading a strategy such as this one? I would not: at the back of my mind would always be the concern that the market might easily revert to the conditions that applied during the 1970s and 1980’s. I expect many investors would share that concern.

But to the point of this post: most strategies are designed around the criterion of maximizing net profit. Occasionally you might come across someone who has considered risk, perhaps in the form of drawdown, or Sharpe ratio. But, in general, it’s all about optimizing performance.

Suppose that, instead of maximizing performance, your objective was to maximize the robustness of the strategy. What criteria would you use?

In my own research, I have used a great many different objective functions, often multi-dimensional. Correlation to the perfect equity curve, net profit / max drawdown and Sortino ratio are just a few examples. But if I had to guess, I would say that the criteria that tends to produce the most robust strategies and reliable out of sample performance is the maximization of the win rate, subject to a minimum number of trades.

I am not aware of a great deal of theory on this topic. I would be interested to learn of other readers’ experience.

September 4, 2018September 5, 2018

How to Spot a Fake

One of the issues that comes up regularly is how, as an investor or other interested party, one can protect oneself from unscrupulous scam artists posing as professional traders or money managers. This is a particular problem on web sites featuring trader forums, where individuals with unverified track records claiming stellar trading histories use their purported trading “prowess” to try to impress and intimidate other participants, usually impressionable newbies. The purpose of this post is to provide some guidance to help investors, traders and other fellow travelers sort the wheat from the chaff. We’ll be doing some forensic analysis on the track record for a strategy in NG futures that one such character recently posted in one of these forums, as a classic example of the kind of fakery I am describing.

One thing you should understand about scam artists operating on forums, is that they don’t work alone: usually they have a bunch of groupies who will shill for them at every opportunity and who will try to shout down any investigative questioning. Don’t be deterred. These know-it-alls are usually just ignorant dupes, who understand no more about trading than the scam artist. They may just as easily be fellow-scam artists themselves.

THE FIRST BIG RED FLAG: UNWILLINGNESS TO PRODUCE A TRACK RECORD
Anyone claiming to be a CTA or professional money manager (or whose shills claim he is one) has to have a track record that is freely available in the public domain. So how does a scam artist overcome a challenge to produce it? He will claim that he “can’t advertise”, or make some other, similar excuse. Don’t accept that at face value. Ask him to PM it to you. If he won’t, there’s already a high probability he’s a con artist.

THE SECOND BIG RED FLAG: CURVE FITTING
Let’s say our suspect meets the challenge and produces a track record. Ideally this will be an audited P&L statement, but let’s assume for the purposes of this discussion that he produces something along the lines of the Performance Reports produced by a product like Tradestation or MultiCharts, i.e. we are dealing with a simulated back-test.

If your suspect produces a back-test, you can be pretty sure it’s going to look good – otherwise he wouldn’t produce it. The task now is to dig into those reports to spot the red flags that give clues as to whether it might be fake.
Now of course any trading system is going to make assumptions – about fill rates, slippage, commissions, capacity etc. All that is fine, as long as the assumptions are clearly stated. You might want to challenge any or all of the assumptions, and the trader may disagree with you about some or all of them. That’s perfectly ok – it’s an honest, open discussion about a set of investment assumptions that have been revealed at the outset.

But here is what is NOT ok: any opacity about which data was used to build the trading model and which data was used to test it. The former, the in-sample (IS) data set, used to construct the model, must be entirely separate and distinct from the out-of-sample (OOS) data set. It is trivially easy using a tool like Tradestation to produce a trading system that shows stellar results in-sample, but which will immediately crash and burn when it is used in live trading. This is known as curve-fitting. And it’s by far the most common method by which scam artists try to dupe investors.

In order to demonstrate the robustness of the system prior to risking real money, a genuine trader will test his system OOS and show you the results. What you are looking for ideally is congruity between the IS and OOS results. Now by congruity, I don’t mean that they should be identical. Far from it – markets evolve and strategy performance will vary over time. But what you are hoping is that the key performance metrics in the OOS and IS periods, such as annual returns, Sharpe ratio, PNL per contract, profit ratio and win rate, will be comparable. At the very least, you would like to be able to identify some portion of the IS data set for which the strategy performance characteristics are similar to those in the OOS period.

Any – I mean ANY – ambiguity or lack of clarity about which data was used to build the model and which was used for OOS testing is a HUGE red flag. Chances are, your scam artist is already trying to fudge the issue that he curve-fitted the system.
This was the case in the recent forum post we are using as a test case. The trader made no attempt whatsoever to clarify which data was used for model development and which for testing. Immediately, I was suspicious and began looking for other evidence of curve fitting. It didn’t take me long to find it.

THE THIRD BIG RED FLAG: THE EQUITY CURVE
The first item I turned to in the performance reports was the equity curve and I immediately spotted two rather large clues that I was dealing with a fake.

The first clue was the large sign on the chart labelled “live start date”. What does this mean? This is a back-test, so all of the results are theoretical, including those after the supposed “live start date” sometime in 2013. What the faker is trying to do is imply the part of the equity curve shown after that date indicate actual performance results. He doesn’t actually claim this, so he has plausible deniability if you call him on it (“I said it was just a back test”). But he hopes that you won’t, and that, by default, you’ll accept these results are real. But they aren’t.

The second clue of fakery is much more important: the equity curve itself. When someone shows you and equity curve like the one reported by this trader, rising in a straight line from the lower left to upper right quadrants, you can be 99% confident that you are dealing with a fake.
You see, in finance there are almost never any straight lines. They are as rare as unicorns. Especially when it comes to strategy performance. They only time you will EVER see an equity curve like this is when you are looking at the equity curve of (i) a high frequency market making trading system or (ii) a fake, produced by curve fitting a strategy to the ENTIRE data set.
And this strategy was not high frequency – as we shall see, it operated on 15 minute bars, holding positions overnight.

THE FOURTH BIG RED FLAG: GOD’s EQUITY CURVE
I said that straight line equity curve were extremely rare. In fact, even God’s equity curve isn’t often a straight line. What does that mean?

Suppose you had a strategy that could predict with 100% accuracy whether the market would go up or down over the next bar (whether you are using daily bars, or 15 minute bars, as in our example). The system would buy (or hold) when the market was forecast to rise, and sell when the market was predicted to fall. What would the performance of such a perfect system look like? Pretty stellar, obviously. And most people would guess that the system’s equity curve would be a straight line, or maybe even exponential in shape. In fact that’s typically not the case. God’s equity curve will be sloped and kinked, just like any other equity curve. And if your suspect’s equity curve is real, it should show some commonality with God’s equity curve, by which I mean it should show changes in slope and level that reflect those seen in the perfect equity curve.

What does God’s Equity Curve look like in NG futures?

As you can see it’s not straight. In fact it’s concave. So a REAL equity curve should have similar characteristics, like this one, for example:

As you can see, the equity curve of the real trading system track’s God’s Equity Curve, albeit at a much lower level. It’s concave, with an upswing during the final few months of trading, just like God’s. That’s a good sign that the strategy back-test is very likely genuine (which it is – I produced it).

Why is Gods’ Equity Curve the shape it is? The answer will vary from market to market. In the case of NG, the suggestion is that the market is becoming more efficient: simple trading strategies based on technical indicators work less well than they did five years ago. We have seen something very similar in F/X markets. During the 1970’s and 1980’s when Soros was active in the field, simple strategies like moving average crossovers made great returns, but these entirely dissipated in the 1990’s, with the advent of widely available computing power.

THE FIFTH BIG RED FLAG: THE SHILL SHOUTDOWN
When I posted my analysis, which clearly indicated fakery by this well known forum participant, I was immediately flamed by one of his supporters who shouted something to the effect that (i) everyone knows that the downward slope of God’s Equity Curve was caused by volatility and (ii) the star trader, unlike God, or me, knows about position sizing.

This attempt at misdirection in the face of awkward facts is a classic sign of fakery. What distinguishes the shill post is:

(i) Immediacy – clearly no attempt has been made to evaluate the argument or analysis. The shill simply attempts to drown out the critic with a lot of noise, as quickly as possible.

(ii) Plausibility – shills will throw around terms that lend plausibility to their objection, but which after a moment’s reflection are entirely irrelevant or, as in this case, detrimental to their own cause.

(iii) Invective – the more intemperate the post, the more likely the shill is simply trying to provide cover for the faker.

So let’s take a moment to dispose of the plausible sounding objections posted by the shill in this example.
I am going to take it as read that everyone understands that trading profitability is positively correlated with volatility. There is a huge amount of empirical research supporting that finding, but to keep it simple we can appeal to one of the cornerstones of modern finance: risk and return. The higher the volatility, i.e. the greater the risk, the greater the return traders and investors in the markets will require on their capital. This is a principle of modern financial theory that even a graduate of the Scranton college of fine art should be expected to appreciate.

So what’s the story with NG volatility? You can see the time series of NG volatility in the chart below. One feature stands out above all others: the upward slope of the curve. NG volatility has RISEN over the sample period from 2008 to 2014. Consequently, returns from trading NG futures should also have RISEN rather than fallen. One thing we can say for sure, whatever caused the concave shape in God’s Equity Curve in NG futures, it was NOT volatility!

Turning to the shill’s next, plausible sounding, but dubious “explanation”, position sizing: this really is completely irrelevant. Because, as we shall see from an examination of the performance report, the track record was created by trading a constant one-lot! So this was just an attempt to sound “sophisticated” by someone trying to misdirect the reader away from the increasingly obvious evidence of fakery.

THE SIXTH BIG RED FLAG: LOW DRAWDOWNS AND OVERNIGHT GAP RISK
One of the highly unusual features of our faker’s equity curve is it’s exceptional smoothness. Low volatility in the equity curve is, in and of itself, an indicator the track record results from curve fitting. But we can get even more insight by digging into the performance report, shown below.

As you can see from the second page of the report, the strategy holds positions for an average of 57 15-minute bars, equivalent to slightly over 14 hours. So this is a low frequency strategy that takes overnight risk. Now, as any trader will know, overnight gap risk in a product like NG can be very significant and likely to be produce much larger drawdowns over a 5 year period than the $8,470 reported here.

The only other possible explanation is that the strategy is traded continuously through both day and night sessions. But this is not only itself improbable, it gives rise to another implausibility: liquidity in the overnight session is so poor that the strategy is unlikely to be able to trade more than 1-2 contracts, at most. This would be of little value to a CTA, or its customers, whatever the star trader’s protestations that his “clients are happy”.

There is no plausible way to resolve the disconnection between the low drawdown, overnight gap risk and market illiquidity. The most plausible explanation: the back-test is a curve fitting exercise.

THE SEVENTH AN FINAL BIG RED FLAG: INCONSISTENCY BETWEEN PERFORMANCE METRICS
As any experienced strategy developer knows, you can get some of the things you want, but you can never achieve all of them. Amongst the desirable features to be maximized are
• Profit factor
• Average PNL per contract
• Percentage win rate

There is a trade-off between the features. A high PNL per contract typically means you are trading less frequently, with longer hold periods, and consequently the percentage win rate tends to be lower. Alternatively, you can increase the win rate, at the cost of lowering the average PNL per contract and/or the profit factor. And so on.

This strategy purports to have it all: a high average PNL per contract resulting from low frequency trading, coupled with good percentage win rate of over 50% and profit factor. A win rate of much over 40% is highly unusual for a momentum strategy entering and exiting with market or stop orders – and its almost inconceivable for a strategy with a PNL per contract and profit factor as large as suggested here.

CONCLUSION
This back-test fails the sniff test on so many levels, I would rate the chance of it being real as less than 1 in 1000.
The final, conclusive proof of fakery is that the “star trader” responsible for producing the report was unable and/or unwilling to attempt to answer even a single one of the criticisms.

So, be warned. If you see forum members banding about track records like this one, you can be sure that they and their strategies are likely to be fake, and not to be trusted.