Wednesday, January 11, 2012

Model Building / Data Mining

Yesterday, I outlined findings of a model that allocates to the S&P 500 when the VIX is below 20 and to cash when above 20. This post will expand on that post to build a model that outperforms the S&P, with less volatility, over the 1993-2011 time frame. The post is less about how great the model is (that is to be determined), as much as just how easy it is to use simple data mining techniques to build models that look GREAT using historical data (i.e. buyer beware of all these new funds / models coming out).

The Model

When analyzing the original model, we saw that the S&P 500 actually performed quite well on average (albeit with huge swings at times) at both low (VIX below 17.5) and high (VIX above 25 levels). Thus, a simple model would allocate to stocks when the VIX is below 17.5 or above 25 and cash when it is not. But, let's see if we can data mine improve on that further. When stocks do poorly, bonds (especially government bonds) tend to do very well (i.e. they become negatively correlated). Using Fidelity's Government Income Fund 'FGOVX' (I am not vouching for this fund, it was simply the first I found with daily returns going back to 1993), we compare returns of government bonds within each volatility "bucket" relative to what we found for the S&P 500.

As can be seen above, in each of the 17.5 to 25 VIX "buckets", government bonds outperformed (on average). This is just what we were looking for.

To the model's results... in 'daily rebalancing' we allocate on a daily basis to the S&P 500 (ETF SPY) when VIX is below 17.5 or above 25 and to government bonds (fund FGOVX) when it is not (monthly is simply on a monthly basis). This rotation strategy had excess returns over SPY's 7.7% annualized return since 1993 of almost 3% on a daily basis and 1.7% on a monthly basis annualized (excluding transaction costs) with volatility reduced around 3% over that time frame.

Model Results (January 1993 - December 2011)

So is this legit model building or data mining? I believe it may be both (if a potential legit model, it needs further testing across markets and time frames), but in no way am I confident this performance can be replicated on a going forward basis.