Here is the latest and most exhaustive in our ongoing attempt to ruin the fun for those who think contrarian factor timing is easy and for those who think current factor valuations are extreme, in spite of all of the evidence to the contrary for both. 1 1 Close The main example of the other side of the debate is Rob Arnott and co-authors in a series of online white papers, with this being the latest.
When you write a philippic on something, you generally hope you have put the matter to rest. 2 2 Close Though, historically, philippics were generally a plural. But, my philippic was admittedly designed to be lighter on data (not bereft, just lighter) and heavier on argument. Our latest paper provides a lot of both. Still, much of the basic intuition remains the same.
I’d sum up the major points in the debate this way (admittedly much more bluntly than in the new, more academic paper, which covers a subset of the below):
• Initial results (over the last 1-2 years) using book-to-price as a valuation measure to calculate the “value spreads” of factors showed that some well-known factors were pretty expensive and some pretty cheap versus history (though none, despite the occasional rhetoric, even close to the extremes we saw in the Tech Bubble). 3 3 Close Even before the current debate I was highlighting this spread for the value factor. At that point the value factor looked about historically average, not expensive or cheap. I stressed that it was impressive that the spread was about historically average when the factor in question, value, was well known and perhaps even “crowded” in the sense of attracting a lot of money (if not crowded in the tautological sense of being over-priced). This was an early indication that perhaps being popular and well-known didn’t always have to mean being expensive. When researchers (not just us at AQR) moved beyond just book-to-price, examining multiple factors on multiple measures of valuation, all the current readings got less extreme (closer to historical norms). It seems the initial book-to-price readings were an outlier.
• An appeal to time factors often includes an appeal to “common sense” and the idea that “price matters” as if anyone is disagreeing. Of course, price matters. Still, for instance, those truths don’t make timing the overall stock market based on historical CAPE an easy exercise. Similarly, it doesn’t make factor timing based on the value spreads we established in 1999 a walk in the park. Market timing (predicting and trading on these predictions for the overall stock market) is instructive for factor timing, a similar exercise. Adding value from market timing is very hard even though the aggregate market price (e.g., the CAPE) matters and varies a lot over time. In our latest paper, we again show that factor timing is likely even harder than market timing. First, the long-short factors in question have higher turnover than the market, making long-run predictability, the possible savior of market timing, a much dodgier proposition for factor timing. Second, unlike timing the market, timing factors using just valuation must contend with contrarian factor timing being already implicitly (but strongly) present in the value factor itself. Despite the rhetoric calling factor timing simple common sense, it isn’t at all obvious that one should value-time factors, at least not to any significant extent (and if one is saying to do just a tiny bit, then let’s not argue about the insignificant!) 4 4 Close In other papers we refer to timing as an investing sin and still recommend (or at least are ok with) sinning a little. , and certainly not while they’re currently within historical bounds as they are today. In short, contrarian factor timing is likely harder than timing the stock market and is already being captured (in a more efficient manner) by those already allocating to the value factor, raising the hurdle for contrarian factor timing considerably.
• In multiple online white papers, Arnott and co-authors present evidence in support of contrarian factor timing based on a plethora of mostly inapplicable, exaggerated, and poorly designed tests that also flout research norms. For (non-exhaustive) example, they use long-horizon regressions for factors with too much turnover to make them applicable (among other things, please, please, please stop making 5-year forecasts for momentum!). They also have apples-to-oranges comparisons, with the most egregious being a comparison of contrarian factor timing based on a composite of valuation indicators, using up to date prices, to a simple book-to-price value factor using lagged prices (and declare the correlations lower than reality based on this poor comparison). They show a lot of graphs with end points excitedly marked (e.g., look where valuation got to before the deluge!) that are both too anecdotal and simply repeating that value is a good strategy (something still not in dispute) not that it should dominate the other factors (as it does if you add in too much contrarian timing). Mixing poorly designed research with ever increasing rhetoric, while living in your own universe and not explicitly referencing or dealing with relevant other work (that repeatedly points out much of the above and more), is just not how it’s supposed to be done. It’s ok to disagree and even to be wrong (e.g., looking back at our original paper using data in the Tech Bubble, I think we gave an overly optimistic impression of timing in general as the current conditions got us so worked up! Although, it certainly worked out in that scenario). It’s not ok to vary your techniques solely to achieve a certain outcome and repeatedly not address your critics.
• Yet, with all that said, there is still much to agree on and real value found in some of the messages in the papers by Arnott and co-authors. If you choose your factors based on only past performance, even long-term past performance, that can be a recipe for poor results in a world where data mining is a problem. If you choose your factors based on recent (say the last three to five years) good performance, that’s worse and a recipe for very poor results. These lessons are timeless, and anyone making these points is doing the lord’s investing work. I have always argued that chasing strong recent (again, say, the last five years – see peeve #3) performance is bad and would echo that for factors. Still, I don’t think the opposite, contrarian factor timing, is so good. This seeming contradiction is in fact reconciled by the power of diversification and the unnecessary pain induced by the lack of it. Remember, despite being the timing cynics, we have consistently found mild (again, mild!) positive power to contrarian factor selection. That is expected as, again, value is a good strategy, and this is just an attenuated (not the most efficient) form of value. Mild positive power doesn’t produce a great timing strategy, particularly when value is one of the factors being timed and presumably already present in the portfolio (again, you’re already getting contrarian factor timing from the regular old untimed value factor). But, at least such timing is in the right direction. One who does the opposite, perhaps by chasing five-year performance, however, gives up diversification (not to mention over-trading), while pursuing a mildly negative strategy instead of merely an insufficiently positive one. That’s now a real problem!
• Taking things further, it’s misguided to choose current factor exposures based solely, or even mostly, on price. Yes, price is very important. It’s why the value factor is so important. If that’s the only factor you like and believe in, then more power to you. Those who believe in other factors but only like them when they’re cheap are not really multi-factor investors. Instead, they are value investors who dabble. That’s fine if that’s truly all they believe in, but it is likely deeply suboptimal if they really believe in multiple factors. Those, like Arnott et. al., who argue for timing all the factors based on their value spreads are, in my view, implicitly just arguing for the primacy of the value factor. 5 5 Close Otherwise, you must deal with the fact that when another factor, say low beta, is expensive, it’s near a tautology that the value factor is also particularly high beta. If you care about both factors long term, I’m not sure why one is more a concern. They are free to make this argument but should do so with more candor, better research techniques, research protocol, and fewer histrionic pronouncements that are marketing not research.
• Here’s one aspect that is indeed “common sense” in my view – and a topic on which I’d guess there is broad agreement. If we see valuations (e.g., factor value spreads) that are epically different than the past, then it’s time to have a new conversation (again, note we are not seeing that today). For example, we saw this in 1999-2000. The value factor (and the low beta factor actually) was super cheap (very wide value spreads that blew away past values). Even then there’s a question of what to do with that information. Timing in real time during the Tech Bubble wasn’t easy, but it was at least interesting. What we all implicitly worry about today is the opposite. Factor value spreads that get extremely low, presumably as the result of too many investors trying to exploit them. If that ever happens it won’t be “88th percentile expensive” (i.e., high but not exceptionally so). It will be more like “150% of the prior 100th percentile before this event” expensive. 6 6 Close Interestingly, what really forced your hand at a time like 1999-2000 was how opposite two factors like value and momentum were. While normally negatively correlated, they were so negatively correlated near the peak of the Tech Bubble you couldn’t plausibly tilt towards both. Thus, you had to choose between them (or do nothing). Once again, this is not where we are today and wasn’t at any point in the recent debate. Though you’d be forgiven for thinking we were at this point based on some of the breathless warnings.
• Despite our findings that popular style factors are not particularly expensive as a group (and even if they were, historical evidence cautions against aggressively using this as a timing signal), we note that bad things can happen even to not-so-expensive factors. This is not an all-clear sign. In fact, our oft-repeated desire to diversify across good factors goes hand-in-glove with how difficult they can be to predict in the short- to medium-term. Also, of course, we too see the growing investor demand for factors and the abundant supply by providers, and this partly motivated our own renewed research into value spreads (and other crowding indicators) we started a few years ago (again, prior to this debate heating up). We were actually surprised to see as benign results as we did, leading me to write that Arnott and co-authors’ scaremongering was akin to shouting fire in a surprisingly uncrowded factor theater. The cheek that was said with was intentional, but the (pleasant) surprise was genuine. Factors still might disappoint or even crash in 2017, but it won’t be because of incredibly rich factor spreads. This is both because they aren’t incredibly rich but also, importantly, because factor valuation is not a great predictor of factor crashes. In that same piece, I also mused that short-term crashes (which are different from lower-than-normal long-term performance) in markets or factors are not as linked to valuation as one might guess (this ex ante includes me, as years ago, I’d have guessed that the link was tighter). However, despite the benign prices, I still think that today’s increased popularity of factor investing makes a short, sharp reversal more likely than otherwise. A near necessary condition for such a correction is a commonality in strategy across some class of investors. 7 7 Close This is fun to think about for a second. These strategies are zero sum so there’s not really a literal sense of “more people” doing it at one time as there’s always the other side of the bet. But, when more people try to do it they can bid up the price (making the value spread richer — again, something we don’t see going on in any big way now). Even if they fail to make spreads richer, suddenly there are many people explicitly following the strategy and perhaps trading against another side that is implicit (“implicit” meaning the other side of the factor bet isn’t a systematic decision by a class of investors intentionally short the factor you’re long). Since the other side is implicit there is little risk of it acting in concert, but there is risk that the explicit side does. In fact, when people say “lots of people are long a factor” they, again, generally don’t mean it literally. Instead, they mean either the factor is expensive, or lots of people on one side are doing it explicitly in concert. These events are essentially runs. Strategies unique to just you are effectively immune to broad-based runs (except by fantastic coincidence) while popular strategies are at least somewhat susceptible.
• All else equal, having access to the factors we discuss, at the historically reasonable factor spreads we document, would in fact be even better if they were known only to us and thus less subject to these runs. But, sadly, we’re not collectively being offered that deal. This is quite common for good factors. For instance, the general equity risk premium is certainly not a secret and is, famously, subject to “runs” or crashes over the very short term. That doesn’t mean you don’t allocate to it. It means you try to be sure you can weather such events and allocate accordingly. Ignoring the possibility of short, sharp reversals and taking so large a bet you won’t survive such an event would obviously be folly. On the other hand, avoiding reasonably priced strategies that we think we understand and are historically efficacious would be a different, but no less real, kind of folly. This is especially true in what’s still, in our view, a low expected return world for many traditional assets.
• Separately from our current paper, I repeat my call for a recant of one specific claim in Arnott, Beck, Kalesnik, and West (2016) (ABKW) that goes somewhat past the disagreements above. It’s their assertion that the main factors being discussed are, to various but serious extents, the result of the research community’s clueless mistaking of long-term factor richening for true “structural alpha.” As I go on about here (going on about something is what you do in a philippic), the observation that value changes can falsely look like repeatable performance is potentially important (to their credit, I think the analysis they carry out, if done the right way, might have a much broader role than it has historically). It has some bite over relatively short periods and at the peak or trough of very extreme events (again, think about the Tech Bubble of 1999-2000). But, ABKW make the clear claim, arguably their central claim, that a vast array of researchers has made this same error over very long periods, for relatively high turnover factors (this matters!), and when the end points were not nearly so extreme. All three of those things argue against their strong assertions about long-term reality. In fact, what’s still really bizarre is this was all in ABKW. They looked at this long-term impact two ways — one a good faith attempt to do it right, and one simply flat out provably wrong. They glossed over their own attempt to do it right and highlighted the flatly silly and extreme results from the provably wrong method. I can’t do this topic justice here (enjoy the links above!), but I renew my request for them to repudiate their extreme claim that many, even most, other researchers have mistaken long-term richening for “structural alpha.”
OK, that’s it, a mini-philippic. Let’s just call it Phil. If you like politer academic discourse, and, you know, some actual data analyzed, I once again direct you to the new paper.