Because machine learning has done so many amazing things, it may seem a foregone conclusion that it would dominate at financial tasks like stock picking. But this conclusion is by no means obvious and does not appear to be supported by research, at least not yet.
What makes finance different? We highlight four reasons why finance is different from industries where machine learning has made significant advancements.
Low Signal-to-Noise Ratios
Perhaps the most important difference is the signal-to-noise ratio, which summarizes how much predictability exists within a system. Take, for example, cat image recognition. A person looking at a thousand photos will correctly identify those that contain cats with a success rate of almost 100%. That indicates that this setting is a high signal-to-noise environment. The signal (the cat image) dominates sources of noise in the photo (blur, background images, and so forth). Machine learning thrives in such environments.
Contrast this with financial markets, which are extremely noisy. The best stock or investment portfolio in the world will, on any given day, quarter, or year, experience wild swings in performance due to unanticipated news. That low signal-to-noise ratio is constantly reinforced by simple economic forces of profit maximization and competition. If traders have some information that reliably predicts a future rise in prices—a strong signal—they don’t sit passively on that information. They start trading. That very act of exploiting their predictive information pushes up prices, and thereby sucks some of the predictability out of the market. With the predictability already priced in, the only thing that moves markets are unanticipated news or shocks—noise.
The machine learning challenges posed by low signal-to-noise ratios are further confounded by the dynamic character of markets. If a researcher identifies a new signal that captures a particular form of asset mispricing useful for predicting prices, then as the signal becomes more widely known, more traders act on it, correcting prices more quickly. The market eventually absorbs that information and the data generating process changes. Likewise, technological innovations can alter the structure of the economy and reshape the way humans interact with markets. While the frontiers of machine learning have developed some tools that may help with such adaptive phenomena, they highlight the fact that finance is more complex than many other domains of machine learning research (cats don’t begin morphing into dogs once the algorithm becomes good at cat recognition).
Another key difference in finance (and economics generally) is that our research does not typically take place in a “big data” environment (though big data methods can still be useful). Statistical analysis of finance, like macroeconomics more broadly, is fundamentally a time series discipline. In the example of return prediction, we can always conjure bigger and better data sets for prediction. However, we are limited by the number of observations on the outcome variable—stock returns—that we are targeting. New data on stock returns is generated only by the passage of time. Without sufficient data, we can’t reliably estimate complicated models.
Need for Interpretability
Some machine learning models are proverbial black boxes. Yet the ability to understand the inner workings of one’s model is a useful feature in asset management. Asset managers have the fiduciary duty of understanding and communicating the risks in their clients’ portfolios, which leads them to place special emphasis on model interpretability.
Machine learning need not be an opaque black box. Structural approaches can simultaneously make efficient use of the data to enhance discovery while also providing interpretation and intuition. There are many interesting potential research avenues for drawing more meaningful and intuitive conclusions from financial machine learning models.