"It's tough to make predictions, especially about the future"

12 Aug 2011 by Jim Fickett.

One frequently reads that such and such conditions were always, or almost always, followed by recessions in the past. One difficulty with such rules is that they usually come out of studies based on revised data, only available long after those past recessions. On the other hand, when one tries to use the rule in the present, only preliminary data, perhaps much less reliable, is available.

Many people are predicting a recession. In fact, it is so fashionable to predict a recession right now that people are tripping over each other trying to find new evidence, so they can get credit for adding their very own prediction to the pile.

It would be very nice, of course, to have the ability to predict the starts and ends of recessions. The Fed would love to time their interventions better. The rest of us, who, unlike the Fed, cannot manipulate the markets, would be able to invest more successfully.

In checking up on the (lack of) evidence behind various predictions, I finally came across some serious studies. One of these is a very nice overview of the economics literature in this area by James Hamilton, known to many of us as one of the authors of Econbrowser. The paper is called Calling recessions in real time. Now calling in real time is a worthwhile goal, but is not the same as predicting. Hamilton does discuss prediction, but he is skeptical that real prediction is possible. In fact he comes to the conclusion that even real-time may be too hard, and the algorithm he uses to make a regularly updated call on his web site uses all data up to and including the current quarter in order to decide whether the previous quarter was or was not part of a recession.

Hamilton has been thinking about such things for at least two decades and, in addition to the formal arguments, one gets a strong sense reading the paper of an old hand who has seen many enthusiastic rookies come and go, and has seen many promising projects turn out less than successfully. He gives examples of predictive algorithms that showed remarkable accuracy when tested on historical data, were thought to be very promising, and then failed badly when their authors made public predictions.

On a technical level, one of the main things that goes wrong with prediction algorithms in economics is that the data now in historical data tables, which most people use in trying to discern important patterns and design algorithms, is not the data that was available at the time one would have made a prediction. In other words, data gets revised, often heavily, and it is much easier to make simulated, retrospective predictions using these corrected data, than it would have been to make predictions using the preliminary data that was actually available at the time.

The Philadelphia Fed has a Real-time data research center with an archive of what the data actually looked like for each quarter in the past, allowing those who are serious about really understanding the forecasting problem to test their methods in a realistic way. In their introduction to the data, the Philadelphia Fed notes, to take one example, “The most well-known study that compared results based on real-time data with later data was Diebold and Rudebusch (1991), who showed that the index of leading indicators does a much worse job of predicting future movements of output in real time than it does after the data are revised.”

Hamilton gives a good example related to the recession prediction problem. In Q1 of 2002, when the 2001 recession was already over, the GDP data showed annualized growth in the four quarters of 2001 of 1.3%, 0.3%, -1.4% and 0.3%. In other words, the data as available in Q1 of 2002 showed that growth had been positive for three of the four quarters of 2001. On the other hand by the fourth quarter of 2002, the GDP data had been revised to show that growth had been negative in most of 2001. Thus even correctly interpreting history (leave alone prediction) had to wait for some time after the fact.

What led me (by a circuitous route) to Hamilton's paper was this statement in a Bloomberg article:

Growth in the second quarter slowed to a pace that has typically been followed by a contraction within a year. …

Gross domestic product, adjusted for inflation, cooled to a 1.6 percent rate in the second quarter from a year earlier. About 70 percent of the time when the pace has fallen below 2 percent, a slump has followed within a year, according to data since World War II in an April study by Jeremy Nalewaik, a Fed board staff economist.

The study in question is Forecasting Recessions Using Stall Speeds. This is quite an interesting paper, perhaps offering one way around the difficulty of prediction. Nalewaik proposes that in fact economic data separate quite naturally into three groups. Rather than just expansion or contraction, these are expansion, pre-contraction stall, and contraction. He then attempts to call pre-contraction stall periods in real time, thus giving some advance warning of recessions. I will write more about this paper in another post, but now want just to emphasize the point about the difficulty of making real-time predictions with preliminary data.

Here is Nalewaik's graph, using present-day data, illustrating the point that the Bloomberg piece was trying to make:

(click for larger image)

This shows the year-over-year growth in GDP, with shaded bars indicating recessions, and a horizontal line at the 2% threshold. One can see that, most of the time, an annual growth rate of less than 2% occurs either in or very near recessions. However note that the growth is not far above 2% in many cases not near recessions, and perhaps the preliminary data was less clear. As for the present, the current estimate of GDP growth from Q2 of 2010 to Q2 of 2011 is 1.6%, but this could easily be revised.

Nalewaik is very much aware of the difference between retrospective historical studies and real-time prediction. Unlike many of the reporters who try to base a strong prediction on his work, he is careful to point out that the algorithms he develops would have sometimes made confusing or incorrect predictions in the past, using the data available at the time. One algorithm he presents uses data on GDI, unemployment, the yield curve, and housing starts, and is probably the best attempt so far to make a call on whether we are in a pre-recession stall period. Here is a graph showing, at each point in time, a probability, calculated using data available at the time, of being in either a stall period or a recession period. As with Hamilton's method, this algorithm uses all data through a given quarter to make a call about the previous quarter.

The dashed line includes data from the yield curve while the solid line excludes it. In either case one gets fairly serious false positives in 1993 and 2003. The point is that even this algorithm, much more sophisticated than the simple rule expounded by Bloomberg, makes mistakes when it is restricted to contemporaneous data.


The quote in the title is usually attributed to Yogi Berra (perhaps incorrectly). A good one, anyway, whoever said it.