The Beginner’s Guide To Understanding Descriptive and Predictive Stats by Neil Weinberg October 27, 2016 Baseball statistics are designed to answer questions. Some questions are simple, such as “who reached base most often in 2016?” while others are more complicated, like “who was the best base runner in the American League?” Statistics allow us to gather up data points from individual events and summarize them in ways that are easy to understand. Different statistics answer different questions and therefore have different uses. You can’t figure out who the best hitter is solely by looking at his batting average. Batting average tells you something, but batting average itself answers a very specific and limited question. Over the years, we’ve attempted to expand the statistics we use to better capture the game we love. Instead of batting average, we moved to OBP, then OPS, then wOBA, and so on. The key is to decide on your question and then find which available statistic(s) best answers that question. One issue that comes up regularly is whether a statistic is predictive of the future or merely descriptive. You’ve likely heard that FIP is a better predictor of future ERA than current ERA. For this reason, many people believe that FIP was designed as a predictive statistic. As I discuss here, that is not accurate, but the perception persists because FIP is useful for prediction. Part of the confusion over FIP specifically is the presentation, but a broader issue is that many people think of stats as either descriptive of past performance or predictive of future performance. The thought is that statistics have to be one or the other, but that perception is also incorrect. The two concepts are related, but not in the way people often believe. All statistics describe a particular thing. ERA describes the number of earned runs allowed per nine innings. FIP summarizes a player’s strikeouts, walks, HBP, HR, and balls in play. Both stats are intended to tell you how well a pitcher performed over a given period of time. Determining which better answers that question is trickier and depends on what you value in a statistic. There isn’t a “right” answer when it comes to whether ERA or FIP better describe pitcher performance. However, empirically we can show that looking at a pitcher’s current FIP will allow you to predict their future FIP and ERA better than if you started with their current ERA. For that reason, we know FIP is more predictive of the future than ERA. In other words, it’s not a matter of descriptive or predictive, it’s a matter of descriptive or descriptive and predictive. And predictive is also a continuum rather than an on-off switch. FIP predicts the future pretty well, but cFIP at Baseball Prospectus predicts the future even better. This isn’t to say that all statistics are equally descriptive of the thing they are trying to measure. OPS attempts to measure overall hitting performance but it’s not as successful as wOBA, for example. There is a difference between what a statistic describes and what it is trying to tell you. If we use that more rigorous definition of descriptive (i.e. what it’s trying to tell you), RE24 is a great example. RE24 does a good job telling you how much a player influenced the run expectancy in a given season. It is very descriptive, more so than using RBI for the same concept. But RE24 is not very predictive at all because it won’t do a good job predicting future offensive performance. wOBA is very descriptive of the batter’s individual actions, but less so if you’re asking about his influence on actual run scoring. However, wOBA is more predictive of future individual performance and future influence on run scoring than RE24. Keep in mind, none of these stats are projections. Projections have no descriptive qualities but are designed to predict the future. So when you see wOBA via Steamer, that is their best estimate of the player’s talent level and not a summary of performance. Statistics attempt to summarize what happened. They vary in how well they look back than others and predict the future. It’s important to know the qualities of each statistic before you use them. When looking at numbers, ask yourself if a stat is 1) providing the best available summary of what happened 2) answering your question most effectively and 3) how much does it tells you about the future? The belief that a given stat is future-oriented or past-oriented is a false dichotomy. All stats are past-oriented, but some have qualities that make them more useful for prediction.