The Beginner’s Guide To Single-Season BABIP

August 10, 2015

Batting Average on Balls in Play (BABIP) is one of the most commonly cited statistics in sabermetric analysis, and it’s role in mainstream coverage of the sport is growing as well. BABIP is a measure of how often “balls in play,” or non-home run batted balls, fall for hits. It’s an easy statistic to understand, but it’s not always the easiest statistic to use properly.

The problem occurs when people focus too heavily on one of the three main drivers of BABIP, which are player quality, defense, and luck. Most of the discussion surrounding BABIP is on the amount of luck that is involved. For some people, BABIP is simply a measure of how lucky or unlucky a player is getting over a period of time. But in reality, that is only part of the equation. Certain hitters consistently produce higher BABIP than others, and the presence of a good defense behind a pitcher can absolutely suppress their BABIP even before we consider the role of luck in the process.

The issue, I would argue, is that we often talk about BABIP-luck as it relates to small samples. If a hitter has a .450 BABIP over 85 PA, that is almost certainly not a reflection of their talent. It’s simply random variation at work. Perhaps they are a very good hitter, but there is too much noise in 85 PA (even fewer of those are balls in play) for that BABIP to be reflective of performance or true talent. It’s not so much that the various tenants of BABIP are hard to understand, it’s that it’s not always clear to everyone how far you can stretch different aspects of them to explain a given situation. BABIP has a luck dimension, but BABIP isn’t only about luck.

Every statistic is a measure of a player’s true performance plus random variation. A true talent .340 wOBA hitter won’t hit .340 wOBA every single season because outcomes over 500 or 600 PA will vary around that true talent mark. The same is true for BABIP, so the highest possible BABIP numbers are a product of the highest (or lowest) BABIP talent levels and random variation. Talent levels are a product of player skill for hitters and player skill plus defense for pitchers.

For example, the highest and lowest active career BABIP for hitters are .358 (Mike Trout) and .253 (Jeff Mathis) using a minimum of 2,000 PA. For pitchers with at least 600 innings pitched, the range is .333 (Manny Parra) to .248 (Chris Young). In general, we can suggest that these marks set up the limits on a player’s BABIP skill. Of course, they aren’t the true limits, but they function as a useful guide.

When it comes to hitters, a .390 or higher BABIP over a season of 500 or more plate appearances is possible, but very rare. There have been 70 such seasons in MLB history and 26 seasons of a BABIP of .400 or more. In other words, if a hitter is producing a .390 BABIP over the first few months of a season, the best bet is that it will not continue, but it is not out of the question that a hitter could maintain that level for 500 or so PA. Keep in mind,that for practical reasons, I’ve only counted the single-season numbers rather than every possible 500 PA stretch of a .390 BABIP. It is possible, but rare.

On the other end, BABIP of .235 or lower are relatively uncommon over a full season as well. There have been 173 seasons of 500 or more PA and a BABIP of .235 or lower, with about 25 seasons of .215 or below. This is slightly less concrete because while a BABIP ceiling is a reasonable thing, a BABIP floor is conditional on the fact that many people are capable of running lower BABIP than .200, but they don’t get to hang around in the show for that many PA. Generally speaking, if you’re dealing with an MLB hitter, single seasons will fall into the .235 to .390 range. If the hitter is outside those bounds, it is a virtual guarantee that number to move closer to average going forward. In other words, a .440 BABIP is the product of good fortune no matter how good the hitter is.

For pitchers, a .340 BABIP over 150 innings or more is very rare. There have been just 43 such seasons in MLB history and just nine above. 350. If you see a pitcher with a BABIP up in this range, it may continue for a full season, but it is very uncommon. Again, keep in mind that if the pitcher is not an MLB-caliber player, BABIP doesn’t really work. If I was in the show, I would certainly have a BABIP much, much higher than .350.

On the other end, only 69 pitchers have posted a .230 BABIP or lower in 150 plus innings and only 18 have been at .220 or below. This creates a reasonable range between .230 and .350. Anything outside those bounds is almost assuredly going to come back to the pack unless you are dealing with someone who is not at all an MLB caliber pitcher.

League average BABIP is around .300, but it’s important to note the reasonable single-season expectations. It’s not crazy for a player to have a higher BABIP over a smaller sample of plate appearances, but it’s almost impossible to produce a BABIP at those very high or low levels for very long. The best hitters can maintain a .340 or .350 BABIP over a long stretch of time, but there is virtually no way to push the limits much higher over a career. For pitchers, the best you can hope for is about .250 over the long haul, and even that is pretty rare. A .270 long term mark is a more typical BABIP-floor.

With all of that being said, it’s important to note a few things. A player’s career BABIP is almost always going to tell you more than their in-season BABIP. Having a career .305 BABIP and an in-season .340 BABIP likely means the hitter is having good luck. However, the mere presence of a .340 BABIP for any batter does not mean the batter is lucky. Miguel Cabrera or Mike Trout having a .340 BABIP is not luck, but a reflection of their ability. BABIP isn’t a measure of luck, BABIP over a small sample relative to expected (projected or career) BABIP may be a measure of luck. But it’s also important to remember the impact of defense on the equation as well.

A pitcher might be a true talent .305 BABIP pitcher, but if they have great defense behind them, it may be .290 for that season. That isn’t luck, but it also doesn’t have much to do with them. The main difference between hitters and pitches aside from defense is that batters display a wider range of BABIP skill than pitchers (about 100 points versus about 50 points).

This is not to say that talent doesn’t change. Hitters and pitchers get better and worse, but the issue we are talking about here is if you can pick it up in the numbers over just one season. Let’s say a true talent .300 BABIP hitter becomes a .320 true talent BABIP hitter over the offseason due to a change in their swing. Let’s say we somehow know this for a fact. The problem, however, is that seeing a .320 BABIP for the first month of the season does not provide enough evidence that the hitter is better because BABIP varies a lot in small samples. A .280 true talent BABIP hitter could produce a .320 mark over a month without any problem. It is a stat with a lot of random variation involved, so you need larger samples to discern true performance.

So the lesson is relatively simple. It is always better to bet on a player’s BABIP track record than their in-season performance going forward. Luck (or random variation) is often the main driver of the different results, but keep in mind that defense can play a role (mainly for pitchers because they have the same defenders all year), and that talent level can change. A BABIP above .400 for a player is almost definitely driven by luck. A BABIP above .370 is almost definitely driven by luck. It doesn’t matter if you think the hitter is good or not, it’s just not very likely that a hitter is a .370 BABIP hitter. The same is true at the bottom end.

Understanding these basic constraints and rules will help you perform better evaluations of players. BABIP is partly about luck, and understanding how to interpret it is very important.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG