BaseRuns

August 8, 2016

BaseRuns is a formula designed to estimate how many runs a team would be expected to score (or allow) given their underlying offensive (or defensive) performance. In other words, BaseRuns is a context-neutral run estimator used to evaluate teams. The output of BaseRuns is an expected team run total, which we present on the site on a per game basis.

While wins and losses are what matter in the standings, knowing how many runs a team has scored and allowed often provides a better snapshot of how well a team has played over a given period of time. Beyond that, knowing the underlying offensive and defensive performance on a per plate appearance basis is often even more informative than simple runs scored and runs allowed.

We could use wOBA and wOBA allowed, but we use BaseRuns because research (example) has shown that run scoring is not entirely linear. That is, while linear weights-based offensive metrics such as wOBA and wRC+ do a good job measuring individual performance, the evidence suggests that at the team level, a linear conception of offense can break down at the extremes. Linear weights-based runs created metrics (wRAA, Batting Runs, wRC) work well for individual players, but at the team level, they can fall short at the extremes. For example, the the difference in team run scoring between 320 and .340 wOBA teams might not be the same as the difference between .340 and .360 wOBA teams.

This page explains our BaseRuns calculation, why we use BaseRuns, and how to use them yourself.

Calculation:

The ultimate product of the BaseRuns calculation is either a total expected number of runs scored (or allowed) or that same number converted to runs per game. We publish BaseRuns in terms of R/G, but all you have to do is multiply those R/G values by the team’s number of games to arrive at the total expected runs.

BaseRuns uses familiar inputs and the math is simple, but there are quite a few steps. We are going to calculate four separate terms and then we are going to combine those terms using a simple formula. After that, we are going to make an adjustment based on the league to make sure everything is to scale. Keep in mind that you calculate a team’s BaseRuns for offense and defense separately.

BaseRuns Terms:

A = H + BB + HBP – (0.5*IBB) – HR

B = 1.1*[1.4*TB – 0.6*H – 3*HR + 0.1*(BB + HBP – IBB) + 0.9*(SB – CS – GDP)]

C = PA – BB – SF – SH – HBP – H + CS + GDP

D = HR

BaseRuns Formula:

Raw BaseRuns = [(A*B) / (B + C)] + D

BaseRuns League Adjustment:

BaseRuns League Adjustment = [Specific League Runs Scored or Allowed] / [Specific League Raw BaseRuns]

BaseRuns:

BaseRuns = Raw BaseRuns*BaseRuns League Adjustment

Okay, so that’s a lot of information to process. Let’s walk through it a bit. The idea behind BaseRuns is that we want to think about a team’s performance in terms of base runners, base runner advancement, outs, and automatic runs. The A term represents base runners, the B term represents runner advancement, the C term represents outs, and the D term is home runs. A, C, and D are very straightforward given that base runners, outs, and homers are just raw numbers you can count.

The B term is where things are less clear, given that we’re trying to model the average number of bases advanced given the underlying numbers. For this reason, BaseRuns might under-value teams that are really good at going first to third or teams with pitchers who are extremely good at holding runs so they don’t get good jumps on batted balls, for instance.

To date, this B term is the best we have based on empirical work, but it could definitely be refined further in the future. The BaseRuns formula then takes those terms and combines them, essentially figuring out how many runs the team is expected to score given the number of runners and how they advance. There’s nothing magic about this formula, it’s simply the one that seems to mirror the run scoring process best. This is also something that could be improved in the future.

The league adjustment is pretty simple. You take all of the runs scored (or allowed) by teams in your specific league (AL or NL) and divide by the sum of their raw BaseRuns. Then you multiply the adjustment by each team’s raw BaseRuns to get their BaseRuns. This adjustment just cleans things up so that runs scored and allowed come out even. You can divide by games played next, if you would like to have it on a per game basis. Remember to do this separately for runs scored and runs allowed for each team.

Why BaseRuns:

We frequently find ourselves asking the question, “how well has this team played?” The default reaction is to simply look at their record to determine how many wins and losses they have. However, wins and losses are an extremely blunt tool for making this kind of evaluation. Winning 7-1 communicates something different than winning 9-8. Winning 2-1 and losing 10-3 is different than winning 3-2 and losing 4-3.

For this reason, we often look at a team’s runs scored and allowed. This is often called “run differential,” and is considered by many to be a better reflection of underlying performance than W-L record. You don’t get special credit for winning by a lot of runs, but margin of victory is a better indicator of team performance than the binary W/L.

We can actually go further than this, and dive into the individual actions that contribute to run scoring. For example, if your team has an inning that goes 1B-1B-HR-K-K-FO, you end with three runs. If your inning is HR-1B-1B-K-K-FO, you wind up scoring one run. You might only care about the run scoring outcome, but again, the events themselves are a better reflection of what happened rather than the order in which they happened.

If you aren’t moved by this concept, W-L record is for you. If you like the idea of run differential, Pythagorean Record is for you, as it’s a W-L estimation based on run differential. If you like the idea of measuring the underlying events, that’s where BaseRuns comes in. (For a little more on these three approaches, click here.)

If we were only talking about individual players, you could look at wOBA or wRC+ (or their cumulative counterparts) and be done. However, research has found that run scoring based as a function of these underlying events is not completely linear and that really good offenses typically score more runs than you would expect given their standard wOBA. BaseRuns addresses this problem.

BaseRuns provides you with an expected number of runs scored or allowed for a team based on the club’s 1B, 2B, 3B, HR, BB, IBB, HBP, SB, CS, and GDP. Like most statistics available at FanGraphs or elsewhere, you could make further refinements based on other specifications.

How to Use BaseRuns:

BaseRuns is straightforward to interpret. We present BaseRuns in RS/G and RA/G, but you may also see it presented in RS and RA totals. These are expected runs scored and allowed numbers based on the team’s performance on a per PA level.

Typically, we use BaseRuns as a tool for indicating whether a team’s record reflects that underlying performance. As you’ll see on the site, we take the BaseRuns numbers and calculate an expected winning percentage. If a team’s actual record is better than their BaseRuns record, it often means that they are timing their events in a beneficial way. If their actual record is worse than their BaseRuns record, it often means that they are not timing their events well.

You might hear this referred to as “luck,” but it is better to thing of it as “sequencing” because that removes any kind of value judgement. BaseRuns isn’t saying that a particular team is getting lucky when it outperforms its BaseRuns record, it’s saying that teams who have those underlying stats typically do not win that often. It’s up to the reader to decide if the team is fortunate or doing something that BaseRuns isn’t measuring.

The clearest example of BaseRuns’ blind spots is good base running that occurs while the ball in play. If a team is very good at going first to third (or preventing the same), BaseRuns won’t pick up on that. Modeling base runner advancement is difficult, and there are definitely places future analysts could look to improve the model.

Keep in mind that BaseRuns is not a projection system. BaseRuns is a run estimator given a set of inputs. The BaseRuns presented on the site are calculations based on the team’s current season performance. If you wanted to estimate how many runs a team would score based on a set of player forecasts, you have the ability to do that with the BaseRuns formula, but the part of our site called “BaseRuns” is telling you the expected runs for a team based on their year-to-date numbers, not their rest of season projections.

Context:

The great thing about BaseRuns is that it looks just like runs scored and allowed. You don’t have to learn any new baselines or averages. A good actual R/G is the same as a BaseRuns R/G.

Things To Remember:

BaseRuns is used at FanGraphs to evaluate teams.
BaseRuns is based on a team’s year-to-date offensive/defensive numbers, providing an “expected runs” scored/allowed estimate.
BaseRuns can help you understand which teams are scoring/allowing more/less runs than you might expect given their raw offensive/defensive numbers.
BaseRuns is presented as RS/G and RA/G at FanGraphs. We also publish an expected W-L record based on those nunbers, and the difference between a team’s actual W-L and that expected W-L.

Links To Further Reading:

The Exponential Nature of Offense – The Hardball Times

BaseRuns Wiki – Tom Tango

Team Record, Pythagorean Record, and BaseRuns – FanGraphs

Expected Run Differentials 2.0 – FanGraphs

Base Runs- Buckeyes and Sabermetrics

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG