Overview

March 12, 2015

For every run scored, a run is allowed. That’s cemented in the rules of the game and the laws of physics. To score a run, someone else has to let a run score. While most of that allowance belongs to the pitcher, there are eight other players on the field with him at any given moment and they deserve some of the blame or credit for how often a run does or doesn’t score.

Unfortunately, for most of baseball history, we’ve had exceptionally lousy practices for measuring defense. Errors and fielding percentage seem to make sense at first but if you peel back the onion at all, they just don’t get the job done. Defense is a meaningful slice of run prevention and we care about measuring run prevention, ergo, we need to do a decent job of measuring defense. How do we go about doing that?

Recently, we’ve settled on defensive metrics measured in “runs saved.” The two most popular ones are Defensive Runs Saved (DRS) and Ultimate Zone Rating (UZR), but there are other efforts to capture the same idea. How many runs does a player prevent while playing the field? That fundamental question sets up everything we’re doing with respect to defensive statistics. There may be other ways to measure skill, and skill certainly predicts the future better than past results, but if we want to measure how well we believe a player has performed, we want to measure the runs he has prevented.

Defensive metrics are not perfect. They suffer from sample size issues, measurement errors, and data availability flaws. To their credit, however, hitting and pitching stats have many of the same types of flaws, we’re just less forgiving about those.

This page will offer an overview of why we need defensive metrics and how they work. Click around to the other pages for specific information on individual metrics.

*****

Why Errors and Fielding Percentage Come Up Short

For decades, the basic measurements of defensive performance were errors and fielding percentage. We used to judge players based on how well they avoided making errors on balls they fielded. If you didn’t kick the ball or let a throw sail wide, you were considered a quality defender. Certainly, those who were able to watch enough games developed a sense about which players were better defenders, but from a statistical standpoint, all we had were errors, assists, and putouts.

These statistics aren’t very useful, however. You certainly want to avoid errors because in order for something to be called an error you have, by definition, failed to convert a batted ball into an out. Yet there are two key problems with errors. First, they are determined by official scorers who don’t always make the right decision. Human error isn’t a problem, per se, but you’ve all seen enough scoring decisions to be skeptical about the quality of their decision making.

More importantly, however, is that errors are a subset of misplays. Even if official scorers got the rule book definition exactly right and perfectly uniform, we would still be ignoring a huge portion of bad defensive plays. Think back to a moment when you watched a player get a horrible jump on an easy ball. Think about the time an infielder took too long to get the ball out of their glove. Picture an easy pop fly falling four feet from the second baseman. None of those are errors even though they are relatively easy plays.

Measuring defense using Assists + Put Outs / Assists + Put Outs + Errors ignores a huge slice of defense. If a player fails to get to an easy ball, there is no penalty. That alone should be reason enough for you to want something better.

Turning Batted Balls Into Outs

So if errors and fielding percentage fail to provide the entire picture of defense, what exactly should we be using? It’s not that errors lack importance, it’s that making an error isn’t the only way to screw up.

Instead of fielding percentage, the next step forward is something like defensive efficiency or Revised Zone Rating (RZR). Both statistics strive to tell you similar truths; how often a fielder turns batted balls into outs. In other words, we don’t care whether you make an error or if you don’t get to the ball. We care if you made an out or if you didn’t. The distinction between an error and a play not made is arbitrary. If 200 batted balls were hit to the third baseman’s zone during a given period of time, do we care if he made 20 errors and failed to get to 20 balls or if he made 10 errors and failed to get to 30? For the most part, we do not (unless they were horrible throwing errors leading to multiple extra bases). In both cases, he turned 80% of batted balls into outs.

This is a much better way to measure defense because it captures every play rather than just the subset of plays in which the fielder came in contact with the baseball. However, this type of metric has it’s limitations because it does not control for the difficulty or importance of the play.

Not All Batted Balls Are Alike

This is another simple truth about which everyone can agree. A rocket off the bat of Miguel Cabrera and a routine grounder from Seth Smith are very different batted balls. We want a defensive metric that includes all batted balls, but we also recognize that even moving in that direction doesn’t take use far enough.

Turning 80% of balls in your defensive zone into outs is great, but if a large portion of those are easy plays, that’s much less impressive than if they were more difficult.

Difficulty

A screaming line drive up the left center field gap and a routine fly ball to center field are both in the center fielder’s defensive zone. One of those balls is much easier to field than the other, so it stands to reason that more talented defenders would get to the more difficult play more often.

So we want a defensive statistic that does something to control for how challenging that particular play was to make. You should get more credit for a tough play than for an easy play. Typically, the modern defensive statistics (UZR, DRS, etc) measures this variable by determining how often that play is made by the entire league.

For example, if a certain play is made 40% of the time, then if the fielder makes the play, he gets credit for 0.6 times the run value of that play (we’ll get to this in a second). Because the average fielder should make that play 40% of the time, by making the play you get credit for the difference.

The advanced defensive stats all include these percentages based on multiple years of data. So if that screaming line drive is caught 30% of the time by center fielders, we’re basing that on all similar line drives over the last six seasons, for example. Humans have to code where the ball was hit, the approximate elevation, etc, but the algorithm is the one analyzing the data. The human being doesn’t say “that was a 40/60 play,” they say, “that ball was hit to X at about Y speed” and the computer compares it to all other similarly coded plays. There can be measurement error in defensive stats, but we aren’t talking about a random person guessing at the difficulty of the play.

Run Value

Using all plays and controlling for their difficulty is important, but we also want to consider how valuable it was to make that play. For example, imagine a hard hit ground ball deep in the hole at short. Maybe one out of every fifteen shortstops are able to turn it into an out. Call it a 7% chance the play gets made. If you make that play, you will get a lot of credit because it was very difficult, but if you hadn’t made the play, how much would it have cost your team?

In addition to difficulty, we also want to add in the average run value of the batted ball in some way. On that tough ground ball, if the play isn’t made then it’s almost always a clean single. It wouldn’t go for extra bases. No one is scoring from first. The play is hard to make, but the cost of failing to execute is lower than if we’re talking about a ball up the gap. So we want to multiply the difficulty times the value of making the play. You can read all about the specifics of this at our UZR Primer, but the concept should be pretty clear. You want the difficulty times the value of making the play.

So What Do We Have?

Instead of errors and fielding percentage, we want a defensive statistic that considers all batted balls and not just times the fielder touches the ball. We want a stat that measures the difficulty and the value of each play. That’s exactly what our advanced defensive metrics do. There’s nothing subjective or subversive going on here.

We’re taking some very fundamental questions about every play and we’re using multiple years of data to answer them. Limiting the number of errors you make is good, all else equal, but if you can get to 10% more batted balls than someone else while making a couple more errors, you’re almost certainly the better defensive player.

Limitations

It’s important to note that this does not mean that defensive stats are perfect. We’re relying on imperfect data. The video scouts can’t perfectly determine the location, velocity, and angle of every batted ball from watching the game tape. They do a very nice job, but there is measurement error. There will always be measurement error.

Additionally, sample size is an important consideration. There are a pretty small number of difficult batted balls hit to every fielder each year. If you luck into a few good plays or miss a few because you happened to be working with a bum ankle, your rating can fall quickly. That’s not a flaw of measurement, it’s a fact of baseball. You don’t get 700 chances to make dazzling plays each season. Even if we could get our measurements from an omnipotent baseball deity, we couldn’t do anything about sample size.

The same thing is true with offensive statistics. If a hitter goes 20 for 40 (a .500 batting average), you don’t say that his batting average is wrong. He got those 20 hits. What you might say is that 40 at bats is too small a sample to tell us very much about this hitter even if it is an accurate reflection of those 40 at bats.

Defensive metrics work the same way with respect to sample size. The metric isn’t wrong just because the output looks too large or small (although it could be wrong), but it might not be a very good reflection of what will happen in the future or how talented the fielder was for the previous few games.

You don’t need to take the precise measurements as gospel and I wouldn’t recommend it. But you should appreciate what these numbers are trying to tell you. These stats are answering the questions you want to have answered. There are all sorts of ways we might improve the measurements included in these stats and the ways in which we use them to determine talent and performance, but the fundamental logic is exactly what you want.

You want to know something about every ball hit to a fielder’s zone. You want to know how often that play gets made and if the fielder made it. And you want to know how valuable that play is on average. You don’t care about errors and put outs. You care about outs and things that aren’t outs. Don’t you?

BAL	CHW	LAA
BOS	CLE	OAK
NYY	DET	SEA
TBR	KCR	TEX
TOR	MIN	HOU

ATL	CHC*	ARI
MIA	CIN	COL
WSN	MIL	LAD
NYM*	PIT	SDP*
PHI	STL	SFG