Defensive Metrics, Their Flaws, and the Language of Writers

August 22, 2014

If you spent time hanging around the comments section of Dave’s Alex Gordon piece, you lurked in the shadow’s of his conversation with Jeff Passan on Twitter, or you’re one of those people who Twitter searches the word “FanGraphs,” you probably saw a decent amount of skepticism about single-season defensive metrics this week. People tossed around words like “flawed” and “absurd.”

The interesting part of the debate, for me at least, was that there was skepticism from both sides. The sabermetric elite dove into an esoteric debate about how to best incorporate defense into WAR and less analytically minded fans used Gordon passing Mike Trout in WAR as kindling for their “WAR is silly” crusade.

Dave’s piece does a nice job covering exactly what it means to say Alex Gordon leads position players in WAR, but the fact that Dave had to write that piece in the first place speaks to a problem we often run into when using advanced metrics. It’s a communication problem. Dave addresses it, but I’d like to expand on it here because it’s vitally important.

Let’s divide the population into two groups, Saber and Non-Saber. Of course it’s more nuanced than that, but what we’re really talking about is people with an knowledge of advanced statistics and people who don’t have that expertise. The thing about writers is that they develop their own language, a short-hand, that they use when communicating with other writers. Think about your own areas of expertise and you’ll certainly relate.

For example, when social scientists talk with each other, they often speak in a complicated series of citations in which they just rattle off a bunch of last names. Since they’ve all read the same series of books and articles, they don’t need to expand on anything. Saying, “but Author X argues this” is sufficient to move a conversation forward.

We get caught doing this too when using advanced metrics. When a FanGraphs author writes an article, any article, and shows that Played A has a WAR of 4.5 and Player B has a WAR of 3.9, the Saber population knows many things without reading another word. First, they know the author is using FanGraphs WAR. Second, they know that there are other versions of WAR and that those WARs are different. Third, they know that a 4.5 WAR and 3.9 WAR aren’t different enough to be 100% sure that Player A has been more valuable. The Saber reader knows all of these things without the author doing anything beyond mention two WAR values.

The Non-Saber reader understands this quite differently. They see the author making an assertion that Player A is 0.6 WAR better than Player B. And that’s where it ends.

This is by no means an argument that Non-Saber fans are stupid, it’s just that they aren’t familiar with the shortcuts. When a writer on FanGraphs mentions a player’s single-season defensive performance, they’re building all of our assumptions into the presentation of a basic fact.

Saying Alex Gordon has a 22.4 UZR this year does not mean Alex Gordon is definitely 22 runs better than the average left fielder this year. Saber fans know this. It’s built into the phrase “UZR.” And because it’s built into the phrase, the average FanGraphs writer doesn’t take two lines to explain the limitations of UZR. When someone includes a list of the players with the highest WAR, they don’t take a paragraph to explain how the top six or seven players are all pretty much the same.

This leads to communication problems. You’ve probably heard saber-critics complain that Saber writers use WAR as the be all and end all statistic. It’s a common complaint. What’s funny about that is that I’ve never heard a Saber-minded ever make that case. Not once. Saber writers use WAR as a tool and the audience they write for regularly knows exactly what that tool does and doesn’t tell you. I didn’t see any Saber writers saying Mike Trout was the MVP last year because his WAR was much higher than anyone else. Mike Trout was the MVP because he played better than everyone else. WAR is just a tool that approximates how much better he was.

But the thing is, we never say it like that because then everything would be horribly redundant. Could you imagine if every post on FanGraphs that used WAR had a paragraph explaining WAR? That would be madness because most regular readers of FanGraphs know all about WAR and what its uses are. It’s a fundamental challenge of writing for an informed audience while also trying to draw in fans who are coming into the conversation with a clean slate. How much should you explain each stat each time you use it?

That’s why FanGraphs has the Library and this blog. If you see something you don’t understand, we have tools to help you put the pieces together. The problem is getting from Point A to Point B. When someone comes to FanGraphs, looks at our WAR leaderboard, and finds Alex Gordon ahead of Mike Trout, they immediately become skeptical. But they’re in the middle of an unknown unknown. They don’t realize that they are lacking critical information, like which statistics might be subject to larger variation and how precise WAR is trying to be.

Alex Gordon has been one of the most valuable players in baseball so far this year. You can’t construct a valid argument where he ranks much lower than 15 or so among position players. If you’re using WAR as a guide, it works perfectly. If you use WAR as a precise indicator of what FanGraphs considers truth, it doesn’t.

It’s really all about the dialect. If I’m debating a player with another writer here and cite the player’s UZR, we both know exactly how much to trust that value and neither one of us has to say anything about how accurate or precise it truly is. When a newish reader hears this conversation or reads an article, it isn’t always clear.

I’m not sure I have a solution to the problem. I’m actually not certain it’s a problem per se. It’s a fact of life of which we should all be aware. This isn’t about defensive metrics or WAR. It’s about understanding patterns of communication and that’s super important when dealing with baseball statistics that are reasonably new. FanGraphs doesn’t trust WAR or UZR more than we should, we just don’t constantly bring up the concerns because most of the people we talk to on a day to day basis are up to speed. This can lead to problems, but that’s why the Library is here. There’s a gap between the language of writers and fans and this is hopefully where that gap is bridged.

BAL	CHW	ATH
BOS	CLE	HOU
NYY	DET	LAA
TBR	KCR	SEA
TOR	MIN	TEX

ATL	CHC	ARI
MIA	CIN	COL
NYM	MIL	LAD
PHI	PIT	SDP
WSN	STL	SFG