Sign In
  • Support FanGraphs
    FanGraphs Membership
    FanGraphs Shirts
    FanGraphs Mugs
    Gift a Membership
    Donate to FanGraphs
  • Fantasy
    Fantasy Tools
    2023 Fantasy Expert Rankings
    Auction Calculator
    Ottoneu Fantasy Baseball
    Signup, FAQ, Blog Posts
  • Blogs
    Blog Roll

    FanGraphs
    Podcasts: FanGraphs Audio | Effectively Wild | Chin Music

    FanGraphs Prospects

    RotoGraphs
    Podcasts: The Sleeper and The Bust | Field of Streams | Beat the Shift

    Community Research

    Archived Blogs: The Hardball Times | NotGraphs | TechGraphs | FanGraphs+
    Archived THT: THT Live | Dispatch | Fantasy | ShysterBall
    Archived Podcasts: Stealing Home | Doing It For Bartolo | OttoGraphs | UMP: The Untitled McDongenhagen Project
  • Projections
    2023 Pre-Season Projections
    ZiPS, ZiPS DC
    Steamer
    Depth Charts
    ATC
    THE BAT, THE BAT X
    2023 600 PA / 200 IP Projections
    Steamer600
    2023 Updated In-Season Projections
    ZiPS (RoS), ZiPS (Update)
    Steamer (RoS), Steamer (Update)
    Depth Charts (RoS)
    THE BAT (RoS), THE BAT X (RoS)
    3 Year Projections
    ZiPS 2024, ZiPS 2025
    On-Pace Leaders
    Every Game Played, Games Played %
    Auction Calculator
  • Scores
    Today
    Live Scoreboard, Probable Pitchers
    Live Daily Leaderboards
    Win Probability & Box Scores
    2022, 2021, 2020, 2019, 2018, 2017...
  • Standings
    2023 Projected Standings
    2023 Playoff Odds, Playoff Odds Graphs
    ZiPS Postseason Game-By-Game Odds
    AL East
    AL Central
    AL West
    NL East
    NL Central
    NL West
  • Leaders
    Major League Leaders
    Batting: 2022, 2021, 2020, 2019, 2018, Career
    Pitching: 2022, 2021, 2020, 2019, 2018, Career
    Fielding: 2022, 2021, 2020, 2019, 2018, Career
    Splits Leaderboards
    Season Stat Grid
    60-Game Span Leaderboards (Special)

    KBO Leaders
    Batting, Pitching

    Minor League Leaders
    AAA: Triple-A East, Triple-A West, Mexican
    AA: Double-A Northeast, Double-A South, Double-A Central
    A+: High-A Central, High-A East, High-A West
    A: Low-A West, Low-A East, Low-A Southeast
    R: Appalachian, Gulf Coast, Pioneer, Arizona
    R: Dominican
    WAR Tools
    Combined WAR Leaderboards
    WAR Graphs
    WPA Tools
    WPA Inquirer
    Rookie Leaders
    Batters 2022, Pitchers 2022
    Splits Leaders
    Batters: vs L, vs R, Home, Away
    Pitchers: vs L , vs R, Home, Away
  • Teams
    Team Batting Stats
    2022, 2021, 2020, 2019, 2018, 2017...
    Team Pitching Stats
    2022, 2021, 2020, 2019, 2018, 2017...
    Team WAR Totals (RoS)
    AL East
    Blue Jays  |  DC
    Orioles  |  DC
    Rays  |  DC
    Red Sox  |  DC
    Yankees  |  DC
    AL Central
    Guardians  |  DC
    Royals  |  DC
    Tigers  |  DC
    Twins  |  DC
    White Sox  |  DC
    AL West
    Angels  |  DC
    Astros  |  DC
    Athletics  |  DC
    Mariners  |  DC
    Rangers  |  DC
    NL East
    Braves  |  DC
    Marlins  |  DC
    Mets  |  DC
    Nationals  |  DC
    Phillies  |  DC
    NL Central
    Brewers  |  DC
    Cardinals  |  DC
    Cubs  |  DC
    Pirates  |  DC
    Reds  |  DC
    NL West
    D-backs  |  DC
    Dodgers  |  DC
    Giants  |  DC
    Padres  |  DC
    Rockies  |  DC
    Positional Depth Charts
    Batters: C, 1B, 2B, SS, 3B, LF, CF, RF, DH
    Pitchers: SP, RP
  • RosterResource
    Current Depth Charts
    AL East
    Blue Jays
    Orioles
    Rays
    Red Sox
    Yankees
    AL Central
    Guardians
    Royals
    Tigers
    Twins
    White Sox
    AL West
    Angels
    Astros
    Athletics
    Mariners
    Rangers
    NL East
    Braves
    Marlins
    Mets
    Nationals
    Phillies
    NL Central
    Brewers
    Cardinals
    Cubs
    Pirates
    Reds
    NL West
    D-backs
    Dodgers
    Giants
    Padres
    Rockies
    Offseason Tools
    2023 Opening Day Tracker
    2023 Offseason Tracker
    2023 Free Agent Tracker
    In-Season Tools
    2023 Closer Depth Chart
    2023 Injury Report
    2022 Lineup Tracker
    2023 Payroll Pages
    2022 Probables Grid
    2022 Schedule Grid
    2023 Transaction Tracker
  • Prospects
    Prospects Home
    The Board
    The Board: Scouting + Stats!
    How To Use The Board: A Tutorial
    Top Prospects List
    Top Prospects
    2023 2022
    AL
    BALCHWHOU
    BOSCLELAA
    NYYDETOAK
    TBRKCRSEA
    TORMINTEX
    NL
    ATLCHCARI
    MIACINCOL
    NYMMILLAD
    PHIPITSDP
    WSNSTLSFG
    AL
    BALCHWHOU
    BOSCLELAA
    NYYDETOAK
    TBRKCRSEA
    TORMINTEX
    NL
    ATLCHCARI
    MIACINCOL
    NYMMILLAD
    PHIPITSDP
    WSNSTLSFG

    • 2023 Preseason Top 100
    • 2023 Imminent Big Leaguers


    • 2022 Preseason Top 100

  • Glossary
    Library
    Batting Stats
    wOBA, wRC+, ISO, K% & BB%, more...
    Pitching Stats
    FIP, xFIP, BABIP, K/9 & BB/9, more...
    Defensive Stats
    UZR Primer, DRS, FSR, TZ & TZL, more...
    More
    WAR, UBR Primer, WPA, LI, Clutch
    Guts!
    Seasonal Constants
    Park Factors
    Park Factors by Handedness
  • Sign In
Help Support FanGraphs


Become a Member No Thanks
Already a member? Log In
  • Intro
  • Features
  • Offense
    • Complete List (Offense)
    • OBP
    • OPS and OPS+
    • wOBA
    • wRC and wRC+
    • wRAA
    • Off
    • BsR
    • UBR
    • wSB
    • wGDP
    • BABIP
    • ISO
    • HR/FB
    • Spd
    • Pull%/Cent%/Oppo%
    • Soft%/Med%/Hard%
    • GB%, LD%, FB%
    • K% and BB%
    • Plate Discipline (O-Swing%, Z-Swing%, etc.)
    • Pitch Type Linear Weights
    • Pace
  • Defense
    • Overview
    • Def
    • UZR
    • DRS
    • Defensive Runs Saved – 2020 Update
    • Inside Edge Fielding
    • Catcher Defense
    • FSR
    • RZR
    • TZ / TZL
  • Pitching
    • Complete List (Pitching)
    • PitchingBot Pitch Modeling Primer
    • Stuff+, Location+, and Pitching+ Primer
    • ERA
    • WHIP
    • FIP
    • xFIP
    • SIERA
    • Strikeout and Walk Rates
    • Pull%/Cent%/Oppo%
    • Soft%/Med%/Hard%
    • GB%, LD%, FB%
    • BABIP
    • HR/FB
    • LOB%
    • Pitch Type Linear Weights
    • SD / MD
    • ERA- / FIP- / xFIP-
    • Plate Discipline (O-Swing%, Z-Swing%, etc.)
    • Pace
    • PITCHF/x
      • What is PITCHF/x?
      • Pitch Type Abbreviations & Classifications
      • Heat Maps
      • Common Mistakes
      • PITCHf/x Resources
  • WE/RE/LI
    • RE24
    • Win Expectancy
    • WPA
    • LI
    • WPA/LI
    • Clutch
  • Principles
    • DIPS
    • Regression toward the Mean
    • Replacement Level
    • Sample Size
    • Splits
    • Projection Systems
    • Linear Weights
    • Counting vs. Rate Statistics
    • Park Factors
    • Park Factors – 5 Year Regressed
    • Positional Adjustment
    • Aging Curve
    • League Equivalencies
    • Pythagorean Win-Loss
    • Luck
  • WAR
    • What is WAR?
    • WAR for Position Players
    • WAR for Pitchers
    • FDP
    • fWAR, rWAR, and WARP
    • WAR Misconceptions
  • Business

The Beginner’s Guide to Using Statistics Properly

by Neil Weinberg
September 15, 2014

We’ve spilled a great deal of virtual ink and audible podcasting words on the nature of Wins Above Replacement (WAR) and defensive metrics recently. Jeff Passan of Yahoo! Sports and many who responded to his critique of the current WAR calculation dug into the relative merits of the metric itself and how well we’ve estimated it to date. That’s a great conversation to have and Dave has done the heavy lifting on behalf of FanGraphs in that regard. I’d like to pivot and discuss a very important point about the use of statistics in baseball: Everything has flaws.

Every single statistic is wrong. Your eyes are wrong. It is all wrong. Nothing we have will provide you with perfect information or even truly accurate information with respect to the underlying variables about which you care. You don’t get to choose between flawed and not flawed statistics, you get to choose between useful and not useful statistics. More importantly, statistics become useful based on your awareness of the proper way to wield them.

Retrospective or Prospective Questions

The order in which you move through this thought process is up to you, but let’s start with the nature of the question itself. You have to decide whether you care about determining how a player performed in the past or how good he’s going to be today, tomorrow, or five years from now. Those are different questions, but we often treat them as if they are the same thing.

Want to know who the best hitter in baseball was over the last year? Sort by wRC+ for that time period and you have your answer, right? Not exactly, but we’ll get there in a second. For now, let’s assume that wRC+ over the last year is the true reflection of who was the best hitter. But if you want to know who the best hitter in the game is, sorting by current wRC+ won’t do the trick.

There’s no perfect way to answer the prospective question here. You want to use some type of projection that estimates that player’s current talent level based on their performance over multiple years, weighted by how recent it is. That’s a very simple way to define projection. But it also gets more complicated because forecasting future performance comes with uncertainty. This means that even if we have a good projection system, we’re going to be uncertain about the precise talent level of each player. Is Trout a true talent 160 wRC+ or 170? 150? We’re not sure. We’re estimating based on an unknown data generating process.

We have many statistics that capture what happened in the past and we often use those statistics to inform what we think will happen in the future. In the MVP debate, we care only about that current season so using statistics that describe the previous season is great. If we want to decide who we should sign in free agency or offer on the trade market, we want to incorporate additional information and attempt to estimate how well players will perform.

Those are two different questions and they require two different strategies, but we seem to appreciate that predicting the future includes uncertainty.

Is that what really happened?

A moment ago, I wrote that a player’s past wRC+ isn’t an exact representation of how well that player performed. On the face of it, my claim sounds strange. The batter accumulated those hits and walks and outs, didn’t they? Of course they did. But that actually neglects the question at hand.

The question we’re asking is “who was the best hitter in baseball?” In reality, wRC+ only tells you who had the highest wRC+. That particular statistic is the best estimate we have of offensive performance right now, but it isn’t a measure of truth. A scorching line drive is a single and a dribble that dies in the grass is also a single. Those aren’t the same thing in anyone’s eyes except those of the official scorer.

In reality, a decent amount of the outcomes we observe are conditional on many factors outside of the control of the players for which we attribute those outcomes. We do our best to control for those factors, but we miss plenty. Our park factors could be more nuanced. We could control for quality of competition. We could measure performance based on exit velocity and trajectory rather than singles, doubles, and fly outs.

We don’t do those things for a variety of reasons. Some of them are impossible with the available data and some are really hard to get right. We don’t know if the player with the highest wRC+ was actually the best hitter, but it’s the best we can do right now.

People often complain about the uncertainty and flaws of defensive metrics, but offensive stats have many of the same flaws. You just don’t notice them as much because there’s more data to wash away some of the concerns. BABIP is a great example of this.

Hitters can influence their BABIP more than pitchers, but there is still a lot of noise amid that signal. If you get ten seeing-eye singles you probably didn’t hit as well as a guy who hit ten rockets to the left fielder on one hop. We know this intuitively, but we don’t conceptualize it the same way. There is lots of uncertainty in all of our statistics. We’re just used to the offensive uncertainty and mentally regress performance much more easily than we do with new defensive numbers.

Proper Usage

The key to this entire endeavor is having a clear sense of the question you want to ask and the best way to go about answering it. Think about it this way, do you actually care who leads the league in batting average? Really think about it. You don’t. You may care which hitter is the best at getting on base or providing offensive value, but you don’t actually care about hits divided by at bats.

Batting average is supposed to be a tool that tells you how well a player has performed as a hitter. Just like wOBA or wRC+ is supposed to be a tool for the same thing. Leading the league in any of those doesn’t make you the best hitter or the hitter who had the best season, it makes you the person who led the league in a category.

You have questions about the game and we have tools that go about answering those questions to the best of our abilities. You can’t get caught up in the raw output of the stats because they don’t tell you anything if you don’t know how to interpret them.

Think about WAR. WAR is imperfectly calculated because calculating it perfectly is impossible. We care about trying to uncover who is the all-round most valuable player. WAR is a tool that allows us to work toward an answer. WAR gives us approximations of player value that we can use to separate groups of players. Typically, 6 WAR or higher gets you into the MVP conversation.

WAR doesn’t tell you who the MVP was, WAR helps you filter out players who definitely aren’t in the MVP conversation. Just like wOBA helps you determine about how good a hitter has been. A .400 wOBA and a .405 wOBA aren’t different enough to tell you anything. And they especially aren’t different enough to tell you anything over the span of 40 plate appearances.

To that end, you need to know the limitations of every statistic you use. We hear a lot about WAR’s flaws. You know what other stats are flawed? Literally all of them. Every single one. You know what else is flawed? The eye test. That’s true if you’re a scout or a casual fan (although the scouts are usually better!) and it’s true if you want fifteen games or 150.

In statistics, there’s a pretty common axiom: All models are wrong, but some models are useful. A toy airplane doesn’t do you any good if you want to learn about the way a jet burns fuel, but it’s very useful if you want to get a sense of the relative size of the wings to the propeller. The same is true with baseball stats. WAR is great if you want to get a sense of a player’s overall contribution, but it can’t tell you anything about the competition that player faced (yet, at least), for example.

On-base percentage is great at telling you the frequency with which a player reached base, but it doesn’t tell you anything about extra base power. And no retrospective statistic can tell you what’s going to happen in the future, either.

We’ve been working toward building better and better models, but we’re not anywhere close to the truth. The flaws in WAR aren’t reasons not to use WAR. They are reasons to use WAR only for what WAR can tell you. The same is true for average and wOBA and strikeout percentage.

Everything we have to evaluate baseball is flawed in some way. The only way around this is to understand the flaws and properly account for them by installing measures of uncertainty in everything you do. You’ve known forever that a .301 average and a .299 average are the same thing. You can’t tell the hitters apart and you certainly couldn’t tell me which one is a better hitter from only those two pieces of data.

This carries into everything else. Defensive metrics aren’t perfect yet. The way we incorporate defense into WAR likely isn’t perfect, either. But it’s the best we’ve been able to do so far and the absence of perfection does not mean the absence of utility. A stat with flaws is better than nothing at all as long as you are aware that the flaws exist.

We’re working toward better measures, but nothing is perfect and everything requires caution. This isn’t just about WAR. It’s about everything we do.

Questions, thoughts, comments? Comment below!





The Beginner’s Guide to Measuring Defense
 
Calculating Position Player WAR, A Complete Example

Neil Weinberg is the Site Educator at FanGraphs and can be found writing enthusiastically about the Detroit Tigers at New English D. Follow and interact with him on Twitter @NeilWeinberg44.

1 Comment
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Gavin
8 years ago

Bravo! The only thing I would add is that statistics as a field is not meant to get you a hard concrete number. The point of statistical analysis (and good analysis at that) is to get as narrow a range around a particular statistic.

For example, a 6.0 WAR should come with a range, although that’s too annoying in everyday life. People like clean. But more a more accurate way of looking would be to say 6.0 WAR, +/- .7 WAR– meaning 5.3-6.7 WAR. You would know with 95% certainty that what you’re looking for is in that range. And you could get a sense specifically for how sure you would be about that particular statistic, much moreso than if it was a 6.0 WAR, +/- 2.7 WAR…

1
You are going to send email to

Move Comment

Updated: Wednesday, March 29, 2023 3:28 AM ETUpdated: 3/29/2023 3:28 AM ET
Player Linker - @fangraphs - Contact Us - Advertise - Terms of Service - Privacy Policy
sis_logo
All major league baseball data including pitch type, velocity, batted ball location, and play-by-play data provided by Sports Info Solutions.
mlb logo
Major League and Minor League Baseball data provided by Major League Baseball.
Mitchel Lichtman
All UZR (ultimate zone rating) calculations are provided courtesy of Mitchel Lichtman.
TangoTiger.com
All Win Expectancy, Leverage Index, Run Expectancy, and Fans Scouting Report data licenced from TangoTiger.com
Retrosheet.org
Play-by-play data prior to 2002 was obtained free of charge from and is copyrighted by Retrosheet.