Sign In
  • Support FanGraphs
    FanGraphs Membership
    Gift a Membership
    Donate to FanGraphs
    FanGraphs Store
  • Games
    Ottoneu Fantasy Baseball
    Signup, FAQ, Blog Posts
  • Blogs
    Blog Roll

    FanGraphs
    Podcasts: FanGraphs Audio | Effectively Wild | Chin Music

    FanGraphs Prospects

    RotoGraphs
    Podcasts: The Sleeper and The Bust | Field of Streams | Beat the Shift

    Community Research

    Archived Blogs: The Hardball Times | NotGraphs | TechGraphs | FanGraphs+
    Archived THT: THT Live | Dispatch | Fantasy | ShysterBall
    Archived Podcasts: Stealing Home | Doing It For Bartolo | OttoGraphs | UMP: The Untitled McDongenhagen Project
  • Projections
    2022 Pre-Season Projections
    ZiPS, ZiPS DC
    Steamer
    Depth Charts
    ATC
    THE BAT, THE BAT X
    2022 600 PA / 200 IP Projections
    Steamer600
    2022 Updated In-Season Projections
    ZiPS (RoS), ZiPS (Update)
    Steamer (RoS), Steamer (Update)
    Depth Charts (RoS)
    THE BAT (RoS), THE BAT X (RoS)
    3 Year Projections
    ZiPS 2023, ZiPS 2024
    On-Pace Leaders
    Every Game Played, Games Played %
    Auction Calculator
    Auction Calculator (New Interface)
  • Scores
    Today
    Live Scoreboard, Probable Pitchers
    Live Daily Leaderboards
    Win Probability & Box Scores
    2022, 2021, 2020, 2019, 2018, 2017...
  • Standings
    2022 Projected Standings
    2022 Playoff Odds, Playoff Odds Graphs
    ZiPS Postseason Game-By-Game Odds
    AL East
    AL Central
    AL West
    NL East
    NL Central
    NL West
  • Leaders
    Major League Leaders
    Batting: 2022, 2021, 2020, 2019, 2018, Career
    Pitching: 2022, 2021, 2020, 2019, 2018, Career
    Fielding: 2022, 2021, 2020, 2019, 2018, Career
    Splits Leaderboards
    Season Stat Grid
    60-Game Span Leaderboards (Special)

    KBO Leaders
    Batting, Pitching

    Minor League Leaders
    AAA: Triple-A East, Triple-A West, Mexican
    AA: Double-A Northeast, Double-A South, Double-A Central
    A+: High-A Central, High-A East, High-A West
    A: Low-A West, Low-A East, Low-A Southeast
    R: Appalachian, Gulf Coast, Pioneer, Arizona
    R: Dominican
    WAR Tools
    Combined WAR Leaderboards
    WAR Graphs
    WPA Tools
    WPA Inquirer
    Rookie Leaders
    Batters 2022, Pitchers 2022
    Splits Leaders
    Batters: vs L, vs R, Home, Away
    Pitchers: vs L , vs R, Home, Away
  • Teams
    Team Batting Stats
    2022, 2021, 2020, 2019, 2018, 2017...
    Team Pitching Stats
    2022, 2021, 2020, 2019, 2018, 2017...
    Team WAR Totals (RoS)
    AL East
    Blue Jays  |  DC
    Orioles  |  DC
    Rays  |  DC
    Red Sox  |  DC
    Yankees  |  DC
    AL Central
    Guardians  |  DC
    Royals  |  DC
    Tigers  |  DC
    Twins  |  DC
    White Sox  |  DC
    AL West
    Angels  |  DC
    Astros  |  DC
    Athletics  |  DC
    Mariners  |  DC
    Rangers  |  DC
    NL East
    Braves  |  DC
    Marlins  |  DC
    Mets  |  DC
    Nationals  |  DC
    Phillies  |  DC
    NL Central
    Brewers  |  DC
    Cardinals  |  DC
    Cubs  |  DC
    Pirates  |  DC
    Reds  |  DC
    NL West
    D-backs  |  DC
    Dodgers  |  DC
    Giants  |  DC
    Padres  |  DC
    Rockies  |  DC
    Positional Depth Charts
    Batters: C, 1B, 2B, SS, 3B, LF, CF, RF, DH
    Pitchers: SP, RP
  • RosterResource
    Current Depth Charts
    AL East
    Blue Jays
    Orioles
    Rays
    Red Sox
    Yankees
    AL Central
    Guardians
    Royals
    Tigers
    Twins
    White Sox
    AL West
    Angels
    Astros
    Athletics
    Mariners
    Rangers
    NL East
    Braves
    Marlins
    Mets
    Nationals
    Phillies
    NL Central
    Brewers
    Cardinals
    Cubs
    Pirates
    Reds
    NL West
    D-backs
    Dodgers
    Giants
    Padres
    Rockies
    In-Season Tools
    2022 Closer Depth Chart
    2022 Injury Report
    2022 Lineup Tracker
    2022 Probables Grid
    2022 Schedule Grid
    2022 Transaction Tracker
    Offseason Tools
    2022 Opening Day Tracker
    2022 Offseason Tracker
    2022 Free Agent Tracker
  • Prospects
    Prospects Home
    THE BOARD!
    THE BOARD: Scouting + Stats!
    How To Use THE BOARD: A Tutorial
    Top Prospects List
    Top Prospects
    2022 2021
    AL
    BALCHWHOU
    BOSCLELAA
    NYYDETOAK
    TBRKCRSEA
    TORMINTEX
    NL
    ATLCHCARI
    MIACINCOL
    NYMMILLAD
    PHIPITSDP
    WSNSTLSFG
    AL
    BALCHWHOU
    BOSCLELAA
    NYYDETOAK
    TBRKCRSEA
    TORMINTEX
    NL
    ATLCHCARI
    MIACINCOL
    NYMMILLAD
    PHIPITSDP
    WSNSTLSFG

    • 2022 Preseason Top 100


    • 2021 Preseason Top 100

  • Glossary
    Library
    Batting Stats
    wOBA, wRC+, ISO, K% & BB%, more...
    Pitching Stats
    FIP, xFIP, BABIP, K/9 & BB/9, more...
    Defensive Stats
    UZR Primer, DRS, FSR, TZ & TZL, more...
    More
    WAR, UBR Primer, WPA, LI, Clutch
    Guts!
    Seasonal Constants
    Park Factors
    Park Factors by Handedness
  • Sign In
Help Support FanGraphs


Become a Member No Thanks
Already a member? Log In
  • Intro
  • Features
  • Offense
    • Complete List (Offense)
    • OBP
    • OPS and OPS+
    • wOBA
    • wRC and wRC+
    • wRAA
    • Off
    • BsR
    • UBR
    • wSB
    • wGDP
    • BABIP
    • ISO
    • HR/FB
    • Spd
    • Pull%/Cent%/Oppo%
    • Soft%/Med%/Hard%
    • GB%, LD%, FB%
    • K% and BB%
    • Plate Discipline (O-Swing%, Z-Swing%, etc.)
    • Pitch Type Linear Weights
    • Pace
  • Defense
    • Overview
    • Def
    • UZR
    • DRS
    • Defensive Runs Saved – 2020 Update
    • Inside Edge Fielding
    • Catcher Defense
    • FSR
    • RZR
    • TZ / TZL
  • Pitching
    • Complete List (Pitching)
    • ERA
    • WHIP
    • FIP
    • xFIP
    • SIERA
    • Strikeout and Walk Rates
    • Pull%/Cent%/Oppo%
    • Soft%/Med%/Hard%
    • GB%, LD%, FB%
    • BABIP
    • HR/FB
    • LOB%
    • Pitch Type Linear Weights
    • SD / MD
    • ERA- / FIP- / xFIP-
    • Plate Discipline (O-Swing%, Z-Swing%, etc.)
    • Pace
    • PITCHF/x
      • What is PITCHF/x?
      • Pitch Type Abbreviations & Classifications
      • Heat Maps
      • Common Mistakes
      • PITCHf/x Resources
  • WE/RE/LI
    • RE24
    • Win Expectancy
    • WPA
    • LI
    • WPA/LI
    • Clutch
  • Principles
    • DIPS
    • Regression toward the Mean
    • Replacement Level
    • Sample Size
    • Splits
    • Projection Systems
    • Linear Weights
    • Counting vs. Rate Statistics
    • Park Factors
    • Park Factors – 5 Year Regressed
    • Positional Adjustment
    • Aging Curve
    • League Equivalencies
    • Pythagorean Win-Loss
    • Luck
  • WAR
    • What is WAR?
    • WAR for Position Players
    • WAR for Pitchers
    • FDP
    • fWAR, rWAR, and WARP
    • WAR Misconceptions
  • Business

Regression toward the Mean

by Piper Slowinski
February 16, 2010

In conversations about baseball statistics, the word “regression” is used quite often, but there are essentially two different meanings associated with the word and it’s important to separate them because they mean different things. Colloquially, the word “regress” is often used to mean movement backwards. The dictionary definition of this is something like “returning to a former or less developed state.” You will absolutely hear people use this word to describe baseball players. If a good player gets worse, they can be said to have regressed. That is, their talent has declined.

However, this is not usually what we mean when we are talking about baseball statistics, so it’s important to be precise with your terminology. We are typically talking about the statistical concept known as “regression to/toward the mean.” Regression toward the mean (RTM for clarity in this article) is the concept that any given sample of data from a larger population (think April stats) may not be perfectly in line with the underlying average (think true talent/career stats), but that going forward you would expect the next sample to be closer to the underlying average than the first sample. Observations tend to cluster around the average value, even if the previous value is unusual.

Let’s use a concrete example. Imagine you have a player with a career OBP of .350. Over the last few seasons it’s been .340, .360, .340, .360, and .350. Let’s assume the league’s run environment has stayed the same and the player is around 28, so there is no particular reason to expect his talent level to change or for his OBP to spike due to a clear external factor. He is, as best as we can tell, a true talent .350 OBP hitter.

But now let’s imagine we observe his next 100 PA in which he posts a .300 OBP. What should we think about his next 500-600 PA based on the information we have? In other words, do those 100 PA at .300 OBP alter the way we think about the player and by how much?

Any sample of PA contain potentially useful information. Maybe he’s hurt, maybe he’s aging poorly, maybe the league learned to exploit a weakness. Maybe his true talent has changed. But when we are asked to assess this player, the previous five seasons carry a lot of weight. We don’t just forget about them because our player had a bad April. So to forecast his future performance, we need to consider RTM. It’s more likely that he will perform close to his career average (or some weighted version of it) than the sample of plate appearances immediately preceding the question.

RTM is not a positive or negative. It’s a push toward average. If our player had posted a .400 OBP, the exact same properties would apply. To put it another way, any one small sample is less informative than a must larger sample even if the larger sample is slightly older. So when a player gets off to a hot or cold start, we want to factor in RTM.

Keep in mind there is no “correct” way to account of RTM in baseball. It’s a conceptual framework, and like most conceptual frameworks there are exceptions. Players’ underlying true talent does change from time to time based on a variety of factors. If a pitcher learns a new pitch, their history is still useful, but it’s much less useful than it is for a pitcher who is using their same arsenal.

The idea behind using RTM in baseball is that we can’t directly measure true talent, we simply infer it from observing outcomes on the field. Baseball has a lot of randomness that makes individual observations fluctuate around the player’s true talent. Picture a line drive being caught by a leaping defender and a weak grounder finding a hole. Because we can’t measure true talent directly, we can’t say for sure when it changes and when we are simply observing a set of data points that are different from that talent level for unrelated reasons.

In other words, because of the randomness (factors unrelated to the talent of the player we care about) involved in generating baseball outcomes, it takes a long time for the statistics we create to tell us exactly how good a player is. This means that any one section of data might not be a clear reflection of the underlying average. So going forward, we expect the data to look more like the overall numbers rather than a single, recent sample. We must regress any new data toward the mean.

As I noted, this is not a formulaic rule. Sometimes players talent level changes. But RTM is accounting for the fact that you can observe outcomes that are not in line with a player’s true talent simply due to randomness and that going forward true talent is a better predictor. Think of it this way:

Outcomes = Talent + Randomness

We can only observe outcomes, but we care about talent. We want to sort out randomness by getting the randomness to cancel itself out over a long period of time. Randomness is most likely to confuse you in short samples, so that’s why we use larger samples (i.e. regression toward the mean) to inform our opinions.

Links for Further Reading:

Regression to the Mean – Wikipedia

But I Regress – Hardball Times

What’s Past is Prologue – Hardball Times

Estimating Hitter Platoon Skills – FanGraphs





wRC and wRC+
 
BABIP

Piper was the editor-in-chief of DRaysBay and the keeper of the FanGraphs Library.

Login
Please login to comment
0 Comments
Inline Feedbacks
View all comments
You are going to send email to

Move Comment

Updated: Monday, May 16, 2022 11:07 PM ETUpdated: 5/16/2022 11:07 PM ET
Player Linker - @fangraphs - Contact Us - Advertise - Terms of Service - Privacy Policy
sis_logo
All major league baseball data including pitch type, velocity, batted ball location, and play-by-play data provided by Sports Info Solutions.
mlb logo
Major League and Minor League Baseball data provided by Major League Baseball.
Mitchel Lichtman
All UZR (ultimate zone rating) calculations are provided courtesy of Mitchel Lichtman.
TangoTiger.com
All Win Expectancy, Leverage Index, Run Expectancy, and Fans Scouting Report data licenced from TangoTiger.com
Retrosheet.org
Play-by-play data prior to 2002 was obtained free of charge from and is copyrighted by Retrosheet.