​
​
Sign In
  • Support FanGraphs
    FanGraphs Membership
    FanGraphs Shirts
    FanGraphs Mugs
    Gift a Membership
    Donate to FanGraphs
  • Fantasy
    Fantasy Tools
    Fantasy Player Rater
    Auction Calculator
    Ottoneu Fantasy Baseball
    Signup, FAQ, Blog Posts
  • Blogs
    Blog Roll

    FanGraphs
      Podcasts: Effectively Wild

      FanGraphs Prospects

      RotoGraphs
        Podcasts: The Sleeper and The Bust | Field of Streams | Beat the Shift

        Community Research

          Archived Blogs: The Hardball Times | NotGraphs | TechGraphs | FanGraphs+
          Archived THT: THT Live | Dispatch | Fantasy | ShysterBall
          Archived Podcasts: FanGraphs Audio | Chin Music | UMP: The Untitled McDongenhagen Project | Stealing Home | Doing It For Bartolo | OttoGraphs |
        • Projections
          2025 Pre-Season Projections
          ZiPS, ZiPS DC
          Steamer
          Depth Charts
          ATC
          THE BAT, THE BAT X
          OOPSY
          2025 600 PA / 200 IP Projections
          Steamer600, Steamer600 (Update)
          2025 Updated In-Season Projections
          ZiPS (RoS), ZiPS (Update), ZiPS DC (RoS)
          Steamer (RoS), Steamer (Update)
          Depth Charts (RoS)
          ATC DC (RoS)
          THE BAT (RoS), THE BAT X (RoS)
          OOPSY DC (RoS)
          3-Year Projections
          ZiPS 2026, ZiPS 2027
          On-Pace Leaders
          Every Game Played, Games Played %
          Cy Young Award Projections

          Auction Calculator
        • Scores
          Today
          Live Scoreboard, Probable Pitchers
          Live Daily Leaderboards
          Win Probability & Box Scores
          2025, 2024, 2023, 2022, 2021, 2020, 2019
          AL Games
          NL Games
        • Standings
          2025 Projected Standings
          2025 Playoff Odds, Playoff Odds Graphs
          2024 ZiPS Postseason Game-By-Game Odds
          AL East
          AL Central
          AL West
          NL East
          NL Central
          NL West
        • Leaders
          Major League Leaders
          Batting: 2025, 2024, 2023, 2022, 2021, Career
          Pitching: 2025, 2024, 2023, 2022, 2021, Career
          Fielding: 2025, 2024, 2023, 2022, 2021, Career
          Major League Leaders - Rank
          Batting: Ranking Grid, Compare Players, Compare Stats
          Pitching: Ranking Grid, Compare Players, Compare Stats
          Splits Leaderboards
          Season Stat Grid

          Postseason Leaders
          Batting: 2024, (WS), (LCS), (LDS), (WCS), Career
          Pitching: 2024, (WS), (LCS), (LDS), (WCS), Career

          Spring Training Leaders
          Batting: 2025, 2024, 2023
          Pitching: 2025, 2024, 2023

          KBO Leaders
          Batting, Pitching
          NPB Leaders
          Batting, Pitching

          Minor League Leaders
          AAA: International League, Pacific Coast League
          AA: Eastern League, Southern League, Texas League
          A+: Midwest League, South Atlantic League, Northwest League
          A: California League, Carolina League, Florida State League
          CPX: Arizona, Florida
          R: Dominican Summer League
          College Leaders
          Batting, Pitching

          WAR Tools
          Combined WAR Leaderboards
          WAR Graphs
          WPA Tools
          WPA Inquirer
          Rookie Leaders
          Batters 2025, Pitchers 2025
          Splits Leaders
          Batters: vs L, vs R, Home, Away
          Pitchers: vs L, vs R, Home, Away
        • Teams
          Team Batting Stats
          2025, 2024, 2023, 2022, 2021, 2020
          Team Pitching Stats
          2025, 2024, 2023, 2022, 2021, 2020
          Team WAR Totals (RoS)
          AL East
          Blue Jays  |  DC
          Orioles  |  DC
          Rays  |  DC
          Red Sox  |  DC
          Yankees  |  DC
          AL Central
          Guardians  |  DC
          Royals  |  DC
          Tigers  |  DC
          Twins  |  DC
          White Sox  |  DC
          AL West
          Angels  |  DC
          Astros  |  DC
          Athletics  |  DC
          Mariners  |  DC
          Rangers  |  DC
          NL East
          Braves  |  DC
          Marlins  |  DC
          Mets  |  DC
          Nationals  |  DC
          Phillies  |  DC
          NL Central
          Brewers  |  DC
          Cardinals  |  DC
          Cubs  |  DC
          Pirates  |  DC
          Reds  |  DC
          NL West
          D-backs  |  DC
          Dodgers  |  DC
          Giants  |  DC
          Padres  |  DC
          Rockies  |  DC
          Positional Depth Charts
          Batters: C, 1B, 2B, SS, 3B, LF, CF, RF, DH
          Pitchers: SP, RP
        • RosterResource
          Current Depth Charts
          AL East
          Blue Jays
          Orioles
          Rays
          Red Sox
          Yankees
          AL Central
          Guardians
          Royals
          Tigers
          Twins
          White Sox
          AL West
          Angels
          Astros
          Athletics
          Mariners
          Rangers
          NL East
          Braves
          Marlins
          Mets
          Nationals
          Phillies
          NL Central
          Brewers
          Cardinals
          Cubs
          Pirates
          Reds
          NL West
          D-backs
          Dodgers
          Giants
          Padres
          Rockies
          In-Season Tools
          2025 Closer Depth Chart
          2025 Injury Report
          2025 Payroll Pages
          2025 Transaction Tracker
          2025 Schedule Grid
          2025 Probables Grid
          2025 Lineup Tracker
          2025 Minor League Power Rankings
          Offseason Tools
          2025 Free Agent Tracker
          2025 Offseason Tracker
          2025 Opening Day Tracker
        • Prospects
          Prospects Home
          The Board
          The Board: Scouting + Stats!
          How To Use The Board: A Tutorial
          Farm System Rankings

          Top Prospects List
          20252024
          AL
          BALCHWATH
          BOSCLEHOU
          NYYDETLAA
          TBRKCRSEA
          TORMINTEX
          NL
          ATLCHCARI
          MIACINCOL
          NYMMILLAD
          PHIPITSDP
          WSNSTLSFG
          2025 Preseason Top 100
        • Glossary
          Library
          Batting Stats
          wOBA, wRC+, ISO, K% & BB%, more...
          Pitching Stats
          FIP, xFIP, BABIP, K/9 & BB/9, more...
          Defensive Stats
          UZR Primer, DRS, FSR, TZ & TZL, more...
          More
          WAR, UBR Primer, WPA, LI, Clutch
          Guts!
          Seasonal Constants
          Park Factors
          Park Factors by Handedness
        • Sign In
        • Intro
        • Features
        • Offense
          • Complete List (Offense)
          • OBP
          • OPS and OPS+
          • wOBA
          • wRC and wRC+
          • wRAA
          • Off
          • BsR
          • UBR
          • wSB
          • wGDP
          • BABIP
          • ISO
          • HR/FB
          • Spd
          • Pull%/Cent%/Oppo%
          • Soft%/Med%/Hard%
          • GB%, LD%, FB%
          • K% and BB%
          • Plate Discipline (O-Swing%, Z-Swing%, etc.)
          • Pitch Type Linear Weights
          • Pace
        • Defense
          • Overview
          • Def
          • UZR
          • DRS
          • Defensive Runs Saved – 2020 Update
          • Inside Edge Fielding
          • Catcher Defense
          • FSR
          • RZR
          • TZ / TZL
        • Pitching
          • Complete List (Pitching)
          • PitchingBot Pitch Modeling Primer
          • Stuff+, Location+, and Pitching+ Primer
          • ERA
          • WHIP
          • FIP
          • xFIP
          • SIERA
          • Strikeout and Walk Rates
          • Pull%/Cent%/Oppo%
          • Soft%/Med%/Hard%
          • GB%, LD%, FB%
          • BABIP
          • HR/FB
          • LOB%
          • Pitch Type Linear Weights
          • SD / MD
          • ERA- / FIP- / xFIP-
          • Plate Discipline (O-Swing%, Z-Swing%, etc.)
          • Pace
          • PITCHF/x
            • What is PITCHF/x?
            • Pitch Type Abbreviations & Classifications
            • Heat Maps
            • Common Mistakes
            • PITCHf/x Resources
        • WE/RE/LI
          • RE24
          • Win Expectancy
          • WPA
          • LI
          • WPA/LI
          • Clutch
        • Principles
          • DIPS
          • Regression toward the Mean
          • Replacement Level
          • Sample Size
          • Splits
          • Projection Systems
          • Linear Weights
          • Counting vs. Rate Statistics
          • Park Factors
          • Park Factors – 5 Year Regressed
          • Positional Adjustment
          • Aging Curve
          • League Equivalencies
          • Pythagorean Win-Loss
          • Luck
        • WAR
          • What is WAR?
          • WAR for Position Players
          • WAR for Pitchers
          • FDP
          • fWAR, rWAR, and WARP
          • WAR Misconceptions
        • Business

        The Beginner’s Guide to Sample Size

        by Neil Weinberg
        April 3, 2015

        A baseball season is the amalgamation of a lot of little events. Each pitch fits into a plate appearance which fits into an inning which fits into a game which fits into a series which fits into a season. That’s a lot of little data points flowing into an overall end result. We care a lot about which players will have good seasons and careers. It matters to us that we can distinguish between good players and bad players, but doing so requires that we understand which chunks of data are meaningful and which aren’t.

        Enter sample size. You’ve heard this phrase plenty over the last few years when talking about baseball statistics and it’s usually a conversation ended rather than a conversation started. Someone cites a stat and then another person says it doesn’t matter because the sample size is too small. What does that mean and how should we properly think about sample size in baseball?

        Each little moment in baseball is essentially random. Not random in the sense that all outcomes are equally likely, but random in the sense that the most likely outcome doesn’t happen every time. If the best hitter in baseball faced the worst pitcher 100 times, he would very likely strike out a couple of times and hit into a double play or two. He wouldn’t always hit a home run even if it was Coors Field and the pitcher was throwing meatballs. Think about the home run derby. MLB players can’t simply hit home runs on demand even when the pitcher is trying to help.

        When dealing with pitches flying 90+ miles per hour and split second movements, a whole bunch of randomness gets thrown into the pot. This means that any one plate appearance might have a funky result. You know this. One time Don Kelly took Yu Darvish deep.

        So of course, we know that a single plate appearance isn’t a convincing amount of data. Even the least sabermetrically minded person agrees with that concept. That single plate appearance is an valid data point, but it’s not enough information to inform your opinion very fully. Instead, you need more and more data points until you have enough for them to “stabilize.” Remember that word because the way we’re going to define it in a very specific way in a moment.

        Essentially, we want to make sure we have enough observations that the random noise gets cancelled out. Don Kelly hit a home run against Yu Darvish one time, but how many Kelly versus Davrish at bats do we need before we can accurately access their abilities? It’s more than one for sure, but the actual number you need depends on the skill you’re trying to analyze.

        For example, there are some skills that are more “stable” than others. For example, strikeout rate stabilizes in fewer than 100 PA while BABIP for a pitcher can take three years. The difference is the nature of the skill and the number of factors that influence the outcome of the play. With respect to strikeout rate, we’re only talking about the batter and pitcher’s ability to make or allow contact (or let strikes go by). When you’re talking about BABIP, you’re adding in quality of contact, direction, weather, defensive ability, luck, etc. That means there’s more room for noise and things with less noise in the actual data generating process stabilize more quickly.

        So let’s go back to this idea of stabilization. Conceptually, it’s an ironclad idea. You want to know how many data points it takes for the current information to provide an accurate assessment of the player in question. There’s no one point at which something stabilizes. Things become stable over time at a given speed. So after five PA, you know more about a hitter’s walk rate than after one PA, but you don’t know as much as you do after 150 PA or 600 PA. A statistic doesn’t stabilize, it becomes more stable.

        In baseball, we lean on some work by Russell Carleton (aka Pizza Cutter), who looked to see how many PA you need for a given statistic to reach the point where the correlation between that sample and another sample of the same size is 0.7 (i.e. R^2 of .49). That’s the colloquial definition of stabilize. So the rates you see on this page reflect that.

        But the key is that 100 PA is better than 50 PA no matter the statistic, but 50 PA is more useful for plate discipline stuff than it is for batted ball stuff. The rates are different, but it’s always better to have more data.

        For practical purposes, you really want to know the difference between a sample that’s meaningful and one that isn’t. There isn’t a point at which it becomes useful data all of a sudden, but there are quantities that are clearly one or the other. This is going to be important when the season starts next week.

        Every April, at least one previously bad hitter has an awesome month. They have a .380 wOBA over three weeks and lots of people rush to suggest they are a breakout candidate who did something during the offseason to improve. It’s important to note that this may be true or it may not be true. All we know is that they hit .380 wOBA over three weeks, let’s call it 85 PA.

        Are those 85 PA enough to lead us to totally change our opinion about this bad hitter to the point where we now think they are fundamentally different in the box? Using our sample size rules of thumb, the answer is no. A bad hitter can easily have a .380 wOBA over 85 PA without actually being a different hitter, just due to random chance. A couple lucky bounces and a well timed cluster of hits and his numbers look great even if he’s no different than he was before.

        Those 85 PA give you some idea that he might be improving, but they are not sufficient to change your mind completely. A true .380 wOBA hitter should hit .380 over more stretches than a .310 wOBA hitter, but a .310 wOBA hitter can hit .380 over a month no problem.

        Think of it this way. A true talent .300 hitter might go 3-10 over a stretch or they might go 6-10. That wouldn’t be strange at all and you wouldn’t change your mind about a hitter over 10 PA. The same is true for 50 or 100 with most stats. It seems meaningful early in the year when you don’t have other fresh data, but it’s not.

        This isn’t to say that streaky hitters don’t exists or that “hot-hand” is a fallacy. That’s a separate issue. This is an argument, backed by extensive data, that a collection of 40 PA is not more meaningful than the 500 that came before.

        This is tricky to internalize because when a player has success, you want to find a reason other than randomness because randomness is not an easy thing for the human mind to handle. But in many cases, it’s the right answer. Each player’s set of outcomes is drawn from a probability distribution around their true talent level. Sometimes those talent levels change, but you need a lot of evidence to believe that is the case. It’s very possible that a good hitter has a bad set of results over a short stretch just based on random stuff happening.

        So try not to make too much of the April results. You can definitely look at the underlying performance, but don’t make too much of the end product. A player might be hitting the ball harder this April and that’s a sign of a new swing, but just the fact that they have a few more hits than normal doesn’t mean that’s going to continue.





        Interpreting Playoff Odds and Projected Standings
         
        How To Use FanGraphs: Depth Charts

        Neil Weinberg is the Site Educator at FanGraphs and can be found writing enthusiastically about the Detroit Tigers at New English D. Follow and interact with him on Twitter @NeilWeinberg44.

        Comments are closed.


        Updated: Friday, May 16, 2025 11:36 PM ETUpdated: 5/16/2025 11:36 PM ET
        @fangraphs - Contact Us - Advertise - Terms of Service - Privacy Policy
        sis_logo
        All major league baseball data including pitch type, velocity, batted ball location, and play-by-play data provided by Sports Info Solutions.
        mlb logo
        Major League and Minor League Baseball data provided by Major League Baseball.
        Mitchel Lichtman
        All UZR (ultimate zone rating) calculations are provided courtesy of Mitchel Lichtman.
        TangoTiger.com
        All Win Expectancy, Leverage Index, Run Expectancy, and Fans Scouting Report data licenced from TangoTiger.com
        Retrosheet.org
        Play-by-play data prior to 2002 was obtained free of charge from and is copyrighted by Retrosheet.