​
​
Sign In
  • Support FanGraphs
    FanGraphs Membership
    FanGraphs Shirts
    FanGraphs Mugs
    Gift a Membership
    Donate to FanGraphs
  • Fantasy
    Fantasy Tools
    Fantasy Player Rater
    Auction Calculator
    Ottoneu Fantasy Baseball
    Signup, FAQ, Blog Posts
  • Blogs
    Blog Roll

    FanGraphs
      Podcasts: Effectively Wild

      FanGraphs Prospects

      RotoGraphs
        Podcasts: The Sleeper and The Bust | Field of Streams | Beat the Shift

        Community Research

          Archived Blogs: The Hardball Times | NotGraphs | TechGraphs | FanGraphs+
          Archived THT: THT Live | Dispatch | Fantasy | ShysterBall
          Archived Podcasts: FanGraphs Audio | Chin Music | UMP: The Untitled McDongenhagen Project | Stealing Home | Doing It For Bartolo | OttoGraphs |
        • Projections
          2025 Pre-Season Projections
          ZiPS, ZiPS DC
          Steamer
          Depth Charts
          ATC
          THE BAT, THE BAT X
          OOPSY
          2025 600 PA / 200 IP Projections
          Steamer600, Steamer600 (Update)
          2025 Updated In-Season Projections
          ZiPS (RoS), ZiPS (Update), ZiPS DC (RoS)
          Steamer (RoS), Steamer (Update)
          Depth Charts (RoS)
          ATC DC (RoS)
          THE BAT (RoS), THE BAT X (RoS)
          OOPSY DC (RoS)
          3-Year Projections
          ZiPS 2026, ZiPS 2027
          On-Pace Leaders
          Every Game Played, Games Played %
          Cy Young Award Projections

          Auction Calculator
        • Scores
          Today
          Live Scoreboard, Probable Pitchers
          Live Daily Leaderboards
          Win Probability & Box Scores
          2025, 2024, 2023, 2022, 2021, 2020, 2019
          AL Games
          NL Games
        • Standings
          2025 Projected Standings
          2025 Playoff Odds, Playoff Odds Graphs
          2024 ZiPS Postseason Game-By-Game Odds
          AL East
          AL Central
          AL West
          NL East
          NL Central
          NL West
        • Leaders
          Major League Leaders
          Batting: 2025, 2024, 2023, 2022, 2021, Career
          Pitching: 2025, 2024, 2023, 2022, 2021, Career
          Fielding: 2025, 2024, 2023, 2022, 2021, Career
          Major League Leaders - Rank
          Batting: Ranking Grid, Compare Players, Compare Stats
          Pitching: Ranking Grid, Compare Players, Compare Stats
          Splits Leaderboards
          Season Stat Grid

          Postseason Leaders
          Batting: 2024, (WS), (LCS), (LDS), (WCS), Career
          Pitching: 2024, (WS), (LCS), (LDS), (WCS), Career

          Spring Training Leaders
          Batting: 2025, 2024, 2023
          Pitching: 2025, 2024, 2023

          KBO Leaders
          Batting, Pitching
          NPB Leaders
          Batting, Pitching

          Minor League Leaders
          AAA: International League, Pacific Coast League
          AA: Eastern League, Southern League, Texas League
          A+: Midwest League, South Atlantic League, Northwest League
          A: California League, Carolina League, Florida State League
          CPX: Arizona, Florida
          R: Dominican Summer League
          College Leaders
          Batting, Pitching

          WAR Tools
          Combined WAR Leaderboards
          WAR Graphs
          WPA Tools
          WPA Inquirer
          Rookie Leaders
          Batters 2025, Pitchers 2025
          Splits Leaders
          Batters: vs L, vs R, Home, Away
          Pitchers: vs L, vs R, Home, Away
        • Teams
          Team Batting Stats
          2025, 2024, 2023, 2022, 2021, 2020
          Team Pitching Stats
          2025, 2024, 2023, 2022, 2021, 2020
          Team WAR Totals (RoS)
          AL East
          Blue Jays  |  DC
          Orioles  |  DC
          Rays  |  DC
          Red Sox  |  DC
          Yankees  |  DC
          AL Central
          Guardians  |  DC
          Royals  |  DC
          Tigers  |  DC
          Twins  |  DC
          White Sox  |  DC
          AL West
          Angels  |  DC
          Astros  |  DC
          Athletics  |  DC
          Mariners  |  DC
          Rangers  |  DC
          NL East
          Braves  |  DC
          Marlins  |  DC
          Mets  |  DC
          Nationals  |  DC
          Phillies  |  DC
          NL Central
          Brewers  |  DC
          Cardinals  |  DC
          Cubs  |  DC
          Pirates  |  DC
          Reds  |  DC
          NL West
          D-backs  |  DC
          Dodgers  |  DC
          Giants  |  DC
          Padres  |  DC
          Rockies  |  DC
          Positional Depth Charts
          Batters: C, 1B, 2B, SS, 3B, LF, CF, RF, DH
          Pitchers: SP, RP
        • RosterResource
          Current Depth Charts
          AL East
          Blue Jays
          Orioles
          Rays
          Red Sox
          Yankees
          AL Central
          Guardians
          Royals
          Tigers
          Twins
          White Sox
          AL West
          Angels
          Astros
          Athletics
          Mariners
          Rangers
          NL East
          Braves
          Marlins
          Mets
          Nationals
          Phillies
          NL Central
          Brewers
          Cardinals
          Cubs
          Pirates
          Reds
          NL West
          D-backs
          Dodgers
          Giants
          Padres
          Rockies
          In-Season Tools
          2025 Closer Depth Chart
          2025 Injury Report
          2025 Payroll Pages
          2025 Transaction Tracker
          2025 Schedule Grid
          2025 Probables Grid
          2025 Lineup Tracker
          2025 Minor League Power Rankings
          Offseason Tools
          2025 Free Agent Tracker
          2025 Offseason Tracker
          2025 Opening Day Tracker
        • Prospects
          Prospects Home
          The Board
          The Board: Scouting + Stats!
          How To Use The Board: A Tutorial
          Farm System Rankings

          Top Prospects List
          20252024
          AL
          BALCHWATH
          BOSCLEHOU
          NYYDETLAA
          TBRKCRSEA
          TORMINTEX
          NL
          ATLCHCARI
          MIACINCOL
          NYMMILLAD
          PHIPITSDP
          WSNSTLSFG
          2025 Preseason Top 100
        • Glossary
          Library
          Batting Stats
          wOBA, wRC+, ISO, K% & BB%, more...
          Pitching Stats
          FIP, xFIP, BABIP, K/9 & BB/9, more...
          Defensive Stats
          UZR Primer, DRS, FSR, TZ & TZL, more...
          More
          WAR, UBR Primer, WPA, LI, Clutch
          Guts!
          Seasonal Constants
          Park Factors
          Park Factors by Handedness
        • Sign In
        • Intro
        • Features
        • Offense
          • Complete List (Offense)
          • OBP
          • OPS and OPS+
          • wOBA
          • wRC and wRC+
          • wRAA
          • Off
          • BsR
          • UBR
          • wSB
          • wGDP
          • BABIP
          • ISO
          • HR/FB
          • Spd
          • Pull%/Cent%/Oppo%
          • Soft%/Med%/Hard%
          • GB%, LD%, FB%
          • K% and BB%
          • Plate Discipline (O-Swing%, Z-Swing%, etc.)
          • Pitch Type Linear Weights
          • Pace
        • Defense
          • Overview
          • Def
          • UZR
          • DRS
          • Defensive Runs Saved – 2020 Update
          • Inside Edge Fielding
          • Catcher Defense
          • FSR
          • RZR
          • TZ / TZL
        • Pitching
          • Complete List (Pitching)
          • PitchingBot Pitch Modeling Primer
          • Stuff+, Location+, and Pitching+ Primer
          • ERA
          • WHIP
          • FIP
          • xFIP
          • SIERA
          • Strikeout and Walk Rates
          • Pull%/Cent%/Oppo%
          • Soft%/Med%/Hard%
          • GB%, LD%, FB%
          • BABIP
          • HR/FB
          • LOB%
          • Pitch Type Linear Weights
          • SD / MD
          • ERA- / FIP- / xFIP-
          • Plate Discipline (O-Swing%, Z-Swing%, etc.)
          • Pace
          • PITCHF/x
            • What is PITCHF/x?
            • Pitch Type Abbreviations & Classifications
            • Heat Maps
            • Common Mistakes
            • PITCHf/x Resources
        • WE/RE/LI
          • RE24
          • Win Expectancy
          • WPA
          • LI
          • WPA/LI
          • Clutch
        • Principles
          • DIPS
          • Regression toward the Mean
          • Replacement Level
          • Sample Size
          • Splits
          • Projection Systems
          • Linear Weights
          • Counting vs. Rate Statistics
          • Park Factors
          • Park Factors – 5 Year Regressed
          • Positional Adjustment
          • Aging Curve
          • League Equivalencies
          • Pythagorean Win-Loss
          • Luck
        • WAR
          • What is WAR?
          • WAR for Position Players
          • WAR for Pitchers
          • FDP
          • fWAR, rWAR, and WARP
          • WAR Misconceptions
        • Business

        PitchingBot Pitch Modeling Primer

        by Cameron Grove
        March 10, 2023

        Introduction

        The wealth of Statcast data that has become available to the public has led to a plethora of new ways to analyze players. We can now create statistics that are more indicative of a player’s approach and process rather than purely relying on outcomes.

        Expected statistics such as xBA and xwOBA have entered mainstream discourse, with fans understanding that these can sometimes be more useful than the results on the field. A hitter may barrel the ball for a deep fly out, but that barrel indicates that better results are likely to come in the future. Similarly, a pitcher can throw a great pitch that gets hit for a home run, but that great pitch may be indicative of future success.

        xBA takes a batted ball’s exit velocity and launch angle and uses a model to produce the probability of a hit. These pitch quality grades take this a step further back, using characteristics of individual pitches to produce an assessment of pitcher quality, with no reference to any outcomes after the ball is thrown.

        A number of independently derived pitch quality metrics have recently cropped up in public analysis. I developed this model over the past couple of years under the name PitchingBot, and used to host these grades on my website.

        Why Model Pitch Quality?

        Before going into too much detail, I should explain why these are useful statistics to make. By removing all references to outcomes, these grades are much more stable than other pitching statistics. This means that it is possible to make quicker judgments on player quality over smaller sample sizes. There is a wide range of features that an analyst must pay attention to when assessing pitcher quality, including velocity, movement, spin rate, release height and extension, spin/movement axis deviation, and location, among others. It can be tough to combine all of this information inside your head, but these models can weigh everything appropriately and distill it into a single number representing overall quality.

        There’s a reason that most, if not all, major league organizations have their own pitch quality models. These models’ outputs are driving pitch usage changes, and analytically-driven teams are some of the best at improving the pitch quality of players that they acquire. There are also applications beyond measuring the quality of major league pitchers. Pitch quality models can help to develop minor league players and measure their ability without needing to face big league hitters. They can also inform decisions on where pitches should be thrown in different scenarios to produce desired outcomes.

        These models use a completely different data source than most statistics that measure pitcher ability. There’s no point in mixing SIERA and xFIP together to produce a new statistic because they use very similar information. Mixing pitch quality grades with existing ERA estimators may provide a projection that is better than either statistic could produce independently because the grades provide new information.

        How is Pitch Quality Measured?

        This section is a detailed overview of how the pitcher grades are produced — feel free to skip ahead if you don’t want to see how the sausage is made.

        The core of the pitcher grading model is many smaller sub-models predicting individual event likelihoods. The flowchart below shows how these different sub-models can be joined together to get a full set of predicted outcomes. The reason for using so many sub-models is that by limiting the scope of each one, they can make much better predictions than a more general model that needs to divert its attention between many tasks.

        Pitches are categorized into fastballs, breaking balls, and offspeed pitches, each type with its own set of prediction models.

        A benefit of splitting the models up in this way is that it makes the grades less of a “black box.” By digging into the predictions, it is possible to see why a player gets good grades; the models may think they will get lots of swings and misses, or generate weak contact and groundballs.

        The input variables used by the models are:

        • Contextual variables
          • Pitcher handedness
          • Batter handedness & strike zone height
          • Count (balls & strikes)
        • Stuff variables
          • Velocity
          • Spin rate
          • Horizontal and vertical movement
          • Release point and extension
          • Spin efficiency (estimated) and spin/movement axis deviation
          • Difference in velocity and movement to the pitcher’s primary fastball
        • Location variables
          • Pitch height and horizontal position at the plate

        The models are produced using XGBoost, a machine learning technique that builds a collection of decision trees. This isn’t the place to go into a fully detailed overview of exactly how this works and specific parameter choices. Overfitting can be an issue with this method, but it may be avoided with careful training processes. These models should be robust enough to not produce bizarre outcomes like the probabilities seen on last season’s Apple TV+ broadcasts.

        A Walkthrough of One Pitch

        To understand how these predictions work to produce pitch values, here’s an example of a single pitch. On September 24, 2022, Jacob deGrom struck out Sean Murphy with a well-placed 2-2 slider. The models see the pitch location, speed, spin, movement, release point, etc. along with the context (a right-on-right 2-2 breaking ball), and produce the following predictions:

        Pitch Prediction: deGrom vs Murphy 9/24/2022
        Event Outcome
        xSwing% 82%
        xWhiff% (assuming a swing) 46%
        xFoul% (assuming contact) 44%
        xCalled Strike% (assuming no swing) 34%

        This gives the following likelihoods for different events:

        Pitch Prediction: deGrom vs Murphy 9/24/2022
        Event Outcome
        Swinging Strike% 38%
        Called Strike% 6%
        Ball% 12%
        Foul Ball% 19%
        Ball in Play% 25%

        For the ball in play outcomes, the model produces the following likelihoods of different types of ball in play:

        Pitch Prediction: deGrom vs Murphy 9/24/2022
        Batted Ball Type EV<90mph 90mph <= EV <= 95mph 95mph <= EV <= 100mph 100mph <= EV <= 105mph 105mph <= EV
        Groundball 47% 6% 5% 3% 2%
        Line Drive 14% 3% 3% 2% 1%
        Flyball 12% 2% 1% <1% <1%

        This ball is unlikely to be hard hit, with only a 17% probability, and it is most likely to be a groundball if put into play.

        Now that we know the range of possible outcomes, we can weigh each outcome by its run value to get an Expected Run Value (xRV) for the pitch. Strikes are better than balls and weak groundballs are better than hard-hit fly balls. xRV is normalized such that the average from any count is always zero. The table below shows how the different outcome probabilities contribute to the overall xRV:

        Pitch Prediction: deGrom vs Murphy 9/24/2022
        Event Context-neutral run value of event Contribution to xRV (run value x probability)
        Swinging Strike -0.235 -0.088
        Called Strike -0.235 -0.015
        Ball 0.127 0.015
        Foul Ball 0 0
        Ball in Play (sum of all different types) ~0 ~0
        Total xRV -0.088

        So this pitch was predicted to save the Mets 0.088 runs above the average pitch in that situation. These models are applied to every pitch thrown in major league games.

        Grading on the 20-80 Scale

        The previous section showed how to get the xRV for a single pitch, but how does that get turned into a statistic to measure pitcher quality? One option is to average all the xRVs and produce a metric on the runs scale. However, this could be tough to interpret, especially since “per pitch” is not a commonly used rate in baseball statistics.

        The 20-80 scouting scale is one that many baseball fans are familiar with. If you aren’t, you can read a primer about it here.

        Pitcher xRV values are converted to grades on the 20-80 scale. In addition, grades for individual pitch types are shown on this scale too.

        The average major league player is a 50 on the scale, with each step of 10 up or down representing one standard deviation in ability. Sixty is above average, 70 is excellent, and 80 represents one of the top players in the majors at that skill. The graph below shows that the distribution of pitcher grades follows a bell curve:

        In addition to the pitcher grades, I have also provided PitchingBot ERA. This puts the expected run value onto the ERA scale, which is more familiar to most fans.

        Stuff & Location

        The pitcher grades discussed so far use models that include both stuff and location features of pitches. I have also produced grades that only use stuff or location variables independently.

        The stuff models include everything except where the ball actually goes: pitch velocity, spin, movement, release point, etc. For the location-only models, no stuff variables are included, only the generic pitch type (fastball, breaking, offspeed), the context (balls, strikes, handedness), and the location of the pitch.

        For the stuff grades, only models that predict swing events are used (whiffs, foul balls, balls in play). Otherwise, the models would be trying to predict zone rate without any location cues.

        Stuff quality is much more stable than location quality within and between seasons, so stuff can be useful for analyzing small samples when other statistics aren’t appropriate. The graph below shows how these grades stabilize more quickly than other pitching statistics, especially the stuff grade. See this article for more details on how the stability of a statistic can be measured.

        Example of Usage

        To see why these grades may be useful in the analysis of a pitcher, let’s look at Logan Webb. Webb started his major league career by using a four-seam fastball as his primary pitch. In 2021, he switched to using a sinker instead, with much better results. Here are the stuff grades for those two pitches from 2019-22, along with Webb’s PitchingBot ERA.

        Logan Webb Four-Seamer vs. Sinker
        Season Four-seamer Sinker  PitchingBot ERA ERA
        Stuff Grade Usage Stuff Grade Usage
        2019 41 44% 54 13% 5.28 5.22
        2020 38 34% 65 15% 5.14 5.47
        2021 42 10% 55 38% 3.69 3.03
        2022 34 3% 51 33% 3.28 2.9

        The pitch grading model already knew that Webb’s sinker was much better than his four-seam fastball. When he changed his pitch mix, it produced a marked change in pitch quality and early evidence that he would break out in 2021.

        Limitations

        There are some caveats that come with the analysis of these grades because the models cannot account for everything that a pitcher does. Notable blind spots include the effects of command, deception, sequencing, and some arsenal effects.

        Command is unaccounted for because the model doesn’t know where a pitcher was aiming to throw the ball. Pitchers with good command have improved catcher framing numbers and can exploit hitters’ weaknesses more effectively. Command artist Kyle Hendricks was a notable model over-performer from 2015-2020.

        Similarly, some pitchers have deceptive deliveries or effective methods of using different pitches to keep hitters off balance. Ace starters such as Clayton Kershaw, Max Scherzer, and Corbin Burnes have reliably outperformed their expected pitch quality over a large sample.

        All ball-in-play predictions are treated independently of spray angle. If a pitcher such as Framber Valdez is able to pitch to the benefit of the defensive positioning behind him, the models do not account for it.

        And despite these grades being more stable than regular statistics, they can still be vulnerable to small sample size effects. This should be kept in mind when assessing a pitcher after only a few appearances.

        Summary

        Statcast data is ushering in a new era of model-based sabermetrics. FanGraphs will now present pitch quality grades that are a stable and results-independent measure of pitcher ability along with its traditional pitch values.

        FanGraphs will also have access to all of the predictions underlying these grades, allowing writers to explore new analysis opportunities, which I’m excited to read in the future.

        Cameron Grove worked as an independent baseball analyst before joining the Cleveland Guardians front office. His blog can be found here.




        Team Pages
         
        Stuff+, Location+, and Pitching+ Primer

        Comments are closed.


        Updated: Friday, May 16, 2025 3:15 PM ETUpdated: 5/16/2025 3:15 PM ET
        @fangraphs - Contact Us - Advertise - Terms of Service - Privacy Policy
        sis_logo
        All major league baseball data including pitch type, velocity, batted ball location, and play-by-play data provided by Sports Info Solutions.
        mlb logo
        Major League and Minor League Baseball data provided by Major League Baseball.
        Mitchel Lichtman
        All UZR (ultimate zone rating) calculations are provided courtesy of Mitchel Lichtman.
        TangoTiger.com
        All Win Expectancy, Leverage Index, Run Expectancy, and Fans Scouting Report data licenced from TangoTiger.com
        Retrosheet.org
        Play-by-play data prior to 2002 was obtained free of charge from and is copyrighted by Retrosheet.