Interpreting Playoff Odds and Projected Standings
As you might have noticed, our playoff odds and projected standings are now up and running for the 2015 season. If you’re a regular FanGraphs reader, or intend to be this year, you’re going to see a decent amount about the various numerical expectations we post on the site. While these odds and standings are a lot of fun and a great tool for taking stock of the league, it’s also pretty easy to misunderstand or use them improperly.
Before I run through the proper way to read the odds and standings, I want to provide a brief overview of how we arrive at the numbers you see on the site.
Our player projections are based on the FanGraphs Depth Charts which are generated by giving equal weight to Steamer and ZiPS (two projection systems) and then manually estimating playing time. Then based on the depth charts, we simulate the season 10,000 times and report the results as playoff odds and projected standings. We also host a Season to Date model and Coin Flip model which project the season based on the current year’s stats (instead of projections) or a 50/50 chance at winning each game, respectively.
While the actual calculations and projections that go into the system are sophisticated, there’s nothing that’s too difficult to understand conceptually. We take the projections, we make a guess about playing time, and then we predict a bunch of seasons and show you the results.
However, if you’re not a statistician by trade, there are a few places you might get tripped up along the way. There are couple simple considerations right up front. The forecast is only as good as the inputs, so anywhere that Steamer and ZiPS are flawed, the model will be as well. Second, our playing time forecasts are based on the best assessment of the person on our staff who handles that particular division. As far as I know, none of them are actual wizards with the ability to see into the future.
It’s easy to say Troy Tulowitzki will miss some time, but it’s much hard to determine which star players will happen to have a serious injury this year. If Mike Trout misses three months of the year, that’s going to affect the Angels’ postseason odds, but right now, we’re expecting him to get most of the reps in center field. If that changes, we’ll have to update the odds.
Finally, the system doesn’t know about roster moves. Even though there’s a good chance Cole Hamels will be traded around the deadline, we’re currently putting all of his innings in the Phillies’ column. At some point, that will change and the odds will change with it.
The odds you’re looking at are based on the rosters and playing time as defined by the current depth charts. If those change, things change in the odds department.
Now that we have an idea about what the system doesn’t do, let’s talk about what it does do. When you look at the playoff odds, say for the Red Sox, you notice they have a 28% in the ALDS column, meaning the current system puts them at 28% to win the ALDS and advance to the ALCS.
Specifically, that means that in 28% of the 10,000 simulations, the Red Sox made it to the ALCS or further. The 28% you see on the page is not a “true” measure of how likely it is, it’s an indication of how often it occurred in our simulation. Now, given that we ran 10,000 trials, there’s not going to be a ton of noise, but if you ran another 10,000 and another, you’d get slightly different marks. Maybe 27%. Maybe 31%.
It’s also important to note what that 28% really means. To make it a little easier, let’s look at the Dodgers’ World Series odds, which are 18.2%. In other words, in 18.2% of our 10,000 runs, the Dodgers won the title. They won in more of the simulations than any other team, which to most people means they are the favorite this year.
And that is true, but it also doesn’t mean that FanGraphs is saying the Dodgers are going to win the World Series. Our model is saying they are the single most likely team to win the Series, but in 81% of of the simulations, someone else won.
In fact, let’s pretend that the Dodgers could steal Trout and Tulowitzki and Felix Hernandez. Let’s say that after those additions, they somehow had a 70% change to win the World Series. That would be an unfathomable number for March, but go with it. That would still mean that in 30% of the simulations, someone else won the World Series. The Dodgers would be the most likely, but not a sure bet.
As my college econometrics professor once said, “30% is not like walking outside and getting hit by an asteroid. 30% happens all the time.” I would add, it happens 30% of the time, but that’s just me being pedantic.
Specifically, what you need to remember is that any type of statistical odds like this are based on what happens in the long run over repeated samples. If we could play this season over and over, we’re pretty confident the Dodgers would win the most World Series titles, but in any one season, the results could be all over the place.
Let’s draw a comparison to a single player. Let’s say your player has a .350 OBP this year, last year, and every year. Everyone believes he is a .350 OBP hitter. Now zoom in on one plate appearance. The odds that he reaches base are 35%, but he’s only going to reach base or not reach base. Even if we expand to four PA, there are only five outcomes, .000, .250, .700, .750, and 1.000. There’s no way he can have a .350 OBP over four PA but we know he’s a .350 OBP hitter.
The same logic works for the full season, but instead of needing 600 PA to balance everything out, we need more like a couple thousand games and we only get 162. As a result, weird stuff is going to happen. That’s okay. The odds and the standings are estimates based on the quality of the players, the schedule, and the probability of winning games along the way. Due to the nature of the game, they will be wrong pretty often at the micro level, but they should be mostly right at the macro level. But that can be hard to see when you can only ever view a single season of results.
If you have questions, we’re here to answer them. Take a look over at our Playoff Odds and see if you think they make sense or they’re garbage. And if you think they’re wrong, try to figure out where your expectations differ from the model’s.
Neil Weinberg is the Site Educator at FanGraphs and can be found writing enthusiastically about the Detroit Tigers at New English D. Follow and interact with him on Twitter @NeilWeinberg44.
“Then based on the depth charts, we simulate the season 10,000 times…”
What method do you use to simulate the season?
Basically works like this, but just with the average strength of the team not the individual lineups http://www.fangraphs.com/blogs/fangraphs-game-odds/