Bayesian approach to analyzing goalies, Part 4: Regression to the Mean, Luongo vs Schneider, Thomas vs Rask, Hiller vs Fasth

Last time, in Part 3 of this series, we saw what our Bayesian estimate for Luongo's true ESSV% looks like over time; i.e. when we get more and more information and use more and more data.  See here for links to all posts in this series. 


In this article, we'll show results for more goalies than just Luongo. I haven’t really heard much analysis of the goaltending situation in the Vancouver (joke). I’ve felt burdened by this obvious need that the hockey world has (joke), so I thought I’d start with a little statistical analysis of Roberto Luongo and Cory Schneider (not meant to be a joke, though you may disagree after reading this). Really, I just want to use them as an example to help illustrate this approach to analyzing goalies. 


One of the things we noticed in Part 3 is that there is some regression to the mean going on.  Luongo's Bayesian estimate was always between his observed ESSV% and the league average ESSV%.  Let's start with a figure that shows that this "regression to the mean" happens for all goalies. 
 

In this figure, each gray dot corresponds to a goalie.  The horizontal position of the gray dot is based on the goalie's observed ESSV%, and the vertical position of the gray dot corresponds to our estimate of their true ESSV%.  The blue dots correspond to observed ESSV% and the red line is league average.

The key observation is that for each goalie, their gray dot is between their blue dot and the red line. This means for every goalie in the league, our estimate is between their observed ESSV% and the league average.  Above average goalies (right side) are automagically pulled down towards league average, and below average goalies (left side) are automagically pulled up towards league average. 

Notice that some goalies are closer to the league average than others.  We have sized the dots by shots faced.  Notice that the small dots are typically closer to league average.  We saw this before in Part 3.  When the number of shots is small, our estimate tends to be close to league average.  When the number of shots is large, our estimate tends to be closer to observed ESSV%. The extent to which our estimate is pulled towards the mean is automagically determined by the model.

One thing we didn't previously stress is that this method is pretty useful for comparing goalies who have faced different quantities of shots.  It is most useful when those goalies played on the same team so that we can assume they faced roughly the same quality of shots.  We'll adjust for quality of shots in a future series, which will be better for comparing goalies on different teams.

As an example, let's start with Luongo and Schneider, who both played for VAN during the seasons we used for this series (2008-09 thru 2012-13).  Luongo and Schneider had roughly the same even strength save percentage (.933, and .931) during the seasons that we are using.  Since Luongo has faced about 3000 more shots than Schneider, we are more sure of Luongo’s ability.  But how sure are we?  

Let's look at the same figure as above, with Luongo and Schneider highlighted:
Luongo's observed save percentage is greater than Schneider's, and since Schneider faced fewer shots, his Bayesian estimates get automagically pulled more strongly towards the league average.  The result is that our estimates for Luongo (.930) and Schneider (.926) are farther apart (difference of .004) than their observed ESSV% (.933 and .931, difference of .002).


Let's look at the curves for Luongo and Schneider as well:
The blue is Luongo, the red is Schneider, and the gray lines are the rest of the league's goalies.  Luongo's curve is to the right of Schneider's, indicating our estimate for Luongo is higher.  His curve is more narrow and has a higher peak, indicating that we are more sure of his estimate than we are of Schneider's.  In other words, Luongo's estimate is more precise.  Here's a table summarizing the results:
              TrueESSV%  Err ESSV% Shots
Roberto Luongo     .930 .003  .933  5239
Cory Schneider     .926 .004  .931  1932


The Err column indicates that the error bound for Luongo's estimate is smaller than that of Schneider, as we would expect.   


In this case of Luongo and Schneider, though the gap between the two goalies changes, the order of the two goalies stays the same:  Luongo has both a higher observed ESSV% and a higher Bayesian ESSV%.  In other words, using both methods we would conclude that Luongo is the better goalie.  But as we saw in the OTT and BUF write-ups in the 2013-14 Hockey Prospectus book, this isn't always the case.  Often, the order of your goalies will switch.  Let's give an example here, and highlight Tim Thomas and Tuukka Rask:

In this case, Rask has a higher observed ESSV% (.935) than Thomas (.934), but Thomas (.930) has a higher Bayesian ESSV% than Rask (.929). In the figure, Rask is further to the right, but Thomas is higher.  Here's a table for Thomas and Rask:


           TrueESSV%   Err ESSV% Shots
Tim Thomas     .9303 .0032 .9346  4723
Tuukka Rask    .9294 .0038 .9356  2824

The gap between them isn't huge in either case, but the order did switch.  See the OTT and BUF team pages in the 2013-14 Hockey Prospectus book for more extreme examples.

What about the goalie situation in ANA?  Here are the results for Hiller and Fasth:
  
            TrueESSV%   Err ESSV% Shots
Jonas Hiller    .9259 .0030 .9278  5234
Viktor Fasth    .9226 .0053 .9272   508


Hiller and Fasth have almost the same ESSV%, with a slight edge to Hiller.  But Fasth has far fewer shots, so our estimate of Fasth's true ESSV% is pretty close to the league average.  The gap between Hiller and Fasth according to true ESSV% (.0033) is larger than the gap in observed ESSV% (.0004).  Also, Hiller's estimate is more precise (smaller Err).  This analysis suggests that ANA shouldn't overreact to Fasth's strong performance last season and do something like trade Hiller.  (Well, that's the conclusion if we are focusing only on performance and are ignoring contract status and cap hit.)

These kinds of conclusions are not new to the analytics community.  You can eyeball their ESSV% and notice they are almost the same, you can notice the huge difference in the number of shots that these two goalies have faced (5000 vs 500).  Analysts familiar with the idea of "regression to the mean" would expect Fasth's ESSV% to regress.

But what is new is that we now have a way to quantify these qualitative ideas.  We also have error bounds on our estimates.  We'll continue with examples like this next time.  We'll also use our results to answer the question "What is the probability that Goalie A has a higher true ESSV% than Goalie B?" 


Links to other parts:
Part 1 - Introduction
Part 2 - An example using only 10 shots
Part 3 - Updating estimates with more information
Part 4 - Regression to the mean, Luongo vs Schneider, Thomas vs Rask, Hiller vs Fasth

7 comments:

  1. This is some awesome stuff Brian! Keep it up.

    ReplyDelete
  2. Good stuff. Two questions:
    1. What do you mean by 'true'?

    2. What prior are you using and why did you choose it?

    ReplyDelete
  3. Thanks.

    1. By true SV% I mean our estimate of what a goalie's save percentage will be after a TON of shots. This assumes things like the goalie's ability, health, and defense remain roughly the same.

    2. I chose a Beta prior, but I didn't choose the parameters for prior. The parameters were determined by the model at the same time as all of the save percentages were determined.

    ReplyDelete
  4. Also, I don't claim it is the best prior. It would likely be better to use more information and give each goalie their own prior.

    ReplyDelete
  5. Hi Brian
    I really enjoyed reading your work and it has got me wondering if this method could translate to other areas and other sports. Could you please recommend a good place to start for someone looking to apply this to their own work? I am most familiar with R so any R packages I should look up? Also any good introductory tutorials on the techniques?
    Much appreciated.

    ReplyDelete
  6. Thanks Oliver! As far as Bayes and other sports, I know Jim Albert has some books on baseball, stats, Bayes, and R. See his website http://bayes.bgsu.edu/. That might be kind of what you are looking for. In one of those resources he does something similar to this for batting average (AB in instead of shots, Hits instead of Goals).

    The only package I used was BRugs, which calls either OpenBUGS or WinBUGS.

    Hope that helps!



    ReplyDelete