Welcome to the second edition of Advanced Stats 102, where we go beyond CorsiRel and discuss some of the more sophisticated metrics that people are using to analyze players, along with their strengths, weaknesses, and where to find them.
Today, we’ll be covering Goals Above Replacement (GAR), which has been used in baseball in various forms, usually called Wins Above Replacement (WAR) for about a decade. It is just now making it’s way to the public sphere of hockey analytics, thanks to Dawson Sprigings (@DTMAboutHeart).
In concept, GAR is a one size fits all number that encapsulates how valuable an individual player is in terms of on-ice play, relative to a ‘replacement level’ player. A replacement level player is a player of a caliber such that they are readily available and can be acquired and played at a moment’s notice. Think along the lines of the players who shuttle waivers every year, or are your emergency callups from the AHL. A replacement level player is one of those. An example would be a player like Byron Froese - a good player in the AHL who becomes very limited at the NHL level.
The first, and most obvious question is, how is GAR calculated? For the full details, I would refer you to Sprigings’ write-up on this. He covers it well, and in non-technical fashion, so I highly recommend you read that first. This piece will not include the mathematical details.
The short version is that player value is decomposed into six categories: even strength offense, even strength defence, power play offense, penalty drawing, penalty taking, and faceoffs. Value is calculated in each of these categories, usually by a regression-based technique, and summed up for each player. The drivers of value change depending on the category. Within even strength offense, the important drivers of value are scoring and shot generation (accounting for shot quality, competition, and teammates). For even strength defense, it’s all about shot suppression, again, accounting for shot quality, competition and teammates. Power play offense is largely about power play production, and the remaining three have obvious drivers.
These are all summed up to get a total value for the player.
What are the strengths of GAR?
For one, it’s the only publicly available stat that attempts to account for basically every facet of play. It also gives a digestable, understandable interpretation for the value of a player, especially relative to one another. It’s an excellent starting point to get an idea of the value of a player, made more useful by the fact that you can look at the individual components to see where they shine and where they falter. It’s also very useful for looking at trends - maybe certain skills take longer to develop than others, and have a different aging curve.
What are the weaknesses of GAR?
While I think the stat is tremendously useful, there are a few things you have to keep in mind when using it. For one, GAR values are estimates - because it uses regression techniques in some places, there is inherent uncertainty in the values output by the system. Those error bars are hidden from view - we don’t really get to see them, and as a result, you have to be careful not to make conclusions based on GAR values that are relatively close to one another.
Along these same lines, sometimes it spits out counterintuitive values, and it’s hard to see exactly why. The complexity of the model means there’s no longer easy mappings from things we consider ‘inputs’ to player value (points, possession ability, etc.) to the GAR output. They’re obviously correlated, but there are now contextual factors (teammates, competition, score usage) included that make the mapping from input to output more opaque. In that way, the model is perhaps more opaque than one would like. However, this is no different from the heavily accepted WAR stats used in baseball. You can break them down into their core components, but it takes a fair bit of effort.
When I asked Sprigings what he thought the biggest weakness of the stat is, he mentions a more conceptual issue, noting that the stat straddles the line between being a measure of ‘true talent’ as opposed to ‘the value a player provided’. Parts of the even strength offense and defense are more a measure of ‘true talent’ but the rest tends to be a measure of what happened. Sprigings brought up an example where assists per 60 minutes are used as one of the inputs to assess even strength offense. That is a measure of what happened. However, Sprigings feels it would make more sense to use something like expected assists, which is a more apt measure of talent.
GAR, in my opinion, also struggles a little bit in divvying up credit between teammates. Sprigings uses robust mathematical techniques to try and separate the effects from teammates, but it is a non-trivial problem, and even the most robust method may struggle if players spend all of their time on ice together and have very little time apart. This can be more pronounced in situations where one of the players doesn’t have a lot of historical data to go off of (for example, rookies).
Whoa, that’s a lot of drawbacks. Why should we use it if there’s so much wrong with it?
Well, the weaknesses aren’t things that are wrong with the stat. They’re just things that have to be kept in mind when interpreting it. It doesn’t mean GAR loses all its value as a tool to understand which players are good and bad. When you think about it, every stat has similar caveats. Many relative possession stats don’t even adjust for teammates, and if they do, it’s done in a very rudimentary way. And any WOWY analysis not done by a computer is inherently incomplete - no human can really understand the complex web of player interaction across groups without drastically simplifying it.
Where GAR stands out from the pack is that it makes a real attempt to adjust for these contextual factors impacting a player. How often do you see someone dismiss a player for playing against easy competition, or getting favourable usage. By using GAR, we can adjust for those factors in a quantitative way.
At the very least, GAR is a valuable starting point to get an idea of what a player’s worth is, that can be refined and studied further. At its best, it is far more than that - its a concrete expression of how much a player is helping or hurting a team, and its certainly more robust than a pithy look at a player’s HERO chart with a quick Twitter quip. It’s not perfect, but it’s a strong improvement over many of the stats we currently have.
Where can I find this stat?
Sean Tierney of The Athletic has made a Tableau dashboard for this, which you can view here. You can sort by basically anything you want there, so it’s arguably the most usable form of this data. There’s also this Google Doc that contains all the information for last year.
Thank you to Dawson Sprigings for his assistance with this article.