Stop Scoring-Adjusting Tiny Samples

A few years ago the most commonly-used possession metric was Corsi or Fenwick close. Corsi and Fenwick are slightly different methods of measuring shot ratios, while "close" meant that we were only including shot attempts while the score was tied or the teams were only one goal apart (with one exception, which is that in the 3rd period only score-tied shot attempts were included).

The idea behind these "close" stats was that teams will typically alter their strategies when the score is further apart, so by only including data from when games were at their most competitive we were getting a better idea of the true talent level of teams.

There was a problem with this approach though: by only including score-close situations, we were throwing out an awful lot of data.

Eric Tulsky, currently employed by the Carolina Hurricanes but at the time a writer for fellow SBN blog Broad Street Hockey, came up with the idea of doing a "score adjustment" so that we were working with a larger set of data, thus increasing the quality of our stats. Eric did a good job of explaining what he did, so instead of just repeating what he said I'd urge you to go read the post if you're not familiar with the idea of score-adjustment.

It turns out that score-adjusting possession statistics makes them useful earlier in the season than our old measurements did. Late last year Micah Blake McCurdy did some further exploration of the concept and was able to confirm and expand upon Tulsky's findings.

As a result of Eric and Micah's work, score-adjusted numbers are being used by a growing number of people, and in general I think that's a good thing. But one thing that I'm seeing more and more that I find quite annoying is people using score-adjusted metrics to describe possession in individual games (or even individual periods) and I think that's misguided. Why? Well let's look at score-adjusted numbers in a bit more detail.

WHY DO WE USE SCORE-ADJUSTED NUMBERS?

We use scored-adjusted numbers because they are a better predictor of future performance than non-score-adjusted numbers, which means that they're a more accurate indicator of the true talent level of teams. However, it still takes time for them to reach a useful level. Here's one of the graphs that Micah created:

It takes about 15-20 games for score-adjusted Corsi (SAC) to begin to level out. After just a few games, it has a minimal relationship to future SAC.

In short, you can't actually make a very good guess about how a team will perform in the future based on one game worth of SAC. This means that we can't use a single game (or period) worth of SAC as a predictor of future performance. It's too random over such a short span of time.

The other reason we use SAC, as mentioned above, is to improve upon score-close measures by including more data. But score-adjusting a single game doesn't include more data than just using unadjusted Corsi; it uses exactly the same amount.

There's just no reason to use SAC for a single game. Why would you try to predict future performance based on one game?

SO WHY DO PEOPLE USE SCORE-ADJUSTMENTS IN SMALL SAMPLES?

The argument is that score-adjusted Corsi is a more accurate descriptor of what happened during a game than standard Corsi is, since it accounts for the fact that teams play differently depending on the score.

There's just one problem: it doesn't actually do that. Not over just one game.

To explain why that is, we'll need to back up for a second. Score-adjusted Corsi takes account of the average shot ratios that teams have in various game states; what percent of the shots does an average NHL team take when they're up by 2 goals and so forth.

Individual teams may play a bit differently in these situations, but we expect that over the course of many games, teams across the league will face a variety of opponents and these effects will average out.

But that is absolutely untrue in a single game. Not all teams play to score effects in the same way, and applying a league average number to a situation in which there is no reason to believe a league average result will take place is misleading.

For example, the difference between score-tied Corsi and up-by-2 Corsi for the LA Kings last year was just 1.4%. But for the Winnipeg Jets, the difference was 9.8%! The two teams respond to score-effects in very different ways. So if I'm trying to describe what "really happened", I can't assume that a game against the LA Kings and a game against the Winnipeg Jets will result in the same degree of score-effects.

We have pretty good evidence that it won't. That's why it's misguided at best and dishonest at worst to use league average numbers to adjust an individual game. In the long run things often even out. But in the short run they don't, which makes it a bad idea to apply long-run thinking to short-run situations.

One way to try to account for this problem would be to do a score-adjustment based only on the numbers of the teams involved in an individual game. That approach would be better than the current one, but it's still critically flawed. In an individual game (or period), we can't expect that a team will employ the same tactics that they do on average over a large number of games.

A good coach modifies his approach based on his current lineup and who the opponent is. Players may also respond differently to different opponents; I'd probably be more cautious protecting a lead if I'm facing the Penguins, with Crosby and Malkin, than if I was up against the Sabres, with Ennis and Moulson. Again, don't apply long-run thinking to short-run problems.

So please, if you care about presenting an accurate picture, don't use score-adjusted numbers in tiny samples. The results they give are worse, not better. If you want to describe what happened in a single game or period, just talk about what the Corsi ratio was, and keep score-effects in mind.

Analysis

Stop Scoring-Adjusting Tiny Samples

WHY DO WE USE SCORE-ADJUSTED NUMBERS?

SO WHY DO PEOPLE USE SCORE-ADJUSTMENTS IN SMALL SAMPLES?

Rimouski vs. Oshawa 2015 Memorial Cup results: Gauthier's strong third period push falls short

[Sunday's FTB]: Quebec at Oshawa

Comment Markdown

Stop Scoring-Adjusting Tiny Samples

WHY DO WE USE SCORE-ADJUSTED NUMBERS?

SO WHY DO PEOPLE USE SCORE-ADJUSTMENTS IN SMALL SAMPLES?

Rimouski vs. Oshawa 2015 Memorial Cup results: Gauthier's strong third period push falls short

[Sunday's FTB]: Quebec at Oshawa

The Ottawa Senators

The Good and the Bad

The Centres who aren't Brayden Schenn

Comment Markdown