“Bryz-warmup” by Arnold C. Licensed under Public Domain via Commons.
This is the fifth part of a five part series. Check out Part 1, Part 2, Part 3, Part 4 here. You can view the series both at Hockey-Graphs.com and APHockey.net.
To quickly recap what I’ve covered in the first four parts of this series, I have updated the work that’s been done on Pythagorean Expectations in hockey, and am looking to find out whether teams that have the best lead-protecting players are able to outperform those expectations consistently.
The first step is to figure out how to assess a player’s ability to protect leads. To do this, for every season, I isolated every player’s Corsi Against/60, Scoring Chances Against/60, Expected Goals Against/60 (courtesy of War-On-Ice) and Goals Against/60 when up a goal at even strength. I then found a team’s lead protecting ability for the year in question by weighting those statistics for each player by the amount of ice time they winded up playing that year. For players that didn’t meet a certain threshold, I gave them what I felt was a decent approximation of replacement level ability. For example, here was the expected lead protecting performance of the 2014-2015 Anaheim Ducks in each of those categories.
This is the second part of a five part series. Check out Part 1, Part 3, Part 4, Part 5 here. You can view the series both at Hockey-Graphs.com and APHockey.net.
In Part 1, I looked at some of the theory behind Pythagorean Expectations and their origin in baseball. You can find the original formula copied below.
WPct = W/(W+L) = Runs^2/(Runs^2 + Runs Against^2)
The idea behind the formula is that it is a skill to be able to score runs and to be able to prevent them. What isn’t a skill, however — according to the theory — is when one scores or allows those runs. Teams over the course of weeks or months may appear to be able to score runs when they’re most necessary, to squeak out one-run wins, but as much as it looks like a pattern, it is most often simple variance. If you don’t fully buy into that idea, or you don’t really understand what I mean by variance, read this and then come back. Everything should be a lot clearer.
When applying Pythagorean Expectations to hockey, there are a couple of factors that complicate the matter. First of all, the goal/run scoring environment is very different. Hockey is a much lower scoring sport. That means that a team is more likely to win, say, 10 one-goal games in a row than in baseball. The lower the total goals, the closer the average scores, the more variance involved. Second, not all games are worth the same number of points. In baseball, you either win or lose, so you use run differential to figure out a winning percentage. But winning percentage doesn’t really work as a statistic in hockey since you can lose in overtime and get essentially half a win, while your opponent gets a full win.
This is the first part of a five part series.Check out Part 2, Part 3, Part 4, Part 5 here. You can view the series both at Hockey-Graphs.com and APHockey.net.
The 2015-2016 NHL season is almost here, and our sport has come upon a new phase — arguably the third — in its analytics progression. The first stage was about broad ideas and testing; I’ll call it the Discovery Phase. It involved public minds brainstorming large-scale ideas about the conventional truisms of the game, looking to prove and disprove that which many had taken for granted. It lent us ideas like the undervaluing of small players and terms like Corsi and PDO. It was revolutionary but not yet a revolution. The second phase was the Recognition Phase, which was kicked off by the Summer of Analytics. Teams began to buy into public work as worthy of investment and began to question their own practices. Now, as we saw it in baseball, a third phase is emerging. One in which much of the public is willing to accept the initially-controversial public ideas, but in which analysts are pushing back on generalities in situations that are often team and player dependent. We are now in a phase where analysts take a magnifying glass to every claim being made. For example, there is no more argument about whether or not Corsi is relevant or important — at least not among those in positions of influence. The question is in what cases it works best, and maybe more importantly, where and why it fails. Because it does, after all. There are players whose finishing abilities, defensive prowess, special teams impact and leadership mean that the value Corsi presents is significantly off base. And it’s important in a billion-dollar industry to figure out how to account for that. The same can be said for any of the metrics that came out of the Discovery Phase or that continue to be developed today.
The point of all this is that we’re at a point where you no longer dismiss the exceptions; you dig into them. There is a lot in the world that can be explained by simple variance, but the game of hockey is far too complicated to assign anything that doesn’t fit a successful model as such.