Since the last post was getting a little long, I decided to hold off on releasing the full Pythagorean results. Linked you will find a table of every team since the lost season, sorted by the difference between its adjusted point total and its Pythagorean expectation. Essentially, the teams that have the highest numbers in the right-most column are likely to have been the most fortunate, and those at the bottom were possibly unlucky. If you look at the 2014-2015 results below, you will see which teams should be a little bit worried about their chances, and those which may be ready for a rebound. Tomorrow, I will address what the point of this whole study was, and we’ll look at some more data.
Joe Haggerty posted a revealing piece last week about what went wrong with the Boston Bruins last season. According to Brad Marchand on the record, as well as a variety of sources off it, the B’s were a divided team, with only part of the group truly on board with the team’s march to the playoffs. Others, he claims, without naming names, didn’t seem particularly bothered to miss out.
It’s a very interesting case, and unfortunately it’s impossible to think of it in terms of an actual case study, since there’s nothing scientific about the way in which players may discuss their perception of the causality involved in failure, and certainly everything a front office says with regards to move it makes has to be taken with a grain of salt. That said, the eye-rolling that some in the analytics community may take part in with regards to this story is a mistake. Dressing room chemistry is important. Leadership is important. And this story does contain some important lessons. Let’s take a closer look.
“In the past years, we were family, but for some reason this past year we were definitely a little bit divided, and had different cliques. It could’ve been because we had a lot of guys coming up in different times from Providence; they felt a lot more together, and it seemed like the older guys didn’t do a good job at integrating other guys.”
There’s no reason to distrust Marchand on this point. I completely believe that the team was divided, and it’s quite possible that guys coming and going from the AHL played a part. Every team deals with those comings and goings to different degrees, but a lack of leadership and communication could play a part in those guys not being well integrated. Just from personal experience, I can tell you that playing on a team when you feel well-liked, or a part of the group, is a lot easier – and often leads to a better performance – that when you feel ostracized or the unity just isn’t there.
In Part 1, I looked at some of the theory behind Pythagorean Expectations and their origin in baseball. You can find the original formula copied below.
WPct = W/(W+L) = Runs^2/(Runs^2 + Runs Against^2)
The idea behind the formula is that it is a skill to be able to score runs and to be able to prevent them. What isn’t a skill, however — according to the theory — is when one scores or allows those runs. Teams over the course of weeks or months may appear to be able to score runs when they’re most necessary, to squeak out one-run wins, but as much as it looks like a pattern, it is most often simple variance. If you don’t fully buy into that idea, or you don’t really understand what I mean by variance, read this and then come back. Everything should be a lot clearer.
When applying Pythagorean Expectations to hockey, there are a couple of factors that complicate the matter. First of all, the goal/run scoring environment is very different. Hockey is a much lower scoring sport. That means that a team is more likely to win, say, 10 one-goal games in a row than in baseball. The lower the total goals, the closer the average scores, the more variance involved. Second, not all games are worth the same number of points. In baseball, you either win or lose, so you use run differential to figure out a winning percentage. But winning percentage doesn’t really work as a statistic in hockey since you can lose in overtime and get essentially half a win, while your opponent gets a full win.
The 2015-2016 NHL season is almost here, and our sport has come upon a new phase — arguably the third — in its analytics progression. The first stage was about broad ideas and testing; I’ll call it the Discovery Phase. It involved public minds brainstorming large-scale ideas about the conventional truisms of the game, looking to prove and disprove that which many had taken for granted. It lent us ideas like the undervaluing of small players and terms like Corsi and PDO. It was revolutionary but not yet a revolution. The second phase was the Recognition Phase, which was kicked off by the Summer of Analytics. Teams began to buy into public work as worthy of investment and began to question their own practices. Now, as we saw it in baseball, a third phase is emerging. One in which much of the public is willing to accept the initially-controversial public ideas, but in which analysts are pushing back on generalities in situations that are often team and player dependent. We are now in a phase where analysts take a magnifying glass to every claim being made. For example, there is no more argument about whether or not Corsi is relevant or important — at least not among those in positions of influence. The question is in what cases it works best, and maybe more importantly, where and why it fails. Because it does, after all. There are players whose finishing abilities, defensive prowess, special teams impact and leadership mean that the value Corsi presents is significantly off base. And it’s important in a billion-dollar industry to figure out how to account for that. The same can be said for any of the metrics that came out of the Discovery Phase or that continue to be developed today.
The point of all this is that we’re at a point where you no longer dismiss the exceptions; you dig into them. There is a lot in the world that can be explained by simple variance, but the game of hockey is far too complicated to assign anything that doesn’t fit a successful model as such.
JP of Japers Rink had an interesting piece a while back about the idea of increasing pace of play. He explored the topic of whether a team should ever attempt to push the play or slow it down in order to give it the best chance of winning against a particular opponent.
Event rates are important because a 55% Corsi For Percentage is very different for a team that averages 110 Corsi events per game (for and against) compared to one that averages 90. The 2005-2006 Detroit Red Wings are an example of the former, the 2013-2014 New Jersey Devils of the latter. A team with a higher event rate with a positive shot attempt differential will end up on average with a better goal differential and likely a better record than one with a lower rate but the same differential.
The big question the piece raised for me, however, was whether pace of play can have an effect on shooting percentage. After all, we know that the score can affect shooting percentage based on the change in a team’s tactics and mindset. Is there a shooting-related reason why high event hockey might not be preferable?
Jack Han wrote a cool piece the other day about the shootout and game theory. He had a number of different ideas, but I want to address one in particular.
I believe his point was as follows.
“As a shooter in the shootout, if you are unpredictable, the goalie won’t know what is coming and will play you straight up. If, however, you have one prominent move and a lesser-used secondary option, the goalie is likely to know that and cheat, allowing you to score more often on your secondary option, which overall will increase your effectiveness.”
I want to look at this point within the unrealistic context of an NHL goalie having complete information on the shooter’s true shootout talent, ie their base rate, and the percentage of the time in which he uses a primary move relative to a secondary one.
So let’s say you’re a league average shootout performer with two moves (let’s say a backhand deke and a backhand-forehand deke). When the goalie plays reactionary, you score on 33% of your shots. You can, however, decide to adjust this rate by leading the goalie into guessing by using your primary move significantly more than your secondary move. The goalie, as I mentioned above, knows how much you use each move, just not in which cases you will use which.
I don’t have a ton of time to blog at the moment with finals coming to an end, but just wanted to throw this up quickly with Ray Shero becoming the New Jersey Devils’ new General Manager and the questions about his seemingly poor draft record. Corey Pronman wrote a nice piece a while back about why Shero’s record in particular is underrated, but I wanted to more briefly examine a few more general reasons why I would be weary about being too reliant on such a history or lack of history of success.
1. Small Sample Size.
One of the central themes with regards to analytics in hockey is that we’re trying to maximize sample size in order to get the most accurate possible view of a player or team’s talent. This is no different with regards to drafting. The fact is, a GM can only draft on average seven players per season, meaning that over the course of, say, a five year tenure, that’s only 35 picks. Some may get hurt, some might lose their love for the game, some might develop better than others simply as a result of random variation. It’s very difficult to isolate real success based on 35 or so picks – which is one of the big reasons why drafting also appears to be so random based on studies in just about every sport.
Kyle Dubas had the following quote in Elliotte Friedman’s great 30 thoughts columns this week:
“Here’s the way I look at it,” he said. “Right now, we aren’t good enough to be picky about smaller players. We need as many elite players as we can. If we get into playoffs and are too small, or overwhelmed, it’s easier to trade small for size than draft for size and trade for skill.” (bolding my own)
The quote struck me as interesting because it takes a fundamentally different angle on the size debate than the one I personally ascribe to, and I wonder whether it is simply a matter of semantics, or whether there is actually more to this.
My sense was always that size is not easier to trade for than skill – assuming we mean top 6 size and not grinder size – but that the reason you want to draft for skill was simply that skill players have a higher success rate than big players who don’t score as much. You prefer guys who can score over guys with size because once you accumulate enough of them, you can overpay for the big players that have succeeded, and not bear the risk that they may be busts.
Thank you to all who attended the DC Hockey Analytics Conference (#DCHAC) and watched via livestream. While there were some technical difficulties that prevented us from recording all of the presentations, we did manage to salvage a good portion of them. Here they are, as well as all of the slides.
Arik Parnass – Opening Comments & Introduction to Analytics
Slides: Intro to Analytics
When Peter Thiel, co-founder of PayPal and Palentir and the first outside investor in Facebook, conducts interviews, he always asks one very difficult question.
“What important truth do very few people agree with you on?”
I’ll wait while you struggle to find an answer that suits you individually….no go ahead….okay maybe table that for later. While straightforward, it’s an incredibly difficult question both because most of the knowledge we accumulate – particularly when it comes to conventional education – is widely agreed upon, and because in an interview setting, answering it inherently involves voicing an opinion that the interviewer doesn’t share. It takes courage, and courage is something that Thiel feels is lacking.