Predictive vs Descriptive Stats
The latest sports data/analytics piece for KCSN from contributor, Joseph Hefner
There’s been a discussion this week on twitter about the Vikings 10-2 record, and how 9 of those wins have been one score wins. The Vikings are facing the 5-7 Lions, whom they’ve already beaten, this week, and the Lions are 2.5 point favorites in that game. Why would the Vikings be underdogs in this game versus a team with half their wins? A team that they’ve already beaten?
This week’s article is going to be about descriptive vs predictive stats. Analytics show that one score wins are inherently unstable, and usually dependent on late game luck. Descriptively, the Vikings have won 10 games, and no one should take anything away from them for how they’ve done it. Ugly wins count just as much as pretty wins do.
Predictively though, one score games are much less meaningful, and that’s reflected in the Vegas odds for this game. Detroit gets around 1.5 points for it being a home game, but that still leaves them as 1 point favorites. Vegas clearly isn’t viewing the Vikings as a typical 10-win team facing a typical 5-win team.
Descriptively, the Vikings have won 10 games, and that’s going to matter going forward. They are real wins that will give them a playoff berth, and probably a high seed, which will make it more likely that they move on, as they’ll face lower seeds in the playoffs.
But when we’re looking at how we think the Vikings will fare in games moving forward, if we’re trying to predict the future, those one score games should be factored at around a half win, which would put them at 5 to 6 wins instead of 10. If we do that, those Vegas odds make perfect sense, since it’s two 5-7 teams facing each other, and the home team is favored.
Not all one score games are actually coin flips, of course. We have better metrics than this. Check out this 538 article if you want to read more. But simply identifying one score games as being worth half a win gets us 80% of the way to that with 20% of the effort.
In addition, simple stats with obvious flaws are actually very useful, because they’re easy to reproduce, and you know exactly what the flaw is. If you know that there were a couple games that weren’t actually close, you can mentally modify the model, and add those games as true wins (or losses). More complicated models might account for more, but are also harder to figure out their flaws and adjust accordingly.
I’m not really here to talk about the Vikings, though. I’m here to talk about stable and unstable stats. Stable stats are stats that are more predictive, or sticky. Unstable stats are more descriptive. One score wins are just one example of an unstable stat.
In today’s NFL, the rules heavily favor the offenses over the defenses. An effect of that is that offenses often produce at close to the same rate no matter what defense they’re facing, while defenses efficiency fluctuates heavily based on which offenses they’re facing. That means offensive performance is more predictive than defensive performance.
Here’s a graphic showing the EPA/play for every offense from 2012 to 2021 in the first 8 games of the season vs the last 8 games (not including the 17th game in 2021). Notice how the dots follow an upward line from bottom left to top right? That shows a level of consistency of the offenses. Good offenses stayed good. Bad offenses stayed bad.
Note the label in the bottom right corner with the “R2 = 0.27”. That’s the r-squared (R2) value of early season vs late season offensive EPA. I don’t want to get into the weeds here, but the R2 is used here to essentially show how predictive early season efficiency is to late season efficiency. An R2 of 0.27 is very high for something as volatile as football.
Here’s the same graphic for defensive EPA/play. Note how the dots are much more evenly spread across this whole graphic. They don’t follow a nice upward slope. Early season defensive performance is MUCH less predictive than early season offense. The defensive R2 is 0.1 vs the offensive R2 of 0.27. That’s a really big difference.
When analysts on twitter say “defense doesn’t matter”, what they’re really trying to say is “Past defensive performance isn’t predictive of future defensive performance, and in fact is largely driven by the offense the defense faces each week”, but that doesn’t fit well in a tweet. Also, it’s a very forgettable line, even if it’s more nuanced and accurate. No one forgets #DefenseDoesn’tMatter.
San Francisco’s vaunted 2019 defense wasn’t enough to stop Patrick Mahomes and the Chiefs rally from a 10 point deficit late to win Super Bowl LIV. The rules just favor offenses too much. The effect of that favor is that offense is much more predictable than defense. Trusting a defense in today’s NFL is just fools gold.
It’s hard to have a nuanced discussion on twitter. It lends itself to brevity and pithiness. “Defense doesn’t matter” is a nice, easy, memorable slogan that fits better with limited characters. Just remember that analysts are typically looking at the predictiveness of a stat when they say things like this.
So “RB’s don’t matter” (when the top RB goes down, the replacement is usually about as efficient). “Sacks are a QB stat” (Seattle had the worst offensive line basically every year Wilson was there, but now that it’s Geno, it’s great? With two rookies at tackle?).
Slogans like these are just shorthand for saying where the signal is going forward. There’s an argument behind the slogans, but you have to read through whole twitter threads and articles to see it, because tweets are just too short, and we’re a people who like pithy slogans.
Want to read more? Here’s a whole “Nerd to human translator” with explanations for these slogans, and links to articles that show the data behind these ideas and statistics.
You make a lifetime of difference for a child or teen.
Boys & Girls Clubs is looking for dedicated men and women to invest in the lives of youth through coaching. Did you know by participating in sports, children learn the value of teamwork, responsibility, good sportsmanship, and self-esteem? By becoming a volunteer coach, you are helping to power the dreams—and successful futures—of Kansas City’s kids! The Clubs are currently seeking baseball coaches. Click here to sign up today! All equipment will be provided. No prior coaching experience is needed.