Showing posts with label coaching. Show all posts
Showing posts with label coaching. Show all posts

Simulating the Saints-Falcons Endgame

I was asked yesterday about the end of regulation of the Saints-Falcons game. With about a minute and a half remaining, NO was down by 4 but had a 1st & goal at the 1. With 2 timeouts left, should ATL have allowed the touchdown intentionally?

I previously examined intentional touchdown scenarios, but only considered situations when the offense was within 3 points. In this case NO needed a TD, which--needless to say--makes a big difference. Yet, because NO was on the 1, perhaps the go-ahead score was so likely that ATL would be better off down 3 with the ball than up 4 backed-up against their goal line.

This is a really, really hard analysis. There's a lot of what-ifs: What if NO scores on 1st down anyway? What if they don't score on 1st but on 2nd down? On 3rd down? On 4th down? Or what if they throw the ball? What if they stop the clock somehow, or commit a penalty? How likely is a turnover on each successive down? You can see that the situation quickly becomes an almost intractable problem without excessive assumptions.

That's where the WOPR comes in. The WOPR is the new game simulation model created this past off-season, designed and calibrated specifically for in-game analytics. It simulates a game from any starting point, play by play, yard by yard, and second by second. Play outcomes are randomly drawn from empirical distributions of actual plays that occurred in similar circumstances.

If you're not familiar with how simulation models work, you're probably wondering So what? Dude, I can put my Madden on auto-play and do the same thing. Who cares who wins a dumb make-believe game? 

Analyzing Replay Challenges

The new WP model allows some nifty new applications. One of the more notable improvements is the consideration of timeouts. That, together with enhanced accuracy and precision allow us to analyze replay challenge decisions. Here at AFA, we've tinkered with replay analysis before, and we've estimated the implicit value of a timeout based on how and when coaches challenge plays. But without a way to directly measure the value of a timeout the analysis was only an exercise.

Most challenges are now replay assistant challenges--the automatic reviews for all scores and turnovers, plus particular plays inside two minutes of each half. Still, there are plenty of opportunities for coaches to challenge a call each week.

The cost of a challenge is two-fold. First, the coach (probably) loses one of his two challenges for the game. (He can recover one if he wins both challenges in a game.) Second, an unsuccessful challenge results in a charged timeout. The value of the first cost would be very hard to estimate, but thankfully the event that a coach runs out of challenges AND needs to use a third is exceptionally rare. I can't find even a single example since the automatic replay rules went into effect.

So I'm going to set that consideration aside for now. In the future, I may try to put a value on it, particularly if a coach had already used one challenge. But even then it would be very small and would diminish to zero as the game progresses toward its final 2 minutes. In any case, all the coaches challenges from this week were first challenges, and none represented the final team timeout, so we're in safe waters for now.

Every replay situation is unique. We can't quantify the probability that a particular play will be overturned statistically, but we can determine the breakeven probability of success for a challenge to be worthwhile for any situation. If a coach believes the chance of overturning the call is above the breakeven level, he should challenge. Below the breakeven level, he should hold onto his red flag.

When Coaches Use Timeouts

As I continue to work on the next generation WP model, I'm looking hard at how timeouts are used. Here are 2 charts that capture about as much information as can be squeezed into a graphic.

The charts need some explanation. They plot how many timeouts a team has left during the second half based on time and score. Each facet represents a score difference. For example the top left plot is for when the team with the ball is down by 21 points.  Each facet's horizontal axis represents game minutes remaining, from 30 to 0. The vertical axis is the average number of timeouts left. So as the half expires, teams obviously have fewer timeouts remaining.

The first chart shows the defense's number of timeouts left throughout the second half based on the offense's current lead. I realize that's a little confusing, but I always think of game state from the perspective of the offense. For example, the green facet titled "-7" is for a defense that's leading by 7. You can notice that defenses ahead naturally use fewer timeouts than those that trail, as indicated by comparison to the "7" facet in blue. (Click to enlarge.)

The Value of a Timeout - Part 2

In the first part of this article, I made a rough first approximation of the value of a timeout. Using a selected subsample of 2nd half situations, it appeared that a timeout's value was on the order of magnitude of .05 Win Probability (WP). In other words, if a team with 3 timeouts had a .70 WP, another identical team in the same situation but with only 2 timeouts would have about a .65 WP.

In this part, I'll apply a more rigorous analysis and get a better approximation. We'll also be able to repeat the methodology and build a generalized model of timeout values for any combination of score, time, and field position.

Methodology

For my purposes here, I used a logit regression. (Do not try to build a general WP model using logit regression. It won't work. The sport is too complex to capture the interactions properly.) Logit regression is suitable in this exercise because we're only going to look at regions of the game with fairly linear WP curves. I'm also only interested in the coefficient of the timeout variables, the relative values of timeout states, and not the full prediction of the model.

I specified the model with winning {0,1} as the outcome variable, and with yard line, score difference, time remaining, and timeouts for the offense and defense as predictors. The sample was restricted to 1st downs in the 3rd quarter near midfield, with the offense ahead by 0 to 7 points.

Results

The Value of a Timeout - A First Approximation

During the NFC Championship Game the other day, we saw a familiar situation. Down by 4 with 14 minutes left in the game, the Seahawks were confronted with a decision. It was 4th and 7 on the SF 37. Should they go for it, punt, or even try a long FG to maybe make it a 1-point game? Pete Carroll ended up making what was the right decision according to the numbers, but not before calling a timeout to think it over.

As I noted in my game commentary, if you need to call a timeout to think over your options, the situation is probably not far from the point of indifference where the options are nearly equal in value. And timeouts have significant value, particularly in situations like this example--late in the game and trailing by less than a TD--because you'll very likely need to stop the clock in the end-game, either to get the ball back or during a final offensive drive. Would Carroll have been better off making a quick but sub-optimum choice, rather than make the optimum choice but by burning a timeout along the way?

Here's another common situation. A team trails by one score in the third quarter. It's 3rd and 1 near midfield and the play clock is near zero. Instead of taking the delay of game penalty and facing a 3rd and 6, the head coach or QB calls a timeout. Was that the best choice, or would the team be better off facing 3rd and 6 but keeping all of its timeouts?

Both questions hinge on the value of a timeout, which has been something of a white whale of mine for a while. Knowing the value of a timeout would help coaches make better game management decisions, including clock management and replay challenges.

In this article, I'll estimate the value of a timeout by looking at how often teams win based on how many timeouts they have remaining. It's an exceptionally complex problem, so I'll simplify things by looking at a cross section of game situations--3rd quarter, one-score lead, first down at near midfield. First, I'll walk through a relatively crude but common-sense analysis, then I'll report the results of a more sophisticated method and see how both approaches compare.

Payton Was Right to Decline

At least according to Expected Points, he was.

Here's the situation: At the beginning of the 3rd quarter against CAR, NO had a 1st and 10 at their own 16-yard line. They threw for a 7-yard gain, setting up a 2nd and 3 from their 23. But CAR was flagged for defensive holding, which would have given NO 5 yards and an automatic first down at their 21. NO head coach Sean Payton declined the penalty to the bafflement of many including the tv announcers.

The game did not hinge on this decision by any stretch. But it's worth taking a look at. The EP model is probably the right tool for the job in this case because it gives a much finer level of precision to down/distance/yd-line situations than the WP model or other approaches.

Using the hand-dandy WP calculator tool (which as a bonus is also an EP calculator), here are the relevant numbers:

Seahawks Stumble, Should Have Allowed TD

In one of the most anticipated games of the week, the San Francisco 49ers took over down 17-16 to the Seattle Seahawks with 6:20 remaining. After a huge Frank Gore 51-yard run, the Niners lined up for a 1st-and-Goal from the 7-yard line with 2:39 remaining. Seattle had no timeouts remaining. Should the Seahawks have tried to intentionally allow the Niners to score a touchdown? Let's look at Brian's graph for this situation in his intentional TD study:

Is the Revolution Over? Did We Win?

"The Revolution Was Televised. The fourth down revolution is over. Going for it won."

Is Mike right? Did going for it really win? Mike makes a the case, and cites several promising examples of unconventional 4th down decisions from one Sunday afternoon earlier this season:

"-The Lions going for it on 4th-and-goal from the two-yard line, early in their win over the Cowboys.
-The Dolphins going for it on 4th-and-1 from the Patriots' 38-yard line, in the second quarter.
-The Patriots going for it on 4th-and-4 from the Dolphins' 34-yard line, while leading by three points in the fourth quarter.
-The Bengals going for it on 4th-and-inches from the 1-yard line, while leading 14-0 against the Jets.
-The Broncos scoring a 4th-and-goal touchdown to tie the game at 21 against the Redskins, in the third quarter.
-The Packers converting a 4th-and-3 from their own 42-yard line, setting up a touchdown to increase their lead to 31-17."

I think Mike is right to point out some very interesting cases where coaches are making some notable decisions, but the revolution is far from complete. I would suggest that an avalanche is the better analogy than revolution. One day there may be an avalanche of aggressive 4th down decisions, but right now we're only seeing a few rocks trickle down the mountainside. It's not that there haven't been bold examples of enlightenment. It's just that there are so many opportunities that coaches have spurned.

When the Defense Should Decline a Penalty After a Loss Part 2 (2nd Downs)

I recently looked at when it made sense for the defense to decline a 10-yard holding penalty following a 1st down play for no gain or a loss. It turned out that defenses should generally prefer to decline after a loss of 3 or more yards.

First downs are easier to analyze because they almost always begin with 10 yards to go. Unfortunately, 2nd downs aren't so cooperative. It's amazing how thin he data gets sliced up. Most downs aren't losses, even fewer have holding penalties, and rarely are they declined. Still, there are enough cases for a solid analysis using 1st-down conversion probability as the bottom line.

Put simply, a defense would prefer to decline a penalty on a 2nd down play whenever the resulting 3rd down situation leads to a conversion less often than the 2nd down plus the 10 yards.

The chart below plots conversion probability for 2nd and 3rd down situations. The red line illustrates the conversion probability of 3rd down and X to go situations. For example, 3rd down and 7 situations are converted about 40% of the time.

The green line illustrates 2nd down situations, but slightly differently. It plots conversion probabilities for 2nd down and X plus 10 yards. For example, 2nd and 13 (i.e. 3 + 10 yds) situations are converted 45% of the the time. The black line is the smoothed line fitted to the 3rd down conversion rates. I plotted things this way because it's the actual comparison we're interested in, given a gain of zero yards.

On the Effect of Coaching

There are exceptions, like that guy on the left, but my hunch is that NFL coaches are mostly interchangeable.

I think at the NFL level, all coaches employ the same best practices. There is no secret sauce that one coach has over another in terms of instruction, motivation, strategy, etc. This is because of the highly mobile, fluid market for coaches and the large size of their staffs. There are very strong constraints on deviation from league norms in any dimension.

Also, from statistical analysis, we can measure the variance in team performance attributable to randomness (sample error due to a short 16-game season) and player impacts (the addition or subtraction of a player's impact on team production, player interaction effects). There is very little variance left that can be attributed to other causes, including coaching. In other words NFL outcomes are overwhelmingly driven by player talent and luck, and there's not much room left for coaching to make a big impact.

We can observe this intuitively, as the very same coaches can have wildly different records from year to year. How much effect can they really have?

When Should the Defense Decline a Penalty After a Loss? Part 1

Let's say there's a sack or other tackle that results in a several-yard loss. And to compound the offense's woes a flag for holding is thrown, potentially setting up a 1st and 20 situation. Should the defense accept or decline the penalty and force a 2nd and X? We can evaluate this question in a few ways. We'll use a simple method and a more complex method to find out when a defense should normally decline a penalty on first down.

Before you read on, what do you think the break-even yardage is? What do you think most coaches think it is?

Examining the Value of Coaches' Challenges

Kevin Meers is the Co-President of the Harvard Sports Analysis Collective. He is a senior majoring in economics with a statistics minor, and has spent the past two years or so as an analytics intern in the NFL. He is currently writing his thesis on game theory in the NFL, and probably puts too much thought into how the perfect fantasy football league would be structured.

The coach’s challenge is an important yet poorly understood part of the NFL. We know challenges are an asset, but past that, we do not have a good understanding of what makes a good challenge or if coaches are actually skilled at challenging plays. This post takes a step towards better understanding those questions by examining the value of the possible game states that stem from challenged plays.

To value challenges, we must understand how challenges change the game’s current state. When a play is challenged, the current game state must transition into one of two new game states: one where the challenged play is reversed, the other where it is upheld. These potential game states are the key to valuing challenges.

Let’s look at a concrete example from last season. With two minutes and two seconds left in the fourth quarter in their week ten matchup, Atlanta had first and goal on New Orleans’ ten-yard line. Matt Ryan completed a pass to Harry Douglas, who was ruled down at the Saints’ one-yard line… only Douglas appeared to fumble as he went to the ground, with the Saints recovering the ball for a potential touchback. When New Orleans challenged the ruling on the field, the game could have transitioned into two possible game states: Atlanta’s ball with second and goal on the one, or New Orleans’ ball with first and ten on their own 20 yard line. If the Saints lost the challenge, they would have a Win Probability (WP) of 0.28, but if they won, their WP would jump to 0.88. This potential WP added, which I refer to as “leverage,” is key to valuing challenges. Mathematically, I define leverage as:

Should You Bench Your Fumbling Running Back?

Sam Waters is the Managing Editor of the Harvard Sports Analysis Collective. He is a senior economics major with a minor in psychology. Sam has spent the past eight months as an analytics intern for an NFL team. When he is not busy sounding cryptic, he is daydreaming about how awesome geospatial NFL data would be. He used to be a Jets fan, but everyone has their limits.

When the Pittsburgh Steelers traveled to Cleveland in week 12 of last season, Rashard Mendenhall was the Steelers’ starting running back. Well, he was at first. Mendenhall fumbled on his second carry of the game, and Head Coach Mike Tomlin benched him immediately. On came backups Isaac Redman, Jonathan Dwyer, and Chris Rainey, who all fumbled and joined Mendenhall on the sidelines in quick succession. Out of untainted running backs to sub in, Tomlin looped back around to Mendenhall, who put the ball on the ground again. Mendenhall, of course, went right back to the bench, ceding his snaps to Dwyer and Rainey for the rest of the game. This was one of the more prolific fumble-benching sprees in NFL history, but we see tamer versions of this scenario all the time. Just look back to last season. David Wilson fumbled and Tom Coughlin actually made him cry. Ryan Mathews fumbled away his job to Jackie Battle. Tears and Jackie Battle - does any mistake deserve these consequences?

Rex Ryan Runs Out of Challenges

Early in the 4th quarter of Sunday's BUF-NYJ game, the Jets led by 8. BUF QB EJ Manuel apparently lost a fumble, but the officials ruled him down. Had NYJ recovered, they would have had the lead and a 1st and 10 in BUF territory, only about 5 yards from realistic FG (attempt) range, with about 13 minutes left to play. The only problem was that NYJ head coach Rex Ryan had no more challenges, having burned them both just minutes prior to the fumble that wasn't.

Let's examine the leverage of each challenge using the Win Probability (WP) model. 

When to Intentionally Allow a TD When Tied

Super Bowl 32 was a memorable one. A tight game featuring John Elway and Brett Favre would have made a memorable regular season game, but as a Super Bowl it was spectacular. To me the most interesting thing about the game was how the winning score happened. It was allowed intentionally.

The game was tied at 24. The Broncos began a drive with 3:27 left to play. After a big Elway pass and several Terrell Davis runs, Denver put Green Bay in the Field Goal Choke Hold. Eventually, Denver fought its way to a 1st and goal from the Green Bay 8. A holding call on Shannon Sharpe moved Denver back to 1st and goal from the 18. Another Davis run set up 2nd and goal from the 1 with just 1:47 to play. Rather than allow Denver to run down the clock any further, head coach Mike Holmgren elected to allow the TD on the next play to give his offense a better chance to respond with a TD of their own.

In the wake of my previous five-part analysis of intentionally allowing a TD, I learned what the Internet jargon tl;dr stands for. I promise to make this one shorter. Previously, I looked at situations in which an offense that's trailing by 1 or 2 points could run out the clock before kicking a field goal to win. In many cases, depending on the time, score, field position, and number of timeouts remaining, it makes sense for the defense to allow a TD rather than try to force a stop and a FG attempt.

This time I'll examine similar situations where the score is tied. The considerations are a little different than when the defense has a 1 or 2 point lead. A tie score means that the defense can't be relatively assured of a win in the event of a miss. And given a successful FG to break a tie, a FG in response only re-ties the game.

New Overtime 4th Down Decisions When Down 3 Points

Your opponent kicked a FG on the first possession of overtime, and now your team needs a TD to win or a FG to continue the game. Your offense has driven down to the opponent's 10-yard line, but the drive has stalled. It's 4th down and 3. Should you go for the risky conversion and ultimately a TD for the win, or should you attempt a FG knowing you'd be at a disadvantage giving the ball to the opponent in sudden death?

The new NFL OT rules are unique in a lot of ways, and by unique I mean convoluted and contrived. There are basically three possible game states:

1. The first drive in which no score leads to Sudden Death, a TD wins, or a FG spawns the second state...
2. A possible second possession in which the offense is down by 3 points. It must score a TD to win or a FG to continue into SD.
3. Lastly, traditional SD itself.

The three game states successively easier to model. The first possession must consider all the possibilities of the following two states. The second state must only consider itself and the possibility of SD. The second possession is also slightly easier to model because there is no punt option. An offense trailing by 3 points simply must score or lose.

New Feature: Time Calculator

Someone tell Norv Turner there are no timeouts in press conferences.

I created a new tool to estimate the time at which a trailing defense (or soon-to-be-trailing defense) can get the ball back if they force a stop. The results are based on the time at the first down snap of a series and the number of timeouts remaining for the defense. You can adjust the expected duration of each play and the time consumed between the previous whistle and the next snap when the game clock is not stopped. The defaults are 6 and 39 seconds respectively. The calculator assumes there will be no stoppages due to reasons other than timeouts and the two-minute warning, such as incomplete passes, runs out of bounds, or penalties.

One additional feature is that you can check a box called "Save Timeout." This will indicate that the team on defense would prefer to allow the clock to wind down to the two minute warning rather than stop the clock with a timeout. For example, if the defense has one timeout left and the second down play ended at 2:10, the defense can elect to save the timeout for its offense in exchange for running down the 10 seconds to 2:00. This is, in effect, a trade-off between the 10 seconds of game clock and having a timeout available for an offensive drive.

It's very difficult to quantify the value of the timeout on offense. It's intuitively very valuable because an offense can use the middle of the field, which otherwise allows the defense to guard the sidelines.

Try this: Enter 2:24 remaining with 3 timeouts. Leave the 6 sec and 39 sec defaults for play and inter-play durations. Click calculate with the Save Timeout option unchecked and checked (with the 12-second default cutoff value). With 'Save Timeout' checked, you get the ball back with 1:54 and retain a  timeout for your offense. Without the option checked, you get the ball back with 2:00 on the clock and no timeouts, with the 2-minute warning essentially going to waste.

This option usually only makes a difference when the defense begins the series with all three timeouts remaining. It also may be smart depending on when a team can expect the change of possession to occur. The defense does not want change of possession to occur on a play that spans the two minute warning because that combines two potential clock stoppages into a single stoppage.

The workings of the NFL game clock is far more complex than it might seem. That's why I forced myself to build the calculator and think through all the considerations. The algorithm behind the calculator is basically a by-product of the one I used to create the chart below, which underpinned my analysis of when a defense should prefer to intentionally allow a TD.

Fourth Downs in the New Overtime: First Possession

This may have been the most difficult, challenging analysis I've done. No joke. The new OT format is more complex than it seems. There are three distinct 'game states' in which a team can find itself:

1. The initial drive of the first possession (A TD wins, a turnover or punt triggers Sudden Death (SD), and a FG triggers State 2.)
2. The team down by 3 now has one possession to match the FG (triggering SD) or score a TD to win.
3. Sudden Death

The possibilities are illustrated in the event tree below, along with some back-of-the-napkin transition probabilities I made back when the new rules were first proposed. (State 1 is "1st Poss". State 2 is the branch under "2nd Poss" that follows a FG in the 1st Poss. Sudden death is self-explanatory and occurs after a no-score in the 1st Poss or after a FG is matched in the 2nd Poss.)

One Play Remaining before the Half on the Goal Line

IND QB Andrew Luck spiked the ball with 1 sec remaining in the 2nd quarter, bringing up a 3rd and goal from the 1-yd line. Without a moment of hesitation, acting head coach Bruce Arians ran in the FG unit for the chip-shot. The FG was good and IND took at 13-6 lead over BUF into the locker room. Was this the smart call?

Let's set aside the score and look at the general case. It's a special situation because there is no subsequent kickoff. Instead of being worth 2.7 Expected Points (EP), a FG is worth a full 3 EP. And a TD would be worth a full 7 EP instead of 6.7. The offense would take the full value of the score.

The expected value of each choice is straightforward. It's just the probability of success * the value of the score. In the case of the FG it would be:

How Much Did Jim Schwartz's Attempted Challenge Cost the Lions?

Had Schwartz not thrown the challenge flag on Forsett's run, the play would have been reviewed and certainly overturned. That would mean a 3rd and 2 for HOU on their own 27 with 6:40 or so in the 3rd quarter. Down by 10 points, that means at 0.18 Win Probability (WP) for HOU.

But because Schwartz threw the challenge flag on a play that would have been otherwise reviewed automatically, he received an unsportsmanlike penalty. The result was that the play was not reviewed by rule, and Forsett's TD stood. That gave DET a touchback up by 3, giving HOU a 0.35 WP.

That's a cost of 0.17 WP. It essentially doubled HOU's chances of winning at that point.

A Week of Weak Decisions

There were a plethora of interesting coaching decisions this week, especially in the early games on Sunday. Avid readers cringe at these conservative calls so let's look at a few of them a little closer.

Cleveland Browns: Punted on 4th-and-1 on the Indy 41-yard line down by four with a little over six minutes remaining.

On what was likely the worst decision of the day, the Browns cut their expected win probability almost in half by punting. On 4th-and-1 from the opponent's 41, you should almost always go for it. You are expected to score +1.58 points by going for it (74% league-wide conversion rate) versus -0.04 points by punting.

Reid's "Gutsy" Call Goes Unnoticed


Down six to the Steelers on the road, Michael Vick and the Eagles took the field near the start of the fourth quarter. After three short gains, the Eagles faced a 4th-and-1 from their own 30-yard line. In today's NFL, this is an easy decision for coaches: Teams punt the ball over 90% of the time. Since there was still significant time remaining in a one score game, it's safe to assume that the Eagles should have performed similarly to a score and time-agnostic situation (although obviously it had some factor in Reid's decision).

Reid was feeling some extra confidence from his tufty moustache, so he decided to go for it. I mentioned it was an easy decision for coaches to punt, but is it the right decision?

Marvin Lewis Kicks Down by 4 with 3 Minutes to Play

Jason Lisk uses the Fourthdownulator to analyze Marvin Lewis' curious decision to kick down by 4 with 3 minutes to play:

Marvin Lewis cost his team dearly. About 13.5% chance of winning. That may not sound like a lot to you, but it’s huge. He almost cut it in half. You may be shocked to see that the chances of winning weren't that much better being down by 1-2 instead of 4-5, but that’s because coaches are conservative and play for field goals. When they need touchdowns, they act more optimally. Well, unless they have three minutes left.

WP Forfeited

It's 4th and 2 from the opponent's 42. The score is tied at 17. There's 1 minute and 17 seconds left on the clock. What would you do? If you're Ken Wisenhunt, you send in Dave Zastudil to punt.

One thing I've learned about human nature is that to help convince someone of something, I should frame the issue in terms of fearing a potential loss. That's usually a stronger motivation than expecting a potential gain. For example, I wouldn't suggest to a coach that he could improve his chances of winning by going for it on 4th down more often. Instead, I'd tell him he's forfeiting a significant chance of winning by not going for it. Nobody likes forfeiting stuff. My wife suggested using the phrase "leaving points on the table." Coach Wisenhunt forfeited 13% chance of winning that game.


By far, the most common question I get from reporters is whether teams are going for it more often. My answer is almost always "it's complicated, but I think so." The difficulty in measuring 4th down aggressiveness is that it's so dependent on the situational variables. You can't just count how often teams go for it rather than kicking. To-go distance, score, and time all weigh heavily on the decisions, and there are just too many possible combinations to compare rates from year to year or even decade to decade.

If we constrain the analysis to certain parameters--inside opponent territory and when the score difference is within reason, for example--we'd get an incomplete picture. We've learned over the last few years that there can be many situations outside traditional 'go-for-it' limits in which it can be beneficial to go for the conversion rather than kick or punt. And each situation can have a drastically different magnitude of effect on a team's chances of winning. Also, why would we reward a coach if he goes for it on 4th and whatever when he's down by 5 with a minute left to play? Coaches always do that.

Here's my stab at the problem. With every 4th down situation in which it would usually make sense to go for it but a coach decides to kick, he forfeits some amount of win probability. We can total all the WP forfeited to measure the degree to which teams are erring on the side of conservatism.

QB Sneak vs. RB Dive

In the NO-ATL game Sunday, ATL went for a risky 4th down and 1 conversion attempt in OT with just inches to go. They elected for a RB dive play rather than a QB sneak. (By dive play, I just mean a straight RB handoff directly between the tackles.) But all '4th and 1' situations are not equal--from 1.5 yards down to an inch to go.

QB sneaks seem more successful on inches-to-go situations than RB dives. We'd like to know if the data back this up. Unfortunately, the play descriptions don't note how long the 'and 1' is, whether it's a long yard or just inches. We'd expect to see more QB sneaks on the shorter distances and more RB dives on the longer distances, which bias the numbers because longer to go distances would naturally be tougher to convert. Still, we may be able to draw some inferences.

The table below lists the success rates for 3rd and 4th down runs with 1 yard to go. It breaks out plays by QBs, RBs, and FBs. QB scrambles on pass plays have been removed. Kneel downs and spikes are also removed. Plays inside the 10 yd line are removed due to field compression effects.

The Secret to the Falcons' Success?

The Wall Street Journal's Michael Saffino wrote an article last week about the Falcons' aggressive tendencies on fourth down this season. In opponent territory, with 3 yards or less to go and more than 4 minutes remaining, the Falcons went for the conversion on 72% of their opportunities, the most in the league. They converted on 85% of those attempts. The review of the game's play-by-play indicated that Atlanta would have scored 21 points according to league-average field goal rates, but following their successful conversions the Falcons actually went on to score 51 points--a net of 30 additional points. And in 3 games, the points scored off of 4th down conversions made the difference in the win. Could this be the explanation for the Falcons' dramatic over-performance this season?

In this post, I'll attempt a few things. First, I'll discuss how we should properly measure the impact of 4th down decisions and measure the wisdom of the decisions themselves. Then I'll look specifically at the Falcons' 4th down situations and measure their impact on Atlanta's fortunes in terms of EPA and WPA. Lastly, I'll look at how all the teams did in terms of 4th down decisions.

Run-Pass Imbalance In the Red Zone--1st Downs

Just before halftime in last year's Super Bowl, on first and goal from the one, Kurt Warner threw the ball directly into the arms of James Harrison who rumbled 100 yards for a touchdown. With so little time left in the half, passing was the obvious call, but that play highlights the dangers of passing so close to the goal line.

Game theory tells us that when payoffs for strategies are unequal, the strategy with the higher payoff should be chosen more often. We've seen that between the 20 yard lines payoffs for passes are consistently higher than for runs on 1st down, but inside the 20 running becomes more lucrative. Now let's take a look at the red zone in more detail, where the stakes get higher and the field gets shorter. On 1st downs in the red zone, should offenses run or pass more often, or do they already strike the right balance?

Mike McCarthy: Hero

The Packers' season came to an abrupt end against the Cardinals, but head coach Mike McCarthy's unorthodox tactics were not the cause. McCarthy engineered an improbable comeback that sent the game into overtime

Following a Packers touchdown that brought the score to within 14 points in the third quarter, McCarthy called for an onside kick. Onside kicks are surprisingly successful when unexpected, averaging a 60% successful recovery rate. In this instance, it worked, and the Packers were on their way to another touchdown drive. And just as important, it kept the ball out of the hands of the potent Cardinals offense.

Andy Reid Is No Longer My Hero

Note: This post was taken from a series of comments on the recent Games of Week post. Thanks to all who contributed to the discussion.

About a month ago Eagles head coach Andy Reid was my hero of the week for his daring, and smart, onside kick to open the game against the Redskins. Although it failed and gave up 7 points, he got them back by going for it on 4th down inside field goal range, succeeding, and getting the touchdown.

This past week, however, Reid is again the goat. With 3 minutes left in a tie game against the Broncos, the Eagles faced a 4th and 1 from their own 49. The Eagles punted. This was a big mistake, and you don't even need fancy math or some win probability model to prove it. Worse, Reid made the exact same call a year ago in the infamous tie against the Bengals.

If you have a 4th and 1 at about the 50 and you go for it, the value of the two possible outcomes are exactly equal. In other words, either I have a 1st down at the 50, or my opponent has a 1st down at the 50. It's perfectly symmetrical. Now consider what punting means.

Desperation Graphed

[Note: I can't remember if I already posted this or not. I did this back over the summer and saved it for the season. But I don't think I ever clicked 'publish.' ]

How desperate do NFL coaches have to be before they start going for it on 4th down?

Recently I've shown that, as a rule, teams should be going for it on 4th down far more often than they currently do. But what is 'currently do?' How far do coaches let themselves get backed up against the wall before they start actually doing what's best for their chances of winning?

2009 Team-Specific Run-Pass Balance

Recently I've been looking at run-pass balance on first downs based on a principle of game theory. When strategy mixes are optimized, the two strategies will ideally produce equal payoffs. If they aren't equal, then the better strategy should be selected until the opponent responds with his own counter-strategy. Results suggested that, in the NFL overall, the gains by passing on first down exceed those by running. In turn, this suggests that offenses should pass more often than they currently do.

However, every team has its own relative ability between passing and running. You can't just tell the 2009 Raiders to start passing more often. Their running game may actually be superior to their passing game in terms of expected payoffs, so while most teams should be passing more frequently, it's possible a minority of teams should be running more often.

JaMarcus Russell: Concorde of the NFL

With Jamarcus Russel’s recent benching, there’s been a lot of talk about when it’s time for a team to cut its losses on a failed quarterback. I don’t have hard numbers at my fingertips, but I’d be fairly certain that if a QB isn’t playing above average football or there hasn’t been steady improvement, by the end of his second year, it’s time to move on. [Edit: Here's a good look at that very question at PFR.] There’s no question teams tend to stick with struggling QBs well beyond their expiration date, even when better alternatives exist. The real question is, why?

Let’s say you’re an out-of-town Bills fan, and before the season began you were understandably optimistic about the team’s prospects. You bought prime tickets to the January 3rd game hosting the Colts, including parking and a hotel room. Altogether the bill comes to $400. In August, this feels like a great deal.

As the season wears on, it becomes clear the Bills aren’t contenders. The coach is fired, and the upcoming Colts game is not looking promising, as the Colts appear likely be playing for home field advantage in the playoffs. Everything points toward a humiliating blowout. What’s worse, as the game approaches the weather isn’t looking good. Bills fans are always the hardy type, but the foercast is beyond bad—snow, wind, freezing rain, and bitter cold. You’re not exactly excited about the prospect of going to the game.

(Much) More on 1st and 10 Run-Pass Balance

In a recent article I presented evidence that offenses should generally pass more often on first down. Accounting for both the potential gains and the potential risks of each type of play, passing tends to lead to a greater net point advantage than does running. The analysis was based on a concept known as Expected Points, which measures the average point advantage an offense can expect given a down-distance-field position situation.

Expected Points incorporates all the various factors such as turnovers, yardage gains, sacks, incompletions, first down conversions, scoring, and so on. But I thought it would be helpful to dig deeper to investigate how and why passing appears more advantageous. In this article, I'll present a series of graphs comparing running and passing on first down, each one looking at a different facet of the game.

Offenses Run Too Often On 1st Down

NFL offenses generally run too often on 1st down. Accounting for the relative gains of each play type, and accounting for the risks of turnovers, offenses should pass more. There is currently an imbalance, where teams are too often running directly into defenses that are expecting runs.

Game theory tells us that when there are two strategy options, like run and pass, the expected payoffs for both options should be equal. You really don't need game theory to intuitively understand this. If one option yields a better payoff, then it should be chosen until the opponent responds with a strategy change of his own. Eventually, as the opponent responds, the payoffs for the two options equalize. The point at which the strategy mix equalizes payoffs is known as the minimax, or sometimes called the Nash equilibrium. The resulting strategy mix, or run-pass balance in this case, produces the best overall, long-run payoff.

When there are two strategy options and one of them yields a much higher payoff, it tells us two things. In this case, passing is more lucrative than running on 1st down, and this tells us: 1) offenses should be passing more often, and 2) for now, defenses should continue to be more biased toward stopping the run.

Last Thoughts on the 4th and 2

At the risk of being accused of milking this thing, here are some final few thoughts on the topic. I watched the Football Night in America segment and a few things struck me.

Let me say I have a lot of respect for Coach Dungy. I also think that convincing someone skeptical is difficult, and we shouldn't expect someone to immediately come around. In general though, coaches (and former players/analysts) benefit from a perception that football is some unknowable mystery, and that they are the only priests that can divine the true answers.

From Dungy's perspective, he's enjoyed long career by doing it the conventional way. But the reason his way worked was because everyone else played the same way. To change his mind now might mean admitting, "I did it wrong all these years." That's a tough hurdle to overcome.

Kevin Kelley of Pulaski Academy

A reader posted this link in a recent comment. It's an interview with Kevin Kelley, head coach of the Pulaski Academy high school football team in Little Rock, Arkansas. He always goes for it on 4th down and 75% of his kick offs are onside kicks. The team has enjoyed unprecedented success, including winning the state championship last year. I've mentioned him before, but he spells out his thinking very clearly here. Gregg Easterbrook has been writing about Coach Kelley and Pulaski Academy since 2007. He was also featured in a Sports Illustrated article earlier this season.

The video is embedded below, but you may need to give the video a few seconds to load. Here is the direct link.

Onside Kicks

With 4 minutes left in the first quarter of last week's Cardinals-Seahawks game, Arizona's Neil Rackers booted a short but high 'pooch' kick that was quickly recovered by the kicking team. The kick recovery was worth a very considerable +0.12 WP. The Cardinals went on to score a touchdown, taking a 14-0 lead. How smart are onside gambles like this?

Onside kicks in the NFL are successful 26% of the time. It’s true, but it’s also very misleading. Onside kick success rates are very dependent on whether the receiving team is expecting one.

Irrational Play Calling

As if you need any more evidence of how irrational many coaches can be when facing a 4th down, here’s some more.

In ‘no man’s land,’ the region of the field from the opponent’s 30-35 yd line, punts don’t buy you much and field goals are just above 50/50 propositions. Going for the conversion occurs fairly frequently, particularly on 4th and short. This is the confluence of 4th down decision making where all 3 options are reasonable choices.

But once a coach has decided that going for it is not worth the risk, he can then choose between attempting a FG and punting. Neither of these options is affected in any way by distance to go. Only field position matters. A field goal is just as rewarding and just as risky on 4th and 1 from the 30 as it would be on 4th and 15 from the 30. Same goes for punts. Distance to go should only affect conversion attempts. It would be irrational to base a decision between a FG and punt based on something that only matters when attempting a conversion. But that doesn’t stop NFL coaches from doing exactly that, a fact first noticed by commenter 'Jim A' last season.

Marvin Lewis Gets Bold In Overtime

The Wall Street Journal's 'Numbers Guy' Carl Bialik asked me the other night about the Bengals' 4th and 11 conversion attempt late in overtime against the Browns. With a just over a minute left in OT, the Bengals faced a 4th and 11 from the Cleveland 41. Quarterback Carson Palmer was able to scramble for 15 yards and a first down, and Cincinnati went on to kick a field goal to pull out the win. At the risk of sounding like a broken record on the topic of 4th down decisions, here's a summary of what I told Carl.

Raheem Morris Is A Really Optimistic Guy

Trailing by 6 points with 4:30 left in the game, the Buccaneers faced a 4th and goal from Washington’s 4-yard line. The Bucs kicked the FG to make the score 16-13 and went on to lose. Columnist Gary Shelton of the St. Petersburg Times wants to know why head coach Raheem Morris didn’t go for the touchdown. That makes at least two of us.

I’ll spare everyone the math, but all things being equal the better decision would have been to go for it. Kicking the field goal gave the Bucs a 0.19 Win Probability (WP). Attempting the TD would net a 0.29 WP on balance. Morris’ decision basically cut his chances of winning by a third.  Sure, the particular "flow" and match-ups of the game are factors, but those considerations are usually overblown. Besides, if the game is close enough for it to matter, then the two teams are probably fairly equal, at least for that day.

But that’s not the point of this post. The more interesting thing is the glimpse inside the mind of an NFL coach. Here’s what Morris said when asked about the decision:

Zorn Is My Hero 2

Jim Zorn may have his hands full managing the Redskins, but I'll give him credit for his courage. Last week, I made the case that his two daring 4th down decisions on the final drive were the right calls. This week, before the sun had even set on the day his team couldn't beat the lowly Lions, he was excoriated for two more controversial decisions.  In this post, I'll examine if he made the right calls in Detroit.

Jim Zorn on 4th Down

Zorn is my hero today. On the Redskins’ final drive of their game against the Rams yesterday, head coach Jim Zorn went for it on 4th down not once, but twice. The network commentators were shocked, and the local media coverage has been decidedly critical. Were they good decisions?

The first decision “felt” right to me. Up by 2 points with 3:47 left in the 4th quarter, Washington faced a 4th and 1 at St. Louis’ 20 yd line. A FG attempt from there is an 84% proposition. A kickoff with a 5-point lead gives the Redskins a 0.76 Win Probability (WP). A missed FG attempt gives the ball to the Rams at the 27 and leaves the Redskins with a 0.56 WP. The net WP for the FG attempt is:

Worst 4th Down Decision of 2008

Last November, the Eagles and Bengals were both desperately trying not to win. And they both succeeded, as their game was the first to end in a tie in several years. It's hard to forget that game thanks to Donovan McNabb's comment that he was preparing for a second overtime period.

McNabb's comment aside, the game was remarkable in that it featured not one, but two of the most timid 4th down decisions in the 2008 season. In both cases, had the offense gone for the first down, it would have significantly improved its chances of winning. Note I'm not saying simply that a successful conversion would have helped the team win. I am saying that on balance, considering the chance of a failed conversion, the far wiser decision would have been to go for it.

With the game tied 13-13 and 1:56 left in the 4th quarter, the Eagles offense faced a 4th and 1 from its own 49-yard line. A punt would have made their WP 0.31. That's lower than you might think at first because handing the ball to the Bengals with two minutes on the clock guaranteed they would not have enough time to respond to a successful Cincinnati scoring drive.

A successful conversion would have given Philadelphia a tremendous advantage. With a 1st down and the ball at midfield, they would only need a few more yards to get into field goal range for the win. Conversion attempts on 4th and 1s are converted about 74% of the time. All things considered, had the Eagles lined up to go for it, their 'expected' WP would have been 0.60. That's a difference of 0.29 compared to the punt--essentially doubling their chance of winning. In terms of costing a team in its likelihood of winning, this was the single worst 4th down decision of the 2008 season.

The Eagles may not have possessed the NFL's best power running game last year, and that 4th and 1 may have been a "long" 1 yard. But to make a decisive difference, the particular details of the situation must have been so overwhelmingly disadvantageous that it's hard to believe.

It's not as though the Bengals defensive line was an impenetrable brick wall, and the Eagles did successfully convert 3rd and 1s 60% of the time in 2008, a task not much different than 4th and 1. Further, we can solve for the break-even conversion success rate. In this case, the Eagles would have needed to convert just 5% of the time for the attempt to be worthwhile.

I know what you might be thinking. Wouldn't a failed attempt at the 50 give the Bengals the identical situation that a successful attempt would give the Eagles? True, but don't forget the alternative: punting gives the Bengals the upper hand anyway.

Fortunately for the Eagles, the reason they even had the opportunity to consider a 4th down and 1 was thanks to the 13th worst 4th down decision of 2008. Cincinnati punted on the previous drive when a successful 4th down conversion would have given them a firm upper hand.

Coaches talk a lot about "momentum" when it comes to 4th down decisions. A failed 4th down attempt deflates a team and encourages the opponent. Although teams might feel that way, however, it's not clear at all this makes much difference in terms of who wins. We're talking about professional athletes with plenty of experience at many levels of play.

Besides, think of it this way: Imagine you're a Bengals defender, elated you made a stop on 3rd down while trotting triumphantly off the field. Then you realize the Eagles are lining up for an easy 4th and 1. Chances are they'll convert, and now you're lining up for a whole new 1st and 10. How's the momentum now?

Decision Theory in Football

In Decision Theory, there are generally two kinds of analysis. Descriptive analysis is what people actually do, and prescriptive analysis is what people should do. Rarely are the two things the same. For example, when I use the win probability model to evaluate 4th down decisions, I'm doing prescriptive analysis. Trying to explain whatever the heck coaches are actually doing would be descriptive analysis.

To be fair, coaches are not computers. They are subject to all the imperfections of human decision making. In this post, I'll examine some of the ways that coaches may be making decisions, including minimax, minimax-regret, prospect theory, and expected utility. I'll also discuss the potential for how much of a difference a pure prescriptive analysis can make when applied in real games.

NFL Orthodoxy

NFL football has evolved as extremely conservative game. By that I mean that coaches adhere to the wisdom passed down from previous generations and are reluctant to deviate from the established orthodoxy. In the real world, away from sports, this approach usually makes sense. Unlike sports, the world is not bounded by sidelines, end zones, and 15-minute quarters. It is highly uncertain and far less predictable than we'd like to think. It makes sense to adhere to what is known to work rather than try to engineer an optimized outcome in a highly uncertain environment.

But in football, we have the stats. We know the probabilities. And we know the possible consequences. 'Conservative,' as I defined it, is therefore often not the best approach. I think the reason that so many coaches adhere to the same orthodoxy, whether in terms of playbooks or 4th down doctrine, is because they aren't conscious of the level of certainty available to them.

Minimax

One of the more conservative approaches is the minimax criterion. Minimax says pick the option that assures you the highest minimum utility. Let's say you have the choice between going on a picnic and going bowling. You'd really rather go on the picnic, but it might rain. Your payoff matrix would look like this:

Payoff Matrix






No Rain
Rain
Picnic40
Bowling11


If it doesn't rain, the picnic pays off, but if it rains you've lost the afternoon. Bowling is not as much fun as the picnic, but it wouldn't matter if it rains. Minimax says go bowling because 1 is its minimum payoff while 0 is the minimum payoff for the picnic.

Minimax-Regret

Another decision method is known as the minimax-regret criterion. This method seeks to minimize potential regrets. Imagine coming out of the bowling alley and being greeted by a sunny blue sky. 'Darn. Should have gone on the picnic.' In this case, if you go bowling and it doesn't rain, you've gained 1 unit of utility but lost out on 4 units, for a net regret of 3. If you go on the picnic and it does rain, you've gained 0 utility but lost out on 1 unit, for a net regret of 1. If you want to minimize your regret, you'd choose the picnic.

Notice that I haven't mentioned the weather forecast yet. These methods are best relied upon when there is a very high level of uncertainty in the "states of nature" that will determine the payoffs.

Now consider a football example. Say a coach has three plays that make sense for a given situation, and the opposing defense can call one of three kinds of defenses. An example payoff matrix might look something like this:

Hypothetical Football Payoff Matrix


Def X
Def Y
Def Z
Play A
-4412
Play B
-238
Play C
321


Note that this is not game theory. We're not looking for a Nash equilibrium. The offensive coordinator is thinking of the defense as a "state of nature." It's something he has no control over and is difficult to predict.

In this case, both Plays A and B have the possibility of negative payoffs. Play C guarantees at least a payoff of 1, and therefore would be the minimax decision.

The regret method says something different. Assume the defense had called Def X. The best payoff possible given Def X would be 3 with Play C, so had we called Play C there would be no regret. But had we called Play B, we would have earned a -2 payoff, which equates to a regret of -5. In other words, we could have had 3, but we got -2. And had we called Play A, we would have earned a -4, which is a regret of -7.

If we repeat the regret calculation for each possible defense, we get a whole new regret matrix:

Regret Matrix






Def X
Def Y
Def Z
Play A
-70

0
Play B
-5-2

-4
Play C
0-2-11


Given this regret matrix, the minimax-regret criterion would look for the choice that assures us of the best worst-case scenario. For Play A, the worst regret is -7. For Play B, it is -5. And for Play C, it's -11. Therefore, we'd pick Play B because it is the least costly in terms of maximum possible regret.

Of course, coaches or anyone else would never actually draw up a matrix and do the math to make a decision. But just like in the picnic-bowling example, our brains are attempting poor analog versions of these kinds of decision criteria, and emotions play a large role.

Expected Utility

What if we reduce the uncertainty in the defense? We can't predict exactly which one we'll see, but we can estimate the probabilities that we can expect each defense. The expected utility of a choice is the weighted average of the possible payoffs. For simplicity, say each defense is equally likely with a 1 in 3 chance. Now we can estimate the expected utility for each play choice. In the example above, the expected utility for Play A is (1/3)(-4) + (1/3)(4) + (1/3)(12) = 4. The expected utility for Play B is 3, and for Play C it's 2. The expected utility method therefore says Play A is the best choice.

The three methods each call for a different decision. Each method is logical and consistent in its own way, but there is only one truly correct method in football, only one prescriptive analysis. Remember, in football we can know the probabilities and the payoffs, or at least have a solid league-wide baseline for them. The expected utility method is the only correct method.

The math behind expect utility analysis couldn't be any easier. It's 5th grade arithmetic. The challenge is knowing the utility function. Yards, and even points, don't equate to utility. A 7-yard gain is usually good, but it's relatively useless on 3rd and 8. And a 3-point field goal doesn't help late in the 4th quarter when down by 7.

Fortunately, there is win probability (WP). WP is the one and only correct utility function for any game, including football. Winning is all that matters, whether by 1 point or 100 points. WP is also perfectly linear, which is essential to valid expected utility analysis. A 0.40 WP is exactly twice as good as a 0.20 WP, and 0.80 WP is twice as good as 0.40 WP.

Prospect Theory

But even if coaches were to somehow use expected WP analysis when making decisions (say by using 'quick reference' cards like they sometimes do for 2-point conversion decisions), it's likely they still wouldn't be very rational.

Prospect theory says that people fear losses more than they value equivalent gains. Humans evolved with a tendency to try to avoid loss. We're usually more upset with ourselves when we misplace a $20 bill than we are happy when one falls out of the laundry. This tendency has been borne out time and time again in clinical experiments and other studies.

In football, this means that decisions are warped because coaches would fear a loss in WP more than an equivalent gain in WP. The chart below illustrates this concept. According to prospect theory, the "joy" from a 0.05 gain in WP is less than the "pain" from a 0.05 loss in WP.


This asymmetry would affect tactical decisions in many ways, but the most obvious may be 4th down doctrine. Say a team finds itself in a situation where punting would result in a 0.50 WP, but the expected utility analysis says going for the conversion would result in a net 0.55 WP. If the goal is to win the game, the correct decision in this case is to go for it. Period.

The analysis isn't so straightforward for the coach (even if he could do all the math on the spot). Say the failed conversion results in a 0.45 WP and the successful conversion results in a 0.65 WP. A 50% chance at successful 4th down conversion therefore results in a net 0.55 WP.

But the coach sees the 0.45 WP as a possible loss of 0.05 WP, and he sees the 0.65 as a gain of 0.15 WP. Because he fears the loss far more than he values the potential gain, even one 3 times as large, he'll prefer the sure-thing option and punt.

Further, it's possible to actually measure the risk aversion of coaches by comparing the WP advantages in situations where they went for the conversion to the WP advatanges in situations where they forego the conversion attempt.

An Advantage

The coach who can resist this human tendency and make decisions based purely on expected utility will have an advantage. Just how big an advantage, no one can ever know. Actually, that's not true--I'll tell you right now. Just by following a pure expected utility analysis on 4th down, a coach would win an average of an extra 1.4 games per year.

I calculated this based on a play-by-play database from the past 9 seaons. For each 4th down in which a team kicked either a FG attempt or punt, I calculated the difference between going for it and kicking. Wherever the difference was positive, I summed the increase in WP for going for it. The grand total for nearly 2400 games was +203.1 WP, which equates to an increase of 0.17 WP for every game. But since there are always two teams competing in every game, this means that we need to halve that, which is 0.086. The bottom line is that a pure expected utility approach to 4th down decisions would increase a team's chances of winning a game from 0.50 WP to about 0.59 WP. This is equivalent to an extra 1.4 wins per season (0.086*16).

That's a bold claim, I realize. But if you trust my WP model, which is really nothing more than a smoothed empirical observation of how often teams actually won in given game situations in real NFL games, then the claim is not so bold. It's not a perfect model, but the errors are unbiased, meaning it overestimates as much as it underestimates.

Still, if a coach only followed the expected utility recommendations when the WP for going for it was greater than 0.05 more than the WP for kicking, his team would still benefit by an extra 0.8 wins per season. That's nothing to sneeze at in a 16-game season.

Are NFL Coaches Too Timid?

Risk is at the heart of football strategy. Aggressive, risky gameplans should result in boom-or-bust high-variance outcomes, sometimes scoring lots of points but sometimes scoring very few. Conservative gameplans result in relatively consistent low-variance outcomes. Teams would more likely score close to their average score.

In this post, I’ll look at what high and low variance strategies would look like in terms of point totals and how they affect each team’s chances of winning. I’ll also compare the theoretical strategies to the actual distributions in the NFL. We'll see why NFL coaches should be more aggressive when they're the underdog.

High Variance Strategy in Basketball

Some time ago, I came across an article posted by basketball researcher Dean Oliver that analyzed high and low variance strategies for the NBA. Oliver calculated the win probability of each opponent according to the mean and standard deviation (SD) of each team’s scoring tendencies. SD represents the degree of variance. The more aggressive and riskier the strategy, the higher the SD will be. For example, a basketball team that shoots lots of 3-pointers would have a high variance.

The key to accurately modeling basketball is realizing that each team’s score is correlated with that of its opponent. The pace of a basketball game ties each team’s score together, and there is a high level of covariance. When one team scores a high number of points, the other team will tend to score more too. Game scores are interdependent.

In Football

Recently the Smart Football blog illustrated the advantage of high variance strategies for underdogs. A high variance strategy increases an underdog’s chances of winning but comes with the cost of also increasing its chances of being blown out.

In the NFL as a whole, visiting teams average about 19 points with a SD of 10 points while home teams average about 23 points with a SD of 10 points. But unlike basketball, football opponent scores are negatively correlated. This makes intuitive sense because the better one team does, the worse the other should do. If one team gets lots of first downs and doesn’t commit turnovers, its opponent will usually start drives with poor field position, and vice versa. The covariance between NFL opponent scores is -1.9 points-squared.

If NFL scores were normally distributed, this is what the typical score distribution would look like. The visitor scores are in red and the home scores are in blue.


We can calculate each team's chances of winning by summing all the probabilities with these distributions and factor in the covariance using Dean Oliver’s method. This estimates that the home team wins 56.5% of the time, which happens to be exactly the NFL actual home field advantage.

Disclaimer

There’s one problem. NFL scores are not normally distributed, primarily due to its unique scoring, which typically comes in chunks of 3 or 7. Here is what the actual distribution of scores looks like.


The good news is, if we group the scores into bins of 7 points, we get a quasi-normal distribution. (Technically, it may be more of a gamma or Poisson distribution.) I’m going to stick with normal distributions to simplify the math and to better illustrate the concepts I want to convey.



Demonstration

Here’s why underdogs should play aggressive and risky gameplans. Take an example where one team is a 7-point favorite over its underdog opponent. Say the favorite would average 24 points and the underdog would average 17 points. With a SD of 10 points for each team, the underdog upsets the favorite 31.5% of the time. The favorite’s scoring distribution is blue and the underdog’s is red.


But if the underdog plays a more aggressive high-variance strategy, increasing its SD to 15 points, it would upset the favorite 35.3% of the time.


Note that I haven’t increased the underdog’s average score in any way, just its variance. The increase in its chance of winning results due to more of its probability mass moving to the right of the favorite’s mean score of 24. In fact, the higher the variance, the wider the probability mass will be spread. Consequently, more mass will be to right side of the favorite’s average score. But more mass will also be to the left, meaning there is a higher risk of an embarrassing blowout.

Even if employing a high-variance strategy is non-optimum, it can still help an underdog. In other words, even if an aggressive gameplan results in an overall reduction in average points scored, it often still results in a better chance of winning.

The next graph plots the scoring distributions of just such a scenario. Like before, the favorite’s average score is 24 with a SD of 10. But this time the underdog’s average is reduced from 17 to 16. The increase in variance still results in a slightly better chance of winning despite its overall reduction in average points scored. In this case, it's 33.2% for the underdog.


What about the favorite? Should it increase its variance in response to an aggressive underdog? No. Ideally it should play as consistently as possible. The lower the variance the better for the favorite. The next example shows a favorite playing a low-variance game with an average of 24 points and a SD of 5 points. The underdog is playing conventionally with a 17 point average and 10 point SD. The result is an increase in the favorite’s chances of winning from 69.5% in the original example to 73.0%.


And if the underdog plays an aggressive high-variance game, the low-variance strategy is still better for the favorite. In this case the favorite still improves its chances of winning from 64.7% to 67.8%.


In Practice

So what does any of this mean in the real world? Simply put, to win more often underdogs should employ a high-variance strategy from the beginning of the game. It shouldn’t wait until the 4th quarter and become desperate. Go for it on 4th and short, run trick plays, throw deep, and blitz more often. Roll the dice from the get-go.

The real question is, what is the optimum level of risk? I’m not sure, but I do know NFL coaches are operating far from it.

Looking at games from the ’02 through ’06 seasons (a total of 1280), underdogs do not increase their variance. For example, for games in which the point spread is between 6 and 7.5 points, the underdog’s SD is 9.8 points, slightly less than the overall league average. Ideally, it should be higher. The favorite’s SD is 10.4 points when ideally
it should be lower.

The table below lists the SDs of points scored for the favorite and underdog according to the most common point spreads.









SpreadFavorite SDUnderdog SD
0 - 1.59.610.5
2 - 3.59.89.4
6 - 7.510.49.8
10 - 11.510.58.7



If anything, there appears to be slight trends in
the exactly wrong directions. The bigger the spread, the smaller the underdog’s variance and the bigger the favorite’s variance. It appears underdogs may get less aggressive while favorites may get more aggressive.

Conclusions

This is more evidence coaches do not coach to maximize their team’s chances of winning. My theory is coaches are delaying elimination until the latest point in the game—that is, trying to “stay in the game” for as long as possible. Underdog coaches minimize risk all game long hoping for a miracle along the way. They seem to be reducing the chances of being blown out, but this is not consistent with giving their team the best chance to win.

But if you think about it, this kind of approach might be good for the NFL as a whole. It keeps games entertaining as long as possible, and keeps viewers tuned in.

Coaches of favored teams could be accused of the same crime. They might be playing with too much variance. But there is certainly a limit to just how consistent a team can be, no matter how hard it tries. There will always be random variation in team performance. I suspect a SD of 10 points may be near that limit, and that coaches of both favorites and underdogs simply play the least risky game they can consistent with accepted conventions.