NFL Preseason Rankings Methodology

Our NFL preseason rankings (and associated team ratings) are driven by objective data we've found to be predictive. Here's how we make them.

May 15, 2024 - by Jason Lisk

Jalen Hurts has been a big factor for the Eagles (Photo by Stephen Lew/Icon Sportswire)

This post describes our methodology and process for creating NFL preseason rankings for all 32 teams.

We’re data people, so as one should expect, our NFL preseason rankings are primarily driven by stats and modeling and not less objective methods like film study or media scouting reports.

(That’s not to broadly denigrate more subjective methods of analysis. But when it comes to preseason NFL rankings, many narratives exist that aren’t supported by hard data.)

Before we dive into the details of our approach, let’s first cover a few basics.

What Our NFL Preseason Rankings Represent

It’s important to know that our preseason rankings simply represent the rank order of preseason predictive ratings that we generate for every NFL team.

So the first step in our process is to calculate preseason team ratings.

Predictive Rating Definition

In simple terms, an NFL team’s predictive rating is a number that represents the margin of victory we expect when that team plays a “perfectly average” opponent on a neutral field.

This rating can be a positive or negative number; the higher the rating, the better the team. A rating of 0.0 indicates a perfectly average team.

How Ratings Translate To Predictions

Because our NFL predictive ratings are measured in points, the difference in rating between any two teams indicates the projected winner and margin of victory in a neutral-site game between them.

For example, our system would expect Kansas City, which has a 2024 preseason rating of +6.0, to beat an average NFL team by about six points on a neutral field.

It would expect Kansas City to beat Carolina, which has a -8.1 rating, by about 14 points on a neutral field. And Carolina would be expected to lose to an average team by about eight points.

Ratings Are More Precise Than Rankings

Understanding the nature of a predictive rating is helpful, because it is more precise than a ranking.

For example, Cowboys fans may feel slighted to see their team at No. 9 in our 2024 preseason rankings. However, Dallas’ predictive rating is 3.3, only 0.9 points lower than No. 5 Cincinnati’s rating.

So yes, if you put a gun to our head and told us to rank order every team, we’d say Cincinnati is going to be a better team than Dallas over the course of this season. But the difference is fairly small.

Now take the Pittsburgh Steelers, who sit at No. 13 in our 2024 preseason rankings. That’s four spots behind the Cowboys, but it’s 2.0 rating points. The No. 9 Cowboys are as close to the No. 5 Bengals than the No. 13 Steelers.

So there’s a clear tier there, with a sizable gap between the No. 9 and No. 13 spots.

In short, don’t place too much stock in a team’s ranking. Ratings tell the more refined story.

When and Why We Make NFL Preseason Ratings

Once the season starts, our predictive ratings go on autopilot. As game results from NFL Week 1 and beyond come in, our system automatically adjusts team ratings (and the resulting rankings) each morning with the results of the previous day’s games.

Teams that win by more than the ratings had predicted see their ratings increase. Teams that suffer worse than expected losses see their ratings drop. Software code controls all of the adjustments and no manual intervention is required.

Generating preseason ratings, however, involves a more labor-intensive process that we go through before every new season starts. In short, we are trying to pre-calibrate our NFL predictive ratings system. We want to give it a smarter starting point than simply having every team start the season with a 0.0 rating.

Put another way, our preseason ratings are our first prediction of what we think every NFL team’s predictive rating will be at the end of the upcoming season. And we need to make that prediction before any regular season games are played.

Despite being a substantial challenge from a data perspective, our approach to this process is still mostly data-driven and objective. However, there are some judgment calls incorporated, which we’ll explain below.

A Brief History of Our NFL Preseason Ratings

Before we get into the details, it helps to explain a brief history of how and why our current preseason ratings process evolved:

In the way old days (early 2000s), every team would start the season with a 0.0 rating, and we’d put a note on the site not to trust our ratings until Week 5 or Week 6. Before then, with such a tiny sample size of games, big surprises or lopsided results could produce some really funky ratings.
In the semi old days (mid to late 2000s), we started having each team begin the season with its end of season rating from the prior year. Until Week 5, the impact of the prior year rating would gradually decay to zero, and by Week 6, we’d only consider current season results. Better, but still not the best.
Starting in 2011, we implemented the general framework we still use today. We looked at years of historical data and built a customized model to generate NFL preseason ratings. This approach is completely divorced from our automated in-season ratings updates.

Why we took that final step is simple. Generating preseason team ratings using a customized model significantly improved the in-season game predictions made by our NFL ratings — and not only in early season games, where one would logically expect to see the biggest improvement.

In fact, still giving the preseason ratings some weight even at the very end of the season improved our prediction performance over the final weeks, too.

When We Make NFL Preseason Ratings

During every offseason, we first put in work to improve our preseason ratings methodology. We investigate new potential data sources and refit our preseason ratings model using an additional year of data.

After implementing any offseason refinements to our process and model, we then gather the necessary data and generate our preseason ratings for the upcoming season. In the past, we’ve typically completed this process a week or two before the NFL regular season starts. However, since 2021 we’ve been getting an earlier jump on it, and doing our research in April and May.

We then do our initial preseason ratings release in mid-May, right after the NFL releases the full schedule for the upcoming season, and make refinements as warranted based on developments that still occur before the NFL regular season starts.

How We Make NFL Preseason Ratings

Now, let’s get to the meat. By analyzing years of NFL data and using more than a decade of recent season results, we’ve identified a short list of descriptive factors that have correlated strongly with end-of-season team power ratings.

We use multiple regression models to determine each factor’s weight in our preseason ratings. Each model uses a slightly different set of predictive factors, and we weight the contribution of the models a bit differently depending on some team characteristics. For example, one model might be more accurate when projecting teams with new quarterbacks, while another might be better for teams with the same quarterback returning.

As a result, the relative importance of each factor is based on its demonstrated level of predictive power, for teams of a certain type.

The statistical bar for giving a specific metric the nod as having predictive power is high, and most stats don’t make it in. But here are some of the factors we do incorporate, and will briefly explain below:

Last Season Performance*
Recent Franchise Performance*
Offensive Player Ages*
Betting Market Data*
Quarterback
Luck
Coach
NFL Draft Data

In our 2024 preseason rating post, the factors marked with an asterisk (*) above are rolled up into one “BASELINE” category, to make the values easier to understand. Because recent power ratings and age matter more for some types of teams than for others, breaking the factors down at that granular level results in some counterintuitive-looking numbers. However, we’ll briefly describe each of them separately here.

(By the way, all of these stat factors are based on past “regular season plus postseason” numbers. We don’t incorporate any game results or stats from NFL preseason games into the model.)

Last Season’s Performance

How good a team was in the most recent season — as measured by end-of-season predictive rating and not win-loss record — is the single best objective measure of how good that team will be in the upcoming season.

The year-to-year correlation coefficient for our predictive rating is +0.48. In non-stat geek terms: no other non-betting-market factor we’ve investigated comes close to being as important as this one.

That said, other factors do contribute to the final 2024 preseason rating of each NFL team.

Recent Franchise Performance

This factor measures how good a team has been in recent history, not including the previous season.

Since each NFL regular season only includes 17 games, capturing more games further in the past still has some relevance. A team may have had an injury-filled season or just an unlucky year, and looking to the deeper past can distinguish that team from another that has been consistently bad for several years.

So what happened two or three years earlier still has some relationship to this year’s preseason rating. The impact on preseason ratings isn’t nearly as strong as the most recent season, since a lot of the info conveyed by the older ratings is already conveyed by the rating from last season. But it still has some predictive power.

And this factor matters a little more for teams with unstable quarterback situations, where the QB is new or simply not elite. In those scenarios, the QB can’t be counted on to carry the team, and having a strong supporting cast is relatively more important.

(For the statisticians in attendance, the correlation between final predictive ratings in a given year and those from two seasons earlier is +0.38. The correlation with ratings from three seasons earlier is still +0.23.)

Player Ages

This probably won’t come as a huge shock to many of you, but — young teams tend to improve a bit more year-over-year than old teams do.

There are a couple minor wrinkles to this. Age matters more on teams with a new quarterback. Super young teams do a bit worse than expected. But the general idea is, having a youthful team, especially on offense, is a good thing.

Betting Market Data

We’ve always used betting market data to some extent in our preseason NFL ratings. In the past, we’d run season projections with our initial ratings and then compare those to season win totals, see where we were way off, and then dig in to see if we thought we should make an adjustment towards the market.

Since 2021 we’ve been taking the logical next step and incorporating information from those win totals directly into the model. This is similar to the approach we take with our game predictions. Essentially, the win total information provides part of a baseline projection, with the other part coming from a mix of our past predictive rating data, adjusted for team aging patterns.

The QB, Coach, Luck, and Draft factors start with that baseline, and shift our ratings up or down a bit. But by using the market data directly in the baseline, we avoid being too “off market” in cases where there are factors that we haven’t accounted for, but the market has.

Quarterback

Our quarterback factor accounts for both how good a team’s quarterback is and whether that’s an improvement or a decline from past seasons. (Note that this was a change in 2020, versus earlier seasons. Prior to 2020, the QB component only reflected how a team’s quarterback contribution was expected to change compared to past seasons, and did not factor in the absolute level of quarterback performance.)

To create this factor, we first identify the likely starting quarterback (or quarterbacks, plural, if there isn’t a clear top starter) for each team, and make a projection of their performance based on a weighted average of their recent stats. For rookies, we use a simple model based on their draft position and college passer efficiency rating to project their NFL performance. For players that have suffered injuries or missed time for other reasons, we regress their raw stats toward their career averages and/or their rookie projections.

This is the core of the QB component. You might wonder if this gives teams “extra credit” for returning a good quarterback. After all, isn’t that player’s good performance already captured in the Last Season Performance component?

Well, it turns out that when we include projected QB performance in the model, the weight of the Last Season Performance component decreases. Essentially, part of the “credit” for the previous power rating is now being assigned to the quarterback. Consider a case of two good teams with identical ratings, one of which is driven by great QB play, and the other by great defensive play covering for a mediocre quarterback. In our model, the team with the star quarterback would have a higher predicted rating the following season, all else being equal.

The model is essentially saying that quarterback performance is more consistent than most other aspects of team performance.

Also, since 2021 we’ve incorporated information about the quarterback’s age. Essentially, quarterbacks improve more season-to-season when they’re young, so having a 26-year-old quarterback gives a bit of a boost compared to a 30-year-old quarterback. Though there’s a limit to this, and extremely young rookie quarterbacks have not fared so well.

Luck

This component reflects the expected change in team rating due to (surprise!) luck-related factors. Several stat categories are highly impacted by luck, or not very reproducible for other reasons.

For example, it’s become fairly common knowledge in recent years — at least among those who closely follow statistical football analysis* — that turnovers in the NFL have a large component of randomness associated with them.

Some years, when defensive backs tip passes and running backs drop balls on the turf, a team just happens to have a lot of lucky bounces go its way. Other years, the opposite happens. These types of things will never be highly controllable or predictable.

As a result, turnover luck tends to regress toward average from one season to the next. Less commonly known, however, is that there are also some defensive stats that act in much the same way (third down conversion rate being one of them).

*Note: At least one former NFL head coach vehemently disagrees. In fact, he got into a heated lunch discussion about it several years back with our 20-year old intern, who didn’t even make his high school football team but politely refused to back down from his data-driven position. It was quite amusing. Hope all is well, coach Mangini.

Coach

This factor relates to recent changes in coaching, as well as to situations where we think the market and our models may not fully account for how important a coach is to a given team. It is not a measure of how good or bad a coach is.

In other words, you might have thought we were crazy when we gave Houston a better coach adjustment last year with DeMeco Ryans than we did veteran Mike Vrabel. Mike Vrabel had been a good coach in Tennessee, while DeMeco Ryans was a new head coach in Houston, taking over for a team that had been among the worst in the NFL for two straight years.

The metric largely measures projected changes in coaching impact. For example, historically, teams that perform very poorly in a given year tend to improve the next year under a new, first-year NFL coach. Hence the positive coaching adjustment we had in 2023 for the Texans, with Ryan taking over.

However, most first-year NFL coaches, especially those that didn’t have a prior head coaching gig in college or the CFL, get a negative coaching adjustment, because our research shows that unless a team was quite bad the previous year, it usually sees a decline during the inaugural year of a new head coach. That’s why you see a difference in how the coaching adjustment is applied to teams with first-year coaches. Seattle and Tennessee, two teams with new head coaches that have been solid in recent history, get more of a negative coaching adjustment than Washington, a team that finished dead last in our power ratings last season.

For second-year coaches, the opposite is true. Overall, second-year coaches tend to show an improvement in team performance, and you can see that relationship reflected in our 2024 ratings.

However, the second-year head coach effect is again not uniform. In this case, it’s the coaches that delivered at least a respectable, and maybe even an improved first season that see the biggest increases in rating contribution.

Doug Pederson, for example, won a Super Bowl in his second season in Philadelphia, after the team showed positive improvement in his first year as coach. And the same goes for Bruce Arians in 2020 with Tampa Bay. That’s why you see positive coaching adjustments for teams that improved last year with rookie head coaches, like Denver, Houston, and Indianapolis.

Draft

This factor accounts for a team’s past draft picks maturing into valuable, contributing players.

We’ve looked at general draft value from a lot of different angles, in a lot of different models, and it always ends up shaking out basically the same — having good draft classes from a few years ago is a better predictor of success than having a good draft in the most recent couple of drafts.

We think there are two main reasons for this.

First, players from the most recent draft simply don’t play all that much, usually. Sure, there are exceptions. But when looking at broad measurements of general draft pick accumulation, the exceptions don’t seem to move the needle.

Second, teams with high draft picks in the most recent draft tend to be, well, bad teams. That’s why they have a lot of draft picks. Whatever factors led to them being bad in the most recent season don’t usually just disappear the next season. So this downward pull probably counteracts, to some extent, any upward benefit of the most recent draft picks.

A couple seasons down the road, the draft picks are playing more, the negative factors have been improved on some, and the strong draft class starts to be a positive indicator.

Step 2: Review & Refine The Initial Results

After our model generates its 100% data-driven NFL preseason ratings, we then run a series of season projections (simulating the regular season and the playoffs 10,000 times) and compare the distribution of outcomes in our simulations to the betting markets.

If our assessment of a specific team seems way out of whack in comparison to the market, we’ll investigate more. Primarily, we’re looking to identify some factor not taken into account by our model (e.g. a major injury, notable personnel changes, or a very unique coaching situation) that is likely to impact the expected performance level of a team.

In some of those cases, we end up adjusting a team’s rating to be closer to the market. As a result, this final part of the process does inject some subjective judgment calls into our process.

In the past, some of these final subjective adjustments have been rather large. However, in 2021 we started using betting market data directly in the model. As a result, we no longer have such big discrepancies between our projections and the market, so the sizes and number of these manual adjustments has decreased.

Why Adjust NFL Ratings Manually?

It typically takes a good amount of convincing for us to incorporate some level of subjectivity into a prediction process.

With only 32 teams and a 17-game NFL season, though, there’s a lot of uncertainty to deal with. As we mentioned earlier, there’s a very high statistical bar to reach in order to anoint a particular stat as predictive of future team performance, and very few stats pass the test.

That’s a good thing. One of the biggest challenges of predictive modeling is filtering out the signal from the noise, and “false positives” based on small sample sizes can ruin the future accuracy of a model.

At the same time, the factors that do make it into the model aren’t perfect, and we try to keep that in mind. There will be cases where our stat factors don’t tell the full story. When we’re aware of that, our best solution for the foreseeable future may be to make manual adjustments, especially when those adjustments bring us more in line with the opinion of the betting markets.

Side Note: We Still Take Stands…

As a final point, it’s important to remember that predicting how good an NFL team will be before the season starts is one area where the betting market has proven to be a good predictor overall, as one would expect. But that doesn’t mean it’s perfect.

And while our methodology has its blind spots, it is rooted in a level of statistical rigor that goes significantly beyond what most other rankings makers apply. So while we do make some subjective final adjustments, even in most outlier cases we don’t often adjust our numbers to exactly match the market.

And those stands tend to work out, on average. From 2011 to 2021, when our preseason projected win distribution indicated there was at least 10% “return on investment” value on betting either the Over or the Under on a NFL team win total, our preseason prediction’s implied picks are up +14.5 units. Roughly one unit of profit per year isn’t going to make you rich, but we view it more as an indicator that we’re doing something right.

We also listed Staff Picks in 2022 and 2023 and our preseason NFL win total picks went 6-0, covering by an average of 2.0 wins from the total, and those were based on a variety of research factors and our model predictions.

Conclusion

There are many different ways to make preseason rankings for the NFL. The approaches can vary greatly, from media power rankings to “expert” analysis, from building complex statistical models to making inferences from futures odds in the betting markets.

And speaking frankly, there’s plenty of crap out there. But there’s also no Holy Grail (yet).

Within ten seconds of looking over our preseason NFL rankings, you’ll probably find several rankings you disagree with, or that differ from what most other “experts” or ranking systems think. That’s to be expected.

When the dust settles at the end of the season, our NFL preseason ratings, and the various projections we generate using them, will almost certainly be way off for a few teams. As happens every year, some teams simply defy expectations thanks to surprise breakout performances, while other teams are impacted by injuries, suspensions and other unanticipated events.

Nonetheless, the primary goal of our preseason analysis is to provide a baseline rating for each team (a “prior” in statistical terms) that makes our system better at predicting regular season NFL games. We’re most concerned about the overall accuracy of the system — that is, how good it is at predicting where every predictive rating for every NFL team will end up at the end of the upcoming season.

For that purpose, we’ve settled on a mostly data-driven (but still subjectively adjusted) approach to preseason team ratings. And so far, this approach has delivered very good results.

NFL Preseason Rankings Methodology

Golf Pool Picks

What Our NFL Preseason Rankings Represent

Predictive Rating Definition

Golf Pool Picks

How Ratings Translate To Predictions

Ratings Are More Precise Than Rankings

When and Why We Make NFL Preseason Ratings

A Brief History of Our NFL Preseason Ratings

When We Make NFL Preseason Ratings

How We Make NFL Preseason Ratings

Last Season’s Performance

Recent Franchise Performance

Player Ages

Betting Market Data

Quarterback

Luck

Coach

Draft

Step 2: Review & Refine The Initial Results

Why Adjust NFL Ratings Manually?

Side Note: We Still Take Stands…

Conclusion

Golf Pool Picks