- Mar 16, 2020

Making the Combine Matter | Rocket Science

As football analytics evolves, we swing back and forth on the value of the combine. Think of how hard we fall for the Trent Richardsons and Kevin Whites, while the Arian Fosters and Antonio Browns are ruled out as poor athletes. Not just talking armchair scouts and fantasy owners, the NFL decision makers do the same thing. We love the Combine and we have no idea what it's worth.

I am here to make it matter.

The tables below display what the Combine drills are worth in relationship to the fantasy points a player will score throughout his career up to age 27. This research includes all players since 2003.

These tables show correlation, the measure of the relationship between two variables. A perfect relationship match would measure 1.000 or -1.000, and a totally random relationship with no correlation would measure 0.000.

At the outset, it's very clear why there is a struggle to appropriately appreciate the combine. Most individual drills have poor correlation. Superficially, we can feel this phenomenon when we look at a list of all the receivers that ran sub-4.32 forty yard dashes.

Tyreek Hill saves the group (you'll see his pro day time listed as high as 4.34), but it's largely a crop of underperformers. Whiffing on these guys has put a bitter taste in our mouth, and they truly influence the total correlative value of the 40 yard dash. However, taking a look at the next fastest group gives us the key to measuring the value of the forty, and the rest of the combine.

This grouping has the best PPG and highest success rate of any other 40 yard dash cohort. It doesn't make sense to throw out the drill or make gross generalizations about its value simply because the "Top 40 Yard Dash Performers" in the sub-4.32 range didn't produce as fantasy assets. This is obvious, but the application of that understanding takes some work.

Threshold Modeling

Instead of using 1:1 methodology, where Jacoby Ford's 4.28 is more valuable than Desean Jackson's 4.35, we can use thresholds to put players into categories. This is not a novel idea, but for most, it's only been a theoretical exercise.

By using the Average NFL PPG as our outcome measure, we can see the real value of a combine drill. As the table below shows, a sub-4.36 WR scores 7.42 PPG on average. This is 71% more points than the average WR who ran between 4.49 and 4.58. So we simply apply a proportional bonus to each player according to whatever 40 yard dash "bin" they fall under. Here's what those "bins" look like:

This process raises the correlative value of the 40 yard dash all the way to 0.181, greater than any of the raw values previously studied (for WRs).

I created these bins through manual analysis. You'll notice the bins are not equal categories like 4.30-4.40 and 4.40-4.50. They are also not equal by player count where we might take the top 50 fastest players and compare with the next 50 fastest players. In this astronomer's opinion, it is best to let the data speak for itself. Instead of binning into groups of 0.10 seconds, an arbitrary delineation based on personal bias and human nature, I create a bin where there appears to be a significant difference in average PPG. After all, time itself is a theoretical construct. Letting the outcome measure determine the bin size creates a better understanding of the actual relationship that the historical data is trying to tell us. This type of custom binning is viable if we have a large sample size and the willingness to adapt to new data.

The Combine Model

By applying this method to every drill/measurement for each position and then reducing down to the most relevant drills (aiming for a p-value 0.05). I have created a combine model with a very relevant correlation ranging from 0.398-0.415 depending on position. Again, this is a major upgrade from the correlation of the best individual measurements and not too far from draft capital's -0.494 correlation.

The Wide Receiver Combine Model uses the scores from six measurements: 40 Yard Dash, 3 Cone, Launch Score (jumping power), Hand Size, Arm Length, and BMI. While Speed (40 Yard Dash), Power (Launch Score), and Agility (3 Cone) are often emphasized in combine models, factors like Arm Length are one of the most important measurements for wide receivers. The Calvin Johnsons, AJ Greens, Mike Evans, Brandon Marshalls, Dez Bryants of the world show the value of having long arms, but even smaller receivers like Antonio Brown, Brandin Cooks, or Tyreek Hill score points in this category for crossing the 30" Arm Length threshold. NFL evaluators recognize "You can't teach speed," in the same way, they also recognize that you can't teach length. In this game of inches, having the ability to reach a bit further than the player covering you can come down to the raw combination of Arm Length, Hand Size and jumping ability (Launch Score).

Below are the bins for the measurements and the respective modifier score that a player will earn for their performance in each category. It's worth noting that some modifiers, like 3 Cone and BMI, give smaller bonuses. They remain significant enough to include, giving additional flavor to the model, but the relative success added by scoring well is smaller. There's some great 3 Cone standouts like Julio Jones, Cooper Kupp, Odell Beckham, who ran sub-6.7 times, but there's plenty of winners who ran slow times, so much so that the overall correlation of the 3 Cone drill is essentially random (0.022). In many models these measurements can be overweighted and you'll miss some valuable players (think of the DK Metcalf 3 Cone drama).

After each trait is binned and given the appropriate multiplier, a multilinear regression is applied. Here is the current weighting for the Wide Receiver Combine Score:

The total composite of the model produces the Combine Score. We can see the full process, using rookie WR Denzel Mims as an example.

Here are the top WR combines from the model.

The results of the WR Combine model reach a 0.415 correlation and an R-squared of 0.178.

We can compare this model against another composite athleticism measurement: SPARQ. The graph below shows that SPARQ is essentially worthless in terms of fantasy output. Although the sample size I've studied is smaller, Hayden Winks of Rotoworld also came to the same conclusion regarding SPARQ's value. As a solution, he created his own Adjusted SPARQ score that has an R-squared that ranges from 0.03-0.10.

Most importantly, we want to quantify how the combine can inform our perspective throughout the pre-draft process. Below is a comparison of my Pre-Draft Model before adding the combine (left) and after adding the combine (right). We see a nice increase in R-squared value, and Pre-Draft correlation settles at 0.571 - a quality improvement over Draft Capital's 0.494. The use of the combine especially raises the hit rate of the top prospects (right side of each chart). As determined by another multiple linear regression, the final WR Pre-Draft model actually uses a heavier weight for Combine Metrics (62%) rather than College Production (38%). Turns out the Combine does, in fact, matter.

The final Pre-Draft Model produces some great names once combine metrics are taken into consideration. Yellow names have produced an 80th percentile fantasy career (>11.0 PPG) and orange names have produced at least one Top 5 fantasy season.

The Combine Model provides value for the other positions as well. With correlations of 0.40 and R-squared values close to 0.17, the Combine Model adds an important lens to the pre-draft evaluation process.

RB Combine Model

Using the same methodology with the running backs, gives us four drills/measurements of importance: Arm Length, Weight Adjusted Speed, 40 Yard Dash, Weight Adjusted 3 Cone. By using both Weight Adjusted Speed and 40 Time, I am able to give credit for both horizontal power and pure speed. Arm Length was another surprisingly relevant feature, but there are a ton of backs who utilize their long arms to distance themselves from defenders with an effective stiff arm. Arian Foster, Derrick Henry, Adrian Peterson, Alvin Kamara, Aaron Jones, Dalvin Cook are just some of the studs who show up in the >32 inch Arm Length cohort. Weight Adjusted 3 Cone is a great asset in adding value, giving bumps to players like Le'Veon Bell, Christian McCaffrey, and David Johnson. Because many running backs do not run the 3 Cone, the Combine Model adjusts the weighting appropriately.

Tight End Combine Model

Tight Ends use almost every aspect of the combine drills. The raw agility times are particularly more useful for tight ends compared to the other positions. Height Adjusted Speed Score factors weight, height and 40 Time and is the most valuable predictor in the model. Big name standouts in the model include: George Kittle, Evan Engram, Noah Fant, Mike Gesicki and Vernon Davis (don't worry Gronk is identified as elite by my TE Production Model).

The Combine is one of the most fun highlights of the offseason. Now, it's one of the most useful components. The combine matters and fantasy football is Rocket Science. Get going with your own research and checkout the complete Combine Score Database in our Stat Suite.