How the NFL model works
A practical, calibrated player-prop model built for serious bettors who want to understand exactly what they're using. No black box, no marketing fluff. Below is every decision, including the ones that didn't work.
Four base models
The NFL model predicts four markets independently. Each is a weighted blend of three rolling averages (full season, last 5, last 3) tuned per position via grid search on the 2023 validation set, tested on 2024.
| Market | Pos | w_season | w_L5 | w_L3 | Test MAE | Win% | Notes |
|---|---|---|---|---|---|---|---|
| Receiving | WR | 0.90 | 0.00 | 0.10 | 24.18 yds | 48.0% | Season avg dominates · implied_team_total r=+0.006 noise |
| Receiving | TE | 0.90 | 0.10 | 0.00 | 17.41 yds | 34.3% | Hardest market — overconfident +5-12pp · raise STRONG to 15+yd |
| Receiving | RB | 0.80 | 0.00 | 0.20 | 12.63 yds | 52.6% | game_total r=0.16 meaningful for checkdown usage |
| Rushing | RB | 0.80 | 0.00 | 0.20 | 23.20 yds | 51.7% | Spread correlates but adjustment hurts — season avg captures script |
| Rushing | QB | 0.90 | 0.00 | 0.10 | 14.77 yds | 46.9% | Selective use only |
| Passing | QB | 0.60 | 0.25 | 0.15 | 67.51 yds | 49.9% | Only market where L5 (r=0.389) > season_avg (r=0.379) |
The TD model
Anytime Touchdown is modelled as a Poisson with player-specific rate λ shrunk toward a position-based prior using k=8 pseudo-observations. The original priors (QB 42%, RB1 52%) were dramatically overstated — actual hit rates are 17.7% and 37.5%. We recalibrated:
| Position | Old prior | Actual | New prior |
|---|---|---|---|
| QB | 42% | 17.7% | 18% |
| RB1 | 52% | 37.5% | 40% |
| WR1 | 35% | 29.7% | 28% |
| TE1 | 28% | 20.1% | 22% |
Game total enters multiplicatively: λ × 1.28 when total ≥ 50, × 1.10 when ≥ 46, × 0.90 when < 42. Hit rates spread from 21.6% to 29.4% empirically across these buckets — a real 36% relative swing.
The 9 situation edge dimensions
Each pick gets a combined edge score(0–6) summarising how favourable the situation is. Score ≥ 3 unlocks our highest conviction tier. Score < 2 picks are skipped unless ev_score > 0.75.
| Factor | How it's applied | Evidence |
|---|---|---|
| Wind ≥ 15 mph | Passing/Receiving × 0.84-0.88 | QB bias drops -16yd; severe (20+) drops -27yd |
| Dome | Receiving × 1.04 | Indoor WR OVER 47% vs outdoor 42% |
| Short week | No adjustment (insufficient n) | TNF sample only 179 QB games — wait for more data |
| Game total ≥ 50 | Passing × 1.06, TD λ × 1.28 | TD hit rate: 21.6% (low) → 29.4% (50+) |
| Spread > 7 (dog) | Passing × 1.08 | Forced to throw — books underadjust |
| Defence quintile | No multiplier — used as filter | Worst-def QB bias +25.7yd, OVER rate 54% |
| Weeks 1-3 | Edge thresholds × 1.30 | Higher prior pull, larger errors — be selective |
| Home/Away | Note venue effects | CAR/CHI/SF/GB/LA dock away QBs -16 to -29yd |
| Combined score | ≥ 3 = highest tier | Skip ev_score<0.45 unless combined≥3 |
Interaction edges (stacked situations)
Single-factor analysis found defence quintile is the strongest signal. When we crossed it with other situations, the lift grew — and one finding overrides our default rule.
| Stack | OVER rate | Lift | n | Decision |
|---|---|---|---|---|
| TE × dome × worst defence | 59.3% | +13.4pp | 241 | TE exception — +2 to situation score |
| RB × worst defence × home | 57.9% | +12.5pp | 582 | +1 to situation score |
| RB × worst defence × low total | 58.5% | +13.0pp | 496 | +1 to situation score |
| QB × elite defence × windy | 22.9% | -27.3pp | 48 | UNDER hint (small n) |
| WR × elite defence × away | 34.3% | -9.6pp | 831 | Avoid OVER, prefer UNDER |
The TE finding is the most important: even though TE has the worst overall win rate (34%), TE players in domes facing the worst defences win at 59%. This is the one situation where we DO publish TE picks.
Regression-to-mean signals
Two counterintuitive findings: alpha receivers (target share ≥ 28%) and workhorse RBs (carries ≥ 18/game) UNDER-perform their season averages. Mechanism: their seasonal mean is inflated by ceiling outliers they can't replicate weekly.
| Player Type | OVER rate | Lift | n |
|---|---|---|---|
| Alpha receivers (TS ≥ 28%) | 37.0% | -6.7pp | 722 |
| Alpha × high team total (24+) | 32.9% | -11.0pp | 240 |
| Workhorse RBs (≥ 18 carries/g) | 32.3% | -11.1pp | 341 |
| RB carries rising L3 by +3 | 51.9% | +8.5pp | 472 |
| RB carries falling L3 by -3 | 36.7% | -6.7pp | 316 |
| Elite OL × worst defence (RB) | 60.9% | +15.5pp | 325 |
Practical effect: we apply a -1 to combined edge score on OVER picks for alphas/workhorses (and the inverse +1 for UNDERs). Carry-trend and OL × defence add direct bonuses to the situation score.
Triple-stack signals (3 factors aligned)
We scanned 3-way interactions for compound win rates ≥ 58% or ≤ 42% at n ≥ 100. The strongest signals are large enough that we believe books underprice them — these unlock the highest conviction tier.
| Stack | OVER rate | Lift | n |
|---|---|---|---|
| RB × worst defence × home × pickem spread | 66.7% | +21.2pp | 171 |
| QB × worst defence × home × low total | 64.7% | +14.5pp | 102 |
| TE × worst defence × home × dog spread | 63.9% | +17.9pp | 166 |
| TE × worst defence × home × dome | 63.1% | +17.2pp | 141 |
| RB × worst defence × away × dog spread | 60.5% | +15.1pp | 238 |
| QB × elite defence × pickem spread | 29.8% | -20.4pp | 104 |
| RB × elite defence × home × dome | 25.7% | -19.8pp | 191 |
| TE × elite defence × home × low total | 27.5% | -18.5pp | 204 |
Things we tried and rejected
- Implied team total as receiving predictor — r=+0.006. Zero predictive power for WR receiving yards. Removed from model.
- Spread adjustment for rushing — increased MAE from 23.20 to 23.73. Player season averages already capture run/pass mix. Removed.
- k=20 shrinkage for TDs — over-shrunk to inflated priors. Settled on k=8 after Brier-score sweep on validation.
- Combined L5 + L3 for receiving — both correlate weaker than season avg (r=0.455 vs r=0.34 / r=0.31). Recency penalised player consistency.
Held-out validation (the honest test)
All the bonuses above were discovered AND tuned on 2020-2024 data. That's the textbook setup for overfitting. So we ran the test that matters: re-derive every threshold using 2020-2023 only, then test on the held-out 2024 season. Every rule directionally held its sign; most got stronger on test.
| Rule | Train lift (2020-23) | Test lift (2024) | Verdict |
|---|---|---|---|
| RB × worst defence | +11.0pp | +14.1pp | STRONGER |
| RB × elite defence | -11.1pp | -13.2pp | STRONGER |
| QB × worst defence | +11.0pp | +0.5pp | decayed |
| QB × elite defence | -12.9pp | -8.2pp | held |
| WR × worst defence | +6.2pp | +5.9pp | held |
| WR × elite defence | -8.2pp | -5.7pp | held |
| RB × worst × home × pickem | +21.1pp | +21.7pp | STRONGER |
| TE × dome × worst defence | +12.7pp | +14.6pp | STRONGER |
| Alpha WR (TS ≥ 28%) | -6.2pp | -8.6pp | STRONGER |
| Workhorse RB (carries ≥ 18) | -14.0pp | -4.2pp | decayed |
| RB carries rising L3 by +3 | +7.3pp | +14.9pp | STRONGER |
11 of 11 rules directionally held on 2024. Two decayed in magnitude (QB × worst defence and workhorse RB) but kept their sign. The most aggressive triple-stack signal (RB × worst × home × pickem) held its +21pp lift almost exactly. We also cap defence-related bonuses at ±3 to prevent any single fact (e.g. "bad defence") from being scored through multiple lenses.
Playoff biases (re-derived 2026-05-30)
The original documentation claimed QB rushing has a -4.73yd playoff bias. We pulled 1,240 playoff player-games (2020-2024 via nfl_data_py) and re-measured. The QB rushing claim was wrong. The real playoff biases are larger and on different markets.
| Market | Regular bias | Playoff bias | Δ (playoff effect) | Applied as |
|---|---|---|---|---|
| RB rushing | +1.7yd | -11.8yd | -13.5yd | prediction -13.5 |
| QB passing | +3.4yd | -7.2yd | -10.5yd | prediction -10.5 |
| TE receiving | +1.6yd | -4.5yd | -6.1yd | prediction -6.0 |
| WR receiving | -0.0yd | -2.6yd | -2.5yd | prediction -2.5 |
| QB rushing | +0.5yd | +0.1yd | -0.4yd | no adjustment (old rule rescinded) |
The pick selector applies these as explicit prediction adjustments whenctx['is_playoff']is true. RB rushing OVERs in playoffs get a -13.5yd haircut to the model prediction before tier assignment — so a 90yd projection becomes 76.5yd.
Calibration correction
Raw model probabilities were systematically overconfident vs proxy lines by 2-9pp. We apply p_calibrated = 0.5 + α × (p_raw − 0.5) where α is per-market: WR 0.88, TE 0.82, RB receiving 0.78, RB rushing 0.80, QB rushing 0.82, QB passing 0.94. This is applied before tier assignment.