Why We Don't Model Injuries (And Why You Should Be Suspicious of Models That Do)
Every betting model that claims to factor in injuries is either using bad data or pretending. Here's what we tried, why we stopped, and what we use instead.
Every betting model that claims to factor in injuries is either using bad data or pretending. Here's what we tried, why we stopped, and what we use instead.
A reader emailed last week: "Your model rates Liverpool at 18% to win the league, but Salah just picked up a hamstring strain. Shouldn't you be adjusting?"
Honest answer: no. Or at least, not the way you'd expect.
This post is about why our model doesn't ingest injury data β even though it would seem obvious to do so β and why most of the AI football tools claiming they "factor in injuries" are either using poor data or being economical with the truth.
To genuinely factor in injuries, a model needs three things, all working in real time:
Every step is harder than it sounds.
Injury reporting in football is a mess. Premier League clubs don't publish official injury lists. They release vague pre-match press conference quotes β "we'll see how he is on Saturday" β that get aggregated by sites of varying reliability.
Even the better aggregators (transfermarkt, premierinjuries.com) are running a few hours behind reality at best, days behind at worst. By the time a model could ingest "Salah is out," the bookmakers have already adjusted their lines. The edge has vanished.
There's also the strategic ambiguity problem. Pep Guardiola is famous for declaring players "75% fit" or "we'll decide on the morning of the game." That's not a data point. That's a manager hedging.
Some sportsbooks have insider sources at training grounds. Their lines move on injury news 2-3 hours before public reports. If you're modelling against bookmaker odds, by the time you've ingested the public injury data, you're acting on stale information.
Even if the data were clean, quantifying player importance is genuinely hard.
The naive approach is to use minutes played, goals, assists, or some xG-based contribution metric. Multiply by an "importance weight" and subtract from the team's expected goal output. Easy.
It doesn't work, for several reasons.
Replacement quality varies. Liverpool without Salah doesn't just lose Salah's contribution β they replace him with a worse winger. But how much worse? Diogo Jota stepping in is different from a 19-year-old academy product debuting. The model would need to know who's actually starting.
Tactical reorganisation isn't linear. A team without their first-choice ball-playing centre-back doesn't just defend X% worse. They sometimes change formation entirely, drop a midfielder back, alter their pressing structure. The system change can matter more than the individual.
Star players have multiplicative effects. Kevin De Bruyne creates chances that wouldn't exist without him. Subtract his minutes and you don't just lose his goals and assists β you lose the chances his teammates would have had from his passing.
There's academic literature on this (it's called "player valuation in team sports") and the honest summary is: nobody has cracked it. The best models can identify which players move betting markets the most when injured, but that's a downstream signal, not a pre-game predictor.
Bookmakers don't model injuries cleanly either. What they do β and what gives them their edge β is react to injury news faster than the public.
When Bukayo Saka tweets a fitness update, the lines move within minutes. The bookmaker doesn't run a probability calculation. They make a discretionary adjustment based on internal heuristics ("a Saka injury is worth roughly +0.4 to Arsenal's opponent moneyline"), then let the market self-correct as more sharp money flows in.
This is fundamentally different from a model. It's a reaction loop, not a forecast. Anyone trying to mimic it with public data is always one step behind.
We've made a deliberate choice: our model treats injuries as noise, not as signal.
The model assumes a roughly representative starting eleven for each team. When teams have prolonged absences (a defender out for 6+ matches with a torn ACL, say), the team's recent form will reflect their reduced effectiveness β they'll have lost or drawn matches they'd otherwise have won, and that filters into their Elo rating naturally.
But for short-term, match-by-match injury news? We don't try.
This means our model will sometimes look outdated. We rated Liverpool at 78% to win against Everton last March, then Salah picked up a knock 90 minutes before kickoff. The match was 1-1. A model with live injury data would have re-priced; ours didn't.
We accept this trade-off because the alternative β pretending we have injury data we don't actually have, or making up adjustments based on rumour β would be dishonest.
Two practical implications:
Check team news yourself before betting. If our model rates a match 80% / 15% / 5% but the favourite is missing their starting goalkeeper and three first-team midfielders, our number is wrong on that day. Use your own judgement.
Trust our edge less in matches where one team has heavy absences. When teams have squad chaos, single-match outcomes are dominated by factors our model can't see. We do better in normal weeks than in chaotic ones.
You can see this in our calibration data. Match-week 22 of the 2025-26 season, where 4 of the top 6 teams had multiple first-team injuries, was our worst week of the season for prediction accuracy. We weren't surprised. The model wasn't built for that.
A lot of AI tools in this space claim more than they can deliver. "Our AI factors in 200+ data points including real-time injury news, weather, referee tendencies, manager mood, and pre-match press conference sentiment."
When you press for specifics, the data sources get vague. The "real-time" part means "scraped from a public site that updates twice a day." The "sentiment analysis" is a pre-trained model run on press conference transcripts that nobody validates.
We could build the same. We've thought about it. The reason we don't is that we'd then have to defend it. When a journalist asks "your model said 80%, what happened?" we want to be able to say the model was honest about what it knew and didn't know. Adding fake sophistication to seem smarter would make us less credible, not more.
The model we ship is simpler than what marketing wants it to be. That's a feature, not a bug.
If real-time injury data ever becomes reliably available β clean, current, with proper player-importance weighting β we'll integrate it. The methodology page will be updated and you'll see it called out explicitly.
Until then: when our predictions look wrong because of obvious team-news, that's exactly what's happening. Don't take the number at face value during chaotic weeks. The model is right about the average week and silent about the specifics.
That's the honest answer.
More reading: Methodology page Β· Calibration chart Β· How our Premier League model works
OddsIQ provides AI analysis, not financial or betting advice. Past performance does not guarantee future results. Gamble responsibly: BeGambleAware, GamCare, GamStop.