Why xG Is Overrated for Predicting Upsets
Expected goals revolutionised football analysis. It also produces some of the worst upset predictions in betting. Here's why xG falls down in the matches you actually care about.
Expected goals revolutionised football analysis. It also produces some of the worst upset predictions in betting. Here's why xG falls down in the matches you actually care about.
Expected goals β xG β is the most important football statistic of the last 15 years. It quantified what every analyst already knew (some shots are better than others, some defences concede higher-quality chances than others) into a number you can put on a chart.
It's a real advance. We use xG-derived inputs in our model. Every credible analytics outfit does.
But xG has a specific failure mode, and it shows up exactly where the most casual football fans interact with statistical analysis: predicting upsets.
If you've ever seen a "Burnley dominated the xG and lost 0-3" hot take, you've seen the failure. This post is about what's actually happening when xG and result diverge, why it happens more often than people expect, and what it means for using xG-based predictions.
xG estimates the probability that a given shot becomes a goal, based on factors like distance, angle, body part, defensive pressure, and assist type. Sum across a match's shots and you get the expected goals each team should have scored, on average, given the chances they created.
A team that creates 1.8 xG and concedes 0.6 xG should win comfortably most of the time. They had better and more chances.
That's the average. The variance is huge.
A 0.3 xG shot has a 30% probability of becoming a goal β averaged across all shots ever taken with similar characteristics. In any specific instance, the shot either goes in or doesn't. There is no "30% goal." There's a binary outcome.
Across 90 minutes, a team takes maybe 12-15 shots. Each is a probabilistic event. The realised goals can vary widely from the expected total.
A simple example: a team with 10 shots, each at 0.15 xG, has 1.5 expected goals. The probability they score:
Most likely outcome is one goal. But scoring zero is nearly 1 in 5. Scoring three or more is also nearly 1 in 5. The xG number (1.5) hides this distribution.
Now combine two teams with their own distributions. The probability of any specific scoreline starts to look much fuzzier than the xG totals suggest.
Consider Burnley vs Manchester City. xG totals at full time: Burnley 0.4, Man City 2.1.
City should win comfortably most of the time. Their xG advantage is real. But "most of the time" isn't always.
Run that match 100 times in your head:
That last category isn't impossible. It happens about 10% of the time. Burnley scores their one chance, City misses everything, and the table reads 1-0 to Burnley.
When this happens, the post-match analysis writes itself: "Burnley won despite being dominated on xG. Their tactical organisation and clinical finishing overcame possession dominance."
This is the wrong frame. Burnley didn't "overcome" anything. They got lucky in a specific match. The 10% outcome happened. xG was a perfectly fine prediction; the realised outcome just landed in the tail of the distribution.
Models that use xG as their primary input have a specific failure mode for upset matches.
They underestimate variance. Quoting "City has 2.1 xG vs Burnley's 0.4" sounds like a confident prediction of a City win. But the underlying probability of any single goal materialising is much fuzzier than the xG totals suggest. A 2-1 underdog upset isn't a failure of xG β it's a normal outcome that xG itself implies.
They underweight defensive concentration. Some teams are deliberately defensive. Burnley under Sean Dyche didn't try to dominate possession or chances; they tried to stay compact, deny the opposition good chances, and counter once or twice with whatever they had. Their xG totals look terrible because that's the system. But their results were better than xG predicted, consistently, across many seasons.
A pure xG-based prediction model treats Burnley's 0.4 xG game as if Burnley were just lucky to keep things to 0.4. In reality, Burnley engineered 0.4 xG. The opposition didn't get good chances because Burnley's defensive structure prevented them. Predicting their next match using their offensive xG without weighting their defensive xG appropriately gets you wrong answers.
They miss situational factors. A team's xG average over a season tells you about their typical performance. A specific match might be unusual. A team playing for a draw against superior opposition will produce different patterns than the same team chasing a win. Pure xG models don't capture this without explicit context features.
Our model uses xG-derived inputs (specifically, attack and defence quality estimates that lean on xG-style chance quality), but it builds them into a Dixon-Coles Poisson framework that explicitly models the variance.
This matters. When we say "City 75% to win, 15% to draw, 10% Burnley win," we're not collapsing those percentages into a winner prediction. We're saying this is the probability distribution. The 10% Burnley win isn't a flaw β it's an honest part of the forecast.
If Burnley wins, our model wasn't "wrong." Our model said 10%. That's a reasonable probability that gets realised about 1 time in 10. If we say 10% on a thousand similar matches and the underdog wins 100 times, our model is calibrated.
If we never said 10% on those matches and instead said 2%, then we'd be wrong.
A specific bad-faith argument shows up in football discourse: "Team X dominated the xG but lost. Therefore xG is broken / clutch matters / there's something the data can't see."
The first two are wrong; the third has a kernel of truth.
xG isn't broken. A team with 1.8 xG in a match is more likely to win than a team with 0.6 xG, on average across many matches. The single match outcome doesn't refute this β it's a draw from the distribution.
Clutch is mostly noise. "Clinical finishers" sometimes exist for a season or two and then regress to the mean. The narratives outlast the underlying truth. Some shots in some moments do matter more (penalties, shootouts, late equalisers in tight games), but most "clutch" patterns disappear in larger samples.
Some things xG can't see, sometimes. A team that's deliberately playing for a draw, parking the bus, soaking up pressure to score on a counter β that team's xG profile underrepresents what they're trying to do. They'd rather have 0.4 xG with a goal than 1.4 xG with no goal. Pure xG misses this strategy.
This is a real limitation of xG, but it's not "xG is broken." It's "xG is a measure of chance quality; it doesn't capture intent or strategic context."
Three rules of thumb for using xG without being misled.
Treat xG totals as expected averages, not predictions. "City has 2.1 xG and Burnley has 0.4 xG" doesn't predict the score; it describes what the chances created suggest should happen on average. The actual match has variance.
Look at xG over many matches, not single matches. A team's underlying xG performance over 10-15 matches is much more meaningful than any single match. If a team consistently outperforms or underperforms their xG over a long period, that's signal. If it's just a few matches, it's likely noise.
Don't use xG alone to predict upsets. When the xG suggests a one-sided match, the upside scenario is real but harder to see in the numbers. Combining xG with other inputs β defensive style, recent form, tactical context β gets you closer to a useful prediction.
To be clear: xG remains the best widely-available football statistic for many things.
Where xG falls down is in single-match upset prediction. That's a narrow but visible failure mode that's worth understanding.
xG is a great statistic. It's also one that gets overinterpreted constantly. The fix isn't to abandon it; it's to remember that probabilities are distributions, single matches are draws, and upsets you call "xG dominance failures" are usually just the model behaving exactly as expected.
OddsIQ provides AI analysis, not financial or betting advice. Past performance does not guarantee future results. Gamble responsibly: BeGambleAware, GamCare, GamStop.