A small adjustment to the Poisson model that improves predictions.

There are a lot extensions to the basic Poisson model for predicting football results, where perhaps the most popular is the Dixon-Coles model which I and other have written a lot about. One paper that seem to have received little attention is the 2001 paper Prediction and Retrospective Analysis of Soccer Matches in a League by Håvard Rue and Øyvind Salvesen (preprint available here). The model they describe in the paper extend the Dixon-Coles and Poisson model in several ways. The most interesting extension in how they allow the attack and defense parameters vary over time, by estimating a separate set of parameters for each match. This might at first seem like a task that should be impossible, but they manage to pull it of by using some Bayesian magic that let the estimated parameters borrow information across time. I have tried to implement something similar like this in Stan, but I haven’t gotten it to work quite right, so that will have to wait for another time. There’s many other interesting extensions in the paper as well, and here I am going to focus on one of of them which is an adjustment for teams to over and underestimate opponents when they differ in strengths.

The adjustment is added to the formulas for calculating the log-expected goals. So if team A plays team B at home, the log-expected goals \(\lambda_A\) and \(\lambda_B\)

\( \lambda_A = \alpha + \beta + attack_{A} – defense_{B} – \gamma \Delta_{AB} \)

\( \lambda_B = \alpha + attack_{B} – defense_{A} + \gamma \Delta_{AB} \)

In these formulas are \(\alpha\) the intercept, \(\beta\) the home team advantage and \(\Delta_{AB}\) is a factor that determines the amount a team under- or overestimation the strength of the opponent. This factor is given as

\(\Delta_{AB} = (attack_{A} + defense_{A} – attack_{B} – defense_{B}) / 2\)

The parameter \(\gamma\) determines how large this effect is. A positive \(\gamma\) implies that a strong team will underestimate a weak opponent, and thereby score fewer goals than we would otherwise expect, and vice versa for the opponent.

In the paper they do not estimate the \(\gamma\) parameter directly together with the other parameters, but instead set it to a constant, with a value they determine by backtesting to maximize predictive ability.

When I implemented this model in R and estimated it using Maximum Likelihood I noticed that adding the adjustment did not improve the model fit. I suspect that this might be because the model is nearly unidentifiable. I even tried to add a Normal prior on \(\gamma\) and get a Maximum a Posteriori (MAP) estimate, but then the MAP estimate were completely determined by the expected value of the prior. Because of these problems I decided to use a different strategy: I estimated the model without the adjustment, but add the adjustment when making predictions.

I am not going to post any R code on how to do this, but if you have estimated a Poisson or Dixon-Coles model, it should not be that difficult to add the adjustment when you calculate the predictions. If you are going to use some of the code I have posted on this blog before, you should notice the important detail that in the formulation above I have followed the paper and changed the signs of the defense parameters.

In the paper Rue and Salvesen write that \(\gamma = 0.1\) seemed to be an overall good value when they analyze English Premier League data. To see if my approach of adding the adjustment only when doing predictions is reasonable I did a leave-one-out cross validation on some seasons of English Premier League and German Bundesliga. I fitted the model to all the games in a season, except one, and then add the adjustment when predicting the result of the left out match. I did this for several values of \(\gamma\) to see which values works best.

Here is a plot of the Ranked Probability Score (RPS), which is a measure of prediction accuracy, against different values of \(\gamma\) for the 2011-12 Premier League season:

As you see I even tried some negative values of \(\gamma\), just in case. At least in this season the result agrees with the estimate \(\gamma = 0.1\) that Rue and Salvesen reported. In some of the later seasons that I checked the optimal \(\gamma\) varies somewhat. In some seasons it is almost 0, but then again in some others it is around 0.1. So at least for Premier league, using \(\gamma = 0.1\) seems reasonable.

Things are a bit different in Bundesliga. Here is the same kind of plot for the 2011-12 season:

As you see the optimal value here is around 0.25. In the other seasons I checked the optimal value were somewhere between 0.15 and 0.3. So the effect of over- and underestimating the opponent seem to be greater in the Bundesliga than in Premier League.