It may be that a football team who has had a hectic period with a lot of games will, because of lack of training and restitution, perform poorer. The Wikipedia page for the FA Cup mentions Manchester United’s absence from the cup as a reason for why they won the Premier League by 18 points in the 1999-2000 season. If this is indeed the case, then this is something we could try to exploit in a prediction model.
I used basically the same data and model as I have used before. I used data from the English Championship and the Premier League, and predicted the Premier League games from January 2007 until January 2015 using the independent Poisson model with the Dixon & Coles weighting method (more details on the setup here and here). In addition I constructed a new variable, the number of matches each team has played the last x number of days, were we can use and try different values of x. As a pretentious shorthand I will call this the Match Schedule Intensity Index (MSII). Matches from the FA Cup, Europa Cup and Champions League were also included in the calculations.
As usual the ranked probability score (RPS) is used to assess the prediction accuracy.
I tried four different number of days backwards in time (21, 25, 28 and 31 days) and also varied the time weighing parameter \(\xi\) a bit to see how these things varied together.
Plotting the RPS, number of days back in time and the different values of \(\xi\) against each other gives the following:
We see that looking back 28 days, or four weeks, back in time gives the lowest RPS and this the most accurate predictions of the four alternatives. 25 days is almost as good as 28 days, while 21 and 31 days performs poorer than not having the MSII in the model at all. I am not sure how important the drop in RPS is, as the changes are around the 4th and 5th decimal place. It is probably not that much, but on the other hand, this is an average over 3000 matches, and the number of days backward in time seems to be a more important parameter than the small changes in \(\xi\) that I tried.
It is also interesting to see what effect the MSII has on the number of goals scored. I plotted the estimated multiplicative effect for each additional match for all the fitted models from 2007 to 2015 using the best model with 28 days and \(\xi=0.0020\).
I expected the effect of additional matches to be negative, meaning the more games the team has recently played, the fewer goals will they be expected to score. This seems to be at least halfway true, except for a few dips over on the positive side around 2010 and 2013-2014, and a rather large positive effect from the start in 2007 until 2008. This was a bit surprising, and I don’t know why. It would be interesting to redo the analysis with data going further back in time to see how far back the positive effect goes.
Is the effect large? Not really. The most extreme values of the multiplicative effects for the MSII is around 0.97 and 1.04. These values means that for each match a team has played more in the last four weeks they are expected to score around 3-4% more or fewer goals. This effect is around 10% for a team that has played four matches in four weeks, which is a typical mid-season schedule. This is not that big of a deal for individual matches, but it seem to improve the predictions in the long run. But I also think it is necessary to keep in mind that the effect in seems to be mostly absent in some periods.