A couple of weeks ago I posted a data set with the location of the stadiums for many of the football teams in Europe. One thing I wanted to use the dataset for was to see if the traveling distance between two teams (as measured by the distance between the two team’s home stadium) influenced home field advantage.

To calculate the home field advantage for each match i did the following: For each team, the average goal difference during the season are calculated (goals scored minus goals conceded divided by the number of matches). Then the expected goal difference for a match is the difference between the average goal differences (home minus away). The home field advantage is then the observed goal difference minus the expected goal difference.

In the 2012-13 Premier League season, for example, Chelsea scored 75 goals and conceded 39 goals in total. Everton scored 55 and conceded 40 goals. Both teams played 38 matches during the season. On average Chelsea had a goal difference of per match of 0.947 and Everton’s average were 0.395. With Chelsea meeting Everton at home the expected goal difference is 0.947 – 0.395 = 0.553. The actual outcome for this match was 2-1, a goal difference of 1. The home field advantage for this match is then 1 – 0.553 = 0.447.

Using data from the 2011-12 and 2012-13 seasons from the top divisions from Spain, France Germany, and the 2012-13 from England I used the stadium coordinates to calculate the traveling distance for the visiting team and the home field advantage. Plotting these two against each other, and drawing the least squares line gives this:

There is a great deal of noise in this plot, to put it mildly. The slope of the red line is 0.00006039. This is the estimated increase in number of goals the home team scores for each kilometer the away team has traveled. This is not significantly different from 0 (p-value = 0.646). The intercept, where the red line crosses the vertical axis is 0.4, meaning that the home team is estimated to score 0.4 more goals than expected, if the opposing team has traveled 0 kilometers. This is highly significant (p-value = 1.71e-11).

To be honest, I am a bit surprised to see such a clear lack of effect of traveling distance. I did not expect a particularly strong, or even very significant effect, but I had hoped to see at least a hint at something. Perhaps one reason for the lack of effect is that traveling distance is not necessarily the same as traveling time as longer distances may be covered by air, making them comparable to shorter travels by land.

It should be kept in mind that these results should only apply to the leagues included in the data. It could be that traveling distance could have a significant effect on longer distances, for example in international competitions such as the Champions League or between national teams.

This is a very interesting result. You may have expected something different but the fact that there is no such effect as travel distance rules out one of the possible reasons home field advantage (HFA) exists at all, which is still a mystery.

What I think is most likely to be the reason for HFA is that you play less well in unfamiliar surroundings. In your own stadium, you know every edge of your stadium, you know exactly where you are on the pitch even when you only see the stands. Away from home, players do not have that.

That is not necessary. The biggest factor is the audience. If you will measure, somehow, audience s efficiency on home team advantage, probably you ll see the higher number audience mean higher home team efficiency, unless the number of audience do not differenciate regarding home/away. I.e in turkish leagues, to sell their supporters, away teams are given only 10% of total tickets.

Hi, would you mind sharing your R code? I added stadium coordinates of the second leagues and Italy to your dataset and like to check if there might be an effect in the second league. You can reach me at klemens[-at-]rationalsoccer.com

Very neat analysis, by the way!

That would be interesting. I will put up the code this weekend, it needs some refactoring etc. to make it useful for others.