{"id":913,"date":"2014-11-25T19:39:27","date_gmt":"2014-11-25T19:39:27","guid":{"rendered":"http:\/\/opisthokonta.net\/?p=913"},"modified":"2015-08-22T18:19:16","modified_gmt":"2015-08-22T18:19:16","slug":"the-dixon-coles-model-for-predicting-football-matches-in-r-part-2","status":"publish","type":"post","link":"https:\/\/opisthokonta.net\/?p=913","title":{"rendered":"The Dixon-Coles model for predicting football matches in R (part 2)"},"content":{"rendered":"<p><a href=\"https:\/\/opisthokonta.net\/?p=890\">Part 1<\/a> ended with running the optimizer function to estimate the parameters in the model:<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nlibrary(alabama)\r\nres &lt;- auglag(par=par.inits, fn=DCoptimFn, heq=DCattackConstr, DCm=dcm)\r\n\r\n# Take a look at the parameters\r\nres$par\r\n<\/pre>\n<p>In part 1 I fitted the model to data from the 2011-12 Premier League season. Now it&#8217;s time to use the model to make a prediction. As an example I will predict the result of Bolton playing at home against Blackburn. <\/p>\n<p>The first thing we need to do is to calculate the mu and lambda parameters,  which is (approximately anyway) the expected number of goals scored by the home and away team. To do this wee need to extract the correct parameters from the <em>res$par<\/em> vector. Recall that I in the last post gave the parameters informative names that consists of the team name prefixed by either <em>Attack<\/em> or <em>Defence<\/em>.<br \/>\n<del datetime=\"2014-12-21T11:47:31+00:00\">Also notice that I have to multiply the team parameters and then exponentiate the result to get the correct answer. <\/del><\/p>\n<p><em>Update: For some reason I got the idea that the team parameters should be multiplied together, instead of added together, but I have now fixed the code and the results. <\/em><\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\n# Expected goals home\r\nlambda &lt;- exp(res$par['HOME'] + res$par['Attack.Bolton'] + res$par['Defence.Blackburn'])\r\n\r\n# Expected goals away\r\nmu &lt;- exp(res$par['Attack.Blackburn'] + res$par['Defence.Bolton'])\r\n<\/pre>\n<p>We get that Bolton is expected to score 2.07 goals and Blackburn is expected to score 1.59 goals.<\/p>\n<p>Since the model assumes dependencies between the number of goals scored by the two teams, it is insufficient to just plug the lambda and mu parameters into R&#8217;s built-in Poisson function to get the probabilities for the number of goals scored by the two teams. We also need to incorporate the adjustment for the low-scoring results as well. One strategy to do this is to first create a matrix based on the simple independent Poisson distributions: <\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nmaxgoal &lt;- 6 # will be useful later\r\nprobability_matrix &lt;- dpois(0:maxgoal, lambda) %*% t(dpois(0:maxgoal, mu))\r\n<\/pre>\n<p>The number of home goals follows the vertical axis and the away goals follow the horizontal. <\/p>\n<p>Now we can use the estimated dependency parameter rho to create a 2-by-2 matrix with scaling factors, that is then element-wise multiplied with the top left elements of the matrix calculated above:<\/p>\n<p><em>Update: Thanks to Mike who pointed out a mistake in this code. <\/em><\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nscaling_matrix &lt;- matrix(tau(c(0,1,0,1), c(0,0,1,1), lambda, mu, res$par['RHO']), nrow=2)\r\nprobability_matrix[1:2, 1:2] &lt;- probability_matrix[1:2, 1:2] * scaling_matrix\r\n<\/pre>\n<p>With this matrix it is easy to calculate the probabilities for the three match outcomes:<\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nHomeWinProbability &lt;- sum(probability_matrix[lower.tri(probability_matrix)])\r\nDrawProbability &lt;- sum(diag(probability_matrix))\r\nAwayWinProbability &lt;- sum(probability_matrix[upper.tri(probability_matrix)])\r\n<\/pre>\n<p>This gives a probability of 0.49 for home win, 0.21 for draw and 0.29 for away win. <\/p>\n<p>Calculating the probabilities for the different goal differences is a bit trickier. The probabilities for each goal difference can be found by adding up the numbers on the diagonals, with the sum of the main diagonal being the probability of a draw. <\/p>\n<pre class=\"brush: r; title: ; notranslate\" title=\"\">\r\nawayG &lt;- numeric(maxgoal)\r\n for (gg in 2:maxgoal){\r\n   awayG[gg-1] &lt;- sum(diag(probability_matrix[,gg:(maxgoal+1)]))\r\n }\r\nawayG[maxgoal] &lt;- probability_matrix[1,(maxgoal+1)]\r\n\r\nhomeG &lt;- numeric(maxgoal)\r\n  for (gg in 2:maxgoal){\r\n    homeG[gg-1] &lt;- sum(diag(probability_matrix[gg:(maxgoal+1),]))\r\n  }\r\nhomeG[maxgoal] &lt;- probability_matrix[(maxgoal+1),1]\r\n\r\ngoaldiffs &lt;- c(rev(awayG), sum(diag(probability_matrix)), homeG)\r\nnames(goaldiffs) &lt;- -maxgoal:maxgoal\r\n<\/pre>\n<p>It is always nice to plot the probability distribution:<\/p>\n<p><a href=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCBoltonBlackburn.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCBoltonBlackburn.png\" alt=\"DCBoltonBlackburn\" width=\"682\" height=\"430\" class=\"aligncenter size-full wp-image-1004\" srcset=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCBoltonBlackburn.png 682w, https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCBoltonBlackburn-300x189.png 300w, https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCBoltonBlackburn-475x300.png 475w\" sizes=\"auto, (max-width: 682px) 100vw, 682px\" \/><\/a><\/p>\n<p>We can also see compare this distribution with the distribution without the Dixon-Coles adjustment (<em>i.e.<\/em> the goals scored by the two teams are independent):<\/p>\n<p><a href=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCboltonBlackburn2.png\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCboltonBlackburn2.png\" alt=\"DCboltonBlackburn2\" width=\"669\" height=\"431\" class=\"aligncenter size-full wp-image-1005\" srcset=\"https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCboltonBlackburn2.png 669w, https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCboltonBlackburn2-300x193.png 300w, https:\/\/opisthokonta.net\/wp-content\/uploads\/2014\/12\/DCboltonBlackburn2-465x300.png 465w\" sizes=\"auto, (max-width: 669px) 100vw, 669px\" \/><\/a><\/p>\n<p>As expected, we see that the adjustment gives higher probability for draw, and lower probabilities for goal differences of one goal.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Part 1 ended with running the optimizer function to estimate the parameters in the model: In part 1 I fitted the model to data from the 2011-12 Premier League season. Now it&#8217;s time to use the model to make a &hellip; <a href=\"https:\/\/opisthokonta.net\/?p=913\">Continue reading <span class=\"meta-nav\">&rarr;<\/span><\/a><\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[48,5,6],"tags":[],"class_list":["post-913","post","type-post","status-publish","format-standard","hentry","category-dixon-coles-model","category-r","category-soccer"],"_links":{"self":[{"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/posts\/913","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcomments&post=913"}],"version-history":[{"count":21,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/posts\/913\/revisions"}],"predecessor-version":[{"id":1006,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=\/wp\/v2\/posts\/913\/revisions\/1006"}],"wp:attachment":[{"href":"https:\/\/opisthokonta.net\/index.php?rest_route=%2Fwp%2Fv2%2Fmedia&parent=913"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=%2Fwp%2Fv2%2Fcategories&post=913"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/opisthokonta.net\/index.php?rest_route=%2Fwp%2Fv2%2Ftags&post=913"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}