I was asked in the comments for the R code for the ranked probability score, so instead of posting it deep down in the comments I thought I’d post it as a proper blog instead. The ranked probability score (RPS) is a measure of how similar two probability distributions are and is used as a way to evaluate the quality of a probabilistic prediction. It is an example of a proper scoring rule.

The RPS was brought to my attention in the paper *Solving the problem of inadequate scoring rules for assessing probabilistic football forecasting models* by Constantinou and Fenton. In that paper they argue that the RPS is the best measure of the quality of football predictions when the predictions are of the type where you have probabilities for the outcome (home win, draw or away win). The thing about the RPS is that it also reflects that an away win is in a sense *closer* to a draw than a home win. That means that a higher probability predicted for a draw is considered better than a higher probability for home win if the actual result is an away win.

You can also find some more details at the pena.lt/y blog.

The following R function takes two arguments. The first argument (*predictions*) is a matrix with the predictions. It should be laid out so that each row is one prediction, laid out in the proper order, where each element is a probability and each row sum to 1. The second argument (*observed*) is a numeric vector that indicates which outcome that was actually observed.

For assessing football predictions the predictions matrix would have three columns, with the probabilities for the match ordered as home, draw and away (or in the opposite order).

rankProbScore <- function(predictions, observed){ ncat <- ncol(predictions) npred <- nrow(predictions) rps <- numeric(npred) for (rr in 1:npred){ obsvec <- rep(0, ncat) obsvec[observed[rr]] <- 1 cumulative <- 0 for (i in 1:ncat){ cumulative <- cumulative + (sum(predictions[rr,1:i]) - sum(obsvec[1:i]))^2 } rps[rr] <- (1/(ncat-1))*cumulative } return(rps) }

Hi,

I’m using excel to calculate RPS. Could you help me to understand how it works?

Thank you

It works by comparing the cumulative probability distributions of the predictions and the outcome. Since the outcomes are ordered, it makes sense to use the cumulative distributions, since for the observed outcome this changes from 0 to 1 abruptly at point of the outcome.

Consider the following hypothetical scoring for a categorical (non-ordinal) outcome: For each possible outcome, take the squared difference between the observed probability (either 0 or 1) and your predicted probability. The higher probability you managed to give the actual outcome, you would get a lower score, indicating a better prediction. Now you can think of the RPS as an extension of this, where instead of taking the squared probability differences of each particular outcome instead, for a particular outcome, also take the sum of all probabilities “below” or “before” that outcome in addition. That way high probabilities far away from the observed outcome will have a greater influence on the score.

Thank you for your answer.

Could you help me with a practical example?

There are some nice examples in the paper I linked to in the second paragraph.

I found this document:

http://trap.ncirl.ie/1875/1/lucaairoldi.pdf

I’m trying to understand how to have the RPS results at page 28.

Could you help me?

Thank you

Hello, your R function don’t works right. I read the paper “Solving the problem of inadequate scoring rules for assessing probabilistic football forecasting models”.

I applied your function in match 3 of this paper (for model alpha and beta) and I got different results in the paper (see Table 3).

It works for me. I think perhaps you use it wrong. The observed argument should be a number in {1,2,3}, not a matrix or length-3 vector of 0’s and 1’s. The number points to the column in the prediction matrix the observed result correspond to.

Here is how I manage to reproduce the RPS for match 3:

I see that I didn’t explain this very well in the post. When I tried to use it myself I also got wrong results and it took me some time to figure out what was wrong. Do you get this to work?

Ok, you is right. I was using

rankProbScore(predictions=alpha, observed=c(0,1,0))

instead of

rankProbScore(predictions=alpha, observed=2)

Thank you