Can new metrics improve prediction accuracy?

  On the 17th of October 2017, Leicester City sacked manager Craig Shakespeare. A poor start left Shakespeare’s men sitting 18th in the table – in the relegation zone –…

Categories: Data, Professional, Statistical models, Uncategorized


On the 17th of October 2017, Leicester City sacked manager Craig Shakespeare. A poor start left Shakespeare’s men sitting 18th in the table – in the relegation zone – with just six points from an available 24.

The decision to sack Shakespeare didn’t come as much of a surprise given their poor results, but by using some more advanced metrics to dig a little deeper the story becomes a little more nuanced.

Goals are rare events and in small samples luck can play a major role in determining results that don’t accurately reflect underlying performances. Model-based traders and syndicates alike start a match with an estimation of how many goals each team will score (an ‘expectation.’ This is an assumption to the question that ‘if this match was played an infinite number of times, how many goals would each team score on average?’). To better understand how a team is performing and derive what is a ‘fair score,’ we need stats that are more predictive in nature and better reflect performance as opposed to just results.

Consider expected goals (or xG for short), a metric which measures the likelihood of any individual shot being scored. By looking at the total xG value of the shots a team has taken and conceded we can get a better understanding of how a team is actually performing by quantifying the quality of all these chances. This gives us much greater insight into performance and allows us to better predict future performances. For a more illustrative explanation of xG, see the following link:

Below is a table highlighting Leicester’s xG numbers under Shakespeare and how they ranked relative to the rest of the league.

xG For Goals For xG Against Goals Against
12.04 (6th) 10 (8th) 10.48 (12th) 13 (17th)

What stands out is that the underlying performances weren’t that bad under Shakespeare. They scored two fewer than you’d expect them to based on the quality of their chances and conceded about three more than you’d expect them to based on the quality of chances they allowed. They generated the 6th best total quality of scoring chances in the league, according to the xG of their shots, and were 12th in the league in terms of xG conceded. Despite being in the relegation zone they were performing like a mid-table team.

Following Shakespeare’s sacking Leicester have risen to 8th in the table and now look to be fighting for a Europa League spot. This change in form would have been surprising had we only looked at the results, but through using advanced metrics like xG this regression to the mean is exactly what we would have expected and is certainly what should be accounted for by a professional trader to derive value from the market.

This highlights one of the key features of xG – that it can better predict future results (in this case future goals) than using just results and goals. Opta’s advanced metrics are a powerful push to gain better understanding of the actual state of game play, and can be a crucial tool for syndicates and professional traders alike.

What xG doesn’t tell us is how a team is generating or preventing these chances. Sequence data can be used to better understand how a team is playing, their approach to attack and defence as well as other more stylistic features.

A sequence is defined as an uninterrupted passage of play where the ball is in the possession of one team and is ended by a defensive action, a shot or a stoppage in play – you can find a more thorough description of the sequence and possession framework here.

Using this data, we can identify which players are involved in the build-up to certain chances – beyond just the player who takes the shot or makes the final key pass – as well as team level statistics looking at things like width and speed. For professional traders, this is further granularity that allows for a deeper understanding of the state of true game play, as well as gaining more data to inform player and team ratings.

One useful team metric is direct speed which looks at the speed at which a team moves the ball down the field. Leicester have historically been a team with a very high direct speed and Shakespeare’s Leicester were no exception moving the ball downfield at a speed of 1.71 m/s – the fourth highest in the Premier League.

However, since Shakespeare left the club and was replaced by Claude Puel this has dropped significantly with Leicester average moving the ball downfield at a rate of 1.45 m/s, only the 14th highest in the Premier League. This supports the general perception that Puel plays a much slower and methodical brand of football than his predecessor.

Speed data could be used to refine models that take possession and shots into account. Sequences can be combined with speed to differentiate long ball play from quick counterattacks. As well as identifying possible partnerships inside a team that produce the best chances, like a good passer and a quick striker in a counterattacking team. Utilising this extra level of analysis can help traders to make better decisions about player performance/ratings and expected game flow.

Using Leicester this season as a case study we’ve been able both to predict their upswing since sacking Craig Shakespeare and highlight the tactical differences between Shakespeare and Puel using advanced metrics.

This is a very specific case about a team in one of the most popular and widely covered leagues in the world, but similar techniques could be applied to learn more about teams and players across all of the competitions Opta data covers.

This is one of the major advantages of using advanced metrics and statistical analysis in general. We can learn a lot about the underlying performances and tactical approaches of teams that we may not be able to watch on a regular basis. For professional traders, the availability of advanced metrics leads to a more substantial basis to form or cross reference player and team ratings, which is the foundation to refining model expectancies.


Key Authors: Thomas Worville & Sam Gregory (Data Scientist Team – Opta).

Part of Perform Content, a division of Perform Group, Opta is the world’s leading live, detailed sports data provider.

No Thoughts on Can new metrics improve prediction accuracy?

Leave A Comment