- Golf predictions: Where to begin?
- The basis of a golf predictions model
- The impact of survivorship bias in golf

Golf is a notoriously hard sport to predict. Data Golf have spent years honing a golf predictions model that uses statistical modelling to help provide a more accurate reflection of player performance. How can you use statistics to make golf predictions? Read on to find out.

On its surface, predicting a sport like golf seems complicated: typically anywhere from 132 to 156 players compete on playing fields (i.e. golf courses) that can differ drastically from one tournament to the next. If the goal is to effectively predict outcomes of golf tournaments, where should one start?

The answer, in our opinion, lies in the domain of statistical modelling. A statistical model describes the process by which a set of data (e.g. scores in a golf tournament) are generated.

In this article, we describe a simple model of golf scores and analyse its main implications for interpreting golf data.

### Golf predictions: Where to begin?

What matters in golf tournaments is not a player’s raw score, but their score relative to the field. A 72 in a tournament where the field averaged 74 would be deemed a performance 4 strokes better than a 72 where the field averaged 70. This adjustment is problematic if the golfers comprising the two tournament fields are not of equal quality (this is a point we will ignore for the time being).

With scores adjusted relative to the field, which we will refer to simply as “score” from here on, the next step is to describe how these scores are generated (i.e. to build a model).

First, we make an assumption that greatly simplifies the problem: suppose that the scores of different golfers at a given course are independent – that is, the performance of one golfer tells us nothing about the performance of another.

This reduces the problem of predicting outcomes of golf tournaments into many separate, simpler problems: namely, predicting the scores of each individual golfer.

Next, let’s define a golfer’s ability, at each point in time, to be their hypothetical average score from an infinitely repeated round of golf. For example, Tiger Woods’ ability at the Genesis Open is defined to be his average score at Riviera Country Club from an infinitely large sample of rounds. While it is never possible to know the value of this quantity, it is useful as a conceptual tool.

All outcomes of a golf tournament (e.g. winning, making the cut) are a deterministic function of the relative-to-field scores of each golfer.

An individual golfer’s scores show considerable variation over time. This variation can be thought of as comprised of two components: that due to changes in the golfer’s ability, and a catch-all residual component including everything else that affects scores. The latter could be labelled “random” variation, or, depending on your philosophical leanings, the variation in scores due to “unobservable factors”.

On a given day, a golfer’s score is defined to be the sum of their ability and the effect of these unobservable factors. For example, Tiger Woods’ score of 65 in the third round of the Genesis Open was six strokes better than the field average; this would be described in our model as the sum of Woods’ ability (say, two strokes better than the field average) and a positive four stroke random shock.

To complete the model we invoke a final simplifying assumption: suppose that golfers’ abilities are fixed over time. If a golfer’s ability is fixed, it follows that all of the variation in scores we observe over time is due to what we’ve labelled “random” variation.

### The basis of a golf predictions model

It may not be obvious, but we have just fully (albeit, informally) specified a statistical model that describes how the outcomes of golf tournaments are generated. Here is the model in three statements:

- Every golfer has a fixed ability.
- Each golfer’s relative-to-the-field score on a given day is a combination of their ability and random variation (i.e. the mythical “unobserved factors”).
- Relative-to-field scores are independent across golfers.

All outcomes of a golf tournament (e.g. winning, making the cut) are a deterministic function of the relative-to-field scores of each golfer; therefore this model provides us with a description of any outcome of a golf tournament we desire.

This basic model setup serves as a useful foundation for thinking about golf scores; the rest of this article explores some of its practical implications.

### Sample size will always be important

The logical first step towards putting this model to use is an attempt at estimating golfers’ abilities. Suppose for an individual golfer we have an historical sample of scores. If this sample is large enough, its mean would be equal to the golfer’s ability. What sample size can be thought of as “large enough”?

Empirically, it is typical for a golfer’s scores to have a standard deviation somewhere around 2.75 strokes. Assuming these are normally distributed, 68% of scores will be within 2.75 strokes of the mean, and 95% within 5.5 strokes. Using basic statistical theory, we could be somewhat confident that the average of a 100-round sample lies within 0.275 strokes of the golfer’s ability.

The key tradeoff to recognise is that the fewer dimensions along which you allow golfers’ abilities to vary, the more data you will have to estimate the relevant quantities.

For context, consider the fact that the 50th and 100th worldwide ranked golfers’ season-long scoring averages are separated by less than half a stroke. This brings us to the main practical implication of this model: to draw useful inferences about golfer abilities, you need to rely on large samples of historical data.

Under this model, differences in scores observed between golfers in any given week, month, or even year, are mainly due to random variation. To distinguish between the abilities of two golfers within 0.5 strokes of each other, 100 or more rounds will be required to confidently separate them.

Crucially, this implication is only made possible by the assumptions of our model. And, maybe this model has it wrong. Perhaps a golfer’s ability is not fixed over time, and to take a commonly used example, perhaps it isn’t fixed across different golf courses either. Therefore, what we have labelled rather lazily, “unobserved factors”, may not in fact be unobservable!

### Fixed ability or course-specific ability?

With fixed abilities, differences in a golfer’s performance across courses are assumed to be the result of random variation; but in a model with course-specific abilities, this performance gap at least partially reflects differences in ability.

This is not merely a semantic difference. The degree to which you believe differences in golfer performance across courses are due to genuine differences in ability, as opposed to random fluctuations, greatly impacts how you would estimate their ability (and ultimately how you form your predictions).

The larger the role of random variation, the larger the sample size required to precisely estimate a golfer’s ability. If abilities are fixed, all the variation in a golfer’s scores is random, and consequently a very large sample of scores is required to average out that variance.

However, in a world in which course-specific abilities are responsible for much of the variation we observe, it is possible that only a few rounds of data at the relevant course would be required to obtain reasonable estimates of a player’s course-specific ability.

Which model is closer to reality? Without formally analysing the data, there is a prima facie case to be made that golf scores are generated by a process that is closer to the “fixed ability” model than a “frequently-varying ability” model.

Sticking with the course-specific ability example, note that there is only slightly less variation in a golfer’s scores within a tournament (i.e. from round-to-round played at the same course), than there is overall (i.e. across rounds played at all courses).

This is transparent evidence that factors apart from player-course fit still play the dominant role in determining golfer’s scores; and as before the implication is that large sample sizes will be required to uncover the course-specific ability.

### The impact of survivorship bias in golf

In general, it is very difficult to explain (in the statistical sense) the enormous variation in golfers’ scores with observable factors (observable should be taken to mean “observable before the tournament starts”).

On the other hand, is the “fixed-ability” model consistent with some of the bewildering patterns we see in the data? For example, Tony Finau recently missed his fourth consecutive cut at the PGA Tour’s Phoenix Open. Is this definitive evidence that Finau has a lower ability at TPC Scottsdale than elsewhere? Possibly, but patterns like this would still show up if the “fixed ability” model were true.

The logic is similar to survivorship bias amongst betting tipsters. Even though there is maybe a 1 in 500 chance that a golfer of Finau’s calibre misses four consecutive cuts, if you consider all possible combinations of golf courses and players (of which there are thousands), we should expect 1 in 500 events to happen not infrequently over the course of several PGA Tour seasons. To focus on one or two examples while ignoring the rest will not paint an accurate portrait of course-player fit.

### Developing this simple golf predictions model

The simple model laid out in this article is useful for understanding the different ways one could analyse golf scores. Seemingly very different philosophies, such as the fixed-ability and course-specific ability models, can both be analysed through a similar framework, allowing for the drawbacks and benefits to be made clear.

In this case, the key tradeoff to recognise is that the fewer dimensions along which you allow golfers’ abilities to vary, the more data you will have to estimate the relevant quantities. For example, to estimate a unique golfer ability at each course played on the PGA Tour, only 5-10 rounds will be available to base this off in most cases.

Conversely, to estimate a single fixed ability for each golfer, all of their data can be used to form the estimate. Neither philosophy is inherently better than the other, and the fixed ability model will perform better the greater the role is played by random variation in determining golf scores.

Our approach to understanding golf scores is more closely aligned with the fixed-ability model. While this model is clearly “wrong”, its power to rationalise (and ultimately predict) patterns in golf scores is impressive.

In future articles, we will provide evidence that supports this claim, but also explore the many ways this simple model can be improved. If you accept the fixed-ability model as a reasonable approximation to reality, its main practical lesson is that it is incredibly easy, to borrow a phrase, to be “fooled by randomness” when analysing golf data.