Last week we held an event in New York in which Mark Broadie from Columbia University talked about his book “Every Shot Counts”. The talk and the book detail his analysis of a very large and complex data set…specifically the “ShotLine” data collected for over a decade by the PGA. It details every shot taken by every pro at every PGA tournament. He was able to use it to challenge some long held assumptions about golf…such as “Do you drive for show and put for dough?”
On the surface the data set was not easy to work with. Sure it had numbers like how long the hole was, how far the shot went, how far it was from the hole and so on. It also had data like whether it ended up in the fairway, on the green, in the rough, in a trap or the dreaded out of bounds. Every pro has a different set of skills and there were a surprising range of abilities even in this set, but he added the same data on tens of thousands of amateur golfers of various skill levels. So how can anyone make sense of such a wide range of data and do it in a way that the amateur who scores 100 can be compared to the pro who frequently scores n the 60’-s?
You might be tempted to say that he would use a regression analysis, but he did not. You might assume that because the concept of “Money Ball” used Hierarchical Bayesian estimation he did as well. In fact, while HB has become more common place (it drives discrete choice conjoint, Max Diff and our own Bracket™), he didn’t use it here.
Instead, he used simple arithmetic. No HB, no calculus, no Greek letters, just simple addition, subtraction, multiplication and division. At the base level, he simply averaged similar scores. Specifically he determined how many strokes it took on average for players to go from where they were to the hole. These averages were further broken down to account for where the ball started (not just distance, but rough, sand, fairway, etc) and how good the golfer was.
These simple averages allow him to answer any number of “what if” questions. For example, he can see on average how many strokes are saved by going an extra 50 yards off the tee (which turns out to be more than for being better at putting). He can also show that in fact neither driving nor putting is as important as the approach shot (the last full swing before putting the ball on the green). The ability to put the ball close to the hole on this shot is the biggest factor in scoring low....