Last week we held an event in New York in which Mark Broadie from Columbia University talked about his book “Every Shot Counts”. The talk and the book detail his analysis of a very large and complex data set…specifically the “ShotLine” data collected for over a decade by the PGA. It details every shot taken by every pro at every PGA tournament. He was able to use it to challenge some long held assumptions about golf…such as “Do you drive for show and putt for dough?”
On the surface the data set was not easy to work with. Sure it had numbers like how long the hole was, how far the shot went, how far it was from the hole and so on. It also had data like whether it ended up in the fairway, on the green, in the rough, in a trap or the dreaded out of bounds. Every pro has a different set of skills and there were a surprising range of abilities even in this set, but he added the same data on tens of thousands of amateur golfers of various skill levels. So how can anyone make sense of such a wide range of data and do it in a way that the amateur who scores 100 can be compared to the pro who frequently scores n the 60’-s?
You might be tempted to say that he would use a regression analysis, but he did not. You might assume he used Hierarchical Bayesian estimation as it has become more commonplace (it drives discrete choice conjoint, Max Diff and our own Bracket™), he didn’t use it here either.
Instead, he used simple arithmetic. No HB, no calculus, no Greek letters, just simple addition, subtraction, multiplication and division. At the base level, he simply averaged similar scores. Specifically he determined how many strokes it took on average for players to go from where they were to the hole. These averages were further broken down to account for where the ball started (not just distance, but rough, sand, fairway, etc) and how good the golfer was.
These simple averages allow him to answer any number of “what if” questions. For example, he can see on average how many strokes are saved by going an extra 50 yards off the tee (which turns out to be more than for being better at putting). He can also show that in fact neither driving nor putting is as important as the approach shot (the last full swing before putting the ball on the green). The ability to put the ball close to the hole on this shot is the biggest factor in scoring low.
So far his book has not helped my golf game (of course with the long winter I haven't been able to apply it yet)... but it has given me newfound respect for the simple ways in which data can be analyzed in new product research. Now don’t get me wrong…regression, max-diff, discrete choice conjoint and other advanced methods are powerful tools that we’d be foolish not to use, but sometimes, they are not appropriate…what do we do then?
Well, consider classic cross tabs...they are in essence simple averages like the golf scores he used. By using them thoughtfully, however, we can drive results that are at least as valuable as a golfer's approach shot.
Rich brings a passion for quantitative data and the use of choice to understand consumer behavior to his blog entries. His unique perspective has allowed him to muse on subjects as far afield as Dinosaurs and advanced technology with insight into what each can teach us about doing better research.