A machine learning algorithm has crushed the Kentucky Derby
For three years in a row, a machine learning algorithm that learned to predict off past results data provided by BrisNet, has crushed the Kentucky Derby result, but you'll be lucky if you hear about it outside of this blog as it's not something that the mainstream horse media necessarily want to hear - a machine doing better than the human pundit (where have we seen this before!)
Eureqa is a machine learning program that uses evolutionary algorithms (that is algorithms that learn off one another) to generate predictive models. The program was developed by Michael Schmidt and his team an Nutonian and has been used in a number of applications. We actually used Eureqa her at Performance Genetics to do our initial model building.
Eureqa, or more specifically Nutonian, approached Ed de Rosa, the marketing Manager at leading horse data supplier BrisNet, back in 2014 with the idea that they would try to predict the most likely winner of the 2014 Kentucky Derby. That year, after analyzing the data provided by BrisNet (which includes some proprietary figures developed by BrisNet for their handicappers) they came to the following 5 horses:
Vicars in Trouble
You can read their predictions here. As we now know California Chrome won the Kentucky Derby but it was worth noting that they also correctly picked Wicked Strong, Danza and Samrat to fill the first five over the line, only missing longshot Commanding Curve.
Interestingly, along with a detailed discussion on how they went about their work, they supplied the final algorithm for their predictive score:
Horse Score = 5.614695362 + 2.634162332*(Racing Style_Early) + 0.5869793526*(Trainer Meet %)*Speed - 0.06186576034*Speed - 57.63578215*(Trainer Meet %) - 1.000054353*exp(1.027235778*(Starting Price Implied Probability Standardized))
The following year they were back at it again. This time they didn't reproduce their final algorithm, which would have to be somewhat similar to the one the year prior as there was only another year's worth of data to add, but the again did very well with their predicted first five in the Kentucky Derby being:
Again with American Pharoah they had the winner, but they also had the third placegetter Dortmund and the 5th placegetter Danzig Moon in their top 5 selections.
So we come to this year's predictions. In their blog post they mention a couple of new factors that their algorithm has learned to weight, but rest on just five variables:
Standardized live odds probability
Speed over the past two races
You will see from the algorithm they posted from two years ago that "Racing Style", "Speed" and "Odds Probability" were already variables used previously while the Post Position and Track Conditions also became of interest. Racing Style and Speed are figures developed just for BrisNet customers, while the Odds probability is pretty much standard in any predictive algorithm as the betting market is somewhat rational in its thoughts on each runner. Their predicted top 5 for the 2016 Kentucky Derby???:
As we now know the first four across the line were Nyquist, Exaggerator, Gun Runner and Mohaymen so they got the Superfecta (first 4) across the line from their five selections which paid a healthy $542 for a $1 bet. While the winners have all been favorites, they have picked the winner of the Kentucky Derby as their first selection in each year and had many of the placegetters in their first five. It's an interesting use of machine learning processes and a machine learning algorithm, and it's certainly better than a lot of the pundits out their that is for sure!