Today I am pleased to announce the launch of This project is the culmination of years of work, gathering data at major two year old in training sales across North America and Europe and using racetrack outcomes to develop a predictive algorithm for two year old in training sales. In all, I have data on just over 6,000 horses that subsequently had 3 or more starts so are valid records to create a prediction algorithm on. uses the latest in machine learning algorithms, specifically XGBoost, to develop a predictive algorithm to select elite horses from two-year-old in training sales. More than half of the winning solutions in machine learning challenges hosted at d

Assortative Matings and Sire Production Class

I've been working on some new data models for the yearling sales season. The goal being that if the average sale has 3% Elite racehorses to all horses, and the average bloodstock agent gets about 10% Elite racehorses to purchases, if you can get a data algorithm to get you started with a shortlist at each sale with a little north of that latter figure, that you are putting yourself in a position for more consistent success. In analyzing the dataset, which is substantial, it is interesting what features rank as more important, or indeed describe a phenomenon in a way you wouldn't expect. One "variable" is the attempt to explain the production class of a sire. That's obvious right? We use some

@2017 by Performance Genetics LLC