The Cobalt Threshold
Racing in Australia is under a bit of a microscope with what can only be described as a plethora of cobalt 'positives' coming to light in recent months (with apparently more to come).
Cobalt Chloride use has been around for some time in the racing industry. Indeed one of the more favored 'treatments' in the North American Harness racing community has been the double of Thyroxine ( off label sold as Thyro-L, Levoxine Powder, or Levo-Powder) which has been around in racing since the 1970's and is used to jack up the metabolism of the horse to make leaner and fitter, and Cobalt Chloride to boost red blood cell numbers - fit lean horses with an abundance of red blood cells.
While there is no evidence that this stack has made its way to Australia, Cobalt Chloride certainly has. Starting in the Harness industry it made the jump to the Thoroughbred industry a couple of years ago. Harness Racing regulators started the ball rolling with setting a Cobalt threshold level, and more recently in the Thoroughbred industry Racing Victoria, followed by the Australian Racing Board introduced a Cobalt threshold level of 200 ug/l as a Threshold. The question being asked by many in the industry is, is the number a fair one?
To answer the question we need data, and fortunately racing regulator Harness Racing New South Wales publishes Cobalt testing levels of all the horses that they have tested since April 2013. There is no difference between the metabolic processes between a Standardbred and a Thoroughbred and in the data provided by Harness Racing New South Wales we find that there are 1474 observations ranging from <5 (there are 515 observations that are less than 5 ug/l) through to 2,600. Before we can do any analysis we need to transform the "<5" observations to a reasonable number. In this case what I did was to distribute them normally between the range of 2 and 4.95 so that the bulk of the "<5" measurements fell around 3.5 which is a fair distribution of values.
The second transformation is to turn the data into a normally distributed data set. Prior to this distribution the data set is abnormally distributed with most of the observations falling at or below 6 ug/l. To transform this data we use what is known as a Box-Cox transformation, raising each of the values to a power (in this case a negative power) in order to make the data more normal in its distribution and allow proper assumptions to be made. This also allows us to generate an average value for the data set and also a standard deviation for the data set.
Once we know what the average cobalt reading and the standard deviation is, we can determine the three sigma rule. Stick with me...In statistics, the 68–95–99.7 rule is a shorthand used to remember the percentage of values that lie within a band around the average in a normal distribution with a width of one, two and three standard deviations, respectively. That is, 68% of all observations will fall one standard deviation of each side of the average, 95% will fall two standard deviations and 99.7% will fall within three deviations.
So, in the case of our dataset the average Cobalt reading in the Harness racing dataset is 7.7 ug/l. One standard deviation above the average is 21 ug/l, two standard deviations is 55 ug/l and three standard deviations is 150 ug/l.
What does this mean?
99.7% (Sigma 3) of all the observations can be found between 0 and 150 ug/l. If a horse is generating a Cobalt level greater than 150 ug/l it is less than half of one percent of a chance of occurring natrually.
Now it must be said that we are starting with a data set that included measurements of 'abnormal' cobalt levels. If we had a data set without these 'outliers', the Sigma 3 value would be much less than 15o ug/l, probably closer to the 100 ug/l threshold that the Hong Kong Jockey Club has set as their threshold.
But there is something more to consider. Even with our data set that included horses whose Cobalt levels were manipulated, Sigma 6 (6 standard deviations above the average) results in a ug/l level of 420. Mathematically that is 2 parts per billion or more easily understandable, if an event occurred every day, a score of 420 would equate to the event occurring naturally every 1.4 million years.
Based on the dataset, 200 ug/l looks very generous.