Considering it is only 16 months since EQB, Inc. trumpeted that “(T)his year EQB adds DNA genetic profiling to its bag of tricks” we were rather surprised to receive an e-mail blast from them this week with the headline “DNA testing – re: state of the science (underwhelming).”
Further reading revealed that the statement, which appears to blithely dismiss a whole field of scientific endeavor, is based on an article titled “ ’Moonshot’ Medicine Will Let Us Down” written by Michael J. Joyner of the Mayo Clinic for the New York Times Opinion Pages. The primary tenet of that article is that “precision medicine”, the interpretation of an individual’s genetic code with the goal of providing more effective therapeutic and preventive strategies, for which President Obama’s new budget is likely to include millions, is a less effective step toward improved health than modifying “how much we exercise, eat, drink and smoke..”
EQB then somehow interpret what they extract from the article, that DNA testing for disease has “yielded little as a predictor..” and continue on to posit – without any evidence – that DNA testing for disease is looking for “simpler predictions than the complex phenomena of ‘performance’ or ‘endurance’” despite highlighting a quote from the article that “..for most common diseases, hundreds of genetic risk variants with small effects have been identified.” They then jump to the implied conclusion, that DNA testing for aptitude and ability in the thoroughbred will have little predictive value.
We’d suggest, however, that this conclusion is an erroneous one, arrived at by compounding logical fallacies. The first of these is that the New York Times article presents a false “either/or” argument. The suggestion by the author (an anesthesiologist and physiologist not a geneticist), that we have a greater control over our lifestyle than our genome, can be allowed without accepting that research into the genetic basis of disease is not worthwhile. The choice need not be one or the other. For example, the fear expressed in the article that those for whom genetic testing indicates a low risk of a specific disease, may feel “bullet-proof” and take little care over diet and exercise, while those who are at a higher risk may adopt a fatalistic attitude and take equally little care, does not indicate that genomic research has failed to find a gene or multiple genes with variants that predispose towards that disease (of course, the author fails to mention that those tested may choose not to embrace either of these extreme reactions, and instead may act in a way that is informed by the results of their tests).
There are in fact, quite literally thousands of genes that are known to be associated with specific disorders or diseases, and even the article cited by EQB admits that for cystic fibrosis “exciting new drugs have been developed using genetic information...” Of course that is not even considering inherited diseases such as sickle cell anemia, Tay Sachs disease and haemophilia, that can be screened for and their transmission avoided (similarly, in the equine world, Quarter Horses are screened for Hyperkalemic Periodic Paralysis, HYPP, and those with a double copy of the responsible variant are not accepted for registration). Nor does the article consider genetic screening for a disease such as HCM (hypertrophic cardiomyopathy), which causes the walls of the left ventricle of the heart to thicken, such that it does not relax completely between beats and can impede blood flow into the heart itself. This is one instance where a lifestyle change that involves increased exercise can be fatal. Indeed, in his excellent book “The Sports Gene”, David Epstein relates the story of an individual who embarked on a vigorous exercise regimen because of the death of his brother due to “idiopathic hypertrophic subaortic stenosis” – a heart that is enlarged for unknown reasons – only to collapse and die when working out of a treadmill. Subsequently both siblings were found to have a variant that causes HCM, genetic information that might have saved their lives. Overall, however, the New York Times article amounts to little more than a suggestion from a physiologist that age, environmental factors, and lifestyle – predominantly diet and exercise – might be a better predictor of the likeliness of developing some diseases than information gained from genetic screening.
It’s quite a step from this relatively modest proposition to implying, as EQB seem to do, that DNA testing of thoroughbreds is unlikely to be of value as a predictor of athletic potential. Oddly, considering their core activity, as identifiers of athletic talent, they cite in their musings the areas of potential prediction difficulty as “‘performance’ or ‘endurance’ ” which suggests a certain confusion of thought. Performance is actual achievement, but a horse with great athletic potential may still produce poor performances if he lacks desire, is poorly prepared to compete, or is asked to compete over inappropriate surfaces or distances. Thus, we are seeking not to predict performance, but athletic potential. Endurance, on the other hand, is a question of athletic aptitude, not of athletic class. For example, Usain Bolt has accomplished many magnificent performances over sprint distances, but if asked to run a 5k x-country race against even a talented college student, would be shown to have very little endurance. He has a stratospheric level of performance (an outworking of his athletic potential) at sprint distances, but almost certainly a poor aptitude for endurance.
Endurance (or sprint) potential is governed to a large degree by muscle-fiber composition. So, we have instance of such as 1972 Olympic Marathon victor, Frank Shorter (80% slow twitch muscle fibers in his legs) or another marathon great of that era, Alberto Salazar (93% slow twitch leg muscle fiber), who had no hope being competitive a world-class level at distances under 10k no matter how they trained. Similarly, a typical world-class sprinter will show 80% or more type II fast-twitch muscle fibers, which does make for speed, but guarantees 400m as the upper limit of world-class performance. As far as prediction muscle fiber composition in the thoroughbred race horse – equally vital to its optimum distance – is concerned, it is well established that much can be gleaned from among other genes (such as PPARGC1a) that a SINE insertion within the Myostatin gene, rather than the intron 1 C:T SNP (that promoted by Equinome), is driving observed muscle fiber type characteristics and is the variant targeted by selection for short-distance racing in thoroughbreds.
Of course, genomic prediction of athletic class is more complex, but not as complex as EQB would have us believe. In fact, because of what is known as linkage disequilibrium – the tendency for genes to be inherited in blocks – one genetic variant can frequently stand as proxy for many more, and as a result (and as our tests on over 3,500 horses have indicated), as few as 25 genetic variants within exercise relevant genes is all that is needed to establish a good association between genetic markers alone and athletic potential with a simple polygenic profile.
EQB is at a certain level also trying to make a virtue of a vice. They have tried using genetic data but because they have made the mistake of using it as a standalone test, rather than integrating the DNA variants with other data that they generate (and we will be the first to say that what they do in terms of cardio and biomechanics is world class) to come up with a better prediction model, they have decided to reject the science in its entirety.
If you ever meet EQB's Jeff Seder, this is a surprising decision and it is especially so when you consider the limitations of the technology used by EQB to predict athletic potential. Measurement of cardiovascular systems is itself a very good negative predictor, that is, horses with poor cardiovascular systems rarely (but not never) compete at the highest level. However, EQB, by their own admission state that there are more horses with good cardiovascular parameters than there are good horses and this is also a limitation of athletic prediction relying on cardiovascular systems alone. The cardiovascular system (and a measurement of the spleen and lung as EQB and others do), really only describes the ability and efficiency of the cardiovascular system to supply blood to the muscle. It doesn't describe what the muscle does, and how effectively it does it, when the oxygenated blood gets there. Variations within exercise relevant genes such as MSTN, PPARGC1a, PDK4 and CKM associated with elite performance will by proxy describe what and how effectively the muscle deals with the oxygenated blood and do go a long way in describing why some horses with really good cardiovascular systems are just plain slow.
The flip-side to EQB, relying on phenotypic measures alone to describe athletic potential, is relying solely on genetic variants to do same. We have been down that road and it is one where you can make significant errors. In using DNA markers, as important as it is having the genetic variants in exercise relevant genes, it is equally important having the environment to allow these genes to express. That is, the environment that a horse is brought up under is a big influence on the way the genes will work. Unfortunately it isn't possible to easily test for environmental expression of genes at yearling sales which is why testing for DNA markers alone via hair or blood samples has significant error associated with it.
Through trial and error we find ourselves in between EQB and Equinome using genetic markers, cardiovascular and biomechanical measurements in a single model to predict athletic potential. We have found that using the latest in machine learning techniques (we are using Microsoft's Azure ML which we have found to be the best in both managing a database in the cloud and allowing machine learning development including scripts in R) and consistently and constantly iterating the prediction algorithm as our database matures that we are in a better place to explain the horses that don't make immediate sense to either those relying on cardiovascular measurements or genetic markers alone. The machine learning algorithm, which is a random forest regression, can more accurately weigh each variable involved and describe the horse better than any standalone measure.
At the end of the day, when all the science is done, this is not a scientific problem, it is a data problem and those with the data and the model that explains the greatest variance of athletic potential and subsequent performance will find the greatest advantage and in turn the winners.
Interestingly enough, we are finding that when we fix for mitochondrial and myostatin haplotype, that is, take yearlings from the same female family and with the same myostatin variations, we are finding that the cardiovascular and biomechanical data separate elite and non elite performers more easily and proves more predictive. It's how we identified the high class filly Fontiton for Matchem Racing and how we are hoping to repeat that performance this year with their selection at the yearling sales in Australasia. That itself should give pause to EQB about being so adamant that DNA hasnt a role to play in racehorse selection.