top of page

Genetics and Pedigree Analysis - just names on a page


Without in any way belittling or demeaning those that struggle daily with substance abuse issues, I feel that I have to start this post with an Alcoholics Anonymous type confession - I was once a Pedigree Analyst.

As a preface, this is a bit more of a personal post, so if you were hoping (expecting) something of a more technical nature, feel free to move on. For those that are happy to indulge, I was one of 'those guys' who believed that the pedigree held the answer, and by arranging the names on the page in a certain way I could help breed or select a superior racehorse. That is not to say that I was totally naive about the racing class of the immediate parents and grandparents of a foal or yearling, or the variability of genetics, it is more to say that I down weighted them when compared to how I looked at or arranged more distant ancestors.

I had a lot of success, and a lot of downright failures. Like any good analyst, or at least any worth their salt so to speak I continually review the process that I undertook in advising on matings and looked at how successful I had been, where I had gone wrong and where I could improve. Admittedly pedigree analysis is just a small part of the larger effort by farm mangers and owners to breed or select a superior racehorse, and environment and how they are raised has a real influence (as does the trainer), but it is an important starting point.

This process of review has been ongoing for years now but has been accelerated and enhanced by tools such as the Jockey Club's Pedigree Analysis Program and other information that is now readily available where it had not been in the past. A short summation of my review (and this is ongoing) would be this - my hand in success wasn't much to do with how these names on the page were constructed at all, rather the quality of the material that I was working with.

The first thing that became apparent was that when I went back through all the matings that I had a hand in, right back to the early days at Arrowfield Stud, for all the patterns that one might arrange on the page, there is an inescapable fact - the racing quality of the immediate parents and grandparents count for way more than anyone gives them credit for, or at least for what I gave them credit for.

Longshot bias, where we value the 1000-1 shot as a 50-1 shot and undervalue the 2-1 favorite, exists in pedigree analysis and certainly in the way that I looked at matings. The numbers back this up. We all love to hear stories about a successful mating, like the New Zealand freak Veandercross who was by the lowly Crossways out of an unraced Super Gray mare in Lavender, but the reality is that these types of matings and the results of these types of matings are literally the 100,000-1 shot coming in. For every Veandercross, John Henry, Skip Away and Ramonti, there are thousands upon thousands out there with similar pedigrees in terms of structure (ancestors arranged on the page) and racing class (of the immediate parents) that amounted to absolutely nothing.

The Veandercoss types are the extreme outliers of thoroughbred pedigrees, almost black swan events. With hindsight a Pedigree Analyst would like to explain the pedigree pattern of Veandercross as a perfectly understandable phenomenon but the reality is that it isn't. The simplest and most accurate explanation is that he is a Mendelian genetic outlier whose repeatability is nearing zero. Try and find a full brother or sister to these types of horses that amounted to anything on the racetrack.

This switch, from looking at distant ancestors to the quality of the more immediate parents started with this review process by a statistical analysis of pedigrees. Try as I may, I couldn't find any evidence to suggest that inbreeding to superior ancestors (Rasmussen Factor) or any other patterns of breeding that occurred outside of the third generation of a foal made any difference at all. The good racemare Lalun was about the only one I could find that got close to being assoicative and the reality was, if it wasn't involving Sadler's Wells and/or Darshaan (i.e good material) it made no difference. I couldn't find any evidence that deep line breeding to Maid of Masham had any impact at all. There were just as many slow horses as good ones in proportion to a normal population. Dosage, which admittedly has been warped beyond comprehension from its original purpose, is just noise as is using Lowe numbers, which in some cases has no relationship to mitochondrial haplotypes.

More recently this has been accelerated by much of the genetic work that we have been doing at Performance Genetics. Increasingly the evidence suggests that while common orthodoxy is that the ancestors of a racehorse beyond three generations should matter, in reality they don't, they are just names on a page.

As a rather simplistic model, but to illustrate the point, take a variant within the myostatin gene that we have looked at. The options for this variant are G:G, A:G or A:A. Within two generations you can have a completely different genotype if mated to the right mare - a G:G sire has a A:G son and then in turn an A:A foal. Within two generations the foal has a completely different variant to the Grand Sire - they are two different horses. There is a great post on Slate.com on genetics and ancestry which covers this issue and if you have the time I thoroughly recommend a read of it as it explains this better than I can. One of the more interesting parts of the article is:

Imagine that you could know that 22 percent of the genome of your child derives from your mother, and 28 percent from your father. Also imagine that you know that 23 percent of the genome of your child derives from your partner’s mother, and 27 percent derives from your partner’s father. And you could know exactly how closely your child is related to each of its uncles and aunts. This isn’t imaginary science fiction, it is science fact.

This raises the question. Do the elite grandsons and granddaughters of A.P Indy have a higher proportion of his genome when compared to the other three grandparents than the non-elite ones? Is that the difference between them? Some food for thought.

In addition to the over weighting of distant ancestors, looking at the 'failures' a couple of observations based on those horses who started at least 5 times and couldn't get out of their own way on the racetrack:

  • A high proportion of these came out of mares that were unraced or moderately good performers who in turn were out of unraced or unplaced racemares that were good producers (thus the dams of these horses were half or full sisters to good horses). That raises the question, is it better to have the full/half sister to a Champion as a broodmare or the daughter of a Champion?....

  • I over weighted the grand-dam on the page in terms of its racing class. The broodmare sire, who is just a blip on the catalog page, is almost as important as the grand-dam in determining the potential of the horses as a runner, but he's like the Rodney Dangerfield of a mating - he gets no respect.

  • The 'fish and fowl' mating was also one that is a trap. Best Race Distance is a highly heritable characteristic, thus sending true sprinters to distance horses and hoping for something in between is a mistake. The owners of Black Caviar did the right thing in sending her to Exceed and Excel rather than say Galileo.

So today I no longer consider myself a Pedigree Analyst.

I am solely concerned in performance as a heritable characteristic or trait, and what factors will influence this outcome. Line breeding and deeper pedigree analysis isn't one of them. That is not to say I'm right and 'they' are wrong, more so to say that I no longer believe that anything beyond the third generation of a foal really matters at all because the data and genetics is just not there to support such a belief,

#Thoroughbred #Data #Genomics

204 views0 comments

Subscribe to our email list and get the latest post straight to your inbox

Thanks for submitting!

bottom of page