Down in Australia we have created a partnership and are selecting, purchasing and syndicating yearlings under the name of Matchem Racing. Our first year of operation was this year and we purchased 12 horses for A$2,172,500 (you can see a full list of the purchases here) and the first of them started racing last month.
The syndication company got off to a fast start with the highly rated first starter Fontiton, who skated clear by 6 lengths at her first start and is the highest Timeform rated two year old in Australia seen to date.
This week we were interviewed by Dave Duffield of Champion Picks discussing the methodology behind Matchem Racing.....
Matchem Racing hit the headlines recently when Fontiton demolished the field in the $250,000 Inglis Banner race.
They use prediction modelling including DNA and cardio testing to try and identify racehorses that have the genetic and physical characteristics to become elite performers.
The technology has been developed by Byron Rogers who spent eight years at Arrowfield Stud in Australia as Stallion Nominations and Bloodstock Manager where he was part of the team responsbile for guiding the stud careers of Redoute’s Choice, Flying Spur and Danzero amongst others. He also has extensive experience in the USA and has been based in Kentucky for a number of years. Combine that hands-on experience with a computer science background and you get a very interesting way of assessing yearlings.
Byron’s on the podcast this week to explain their unique approach.
Punting Insights You’ll Find:
The varying characteristics of sprinters and stayers
A very simple yet often overlooked breeding factor
The underestimated importance of the trainer effect
Whether any of this applies to the betting market
Byron Rogers – Matchem Racing
Read the Transcript
David Duffield: Can you start by filling in the listeners on a bit about your background just to let them know your experience in the racing industry. Then, we’ll talk about Matchem Racing and obviously Fontiton being the headline horse at the moment.
Byron Rogers: I started off working at Arrowfield Stud many moons ago. Going back to when Redoutes Choice and Flying Spur were just starting their careers back then. Actually I’d been to university and I’d done some computer science background at University so I’ve worked with data and all that stuff at University.
I’d been doing stuff with various companies up here in America and got more interested in how really to measure a horse. If you do the same thing that everyone else is doing you get the same results. I took the view of let’s try something different.
We started looking at measuring cardiovascular capacity in horses. That’s where we started with it and we took an ultrasound and just measured the cardiovascular capacity; how big the heart is. That’s what led me into then looking to genetics as to why these horses were running certain distances.
Obviously my background at University of Technology I took that feel of data, aAt the end of the day when you start measuring everything it all just becomes down to data and data modeling is what I was good at.
We started to build more and more data. We got to, geez; we’re now at 3500 data base where we’ve got known racing outcomes. We’ve been able to model different things in terms of how far does a horse really want to run and that’s just not genetic in terms of that you can get these horses that are genetically milers but they end up being sprinters because they’ve got slightly different cardiovascular capacities.
We’ve done a whole lot of modeling on really what makes a race horse tick. Now, it’s been four or five years we’ve been doing that. We’ve had a lot of success up here in America with horses like Verrazano who was a horse we bought as a yearling, helped selected a yearling for clients. New Year’s Day who won the Breeders’ Cup juvenile two years ago. This year we had two runners in Breeders’ Cup races out in California.
We’ve taken that technology and said what is the best application for it and that’s where Matchem Racing came from and that’s where we are today.
David Duffield: With the data base that you’d mentioned of 3,500 racehorses is that on a global basis?
Byron Rogers: Yes, that’s horses all around. We’ve got horses like Winchester who raced in America here and ended up racing in Australia. We’ve got horses like Icon Project that are all American based horses and all these, very good Tiznow and a bunch of those horses. We’ve also got horses in Australia like Rebel Raider and horses in Japan and horses in Europe.
Yeah, the data base is pretty well spread in terms of where everything is and where they raced. I’ve got a pretty good feel of what you need to do and why horses, some stallions in particular, work in Australia and don’t work overseas.
We seem to get a pretty good idea when you look at the genetic markers and what they do with them in terms of producing. What the average cardio of a stallion is we can pretty much work out where they’re going to fit in what country they’re going to be best fitted.
David Duffield: With the data base of 3,500 did you purposely include any horses that might not have been that successful on the track?
Byron Rogers: Yeah, there’s a bunch of those. It’s easier to find a slow horse than it is a good one. Yeah, there’s a lot of slow, average horses that we’ve got in the data base. That was the main thing is that when we started off we said let’s take 200 really fast horses and 200 really slow horses and let’s look at the genetic difference between them.
We did a genome wide association study where we sequenced not the whole DNA, but we sequenced the snip markers of relevance and found out what snips or what changes in genetic code were the most important difference between them. Then, that’s what we started to build the model on the genetic side of things.
We started to look at what that all meant. The issue with just using DNA and there’s companies out there that just do DNA and now, the major problem you’ve got there is your DNA isn’t your destiny basically. It’s a very good negative predictor. If the DNA markers you’ve got in a horse are associated with slow runners then those horses are usually slow. That’s usually to do with the fact that when the oxygenated blood gets to the muscle they don’t process it properly and they don’t have the type of muscle fiber type that you’re after and they’re generally slow.
There’s a lot of horses that have average genetic profiles in terms of their genetic markers when you take the DNA, but they have got great cardios and so because the way you take the DNA you take a DNA sample of a horse and it’s the same sample the horse had when it came out of its mother.
What we’re finding is a thing called epigenetics which basically how does that genome interact with the environment is almost as important as the DNA itself. What you can find is horses with average DNA markers or below average DNA, but they’re raised very well and they went to a good trainer. They will outperform their DNA if it was DNA alone.
Equally, there’s horses with very good, what we would say have good genetic profiles, but they’re raised on a poor farm. They’re raised poorly and so that DNA never gets a chance to express itself the right way. We see all of those.
If you just do DNA we see all of those sorts of things occur quite frequently which is why the model we use is not just DNA. We took the view that you need to fill in the gaps in terms of how has that horse reacted to its environment? In terms of when it gets under pressure for the first time, as a young foal when it’s weaned, how does it react in that environment?
When we do the cardios and we also have a biomechanics model that puts a skeleton up on the horse and looks at bone lengths and all those sorts of things. When we did that, when we put it into one model basically looking at 300 data points on each individual horse, we used a K means cluster classic algorithm and then a Bayesian algorithm to classify them.
When we did that we found that there were obviously the genetic markers were very important, but there were also other cardiac parameters and biomechanical parameters which were really important as well. You could get a horse that had a sprinter genotype, it doesn’t need have as big a cardio for example. Whereas a distance horse, if you’ve got a distance genotype, they really need a big cardio because you’ve got to be able to get the oxygenated blood to the muscle.
Some of these horses also with big cardios, there’s a lot of horses with big cardios walking around but they’re slow because they don’t have what you need in terms of the muscle fiber type or the mitochondrial capacity to use that oxygenated blood the right way. There’s a whole lot of nuances and a whole lot of little things that matter.
David Duffield: You talk about the data modeling that’s one key aspect for the Matchem racing approach, how do you go about providing a performance rating for a horse purely from the catalogue?
Byron Rogers: That’s a step back from that. We’ve done a lot of data modeling in terms of how do certain things influence outcomes. One of the most underestimated things that when you’re looking at the catalog page, after all the catalog page is just a sales document. It’s there to sell the horse.
There’s a couple things that are very much underestimated one of which is actually the actual race performance of the mare. Everyone looks down there and says she’s unraced or she’s unplaced and they skip that whereas actually that’s probably one of the most important things you can actually look at is how good the mare was as a runner.
There was just a study just last week put out by the University of Sydney which showed the heritability of race performance and showed that the log of earnings per start to signify class, but that was actually quite heritable. In terms of when we start looking at a catalogue we do a lot of modeling on knocking out horses that have got certain characteristics in terms of what their mother did.
We also look at other things like foal ranks and how old the stallion is. As a stallion ages his production of Stakes horses goes down regardless of the qualities of the mare. If you take out the quality of the mare that they’re bred to, as a stallion ages their production of superior runners goes down.
It’s the same with the mares, as the mare ages the chances of getting a Stakes horse, so we’ve done a lot of probabilistic modeling to look at when you take a catalog page where is it most likely that we’re going to find Stakes horses. If you take a horse like Fontiton you look at her, at the time she was by Turffontein at the time hadn’t done what he’s done today. He didn’t have Fontein Rubyn and all these horses running around at the time.
He was a young sire in his first six or so crops, he was out of mare who had had, I think she was on her fifth foal. She was the fifth foal, but the mother could really run. Personal Ensign, she’s a very good race mare and Fontiton as an example fits into what the modeling that we do prior to the catalog that gives us the best chance of finding the good horse.
David Duffield: It’s interesting because we’ve written about the favorite/longshot bias many, many times and some people are probably a little bit bored about hearing about it that it applies across, not just racing but financial markets and across the board.
I hadn’t really thought about it in a breeding context. I did read on your site that it applies there just as much and that some people look at the real outlier results that the occasional absolute long shot comes in and underestimates something as simple as what was the race performance of the parents and grandparents?
Byron Rogers: Yeah, that definitely that happens the minute that people look down the catalog page and they search for answers. Even from a genetic viewpoint, if you take something that appears let’s say you’ve got your yearling and you’re looking down and you see in the second dam that there’s something out of daughter of the second dam of the page. That’s actually got very little genetic relevance to the horse in front of you.
Yet, we look at that and we sort of make decisions on it, so there’s a lot of bias in there that from the viewpoint of looking at the catalogue, so we strip all of that out. We said let’s get rid of all that. It’s like a shiny object moment. You say look over here to the shiny object. We actually say let’s get rid of all that and let’s just get down to what actually matters on the catalogue page.
The other thing that’s missed is one of the things is the effect of the broodmare’s sire. If you look at the catalog page it’s not really, you can’t really make much judgement about the broodmare’s sire. It just a single name on the page whereas actually he has got some significant relevance to the outcome.
There are all sorts of things that you, as you say, you overestimate when you’re looking at the catalogue or you’re looking at yearling sales. You overestimate certain things and underestimate others. We try to strip that away a little bit.
David Duffield: What’s your reasoning behind or I suppose your hypothesis that as the stallion ages and as the mare ages that the likelihood of a really fast race horse reduces?
Byron Rogers: From the mare’s side of things it’s a little bit easier to explain in terms of the endometrial wall, age of the mare. That’s basically as the mare’s age and I remember when we had Lady Giselle at Arrowfield I remember Terry at the time saying she’s got the uterus of a five year old mare. Even though she was at the time I think 15 or 16, she had a really young body relatively speaking. With those mares when they age and the endometrial wall is different and they produce old foals you can actually physically see them anyway.
The stallion’s a little bit more difficult. We’ve done some numbers there where once the stallion gets past about 1,000 to 1,200 foals they really drop off. Those proven sires really drop off quickly in terms of their percentages.
I’ll give you an example there, Sadler’s Wells is one of the greatest sires of all time had ended up having 2,400 foals. It was the first 1,200 foals he had 4.4% group one winners to foals born which is phenomenal. Most stallions would lucky to have 4.4% Stakes winners and he had 4.4% group one winners. The next thousand he dropped from the 4.4 to 1.1 so he actually, the first 1,200 he was at 4.4 and then he dropped to 1.1 the next thousand. In the last 400 foals he never had a Group one winner. Basically as he aged and as he served more, I’m not sure where it’s an aging, we’re not sure whether it’s an aging thing or a frequency thing. As they get older they definitely drop off whether it’s to do with DNA methylation patterns in the sperm I’d be speculating. It’s very hard to know what the answer is other than it’s an effect.
David Duffield: In that Sydney University study that you mentioned, the fact that log of earnings per race start you said was moderately heritable. The best race distance – that seems to pass down from one generation to the next?
Byron Rogers: Yeah, that’s race distance in terms of the genetics behind race distance is highly heritable. Yet, I think that most people listening to this and most people that breed horses would know it’s very hard to change the genotype of a horse over one generation. You’ve basically got two to three generations before you can change.
This is in some ways gets back to the problem in terms of thinking about breeders saying we need to breed Melbourne Cup winners or horses that are going to run classic distances. That’s a very hard thing to do in terms of since the Golden Slipper and since The Golden Slipper and Blue Diamond and all those races. Australian breeders have been breeding for speed for 30 or 40 years or longer.
To switch that around just takes two or three generations, so you’re really talking about everyone making a concerted effort to change that and that taking the intergenerational for the thoroughbred is about seven years. To turn it around you’re looking at somewhere between 14 to 21 years to change the DNA profile of all the horses involved. It’s not an easy task to change that, but race distance and how far a horse wants to run is very, a highly heritable characteristic.
David Duffield: What about trainers? Obviously a poor or mediocre trainer isn’t likely to help your chances at all, but how do you actually gauge that in the analysis that you do?
Byron Rogers: Trainer effect is one of the most bizarre things. Basically people, that’s one thing that they underestimate in terms of, if you’re buying a yearling and you’re sending a yearling to a trainer. The trainer effect is quite large in terms of what it represents it’s probably about I want to say it’s probably around about the 30% mark in terms of the effect of the outcome.
All the other things with genetics and the body shape, cardiovascular and capacities of the horse and those sorts of things are all part of that. There’s some part of it which is just unexplainable luck, but one of the biggest effects is the trainer. If you’re sending your horses to an average trainer you’re giving the horse a fair handicap to start with.
David Duffield: All right, so back to Fontiton then. I’m pretty sure I heard, maybe with Matthew Cain after the win, but there were some other two year olds that you bought that had higher was it rankings or ratings?
Byron Rogers: Yeah, we rate them out from one to 100. She rated a 77 or something like that.
David Duffield: Was that an all-inclusive rating that one, the 77?
Byron Rogers: Yeah, that includes everything. That looks at the genetic markers and it’s an ensemble algorithm so it weights everything and looks at the genetic markers and the cardiovascular capacity measurements and all those sorts of things.
She rated a 77 and Matt and I talked about this and basically we like any horse that rates a 70 or above. It’s hard to, once you get past about 70 it’s like being intelligent. Once you’re a smart person; you’re a smart person. It’s hard to work out the difference between a person with an IQ of 160 and 170. It’s they’re just smart.
It’s the same thing with the way we do the ratings. We score them over. The highest rating I think we’ve ever had was a 94 or something like that. We’ve never had anything much over those numbers. I think anything over you’re effectively looking for a unicorn there, they don’t exist.
I think anytime that they get over the 70’s we’re pretty excited about the horse. We do have some horses out there that are in the high 80’s and stuff like that, so that we’re pretty excited about what’s coming through. Yeah, I think for anyone out there listening and thinking about what we do it’s just once you get over the bar of 70 they’re pretty good horses.
David Duffield: We have to turn our ratings into a price, but so do you, but it’s a dollar price. How do you turn 77 or whatever it is into okay, we’ll spend X amount on a horse?
Byron Rogers: That’s a good question. We look at it in terms of a price model basically. We worked out given a horse’s, the dependent variable is actually not the service fee that the horse stood at, the sire of the yearling stood at, it’s actually the service fee of the season prior to the horses yearlings being sold.
Delete Say for example his yearling [inaudible 00:20:17] about a B3. The biggest effect on their yearling price is actually the service fee that the stallion’s standing for and how popular the stallion was this breeding season. The fact the I’m Invincible, say for example, were bred at $10,000 a piece is irrelevant. It’s what he’s actually standing for this season at $25,000 when he’s full and you couldn’t get into him. That’s the most relevant thing in terms of what drives the price up.
We’ve looked at some things there in terms of if you’ve got those horses standing at certain fees and you look at all the yearling sales. You start to divide them out by standard deviations you can actually work out that a really good looking horse by say a $20,000 stallion is going to cost you $150,000.
If you take a horse like Fontiton at the time I think Turffontein was $8,000 or $10,000, but she was a very good looking horse. We knew that she was going to cost somewhere between and she rated well for us, so we just knew that she was going to cost somewhere between $75,000 and $100,000 and she ended up costing $110,000.
We do some sort of modeling on what you think these horses will make given the stallion service fee and there’s some things there like if they’re out of a Stakes wining mares or whatever as an influence as well. Yeah, so we’ve got a broad idea. Sometimes we buy horses, like we bought a filly I think $70,000 at the Magic Millions where she was we thought one of the best fillies on the grounds in terms of her class rating and that sort of stuff.
For us we were pretty happy to buy her at that price. We would have paid a fair bit more, so I think the pricing we know if we’re buying a horse that’s a very good type and it’s at a certain price level, you have to step up and pay for them because A you really want them because we’ve done all the things we do, but B we also know that, we’re usually not Robinson Crusoe. We’re usually not by ourselves. There’s usually a few other people that are interested in the horses that we like.
David Duffield: Will you ever spend seven figures on a yearling?
Byron Rogers: We just bought out for $450,000 and one for $400,000 this year I should say. Yeah, I think if we had the right horse, yeah. I mean there was a horse there that we couldn’t buy. I think it was Gerald Ryan’s thing out of Defiant Dame. He was I think a Fastnet Rock. We liked him as a yearling. We liked another horse, he only made $300,000 because we had some questions over the vet, but I think he would’ve made a lot more money, the Street Cry that James Bester bought called Wolf Cry that’s had one start; we rated him very highly as a yearling.
I think, yeah, if the right horse is there we could certainly have a stab at it. But when you get to those levels when they start talking about horses that are going to bring anything over half a million dollars you never know where they’re going to land.
David Duffield: The horse that rated 94 on your catalogue rankings. Do you remember how that one turned out?
Byron Rogers: Yeah, he’s actually a group two winner. That was probably his mark. He could may have been a bit better, but he was a group two winner, wasn’t an absolute superstar. That’s what I’m saying once they get above that 70 mark it’s very hard to say this horse because they score here this horse is that much better than that horse there and it’s only a five point difference.
The one thing that is happening is we’re seeing all these horses. Every time we had because of the way we do the modeling in terms of Bayesian every time you add a group of horses and you know more things about them you start to refine the model. You take your prior assumptions, you get the data and you update your assumption based on the data.
We do about 700 horses in terms of their genetics and cardios and everything we do every year. Every time that they filter through and they end up at the end of their four year old year there’s 700 more horses every time that we keep adding to the data base in terms of known racing outcomes. We’re starting to get better and better every year as to working out how good these horses can be.
David Duffield: I know you’ve got a fair bit on your plate, but have you ever looked at applying this from a betting perspective, not just yearling selection?
Byron Rogers: I think there’s a guy in Hong Kong who does a similar sort of thing. Bill Benter. But I think that you’d need a, one of the reasons we’re successful is that we only look for a closed pool of horses.
We can predict a whole lot of different horses very well in terms of stallions and in terms of horses, but we know, one of the things that we know is we know we can predict a particular type of horse really well because we’ve got a lot of them. We know what separates the good one and the bad ones.
Just as an example, say if you take a horse, you take the average Australian yearling, you could look at their cardio internal diameter and the internal diameter will vary between like 7 cm up to 12 cm in size. If you take the average Australian yearling they’re sort of sitting around that 8 and 8 ½ cm. They’re not going to cut it when they grow. We have growth curves and all that stuff so when they grow they’re not going to make it into a really big cardio. Whereas a thing like Fontiton she starts off at 9 ½ cm, she’s going to end up at 11, so she’s got, so those little things make, in terms of the sprinters, make a big difference.
We, while the cardio is not absolute, but if you’ve got a good size cardio in a sprinter or a sprinter miler we know what with one particular genotype and one particular type of horse how to find those horses really well. We’ve really now circled around, so we’ve said OK we’re going to find those horses better than anyone else.
I think if you’re going to do it to betting I think you have to have A, a closed pool like Hong Kong where you can get all the data or B have a very finite type of race or a race type or set of variables that you would only bet under to use it effectively. You’d have to be pretty disciplined to keep that going for a long period of time.
David Duffield: It’s just interesting, I mean breeding isn’t a real focus of the way we do the form because it has been very hard to quantify. Obviously you’ve put your life’s work into being able to quantify, but it’s something where you really need to have some expertise or otherwise if you’re dealing with it on a fairly shallow basis it doesn’t really add too much value.
Byron Rogers: Yeah, I think in terms of handicapping or betting and all that other stuff it’s very hard. I think the only thing you can do on horses that are first time out like daily runners and stuff like that there is probably if you looked up the mares and your race record and make sure that she was a 1000m winner and the thing was running you’d have a pretty good. That may have an effect.
As I said earlier on the trainer effect is quite large in terms of especially those early 2yo races. You know the trainers yourself that are really good at getting out those early 2yo’s.
Breeding outcomes are very difficult even after all the modeling we’ve done in terms of trying to predict, one of the things we haven’t been able to do is predict breeding outcomes of hypothetical matings in terms of if you just said we’re going to here’s our sire and here’s our sire and dam, what’s the outcome going to be? That’s a very, there’s a lot of different permutations in terms of genetics and lots of stuff that can occur. We’re working on it but it’s still very hard to do.
I think the advantage we got doing what we do in terms of selection, selecting yearlings at Matchem and up here in America with Performance Genetics it’s in front of us. You know what you’ve got, you can’t go hiding from it. Yeah, I think that’s the best application for the technology at the moment.
David Duffield: Fair enough. It’s been a fascinating chat. Like I said, I’m no breeding buff on my own. We stick to the form side of things, so it’s been really interesting to learn the way that you guys go about it. Obviously, the success of Fontiton will help keep business kicking over. We really appreciate your time in joining us today.
Byron Rogers: Not a problem, thanks very much for having me.