Data collection and analytics in eventing

Video recordings are so ubiquitous, why don’t officials watch videos of riders after an event and call out any behavior that wasn’t noted or marked on the day? You can give a yellow card or a warning a week later.

You all do realize we were hashing this out 15 years ago? IFG even developed a rubric to enable statistical comparison for all incidents.

Record keeping at events is spotty at best and wholly incomplete at worst. There is no consistent reporting, e.g. uniform nomenclature, that allows effective comparison.

I’ll support you all if you can get it done this time. But as one who has a couple of clinical research trials going, I can tell you that the data you need to collect will need to be consistent and clear (IFG is the person who can create that).

Data analytics is only as good as the available data. And that is where this effort will fail without the development of a consistent system and training of TDs to fill out the required forms correctly (welcome to trying to do epidemiology).

Additionally, you need to have upfront the statistical parameters that will be measured (a parameter is not a measurement) in a matrix that enables testing. Think about exactly what your hypotheses will be and what parameters will need to be quantified in order to compare those parameters.

6 Likes

15 years is a long time. Maybe now is the time for people to listen. We are who changes our sport. Maybe those who tried in the past can give us a jump

1 Like

Seems like US and Europe are the same on this, what a shame :frowning:

1 Like

totally in support of the effort! I’d love to see eventing as easily tracked and published as horse racing, and we have the tech to do it. Here are some articles that I remembered that might help your quest, or maybe contact the author of them to see where she got the information.
These have a data category of horse falls and rider falls broken down by event
https://eventingnation.com/eventing-analytics-the-math-of-moving-up-east-coast-edition/
https://eventingnation.com/eventing-analytics-the-math-of-moving-up-west-coast-edition/

I suspect the USEA and FEI have far greater data than they let on, it’s just gaining access to it that will be the challenge.

1 Like

First off, thank you to everyone who has contributed to this discussion! I have been reading all the responses and seriously thinking about this all week. I’ll respond more directly to people this weekend when I get some more time, but I do want to say that I’m working on a proposal of where I’d like to start with the data. One idea I have, and the one I think would be easiest to start with, is getting some data about every rider-xc jump-show combination and putting it through a machine learning model called a random forest. I think we could use this to give us a variable importance ranking, which would then at least point us in the right direction in figuring out which variables are correlated to falls the most. I’m putting together a list of the variables I’d like to use, how hard they would be to get in the current climate, and what an action item might be if we found it to be highly correlated to falls. This would not be used to advise any single person, but rather to advise the system as a whole on what to focus on. I’ll post the write up this weekend and would love to get everyone’s thoughts on it!

2 Likes

You still have to develop the parameters of the learning set. You just don’t take random data. Thus, you still need to have uniform parameters that may not be reported, e.g. type of fence - a vertical or palisade could be the same or different fences depending on who filled out the report. Hence the need for uniform reporting. Otherwise you will need to go through all written reports and develop your own nomenclature.

Make array of parameters. I bet you would have somewhere up to 25-50 with 10 having significant principle component effect.

3 Likes

Thanks for those links, they were really interesting!

It would be great if the USEA or USEF could use the show results they already have to calculate and publish rates of horse falls, rider falls, refusals, and completion at each venue and level. Perhaps that would encourage some accountability for venues and course designers with abnormally high fall rates.

There is a lot of analysis associated with the ERQI ratings, but it is focused on identifying horses that are likely to fall, rather than fences that are likely to cause falls. You might be able to piggy back on their analysis.

1 Like

How do they actually do that? Do they incorporate the accident data into their algorithms? Do they take, fence, weather, and venue data? Are the rider qualifications considered?

Janet, how can the system identify specific horses? Has it ever predicted a horse and rider fall before it ever happened? What sort of validation is done? Has this ever been done at low level, small events, e.g. did it ever predict a fall at BN in Arizona?

Analysis without verification is just that. It is economics. It is man made numbers that GUESS as to an outcome.

ERQI isn’t perfect, and to my knowledge it doesn’t include every single one of the variables you discussed above (although it is proprietary, so there is only so much information we have about the exact calculation). This is what has been disclosed publicly:

The ERQI value takes into account the class at which a horse is competing, the rider who is competing on the horse and the level of performance displayed by all of those who competed in the same class. An ERQI can react to whether a competition was statistically harder or easier than the average for that level of competition.

Source: https://www.eventingireland.com/Portals/0/EasyDNNNewsDocuments/936/ERQIs%20FAQ%20feb16%20v2.pdf

Early data indicates it is more effective than MERs alone as qualifications, explained here:

Eventing Ireland (EI) was the first and only national federation to utilize the ERQI during the 2016 season, targeting all national levels. They saw a 56% reduction at the national two- and three-star levels, with a staggering 66% reduction in horse falls at the national two-star level alone.

Source: https://eventingnation.com/equiratings-quality-index-uses-risk-analysis-for-a-safer-sport/

ERQI does not predict that a horse will fall, simply that a horse is at a higher risk of falling than others. That said, since you asked for an example of a horse and rider fall it predicted before it happened, one I know of was Lauren and Veronica at Rio. Equiratings had alluded to that combination several times, as clearly as they could without outright naming them, prior to that fall. Colleen Rutledge and Escot 6 at Kentucky in 2016 was another they had not-so-subtly flagged up prior to their horse fall there. That’s just off the top of my head.

If you would like to investigate further, additional links are below (depending on how much time you have on your hands):

Detailed explanation: https://useventing.com/news-media/news/equiratings-quality-index-explained
FAQs: https://useventing.com/safety-education/safety/equiratings-quality-index-faq
Podcast: https://usea.podbean.com/e/introducing-erqi-the-equiratings-quality-index/

Yeah. Yeah. I’ve seen and read all that. The questions still stand. You use a very high level example. How about prelim at St. John’s in New Mexico. You example is one where I would ask if there is a high level bias which suggest there is algorithm bias.

At the same time, how does one account for any officials bias, e.g. a dressage judge who is consistently 10% different than another?

EQRI is a good validation of risk management but is not actually a risk management tool.

I have yet to see how Ireland actually did their comparisons, was it normalized to participant rate? How do you know of that claim is a real effect of simply an outlier of random chance?

This is where the real research must be done if we want data driven rules and standards.

2 Likes

I’d love to help out. I have a background in data analysis and would love to help where I can- been thinking along these same lines for a while.

Here it is in case anyone is having difficulty finding it:
https://inside.fei.org/system/files/Eventing%20Audit%20-%20Charles%20Barnett%20-%20Final%20Report%2026.07.16.pdf

https://useventing.com/news-media/eventing-tv/erqi-reports-for-officials-explained

Thanks for this link. It’s interesting - especially the risk analysis on fences. It would be nice to see this report updated every two to three years. I see the FEI has an online reporting tool that is supposed to be extended to Eventing this year.

I sat down for several years and collected videos of accidents. Probably 200 or so and did a stride by stride analyses.
Most of the time I look at as much video I could get of every ride and than concentrated on the jump and than how it developed during the ride.
I send those analyses to the FEI, USEF and so on. Hundreds of hours of work. 2008, 2009 and into 2010.
Never heard a word.
I might have killed the plastic big log imitation, when it was used at the WEG, by analyzing the video of the South American rider, which was heralded as such a big success.
The log nearly killed him.

Do not expect any response from the powers.
They are to much used to work on their self interest, which is big money and the good old boy network.

6 Likes

A risk analysis of fence is BS. You have to understand the course design and the terrain and how the jump was or is used.
Besides eventing I did built jumps and worked with course designers, sobering experience.

A methodology was included in this report. The analysis included those factors (to the extent that they had the information) and made recommendations on how to improve the data collected. The report also recommends yearly analysis, as data collection is improved and updated.

Sad but true