Musings on the meanings of scores

You can measure and quantify human performance. Done every day in multiple industries, social sciences, psychology, medicine…

Judging to a standard can be measured…Dressage judging ain’t rocket science… perhaps the judges don’t want to submit themselves to that level of scrutiny.

Are you a dressage judge?

1 Like

Au contrair. A good understanding of physics makes understanding dressage and correctly executing it easier. It’s not magic.

4 Likes

Judging IS done to a standard. There are small degrees of variability within that standard because judges are human and horses aren’t machines. But to state there is no standard or the standard isn’t followed is simply false.

6 Likes

I never said there is “no standard”…pls read for comprehension. I have repeatedly quoted Objectives and Directives of the test…and listed multiple times the listing of the meanings of the scores.

As far as there being “a small degree of variability”…please enlighten me where those studies have been performed, by whom, using what methods, and what the quantification of error has been…because I would be very interested to read about these studies.

1 Like

The thing about judging to standards is that it is easiest to apply standards when they are simple, objective, and isolated. It is increasingly difficult to create and apply standards as the task becomes more complex, more expressive, and more entwined.

At one end of complexity, the shot put, the high jump, and the marathon race are judged on one metric of distance or speed. In this they are like show jumping or TB racing.

At increasing complexity, we have things like gymnastics and ice skating in which you perform a set routine but every movement has its own set of scores. And there is a component of aesthetic appeal too. This is closer to dressage. It’s already much more complex than evaluating safety protocol in a power plant.

At an even more complex level is actual dance. Or any of the arts. Here originality is important, even in a traditional discipline like ballet or classical piano. Once I was watching “So You Think You Can Dance” and they interspersed it with a few minutes of really good modern dancers from a big company. It was so obvious that we were dealing with two completely different realities. The professional dancers were expressive and fascinating while the contestants all honestly looked like they were doing gymnastics routines.

I could easily see that the professional dancers were miles better than the contestants but it would be hard to quantify this. And it might be hard to rank the various professional dancers using a simple scale because they were all masters of the skills, and the difference would be in expression and originality.

4 Likes

Agreed. But the fact that the “standard” is complicated is no reason NOT to quantify how well the judges adhere to the standard.

1 Like

I’m not a dressage judge but I am out there competing on a regular basis. Are you out there competing?

Judging IS done to a standard, a standard that you don’t agree with or seem to have difficulty relating to real life. You can’t quantify what you seem to want to quantify. Your ideas of how the judging scale works are simply untrue. Judges ARE adhering to a standard, the issue is you disagree with that standard. For example, your personal idea that a buck in a canter transition should earn a 1 simply does not match with the judging standard. Your idea that in the other post a horse is performing an extended trot is a misunderstanding of the standard.

Your idea that green horses don’t belong at shows is simple your personal belief that I disagree with.

12 Likes

We don’t know if judges are judging to the standard because that performance hasn’t been measured…at least not that anyone has shown. I have never seen any studies of dressage judge performance using standard statistical methods for measurement of human performance.

My agreeing or disagreeing with the standard is irrelevant.

My OPINION of how the judging marks are used is a totally different discussion than whether judges’ performance to the standard has been quantified

2 Likes

I scribed for Gen. Burton (LONG time ago, LOL). He was the best judge I ever scribed for. I told him I was going to do the L program because I was interested in judging. He then judged a 3rd level class, and every time he gave me a score, he would say, “why a 7?” or “why a 4?” If I responded correctly, he would say, “write that down.” If I was wrong, he told me why and gave me the comment to write. I got a HUGE education about judging that day.
Nonetheless, I still maintain that the judges are held to a standard and very few of them judge outside that. I don’t think it’s as willy-nilly as you do.

11 Likes

Well, there ya go…Gen. Jack Burton was former cavalry. I am probably of a similar “vintage” as you and rode and trained with these old cavalry guys, now long gone. Things seem to have been more clear and more well-defined back then when these guys were the dressage decision makers.

My PERSONAL experience…take that FWIW…with the L-Judges training was the following. We were allowed to ask questions after the videos were shown. One video showed a horse behind the vertical. I naively asked a question of how that would be scored…to learn…not to be provocative.

The answer I got was, (direct quote), “The horse was not behind the vertical.” Well, I may be old, but I am not blind. A simple re-wind of the video would have led to a purposeful discussion…if the trainer actually wanted to have that discussion.

DR-101-6 says the following

  1. In all the work, even at the halt, the horse must be “on the bit.” A horse is said to be “on the bit” when the neck is more or less raised and arched according to the stage of training and the extension or collection of the gait, accepting the bridle with a light and consistent soft submissive contact. The head should remain in a steady position, as a rule slightly in front of the vertical, with a supple poll as the highest point of the neck, and no resistance should be offered to the rider.

I simply wanted to understand how the horse shown in the video would be judged against this “standard.”

Again, I ask why no one has thought to quantitively measure dressage judge’s performance to “the standard?” If the judges are as good as I am being told, then a measurement using standard quality tools would validate that performance. If there is large variation, then a program can be implemented and adjusted to reduce that variability to improve judging.

3 Likes

Perhaps it is you?

You seem to have trouble connecting the standard to real life movements. In the story about a horse that bucked into the canter you felt that was worth a 1, when people that had actually passed the L and judges said that it would always be a 3 or a 4. On the other post you felt that a horse was doing an extended trot when many other posters tried to explain to you why the gait of the horse did not meet the standard of scoring. People who were in the judges program and actively show and judge.

Do you show? You mention scribing many years ago.

10 Likes

So call me a hard-ass.

This is a discussion board no? We are discussing different opinions. Some people have different opinions. The horse in question was not in a show environment, so the scoring of that trot is a moot question.

1 Like

So there are several ways scores IRL can duverge from rubrics on paper.

One way is if there are huge differences in innate ability between different groups. Obviously there is no shared rubric for writing programs across universities. But if I tried to evaluate my sweet 4 year college kids by the standards of the super select research university it would be ridiculous. Similar things apply to the slope between CDI, rated shows and schooling shows.

Another way is if the written rubrics get in the way of outcomes that the judges participant and community feels are more important. At this point, the big trot with flashy front legs is the A plus movement in dressage. To get the exaggerated trot, biomechanically you have to ride more like saddle seat, which can mean behind the vertical, DAP, and trailing hind legs. Therefore everyone agrees to overlook some directives because the discipline has evolved away from that. There would certainly be parallels in writing, can’t think of any because I make up my own rubrics for each assignment.

The judge can also have some discretion how much to mark down flaws in a complex movement (do I take off points for spelling errors in a really good paper?) or could be outright incompetent but that’s actually rare.

The reason it’s not done for every national judge and show - cost for the show and USED/USDF organizations.

I believe at some FEI competitions there is an overseer judge who watches all the scores come in from the panel, analyzes and compares, and will talk to an individual judge who is over or under scoring.

Always complete a judge/show evaluation form at the show if you feel like the judge’s quality and scoring should be reviewed.

1 Like

So I got to thinking about this a bit, and the idea of standards. To the point made earlier by Big Mama, there will be variability. The easiest place to see this is to look at scores from the CDI’s where there are multiple judges. Five judges will have five different scores for the same ride. Often they are fairly close, but there are also times when they aren’t. Sometimes it may have to do with the angle of the particular judge - Things look different at B or E than at C. It also has to do with the fact that the test is moving along, judge doesn’t have time to mull over any movement for more than a few seconds.
And then the question becomes which of these judges is judging to the standard? Or not?

2 Likes

That is not how a “real” statistical analysis would be done as that is not a controlled test.

The judges are already required to attend regular training for their license renewal. At these training sessions it would be pretty straight forward to structure a protocol that would not interfere with the judge training, AND would provide robust data for a statistical analysis.

As far as the FEI, I haven’t followed details for a few years, but David Stickland cobbled together some sort of measurement. Stickland is a PhD physicist, not a statistician or a quality improvement professional. When I discussed what Stickland was doing with a PhD statistician, the reply I got was (direct quote), “He is trying to reinvent statistical methodology.”

Out of Stickland’s work came the FEI Judge Supervisory Panel, JSP.

Although I give the FEI credit for stepping up to the question of quality in judging, this effort is still not based on tried and true statistical methodology.

And before someone jumps down my throat for this as being “too critical”…I adhere to the Japanese Quality Principles…eg., perfection can never be attained, so there is always room to for improvement.

3 Likes

WHo would not pass the L with distinction. The concept of the judge programs is to create judges that uphold THE standard AS DESIGNED. If you don’t want to accept that standard methodology, you would not pass. (And this goes for the higher level programs as well). You are more than welcome to hold yourself and your students to your standards. Just dont think you have the right to tell the rest of us that our understanding of the standard methodology is wrong.

16 Likes

Sorry to be cynical, but a study like this would cost money, and i don’t think there’s the political will to spend member’s fees on it.

1 Like

I do not think it is necessary, nor do I think it is even possible, to quantify what pluvinel wants to quantify.

4 Likes

It’s always good to check in on what is happening. Maybe such a study would come back and show everything is awesome, and there nothing needs to change. Even knowing that is valuable, I’m not sure why there’s such strong push back on this?

Self reflection is a useful tool, and that is what this would be

2 Likes