Positive Reinforcement Training

Ah, my point exactly. Putting a reward on an R- behavior doesn’t make it R+.

Also while I’ve had exceptional fun with clicker training my mare, the poster upthread, I think Bluey, who said you don’t know what kind of partner you’ll have until you try is spot on.

My mare is smart, extroverted, friendly but not cuddly, and confident. And super food motivated. She will add her own flourishes to tricks and yes, she can get way too excited.

I don’t know how well I’d do with a normal horse! I sometimes try to set friends up with a trick or two, and between the humans’ lack of timing and the horses’ lack of interest, they never get anywhere.

On the other hand, every horse can learn to give to pressure and in very subtle ways if the humans around him have tact and timing. And pressure can be just a wisp of a touch. The thing that makes it R- is that it isn’t necesssarily followed up by a treat.

Also, while R+ is a strong motivator in a controlled situation, it is not always strong in the wide world.

Or there can be competing good things! At the fall fair, we perform on a nice patch of lawn. The first year, maresy wouldn’t lie down because every time she lowered her head, she started to graze, and our regular kibbke treats were not as desirable as grass.

I solved that by getting “high value” treats, peppermints and gingersnap cookies, and letting her hang out in a paddock with grass between performances.

Thing is, I can work her with pressure on grass and she will listen. But the kibble reward was not enough to overcome the more pleasurable act of grazing.

Why does this difference matter? Because you need to understand the basis for your own actions before you can be an effective trainer. I agree that you can take a behavior that was taught inhand R- and confirm it at liberty using treats. But you will find it difficult or impossible to get that behavior from scratch using just R+

Also I don’t know why R- is somehow suddenly a terrible thing. I’m currently hacking out a dressage school mistress mare while my own horse is at pasture vacation. And 95% of our ride is just a whisper of aids, shoulder in and half pass and transitions. The other 5% is more directional on my part, if she spooks or decided it’s time to go home :wink: because nobody’s perfect.

My own horse isn’t this advanced but certainly can go through much of her repertoire under saddle and on the ground with minimal R- cues. Indeed in the first 6 months I was riding her and she was still green, there came a day when I suddenly realized she was “with” me handwalking, matching her pace to mine. Except under exceptional circumstances she would halt when I halted and walk or trot with her head at my shoulder. We had already refined pressure cues to the point they were invisible, indeed without me doing anything other than demand correct behavior every time in hand (because otherwise shed pull me into the ditch to graze). This was before we were introduced to clicker training.

Anyhow the classic groundwork gurus all stress the use of R- pressure starting with very clear obvious and if necessary big signals on a green horse, and working up to very subtle signals on an advanced horse. That’s how you longe at liberty and make a horse stop and change direction at the canter. You don’t in general get there by pure R+.

I would say, ignore this at your peril. These body cues are what horses inherently understand because they are how they communicate with each other. And honestly if you don’t have tact, timing, and full self awareness of how you are using R- pressure every single minute around a horse, even just grooming, your clicker training will probably not be that effective because it is built on the same skills of tact, timing, and release.

Anyhow, honestly I think R- done at very high level of subtelty is more impressive than R+. With good R- you can have a horse play with you for extended periods of time, at speed, and on minimal almost invisible cues.

With R+ the extrinsic reward is always there and a horse can walk away if the treats aren’t forthcoming. Some tricks can only be taught R+, as I said upthread. But most speed and movement work is better off R-. Horse doesn’t really want to stop galloping for a single peppermint, once he is in the groove of going forward.

5 Likes

I think it started in dog training. “Positive only” is suddenly the only “humane” way to train dogs, and anything else is “cruel.”

I have had this discussion a lot regarding dogs - particularly because in hunting dogs R+ is not used as the primary training method. It can be used for many things, but R- and sometimes P+ or P- can be more effective. It is really difficult to teach a dog NOT to do something with R+.

“Don’t chase that bird” - “Don’t break point” - “Don’t eat the bird on a retrieve.” The biggest problem is that any reward offered by the handler is usually less desirable to the dog. The bird is the ultimate reward. That’s what make them a good hunting dog - if they don’t want to get the bird, you can never teach them to hunt.

This comes up on the Menagerie at least once a year. The idea that recall and off lead obedience can be taught using R+ only. And for some dogs - it can. For dogs bred for the purpose of hunting - it’s probably not going to be the best/only quadrant to work.

3 Likes

That makes sense! Also some dogs are so cuddly that just a pat and praise is enough for R+. There are sniffer dogs whose reward in training is to find a toy. But absolutely even the biggest goofiest Labrador retriever isn’t going to be taught to stay off the couch by R+ only. He might only need one “bad dog” to slink away in shame. But that’s still P+ no matter how mild.

Horses have less repertoire of acceptable rewards and really food treats are the only reliable R+

Whereas with good dogs, they become super attuned to human praise and punishment and seek praise for its own sake.

1 Like

Don’t forget the value of keeping engaged/keep working as a reward.
Had horses and dogs that went there.
Even judiciously applied jackpots were not that interesting over continuing the fun games.
Don’t know with sporting dogs, but herding dogs, yes, to stay in play itself is a top interest.

When our dog club started using clicker training after some of our members went to seminars, oh, some 30 or 40 years ago, everything changed.
We had some of those clinicians come here for seminars, we eventually started having clicker classes along with standard ones, then incorporating operant conditioning itself partly in the classes, where suitable.
Even those that had already been playing with treats learned not to be more of a treat dispenser than how to properly use them for rewards:

https://m.facebook.com/dogsdentraini…846679/?type=3

Some of the clinicians mentioned here were part of the dog world then, Sue Ailsby came regularly for some years to help with her seminars, reports coming out of the Baileys talks, they of the dolphin world, were followed closely before that.
Remember, you can’t put a halter on a dolphin, you have to train by your wits and understanding what you are doing.

I am still scratching my head about someone thinking they can use “only xyz”, here R+?
I think someone has been missing the bigger picture of learning theory R+ is but a part of.

1 Like

In my experience, most horses respond well to people who they associate with safety, and look to for guidance in times when they’re unsure.

My own mantra is “never betray a horses trust”. I believe the very presence of the “right” person in the horse’s life, can embody the element of positive reinforcement, even while the person is correcting undesirable behavior using what some would call negative reinforcement.

It’s the same idea as how a good parent of human children sometimes has to say “no” to what their child wants. Because the parent knows what children think they want in any moment will not always be good for them to actually have.

All I think that really matters is the form the horse/person relationship grows towards becoming. If horse and owner are on a path towards building a bond that brings mutual trust and contentment to both, then why do the elements of their interaction need to be labeled good or bad by some third party who’s likely interest is to sell you something, or pretend they have secret knowledge.

OP-what are your plans for this horse-how are you going to use him (for lack of a better term)? I’m having trouble understanding how you are going to have any practical, real world riding with only positive reinforcement, especially in a group. As we all know, every time you put your hands on your horse, you are training, whether you mean to or not.

I’ll tell you right now all the cookies and “good pony”'s in the world will not get my forward galloping mare to come back when the horse in front of her slows down-that takes a gag bit and a strong seat. Now, I love her for it, but that’s her just her. She’s hard wired to be that way, and I don’t see “nurture” overriding “nature”. She gives everything she has, and sometimes it’s more than necessary in that circumstance!

I think the kindest (and most successful) training is having a clearly defined goal, unrelenting consistency and impeccable timing.

1 Like

They are taught first with leading so they walk with a single click before I walk. Halt before I halt when I say halt and go back with thumb on chest and the word back. Always 2 signals for back as you only want one horse to back out at a time in a float (trailer) when you say back.

After that they are taught to go back with a finger going side to side and the word back so you can ask a horse to back away from a gate so as you can open it and go through on another horse and a gentle tug on the tail and the word back for backing out of the float.

All with release of pressure in the beginning. With these signals they can be asked with no halter on.

Then onto lunging. You click, they walk forward, you step back and you are in the correct place to lunge. They are taught to stay in the gait they are in by lunge whip being cracked when they go out of the gait. Not when they are in the gait, so they learn to stay in the gait they are in. They know the words walk, trot, canter and halt. Later after being ridden, they also understand slow and quick.

So now when you see me lunge you see a horse that will respond to voice alone, no whip being cracked AND without or working correctly in side reins.

So pressure and release is used to start with if you call using the lead rope to ask for forward after clicking and praise and using the lead rope when you say halt and praise but I dont see that as punishment.

Once under saddle they know to walk with a click the words trot and canter and they can be used. Adding aids so that this can be non verbal. Again I don’t see this as punishment.

With what I said about slow and quick, the leg is not removed. It is on and asking for bend and flexion and inside leg into outside rein.

When I use good boy as a reward it is while he is doing it and the stimulus of seat, leg and rein are being used and not removed. THAT was the biggest breakthrough with Sim.

So yes they can go on voice alone, they can go with aides alone. These are dressage horses who are being worked and not just being taught tricks. They have also started over trot poles and cavelletti.

You think that all horses originally move away from pressure. That is not so. Originally when they came both Twiggy and Sim moved into pressure.

So picking up Twiggy sight unseen on the side of the road and she did not want to go on. Asking her forward with the halter did nothing. Tapping gently with a whip on her side did nothing. Tap her on the chest and she jumped forward. She jumped forward each time and was in.

So I had to adapt training for her. Sim also went into pressure when he arrived. You have to work with the horse you have and learn to speak the language they understand. 2 horses may not speak the same language.

Which is where I think you and I differ the most. I adapt to each horse. I dont expect them to adapt to me. They learn faster when you do that.

Also remember that it is going to be different if you are training a horse from scratch or if you are retraining a horse.

1 Like

To start, here is a website that briefly explains BF Skinner’s study in Operant Condition and positive/negative reinforcement/punishment. (Conditioning was something I have great interest in and studied a lot while getting my Masters in Psychology).

https://www.simplypsychology.org/operant-conditioning.html

It seems some people are confusing what positive and negative is. It does not have to do with ‘stimuli’ per se, but taking away or adding something. I will use the example of a dog I have been working with. He has to sit and wait for the release command for dinner. I tell him to sit and wait. If he does not wait while I am putting the food down, I pick it back up - I amd taking the food away in response to him not doing as told. That is negative (taking something away) punishment (he didn’t do what he was told).

Also, there is a difference in cues and reinforcement/punishment. A horse that is voice trained will trot with the word ‘trot’. A horse that is cue trained will trot with use of the leg. Neither of these are conditioning, these are cues. A horse that trots with the word or leg is trained, so there is no need for operant conditioning as previously used operant conditioning has done it’s job.

Likewise, if someone taps you on the shoulder to get your attention, that is not positive punishment, that is a cue to turn in the direction of the shoulder you tapped.

It seems some people also are not understanding how conditioning is used. Quite often more than one form is used in quick succession. For example. The same dog must sit before I open his kennel. If he gets up as I reach for the gate, I stop and tell him to sit again (negative punishment). If he continues to sit, open gate (positive reinforcement). We must sit and wait before exiting. If he gets up, I don’t move (negative punishment) and hits the end of the lead (pressure of the lead is positive punishment). Once I give the release command, I give praise and we go out (both positive reinforcement). I rarely use negative reinforcement with this dog as attention is what he wants.

I used negative reinforcement with horses a lot. When learning out to stand at the mounting block with a nervous horse. Line up to the block. If the horse moves, It hits the end of the lead/reins (positive punishment). Once he stands for the prescribed time, I step away or walk the horse away from the block (or a bit of both, which is negative reinforcement since I am removing myself/the mounting block/source of worry). Sometimes I will give a treat as well (positive reinforcement).

Likewise, if you are teaching your horse to trot and give a little pressure with the leg (cue) and the horse does not respond so you kick (positive punishment). If the horse responds, you relax your leg (negative reinforcement). If you give praise at the same time then you are using both positvie and negative reinforcement at the same time.

I really do not see any way to use ONLY positive reinforcement and would be interested to see the progress of someone attempting this.

3 Likes

@Ajierene That is a fantastic article! I did find it the other day and it is very clear and easy to understand. I’d like to pull a quote from the “Negative Reinforcement” section:

“The removal of an unpleasant reinforcer can also strengthen behavior. This is known as negative reinforcement because it is the removal of an adverse stimulus which is ‘rewarding’ to the animal or person. Negative reinforcement strengthens behavior because it stops or removes an unpleasant experience.”

Negative reinforcement is the removal of an ADVERSE stimulus. The “negative” in “negative reinforcement” does not refer to the adverse stimulus, but the removal of it. But the adverse, or undesired, stimulus is absolutely present. If you were to remove a desired stimulus, that would be negative punishment, as shown in @Ajierene 's negative punishment example with the dog and the food bowl.

I use positive reinforcement because I do not want to introduce an adverse stimulus to my horse to produce a behavior, movement, etc. I do not bash those that use an adverse stimulus in their training, because it is not my place and they are free to train with the method of their choosing.

It is my personal choice to use positive reinforcement. I am confused as to why I received so many negative replies in my thread. I am all for a healthy debate, but I am afraid that some posters seem to be bent on changing my mind and criticizing my understanding of basic psychology.

I have practiced negative reinforcement for 8 years of my riding career, under numerous trainers, with countless horses. I did not like my lengthy experience with negative reinforcement, not because it isn’t effective or because it is physically abusive, but because of the way I noticed that it impacted the relationship between me and my horses. I am by no means a professional, but a mere equestrian who has tried two training methods, liked the effect on the horse from one method, and did not like the effect on the horse from another method.

I never called negative reinforcement inhumane, abusive, or cruel. Please quote me if you find these terms in my posts.

It is okay if you do not agree with the way I train my horse, and it is okay if I do not agree with the way that you train your horse. There is no need to fight to the death and prove that I must be misunderstanding you, or that I must not be educated enough, or that I am just plain wrong.

I am respectfully removing myself from this discussion, as I seem to be repeating myself with no end in sight. The purpose of this thread was to share experiences with positive reinforcement and tips, which has completely derailed.

OP, you seem to be missing the point that many of us use positive reinforcement successfully in some situations and are experienced in it.

You also seem to be missing the point that many of us use negative reinforcement to create a subtle relationship with well trained and willing horses both in the saddle and on the ground.

And that we are asking: what are your concrete goals for R+ training? What behavior do you want to teach by R+? How will you go about removing R- from your training?

We are not just being “negative” about R+. If you could give specific tasks you want to accomplish we might be able to give you suggestions.

At the moment you give the impression of having an abstract idea (R+ is morally better than any other quadrant) but no practical experience training horses yet.

What behavior do you plan to teach, and how do you see R+ working to teach that behavior?

2 Likes

OK, but seriously, why not?

I think you are hung up on the word “adverse.”

We have given many examples that the stimuli could be as minimal as stepping toward your horse, or resting your leg on them. You can use a gentle tap, a press of the leg, a rein aid that “closes” one side off so the horse chooses to move the other way.

Why do you think these are bad and/or you do not “want to introduce them”? What about this type of training is harmful, unpleasant, or in any way not kind or gentle? Why wouldn’t you use negative reinforcement if it is effective, is not harmful, and easier? Especially because you have even said that horses naturally move away from pressure?

Since you say that you never said R- was inhumane, abusive or cruel - what exactly is the objection?

This thread has only been “derailed” because most people do not use positive reinforcement for under saddle work. Because R- is more effective, and since it is not inherently unkind, makes far more sense.

1 Like

As I stated, that article is basic and it uses a non-stimulus type example to illustrate negative reinforcement. It is interesting that you correct my explanation of negative reinforcement but reference my examples of negative punishment. Removing the food (positive thing dog wants) is negative reinforcement for the dog not obeying the command. The giving of the food is the positive reinforcement for the dog obeying the command.

Using my example of training a nervous horse to stand at the mounting block, moving away from the mounting block (or moving the mounting block) is negative reinforcement because you are removing the item the horse does not (at the time) like. You can use treats (positive reinforcement) to make the horse more relaxed at the mounting block. Neither is necessarily better than the other but one may work better than the other depending on the horse.

Similarly, with teach a horse to turn left, you pull the left rein and once the horse turns, you stop pulling on the rein - that is negative reinforcement because you are removing, the action (in this case the stimulus). How are you going to train a horse to turn left without using negative reinforcement?

1 Like

Well, I’m sure you know that in true R+ scenarios you just wait for the animal to voluntarily offer the behavior, and then reward it and continue to shape the behavior.

How long will it take? Maybe somewhere between 5 minutes and never. What if the horse never turns left or even looks to the left?

There are dog trainers that literally use R+ only. And they just wait. I’ve used R+ only to teach (or try to teach) my dogs to speak. The theory is that they may offer random behaviors trying to please me…and if they make a sound I can reward, repeat, shape, etc. until they figure out the pattern.

One dog was great at this and learned it quickly after vocalizing out of frustration, got rewarded and now is perfect at this command. One dog just sits there politely and waits eternally, never offering behaviors and never vocalizing…maybe he’s not so smart? And sadly, one gets hysterical…offering frantic behaviors and spinning in circles, but never vocalizes her frustration, so she never gets rewarded and never understands.

Personally, I actually think this is unkind, at least for her. I could continue to wait and wait, hoping for a vocalization, but I have never managed to wait that long - and instead ask her for a command she does know and reward her for that.

So…just because R+ has the word “positive” in the name…doesn’t mean by definition that it is kind or humane. :slight_smile:

5 Likes

This is a really good point. I can get my mare overstimulated or confused with clicker training, and I watch carefully for those signs.

Pure R+ really does only work with offered behavior. I can teach my horse to play fetch but cannot teach a quieter minded less mouthy busy horse to do this.

1 Like

I think all animal training has to contain both a positive and a negative element.

Once the animal is conditioned to expect a reward, the withholding of the reward becomes the negative element.

Who’s to say withholding a reward doesn’t fire the exact same neurons in the horses mind as does a light tap with a crop or spur???

1 Like

This distinction between negative reinforcement and learned cues is important and I was skipping over it. Thank you for making this so clear.

My horse has cues she has learned from R + and cues she has learned from R- that at this point in her career can look similar. For instance the cue to smile is holding up my arm and wiggling my fingers, and the cue to back up can be as subtle as a hand cue at liberty.

But I fully understand that my handcue to back up is grounded in R- and indeed in a very few but necessary moments of P+ in our early days.

Yeah, it all looks so seamless during our trick work. But that’s because it’s trick work. What you see is not always what’s really going on.

It has also been my impression that when horses learn to give to your pressure, when they learn that your pressure is consistent, fair, and sensible, that is when they fall in love with you. By which I mean they want to be close to you, they watch what you are asking, and they offer behavior they think you want (putting their head in the halter, stepping back to let you enter the stall, lifting a foot to be cleaned) without any treats involved.

I love my trick work, but honestly I am more deeply impressed by the everyday dance of attention and recognition that you get in any well mannered horse that has a tactful handler. And this is based on pressure refined to cues, but learnt through pressure.

OP says they have been riding many horses in less than satisfactory lessons for 8 years and now has their own horse. Reading between the lines, I figure an intermediate rider that has come up against a lot of sour lesson horses and hectoring beginner instructors in group lessons, and has probably got more than a few bad habits or crude harsh aids going on. Because that’s what those kind of lessons create. When you start riding your own horse there is a lot to unlearn and relearn.

Key thing being to refine your aids. If you practice correct aids at just the force needed for effect, your horse will get lighter and lighter. Then you can turn him while riding just by looking in the direction you want to go. If you stop kicking with your heels and use your calves correctly, eventually horse will move forward at a whisper of leg. Etc.

OP, you do not need to give up riding to be kind to your horse. You just need to become a better rider, which means finding better teachers.

4 Likes

This is an interesting conversation.

Someone asked up thread about when you might use P- in horse training.

I think I use P- in combination with R+, when teaching my horses how to interact with human hands, including hand-feeding/taking treats. If they’re rude or pushy approaching my hand for a treat, then hand goes away/treat goes away (P-), time out, start over. If they’re polite, they get the treat (R+). This is more effective than just waiting for them to stop being pushy, ime. I also I like my horses to be willing to lip at my closed fist - lips ONLY, without escalating - and use this to engage/distract them during vet/farrier visits. I basically just generalize what I do with taking treats politely to show them the barrier between “this amount of lippy-ness is ok, that is not.” I know lots of people don’t like interacting with a horse’s mouth like this, and I’m sure there are horses out there that wouldn’t be safe, no matter how much you worked at it, but it’s been a useful tool for me so far.

I can think of an example where one might try P- only, but it’s rather specific: My gelding loves to play with another gelding at our barn. If they’re turned out in a paddock together, they just keep ramping each other up, to the point where we get worried that they will hurt themselves or damage the fence. So we’ve stopped turning them out together, which is a shame because their other turn-out buddies don’t like to play so much. It’d be interesting to see if a P- approach of removing one of them from the paddock when the intensity passes some threshold, then returning him to the paddock after a brief time-out could teach them how to self-regulate the max intensity level of their play.

Regarding R+ under saddle, I agree it’s not feasible to do purely R+. The language of cues that we use to influence their movement is intrinsically R-, and that includes weight aids, which can not be avoided when you are on the horse. However, I’ve deliberately added some R+ to my under saddle work to improve clarity for my horse, and I think it has been good for our working relationship. I don’t believe horses understand pats or positive words like “good” on their own. I think they only learn that those things are rewards by associating them with things like walk breaks. So to make a more powerful association, I tried to power up the word “good” with treats on the ground the way you train them to have a positive association to a clicker. This makes an impact under saddle because if I’m asking him for something hard, like maintaining a lengthening or some lateral movement or to stay engaged on a circle, I need to keep my aids on – a well-timed “good” lets him know that he’s doing the right thing, even if it’s physically difficult for him, and even if he’s not getting the R- reward yet. My totally subjective and biased opinion is that this has had a positive impact on his relaxation and motivation, and I feel encouraged to incorporate this into my training with any future horse as well. One more tool for the toolbox.

2 Likes

Yeah…never is likely the scenario for my horse. She just isn’t that into treats to offer something all the time and I like to joke that she has ADD. She’s a very independent horse and while she does know voice commands (walk/trot/canter) and is in general great with hand/leg aids, as well as subtle body shifts, she can get distracted sometimes.

It’s not just me, it’s her personality. She’s very independent and is one of the ones you will find happily grazing alone. She was in a small paddock with one other horse for a bit and she LOVED that horse but when they got out (she itched her face on the gate in such a way to both cut herself, leaving evidence, and break out), she did not follow the other one back into the paddock - neither had halters so I grabbed the one closer and assumed because he was the boss she would follow but nope… She also jumped out when a top rail was down and went on an adventure. So no, strictly positive reinforcement like you said…is much closer to never on the spectrum and I don’t have that long to wait.

I agree and have seen other animals like you describe or animals not trained because people knew one way to train and didn’t have either the patience and/or training and/or willingness to research to figure out the best way to train THAT animal.

Yes, I wasn’t thinking of you when I wrote that but I did see some posts that seemed to at least imply every time you put your leg on a horse (nudge, kick, squeeze, whatever) it is some form of punishment when it is in reality the cue you taught your horse.

Yes, you noted earlier that grass was more interesting than reward so for some learning curves you had to use punishment rather than reinforcement. It is important to pay attention to whatever or whomever you are training to understand how they learn and what motivates them.

While I have a different focus for my horse and hence spend more time riding and less (read: none) teaching my horse tricks, I can definitely understand the joy of an attentive horse. When I’m in galloping position and we are cantering around and I “think” 20M circle and feel her start turning or feel her looking for the next jump and waiting for my subtle body shifts to point to it, or get that seemingly invisible upward or downward transition in dressage - pure joy and happiness.

Yeah, riding is also what I enjoy most.

The trick training evolved because the horse enjoyed it so much as she also enjoys “agility” work like circus boxes and teeter totters. But also good clicker training works in very short stints, like 10 minutes sessions.

So a lot of our trick training has been just little moments of play before evening feed especially during a couple of winters where I was riding mornings, teaching afternoon/evening and popping in for a late night visit on my way home.

I wouldn’t have given up saddle time for tricks!

It has however been eye opening how fast she can learn a behavior in a couple of repetitions now, if there is no physical impediment.

And I have to be careful saying “good girl” in the saddle. She might just slam on the brakes and turn her head to look at me. Even with R+ you can inadvertently be teaching the wrong thing from what you wanted.

Our very first session, when I was working with a semi pro clicker trainer to “load the clicker,” and we were working on touching a target, I noticed that my mare had taught herself a sequence. Something like look around, touch the handler, look at the ground, touch the target. Click reward. Then she ran through the whole exact identical sequence each time. She didn’t move on to just touching the target (now she does). It was pretty funny and I recognized it, the trainer didn’t.

A horse that will overcomplicate things for sure!

1 Like