Reward-based Training is not 'Bribing' Your Dog

More For Dogs
Apr 30, 2020
7 min read

Updated: May 9, 2020

Now that we know dominance-style training is not an appropriate way to teach our dogs, let's take a look at the more modern and scientifically-backed approach: reward-based training.

I have on several occasions heard dog owners or dominance-style trainers liken using rewards in dog training to 'bribing' the dog. I would like to explain in this blog post why that is not the case and the science behind reward-based training.

Reward-based training methods are the most effective way of training our dogs and ensures a good bond between owner and dog. With this method, dogs are taught to trust their owner rather than fear them. They are allowed to freely display natural behaviours for which the owner can choose to reward if they want the behaviour to be repeated, because any animal will exhibit a certain behaviour if it results in a worthy reward. For example, as humans we go to work for the reward of money, or perhaps for the reward of helping others if we are carrying out volunteer work. Some wild animals have learned to live near human populations for the reward of stealing their leftover food such as these macaques. And a honey badger in a zoo in South Africa has learned how to escape his enclosure in a number of inventive ways for the reward of having more space to explore! Therefore, if we can control the reward for the animal, they will be willing to work for us rather than forced to. However, this is not to say that punishment is completely unavoidable when training animals.

Reward-based training is one quadrant of operant conditioning learning which can be applied to many different animals, both wild and captive, as well as humans.You may be familiar with operant conditioning or you may have heard of Skinner's box experiment. In this experiment, a hungry rat was placed in a box with a lever inside. The rat was given free roam to explore the new environment and through this random exploration, the rat eventually pushed the lever which released food into the box. The rat then continued to explore the box and at some point happened to press the lever again. He then began to learn that pressing the lever resulted in food which became his reward for the behaviour. This was a very early example of reward-based learning in animals. The experiment was then expanded to use another form of reward. This time, a rat was placed in a box which had an electric current running through the floor meaning that he was constantly electrocuted. Out of panic, the rat raced around the box and accidentally pressed the lever which turned off the electric current. After a few repetitions, the rat learned to immediately go to the lever to stop the electrocution as soon as he was placed in the box. Therefore, this rat's reward was the termination of the electric shock.

Clearly these were two very different forms of training but they can both be categorised into 'operant conditioning'. This is because operant conditioning can be broken down into four quadrants. They are as follows:

- Positive Reinforcement. This means adding in a reward for a behaviour (for example giving your dog a treat when he sits)

- Negative Reinforcement. This means adding a negative stimulus until a behaviour is achieved (for example spraying water at a dog who is trying to steal food from the table until he gets off)

- Positive Punishment. This means adding a punishment for a behaviour (for example hitting a dog's nose when he jumps up)

- Negative Punishment. This means taking away a desired reward for a behaviour (for example turning and walking the other way when a dog pulls on the lead)

As you can see, not all parts of operant conditioning are positive and this actually emulates the normal daily life of animals. For example, I'm sure many of us have turned and walked the other way when our dog has tried to pull on the lead, meaning that he was 'punished' for his behaviour as he was not allowed to greet the other dog he was pulling towards. Other forms of punishment can be more severe and should be avoided, such as hitting a dog on the nose when he jumps up. Studies such as this one show that using punishment-based methods have a higher likelihood of causing further problematic behaviours in dogs and often lead to dogs being given up by their owners. It also shows that dogs have a higher rate of success in training when reward-based methods are used instead. Therefore, rather than hitting a dog on the nose when he jumps up, we should instead teach him to sit when greeting a person and reward this behaviour. The dog will learn that the sitting results in a reward whereas if he jumps up he will be ignored. This method of training uses two aspects of operant conditioning but does not result in any physical or harmful punishment.

I do not believe that teaching a dog it is a more worthwhile experience to sit when greeting someone rather than jumping up is 'bribing' him. We are offering him a free choice and then teaching him a consequence of his chosen behaviour. Once he has learned that sitting instead of jumping up is the best choice, we can remove the food reward and the greeting itself becomes the reward. Over time, even without a greeting (for example if the dog was attempting to greet someone who does not like dogs), the dog will still choose to sit rather than attempt to jump up due to his previous learning. A reward is no longer needed when the behaviour is learned.

Classical conditioning is another way in which animals including dogs learn. This was first discovered through research in the early 1900s by Ivan Pavlov. Pavlov was actually conducting an experiment to research digestion and came across classical conditioning (or Pavlovian conditioning) by mistake. After the accidental discovery, Pavlov designed an experiment to further research his finding. He had noticed that the dog in his study would salivate when food was presented to him. Over time, the dog would salivate when he heard someone bringing his food. Pavlov started to ring a bell just before the dog's food was served. Each time before the dog was fed, the bell was rung. Eventually, Pavlov started to ring the bell without serving the dog food and noted that the dog would still salivate, even though there were no other indicators that food was coming. Therefore, the dog had learned to salivate on the sound of the bell ringing even without the original reason for the salivation (the food). Extensions to the experiment showed that if the bell was initially rung too long before food was served, the dog would not learn to salivate on the sound of the bell. This showed that the learning could only occur if the bell and the food were presented close together. We now use this knowledge when teaching our dogs.

If you have tried reward-based training in the past but had limited success, there are a few pitfalls that may have caused this. One possible problem was that you were not using the correct reward for your dog. Everyone works for rewards but a reward is different for every person/animal. For example, some people strive for higher pay at work whilst for others, having more time working from home is more important. As every dog has his own personality, they all work for different rewards, too. The first step is to figure out what works for your dog. This could be cooked chicken, cheese, sausage or it could be a particular game or favourite scratch behind his ears!

Secondly, as the famous saying goes, we must learn to walk before we can run! This too applies to dog training. If there are too many distractions or your dog is in a state of high anxiety or hyperactivity, he will struggle to concentrate on the training. For example, if you are having trouble with training a 'stay' because your dog regularly breaks away to play with a nearby toy instead, it would be far too difficult for your dog to try to teach him in a room full of toys! Instead, we must start the learning first in a low distraction environment such as at home with the toys stored away. Then, once your dog understands what we are teaching him and he is having success, we can start to increase the distractions gradually - perhaps by adding one toy at a time. Trying to increase the distractions too quickly means that your dog is more likely to fail and therefore learn to ignore you.

Thirdly, timing of the reward is very important. For example, if you are trying to teach your dog to ‘sit’ then the reward must be delivered within one or two seconds of your dog’s bum hitting the ground. If you are not quick enough with your reward and your dog has already stood back up by the time you deliver the reward, your dog will think he is being rewarded for standing up rather than the sit he performed previously. In this instance, with repetition your dog would learn that the word ‘sit’ means stand up! The common mistake that is made with this is in recall training. Many people want to make sure their dog does not jump up in excitement when they return to them which is fine but this needs to be added at a later point of the recall training. The mistake is made when the dog owner calls their dog to them and the moment the dog arrives, instead of giving the dog the reward immediately, they instead ask the dog to sit first. This means that the dog receives the reward for the sit and has no idea that it was supposed to be for the brilliant recall he just did!

Overall, the claim that using rewards to teach a dog is 'bribing' him is not true. 'Bribing' implies that your dog is being tricked to behave in a certain way, rather than having free choice. Instead, dogs learn the consequences of their chosen actions/behaviours. The behaviours which result in rewards will be repeated and those which do not will cease. Multiple studies show that reward-based dog training is highly effective and that punishment-style methods only lead to more problems.

Phone: 07719 616911

Email: morefordogs@outlook.com

Reward-based Training is not 'Bribing' Your Dog

Recent Posts

Comments