The Golden Retriever moved across the floor to where a half a dozen dumbbell shaped objects were spread out. As the dog approached one of these the trainer said "Yes!" in an enthusiastic voice. The dog immediately grabbed the article and returned it to her in exchange for a treat.

A few moments later the article with the handler's scent had been returned to the group of other items and placed in a different location. Again the big yellow dog moved forward and was about to lift another object from the floor, however this time it was not the item with the correct scent. Upon seeing this, the handler announced "Sorry!" The dog stopped reaching for the wrong item, looked back at the trainer, and then in a rather subdued manner began to explore the other items. Ultimately he decided on the correct item, which triggered another happy "Yes!" and when he returned with it he got another treat.

The first part of this training sequence is a quite common and familiar aspect of dog training. It is similar to what is called "clicker training" where a sound or a signal serves as a reward marker to indicate that the dog is made to correct response and this marker informs him that a treat will be waiting when he returns to the handler (click here for more about that). The second part of the training sequence is considerably less common, since in this case the word "Sorry" is a marker which tells the dog that he was wrong, and that no reward is coming this time.

I asked the trainer why she chose to use this "no reward marker" and she said, "Telling the dog that it is wrong simply provides him with extra information and allows him to abandon any dead-end responses and move on to other behaviors that are more likely to be rewarded. I have read a number of times, and been to workshops where several well-known dog trainers have claimed that telling the dog when he is incorrect as well as when he is doing the right thing is a more efficient method of training."

The idea of a reward marker when training animals was introduced by the psychologist B. F. Skinner. I had a number of opportunities to speak with him since he would often visit Vancouver because his daughter was married to a faculty member in the history department at my university. On such visits he would often drop by the Psychology Department to visit with friends and acquaintances. At one point I remember asking him about how dog trainers might use markers. Specifically I wanted to know if we should be telling an animal when he was doing something wrong and was not about to be rewarded in the same way that we tell him when he was correct and was about to get his reward.

He shook his head and smiled. "Every time you reward an animal for doing the correct thing you strengthen that response and make it more likely that it will occur again. But signalling to an animal that it is wrong makes that very signal a kind of punisher. And the truth is that animals want to avoid anything associated with any situation where they might get punished. Do you do crossword puzzles?" I nodded. "Well the fun of doing crossword puzzles is that whenever you get something right you feel as if you have been rewarded. Imagine what would happen if each time you put down a wrong word or wrote a wrong letter in a square, the puzzle buzzed to tell you that you were wrong. Do you imagine that working such a crossword puzzle would be as much fun as working one where you simply got to poke around until you got the right answer without any negative commentary? Do you think that you would voluntarily choose to work at that puzzle which gave you that extra negative feedback rather than opting for the more traditional format?"

I thought about it and came to the conclusion that Skinner was probably right. I believe that I would prefer the situation where all of my feedback only focused on my correct responses without making any fuss over my errors. If that is the case for a person isn't it reasonable to presume that animals that were being trained would likely feel the same way. However there was no actual data to support that conclusion — until now.

I recently obtained a copy of a thesis written by Naomi Rotenberg, who was a Master's degree student at the City University of New York's Hunter College*. The experiment reported in that thesis directly addressed this issue. Rotenberg's study was rather straightforward and involved 27 dogs which were being trained to perform a simple trick (to place both of their legs into a hoop on the floor). Half of the dogs were taught using only a reward marker, in this case the typically used sound of a clicker. The other half of the group was taught with both a rewarding clicker sound, but in addition they would hear a tone (just the tone that we call "middle C" on a piano) which told the dog that he had made a mistake and chosen the wrong behavior.

The training sequence involved the experimenter issuing the command "Hoop" and then luring the dog into making the correct response after which he heard the click and got a reward. The training was broken up into six different levels in which the lure was gradually phased out. At the highest level the dog simply received the verbal command and was expected to perform the behavior. How many successful levels the dog made it through during the training session was one indication as to how much the animal had learned. In addition the percentage of correct responses served as another measure of the dog's proficiency.

The results were quite unambiguous. The dogs who were rewarded for their correct responses and who simply had their incorrect responses ignored did considerably better. These dogs learned more quickly, and reached a higher level of proficiency than the dogs who received the "extra information" telling them when they were wrong. During the training sessions the median level of achievement for the dogs whose errors were ignored was level 4 (out of 6), while for those who were told when they were wrong as well as when they were right only achieved a median performance of level 1. In terms of percentage correct, those dogs who only received the markers indicating those instances when they did the right thing achieved a correct response rate of 60%, while those dogs who were also told when they did the wrong thing managed to be correct only 27% of the time. A statistical analysis showed that overall the dogs that only were told when they were correct were nearly twice as proficient at the end of training.

Rotenberg summarizes her results in this way [where I am spelling out her abbreviations in square brackets].

Not only did [non-reward markers that told dogs that they were wrong and no treat was coming] significantly affect dogs' performance overall, but they led many dogs to fail very early on in the training session. Dogs that heard a [non-reward marker] following an early error continued to make errors, and none were able to progress to lure level 2. In contrast dogs whose early errors were ignored were able to recover and eventually move on to at least lure level 2. This pattern of results lends credence to some trainers' assertions that hearing [non-reward markers] might cause certain dogs to abandon training, rather than attempting to work past their errors to perform the behavior correctly…

In other words, dogs who are simply working to discover the correct behaviors, and are rewarded for those behaviors, keep at the training task and ultimately succeed, while those dogs who are not only told when they have made the correct response but are also told when they have made the wrong response seem to become despondent and give up on the whole learning task.

* Data from: Naomi Rotenberg (2015). Training a New Trick Using No-Reward Markers: Effects on Dogs’ Performance and Stress Behaviors. Master's Thesis, Hunter College, New York. CUNY Academic Works.

