Beginning in the 1800’s, behavioural scientists were in their labs discovering the principles that laid the groundwork for the 1938 arrival of operant conditioning. At the same time, without using the technical terminology or being aware of the scientific theories, dog trainers were using many operant conditioning methods. But first, let’s consider the theory of classical conditioning.
Ivan Pavlov (1849 -1936)
Today’s theories of behaviour began with the work of Ivan Pavlov. A Nobel Prize winner, Pavlov was a Russian physiologist who studied digestion in dogs. In the course of his research, Pavlov observed that the dogs he was studying would salivate before food was placed in their mouths. He thought the dogs were associating the lab assistants, or the sound of the door opening, with food. He tested this theory by ringing a bell just before feeding the dogs. After a number of trials, ringing the bell would cause the dogs to salivate even if food was not forthcoming. This became known as a conditioned reflex. The development of such reflexes has come to be known as Pavlovian conditioning or, more commonly, classical conditioning. This is all to do with reflexes – learning by association; the pairing of two stimuli. Pet owners will be fully aware that shaking the biscuit box will bring their pet running in anticipation.
Edward Lee Thorndike (1874 – 1949)
While Pavlov was busy in Russia studying the kind of learning that involves reflexive responses, in the United States, Edward Lee Thorndike began studying what different consequences have on new behaviours. This was important groundwork for the development of, what is now known as, operant conditioning. Thorndike is known for the Law of Effect, which basically state that behaviours that produce rewards will increase in frequency. If you do something that brings a reward, you are more likely to do it again. For example if you get up and go to work then get paid at the end of the week, you will likely do it again next week. Thorndike’s work provided the foundation of all the treat training we use with dogs today.
John Broadus Watson (1878 – 1958)
J B Watson was a psychologist who worked at John Hopkins University and the University of Chicago. In spite of the studies of his counterparts, he is credited as the father of modern behaviourism. His view was that thoughts and feelings were unscientific and that a more objective and observable view was needed. Watson’s ‘Little Albert’ is enshrined in the history of psychology. Albert was an 11 month old boy. Watson and his colleague, R Raynor, conditioned a fear reaction in Albert. Initially, Albert was allowed to play freely with a rat. Then a loud bang was presented whenever Albert reached out to touch the rat. Within days, whenever the rat was presented, Albert would withdraw and cry, even without the bang. He also generalised his fear to other things, including a rabbit, a dog and a Santa Clause mask. Watson was using classical conditioning – in this case a startle reflex – to modify Albert’s behaviour. A Youtude video showing the experiment proves difficult to watch. Of course, today it would be considered unethical, indeed illegal. Watson is responsible for today’s branch of psychology known as behaviourism.
Burrhus Frederic Skinner (1904 – 1990)
For obvious reasons he was known as B F Skinner; an American psychologist, behaviourist, author, inventor and social philosopher. He was a professor of psychology at Harvard University from 1958 until his retirement in 1974. He considered free will to be an illusion, Skinner saw human action as dependent on consequences of previous actions, a theory he would articulate as the principle of reinforcement. If the consequences of an action are bad, there is a high chance the action will not be repeated; if the consequences are good, the probability of the action being repeated becomes stronger.
He was influenced both by Pavlov and Watson. He expanded Watson’s work on behaviourism when he described the science of operant conditioning. When he was a doctoral student at Harvard University, he discovered that he could systematically change the behaviour of rats by giving them food rewards for pressing a lever in his (infamous) Skinner box. He was the first to talk about conditioned reinforcers and conditioned punishers.
I am writing this blog during the International Dog Trainers Winter Summit 2020. This was opened with an engaging presentation by Ian Dunbar. Ian will, I’m sure, go down in history as one of the GREAT dog trainers. At the summit he spoke about, amongst other things, putting an unwanted behaviour on cue, schedules of reinforcement, making an unwanted behaviour the reward, the importance of games in training, luring vs. bribing, ‘response reliability ratio’ and ‘analogue feedback’.
PUTTING AN UNWANTED BEHAVIOUR ON CUE
We may have a dog who loves to bark at the sound of the front door bell. This may be desirable to alert household members, but is it out of control? We could put the behaviour on cue, for example ‘bark’ or ‘talk’ along with an appropriate visual cue. A reward may not be necessary as the unwanted behaviour of barking becomes the reward! Once the dog has mastered the new ‘trick’ we can then teach for the opposite – ‘shush’, again with an appropriate visual cue. Practice this regularly when the dog is sitting quietly. The reward then becomes ‘bark’!
MAKING AN UNWANTED BEHAVIOR THE REWARD
See above. Another example may be a dog who loves to run away and chase. Assume we have trained our dog to ‘sit, down, stay’, we then cue the dog ‘go play’ as the reward. This negates the need for a food reward and may have the effect of being more reliable as the dog places much value on running away. Also we have put the unwanted behaviour – running away – on cue. Next recall him again to the ‘sit, down, stay’ position – he will come running ready for the next ‘go play’ – and the cycle is repeated. It forms part of the dogs repertoire of tricks and adds the element of fun and games.
THE IMPORTANCE OF GAMES IN TRAINING
See above. Tug, chase, ‘hide and seek’ amongst others, helps build a dog’s confidence and sociability. Search and rescue dogs, drug detection dogs and others are trained using the element of fun and games. The ‘real thing’ then becomes a game – the dog does not differentiate. Sounds like nirvana to me!
REINFORCERS AND SCHEDULES OF REINFORCEMENT
Here I will discuss reinforcement as opposed to punishment (think of the two as reverse mirror images). Reinforcers can be either positive (something is added to increase behaviour) or negative (something is withheld or withdrawn to increase behaviour). They can also be secondary (such as a mechanical ‘click’ or verbal praise) or primary (such as food). Depending on the character of the individual, a secondary reinforcer may cross over and become a primary reinforcer – for example enthusiastic words of praise.
A reinforcement schedule is a rule stating which instances of behavior, if any, will be reinforced. This is a component of operant conditioning for which we can thank Skinner. Schedules can be divided into two broad categories: continuous schedule and intermittent schedules.
- Continuous Reinforcement Schedule
The desired behaviour is reinforced (rewarded) every single time. Because of this the association is easy to make and learning occurs quickly. However, this also means that extinction occurs quickly after reinforcement is no longer provided. Furthermore the value of the reinforcement, for example food, is lessened and the dog potentially looses interest.
- Intermittent Reinforcement Schedules
Unlike a continuous schedule, intermittent schedules only reinforce the desired behavior occasionally rather than all the time. This leads to slower learning since it is initially more difficult to make the association between behavior and reinforcement. However, intermittent schedules also produce behavior that is more reliable over time and resistant to extinction. In reality, a trainer may start with continuous reinforcement and phase out in favour of intermittent reinforcement. Here it can get complex for the novice trainer and is beyond the scope of this blog. However, the six intermittent schedules are listed:-
(1) Fixed ratio
(2) Variable ratio
(3) Fixed interval
(4) Variable interval
(5) Fixed duration
(6) Variable duration
2, 4 and 6 are random reinforcers and may cause the younger dog to become frustrated. Conversely, they may improve reliability insofar as it keeps the dog guessing.
RESPONSE RELIABILITY RATIO AND ANALOGUE FEEDBACK
RRR is described as the number of responses ÷ number of cues x 100. For example if we cue the the dog ‘sit’ 10 times and he responds correctly 5 times then: 5 ÷ 10 = ½ x 100 = 50%. Ian showed the results collated as bar charts over two time periods for comparison. This is what science is all about – collecting and collating data then presenting it in a clear and understandable way. The results are unambiguous and prompts us (and our clients) to move to the next stage of training.
Analogue feedback is a term borrowed from electronics. Feedback can be either analogue or digital. As with dog training there is both negative and positive feedback. Imagine digital as an on/off light switch – there are two poles, there is no in-between except a nanosecond’s time delay between the two. Now imagine a dimmer light switch as analogue – it is continuously variable and instantaneous. With dog training the aim is to give the dog appropriate and instantaneous feedback. Hence we have ‘analogue feedback’. As with voltage and current, input from the trainer = output from the dog!
LURING vs. BRIBING
This causes much debate amongst dog trainers. We can, for example, lure a dog over a fixed series of agility hurdles (fixed ratio) using a small piece of frankfurter. If the dog is willing it’s a lure; if he is unwilling it becomes a bribe.