The information contained in this page should be used in conjunction with the other training resources such as training for restraint, procedure specific guides, troubleshooting and monitoring welfare on the table.
There are a number of reasons that laboratory animals may be trained. Prescott, Buchanan-Smith and Rennie (2005) surveyed NHP facilities and found that 8 of 13 facilities surveyed trained their primates. Benefits reported reduced injuries for handlers, increased satisfaction with interactions with primates, reduced stress for both staff and primates and better quality data and greater volume of data obtained from each primate.
There may be a number of reasons training programmes are set up for dogs. The needs of the sponsor or the study design are commonly cited drivers for training (Refining Dog Care survey, 2015).
Laboratory-housed animals may experience frequent interactions with staff, during husbandry, restraint and regulated procedures, which have the potential to cause stress. It is desirable to minimise the stress associated with these events, not only to promote positive welfare, but to avoid unwanted effects of stress on the data obtained from the animals’ use.
There is a wealth of information on the training of dogs as pets and working dogs, and on the benefits of positive techniques. However, few of the available training guides are applicable to the laboratory environment, where the behaviours required and pressures on resources are different from other environments. As a result, training is carried out less frequently or is less optimised than it might be and valuable opportunities to improve dog welfare are missed.
There are many reasons that a training programme might be difficult to implement. Staff time are often responsible for a large number of duties besides training, such as cleaning, feeding, health checks and regulated procedures. When staff are responsible for a large number of animals, this leaves little time to dedicate to individually training dogs. In fact, insufficient staff time is the most commonly-cited reason for not doing more training. Other reasons include not having the knowledge or confidence to introduce new techniques, lack of resources which apply to the laboratory environment, concerns about the impact of a training programme on study data or concerns about negative impacts on behaviour.
Animals are being trained constantly, whether or not it is intentional. The appearance of staff in the animal unit before feeding or dosing provides a signal to the animals that becomes learned. Handling techniques during routine husbandry events such as cleaning or weighing impact on other events such as restraint for dosing or bleeding.
The purpose of training, or the end goal, has a big influence on the programme that will be designed. There are several potential purposes:
1. To create a well-trained animal which voluntarily co-operates with husbandry and regulated procedures.
2. To create a relaxed dog which does not experience stress as a result of husbandry or regulated procedures.
3. To improve the welfare of the dog.
4. To increase the efficiency of carrying out tasks.
Training occurs through classical conditioning (also known as Pavlovian or respondent conditioning) and operant conditioning. Dogs are also capable of social learning. To cause and monitor changes in behaviour, it is necessary to understand how animals learn. Much learning occurs through conditioning, which is a change in behaviour caused by experience.
Classically conditioned behaviours are reflexive behaviours automatically induced by an antecedent (preceding) stimulus. Behaviours such as blinking or salivating are respondent behaviours. Classical conditioning can be described by a stimulus response contingency, S → R, which is usually insensitive to consequences. Classical conditioning is a process by which a neutral stimulus (NS, which causes no response) becomes a conditioned stimulus (CS, which elicits the conditioned response). This occurs when the NS is repeatedly paired with an unconditioned stimulus (US) which already causes the unconditioned response (UR). Through repeated exposure, the NS will become the CS and elicit the conditioned response (CR) which is usually very similar to the UR. This relationship is described in the figure below.
Examples of classical conditioning in the dog unit:
Operant conditioning can be described by a stimulus -> response -> consequence contingency. According to Thorndike's Law of Effect, pleasant (appetitive) consequences make the behaviour more likely to reoccur in the future, while unpleasant (aversive) consequences make the behaviour less likely to reoccur. Whether a consequence is pleasant or unpleasant depends on the subjective experience of the animal.
There are commonly considered to be four quadrants of operant conditioning, however O'Heare  considers extinction to be a fifth principle of operant conditioning as the reduced rate of responding can be considered a function of the lack of reinforcement. The diagram below demonstrate how each of the four quadrant, and the fifth principle of extinction, relate to each other.
Reinforcement is a process in which the consequence of a behaviour (reinforcement) results in an increased likelihood of the behaviour being repeated in the future. Reinforcement strengthens the ability of a discriminative stimulus to cause the behaviour.
Positive reinforcement is a process in which the presentation of stimulation, or increasing its intensity, during or immediately after a behaviour, increases the likelihood that the behaviour will reoccur in the future. Positive reinforcement strengthens the ability of the differential stimulus to cause the behaviour. Reinforcers are typically stimuli which cause pleasant changes in the animal, such as food, petting or vocal praise.
Negative reinforcement is a process in which the removal of stimulation, or decreasing its intensity, during or immediately after a behaviour increases the likelihood that it will reoccur. Negative reinforcement strengthens the ability of the differential stimulus to cause the behaviour. Reinforcers are typically stimuli which cause negative changes in the animal, such as pulling, jerking, or pushing. Negative reinforcement has traditionally been used to teach many obedience behaviours, such as pushing on the hind quarters until the dog sits, or jerking on the leash until the dog walks at heel. Because negative reinforcement typically uses aversive stimuli, positive reinforcement should be given preference for teaching new behaviours.
Punishment is a process in which the consequence (punishment) of a behaviour results in a decreased likelihood of the behaviour being expressed in the future. Punishment weakens the ability of the discriminative stimulus to cause the behaviour.
Positive punishment is the introduction of a stimulus, or an increase in its intensity, which results in a decreased likelihood that a behaviour will reoccur. Positive punishments are usually aversive stimuli which cause pain or other unpleasant sensations. Punishments are often delivered for a variety of behaviours, including toilet training, separation anxiety and obedience behaviours. However, punishments can only suppress existing behaviours, not teach new behaviours, and their use can lead to increased anxiety and fear. If the original, undesirable behaviour continues to be reinforced, punishment is unlikely to suppress it.
In order to be effective at preventing a behaviour from recurring, a punishment must be delivered in a timely manner, so that it is associated with the behaviour, and be of sufficient strength to prevent the behaviour from recurring. Gradually increasing the strength of a punishment leads to habituating and renders the punishment ineffective. Behaviours which have been suppressed using punishment are also subject to post-punishment recovery, where the behaviour returns and is more difficult to suppress (Fraley, 2008). For these reasons, it is very difficult to effective use punishment to change behaviour, and as such it should be avoided by the majority of trainers.
Negative punishment is the removal of a stimulus, or the reduced intensity of a stimulus, during or following a behaviour, which results in a decreased likelihood that the behaviour will reoccur. Negative punishments might be used intentionally or unintentionally during training and the removal of stimuli the animal likes may lead to unpleasant sensations for them.
Examples of negative punishments include withholding stimuli, such as attention, a food reward or play, when an undesirable behaviour, such as over-excitement, barking or jumping, is displayed.
Negative punishment might also used unintentionally. Inexperienced trainers who are shaping behaviour using positive reinforcement may try to increase the steps between behaviours too quickly, resulting in the dog failing to display the correct behaviour and failing to be rewarded. This is likely to lead to frustration and stress. Trainers may also unintentionally negatively punish their dog by failing to pay attention during a training session and missing an opportunity to reward a correct behaviour, or by unexpectedly cutting short a training session. In both of these situations, the dog should have been rewarded for displaying the correct behaviour, and would anticipate reward, so failing to reward constitutes a negative punishment which will decrease the likelihood of the behaviour being performed again, affecting motivation and likelihood of success in training.
As with positive punishment, negative punishment can only serve to suppress an existing behaviour and cannot be used to teach a new behaviour. Therefore as a behaviour modification strategy, it must be used in conjunction with positive reinforcement to teach an appropriate behaviour in its place. Some behaviours may also be self-rewarding, such as running away and failing to recall, destruction or barking, and withholding a reward is unlikely to be effective at stopping the behaviour. Although the dog may naturally end its display of the behaviour, it is unlikely to be because learning has occurred and the behaviour is likely to be exhibited in the future.
Unlike the other contingencies of reinforcement and punishment, extinction operates because the behaviour has a history of reinforcement, but is no longer reinforced. Unlike punishment, which only serves to suppress a behaviour, extinction changes the relationship between the stimulus and the response (behaviour) so that the behaviour is no longer expressed. A phenomenon frequently encountered in extinction is the extinction burst, in which the behaviour increases in frequency or intensity as the animal attempts to achieve reinforcement. While extinction is continued, the behaviour will gradually decrease to zero.
Contingency refers to the degree to which a consequence can be predicted. Reinforcement requires there to be a high degree of contingency between the target behaviour and the consequence (reinforcer).
Contiguity refers to the temporal relationship between a behaviour and the consequence. The smaller the time period between a behaviour and the consequence, the greater the impact of the reinforcer. A bridging stimulus (secondary reinforcer) such as a clicker or verbal marker can be used to make the behaviour where delivering the reinforcer in a timely manner is impractical.
3. The nature of the positive reinforcer
A positive reinforcer must be sufficiently rewarding for it to be successfully applied in training. This means it must be pleasant and appealing to the dog, something which varies between individuals. One dog may prefer food rewards while another may prefer vocal praise or petting. A reinforcer must also be sufficiently rewarding for the dog to work for, must not the most valuable item available, so that this can be used if required for couple behaviours or jackpotting.
4. Other events/distractors
The presence of other animals, people or distracting events also influence how successful reinforcing is. Distractors reduce the dog’s ability to pay attention to the contingency and contiguity of reinforcers, and the presence of several stimuli make it more difficult for the dog to determine which is the discriminative stimulus. Training a new behaviour in a busy environment makes it more difficult for the dog to learn and distractors should only be introduced when a behaviour has been proofed at the previous stages of training.
The motivation of a dog to obtain a reinforcer will affect its performance in training. A dog which has recently eaten will be less motivated to obtain a food reward. A dog which is nervous of strangers will not be motivated by verbal praise from a stranger.
The schedule of reinforcement affects there success of a programme of reinforcement training. These are discussed below.
Manipulating the schedules of reinforcement is an important part of training a new behaviour. In general, a behaviour must be reinforced on every occasion it is displayed. For long-term maintenance of behaviours (e.g. where a dog will be asked to perform the behaviour repeatedly over a long period of time), variable schedules of reinforcement increase the strength of the behaviour and prevent extinction. For the majority of dogs which are used in short-term studies, continuous reinforcement will be necessary to establish and maintain the behaviour. More information on these schedules is presented below for reference.
On a continuous reinforcement schedule, every occurrence of a behaviour is reinforced. Continuous reinforcement is typically used in the early stages of training to acquire a behaviour and to increase the rate of response. Behaviours which are maintained using continuous reinforcement are usually susceptible to extinction as soon as the reinforcer is withheld. Although it is normal to move to intermittent schedules of reinforcement before raising the criteria of a shaping protocol during training, some types of learning, such as complex behaviours or discrimination, should be maintained on a continuous schedule (Pryor, 2009).
There are six intermittent schedules of reinforcement. Variable schedules of reinforcement produce more responses which are more resistant to extinction.
On a fixed ratio schedule, the behaviour will be reinforced once a set number of responses have been given, for example, every third response, or every sixth interval. When using a fixed ratio schedule, the animal will typically respond at a high rate, but pauses after each reinforcement, with greater effort exerted on the behaviour which will be reinforced as it is predictable.
On a variable ratio schedule, the number of responses between reinforcements is varied in a seemingly random manner, for example, with a gap of three, then six, then four responses between reinforcements. Response rates are high, with less pausing following reinforcements. In training, it is typical to move to a variable ratio of response. This increases the rate of response, because the animal is unsure how many times it needs to respond to be reinforced, and it also increases the strength of the behaviour and proofs against extinction, because the animal does not need to be reinforced each time it performs the behaviour.
On a fixed interval schedule, reinforcement is delivered to the first correct response after a set interval of time. Response rates tend to be lower, and the animal is likely to stop responding after each reinforcement because it is the time interval, rather than the number of responses which predicts when the reinforcer will be delivered.
On a variable interval schedule, reinforcement is delivered after a varied interval of time has passed. This increases the rate of response over fixed interval, although it is still lower than under a variable ratio.
On a fixed duration schedule, reinforcement is delivered after a behaviour has been occurring for a specified length of time.
On a variable duration schedule, reinforcement is delivered after a behaviour has been occurring for a varied length of time. A variable duration schedule may also be used to increase the strength of an established behaviour by varying the length of time the animal has to perform the behaviour before it is rewarded. This also helps to proof the behaviour against extinction because the animal is unsure how long it has to perform the behaviour before it will be reinforced.
What is effective positive training for laboratory dogs?
There are many terms used to describe training in dogs, which can make identifying effective techniques difficult. The word ‘training’ can mean anything from habituation through classical conditioning, which requires no active behaviour from the dog, to the shaping of behaviour through positive reinforcement. Training can also refer to negative reinforcement or punishment. Simply delivering a food treat does not mean that positive reinforcement has been used. Throughout Refining Dog Care, where ‘training’ is referred to, what is implied is effective positive training, unless otherwise stated.
How to implement effective positive training for laboratory-housed dogs:
The sections below detail the design and implementation of an effective, positive training programme for laboratory-housed dogs. There are three key factors which should be kept in mind when designing a training programme:
1. Have a clear programme of training designed with distinct stages
2. Identify behaviours which can reliably be used to identify success in training and positive welfare at each stage
3. Ensure that the dog is set up to succeed and has the opportunity to gain reinforcement at each stage of training
To effectively train, the method of training currently being used needs to be understood, as well as the best method to achieve the goals of training. Dogs can be trained constantly in interactions with staff, even if it's unintentional. However, all forms of training are not created equal. If you are following a training programme, are you sure that all staff are only using the forms of training outlined in the programme?
Below, different types of training are discussed as well as their suitability for the types of tasks that might be trained.
|Classical or respondent conditioning||Learning by association|
|Operant conditioning||Learning by consequences|
|Positive reinforcement||A consequence of behaviour which increases the likelihood of the behaviour reoccurring, such as delivering treats|
|Negative reinforcement||A consequence of behaviour which increases the likelihood of the behaviour reoccurring by removing an aversive stimulus|
|Positive punishment||A consequence of behaviour which decreases the likelihood of the behaviour reoccurring by delivering an aversive stimulus|
|Negative punishment||A consequence of behaviour which decreases the likelihood of the behaviour reoccurring by removing a pleasant stimulus, such as withholding reward|
|Primary reinforcer||A reinforcer that is rewarding without any prior conditioning|
|Secondary reinforcer||A reinforcer that becomes reinforcing after being paired with an already rewarding reinforcer|
|Desensitisation||The process of counter-conditioning an aversive stimulus by systematically exposing the animal to it through graded exposures|
|Counter-conditioning||Changing the negative association with a stimulus by pairing it with pleasant or rewarding stimuli|
|Shaping||The process of training a behaviour through successive approximations|
|Habituation||Repeatedly presenting a stimulus until the response becomes extinct, usually temporarily|
|Targeting||A differential reinforcement of successive approximations (similar to shaping) usually towards a goal of touching a body part to another object|
|Stationing||Teaching an animal to stay in a location|
|Errorless training||A type of training in which learning is set up so that animals do not make errors|
|Social learning||Learning which takes place in a social context and occurs through observation or instruction|
Habituation occurs when an unconditioned stimulus is repeatedly presented and the unconditioned response gradually declines. An example is restraint, where the unconditioned stimulus (restraint) gradually stops causing the unconditioned response (escape behaviour). However, habituation is usually temporary and is subject to spontaneous reversal, so it is not suitable for training protocols.
Desensitisation is often used in combination with counter-conditioning. Counter-conditioning is the pairing of a pleasant stimulus with an unpleasant one to change the response by classical conditioning. Desensitisation may use elements of classical and operant conditioning and uses systematic exposure to create a behaviour change. For desensitisation to be effective, it is necessary for dogs to be relaxed, with minimal arousal. It is also necessary to evaluate stimuli for their intensity, so that desensitisation can progress from least to most intense, and to expose the dog to the aversive stimulus along a gradient of intensity.
The first step in a training programme requires us to make sure that the dog is happy (1) around the handler, (2) taking food rewards and (3) in the training/procedural area. (In some facilities, habituating dogs to staff might be called socialisation, and or that might refer to dog-dog interactions or exercise.)
Needless to say, if the dog is too nervous, excited or stressed to interact with the handler, take food or be comfortable in the training area, it will not be possible for learning to take place.
This protocol outlines the steps that can be taken if those three factors aren’t met before training begins. In addition to this, a familiar handler offering food treats to a nervous dog in the home pen can often positively impact its responsiveness in training sessions. It is important not to skip this stage and rush into the next stage, as this will result in a dog which is trained using habituation, leading to a reduced behavioural response but maintaining an internal stress response and potentially spontaneous reversal of the training. A dog which appears to comply, then suddenly developed an aversion to a procedure this might well be why. Remembering from above that behaviours which have previously undergone extinction (natural fading from lack of reinforcement) are stronger when they re-emerge, it might be even harder to correct a problem behaviour when it reappears, making correct training essential from the start.
There are a number of limitations that might constrain the type of reinforcer or reward chosen. These might include: facility-specific restrictions, no prior history of using additional food reinforcers in study protocols, sponsor requests, uncertainty about the costs and benefits of using extra food reinforcers or a lack of understanding of the purpose of the training.
A number of facilities utilise food reinforcers as part of their standard training protocol, with the provision of extra food items in addition to daily diet being reflected in the acclimatisation pre-study period. Some facilities may also deliver a small food reward immediately after a procedure for studies which don’t require fasting or the compound is not known to be affected by a small food item. This should be evaluated on a study-by-study basis, and the provision of food items should not be totally discounted from training programmes without justification.
While there are food reinforcers available which come with certificates of analysis, these may be prohibitively expensive. It may be possible to use off-the-shelf products (all of the videos on this website use Pedigree Tasty Bites with Cheese in training sessions).
In fact, it might be necessary to have a variety of reinforcers depending on the individual differences in the dogs being trained. As a rule, the reinforcer must be reinforcing enough to motivate the dog to work, but not the be highest value item, which should be saved for when it is needed
What is reinforcing for one dog may not be reinforcing for another. While many beagles are strongly motivated by the smell and taste of food, many are also motivated by praise and petting. Particularly in the early stages of training, it is recommended to use all three of these (providing the dog is comfortable with it). Gentle petting and a ‘marker word’ (such as ‘yes’ or ‘good’) not only tell the dog when it has done the correct thing, but also use social communication to tell the dog that it is in a positive situation, can reduce stress and increase positive associations with the handler and training sessions.
Food: food items should be several things: (1) small enough to be easily consumed without holding up the training and which won’t result in overfeeding; (2) tasty enough to appeal to most, if not all, dogs; (3) approved for use in a research setting. Number (3) might vary depending on whether food treats are being delivered pre-study or during a study.
Food items which don’t work: regular dry diet, this might be appealing in the home pen when there is plenty of time to eat, or competition from pen mates, but most dogs will not find it appealing in training; wet diet, most dogs will find this appealing, and it makes a good back up for dogs who don’t like anything else, but it is very messy and not practical to train with!; large items like chew sticks which require a lot of chewing and interrupt the flow of a training session. A small item which can be swallowed easily and is tasty enough for most, if not all, dogs to like it, is ideal.
Food items which do work: small food treats like Pedigree Tasty Bites appeal to most dogs and are very quickly eaten; Bioserv Beefy Bites, which are tasty and come with a certificate of analysis; and where a certificate of analysis is of less concern (e.g. dogs held as stock which have not been assigned to a study), small pieces of human-grade food like cheese or sausage are very effective.
Shaping means using a series of steps to teach the dog a new behaviour. By breaking the behaviour into easily-achievable steps, the dog will feel successful during the training process. This will also speed up learning, and reduce confusion and frustration. Most importantly, as the dog learns more and more new behaviours using shaping and positive reinforcement, he will start to enjoy learning new things and will look forward to interacting with the trainer during training.
Targeting is a form of training in which the dog is taught to touch an object with a part of his body, for example touching the nose to trainer's hand.
Stationing is a form of targeting in which the target is a location. For example the dog might be trained to come to the pen front, or to sit on a table. Stationing allows the dog to remain in one location without physical restraint.
To shape a behaviour, a pre-determined schedule of steps should be created. We have created an example set of stages for training for restraint which can be found in our Welfare Monitoring Tool for table training.
1. Have a clear training protocol in mind before beginning training.
2. Practice. More practice means being better able to adapt to the dog's responses and making training time more efficient.
3. Be clear about the goals of the training. Are they specific to an activity or event? Is the goal to prepare a dog for study in a short period of time or is it to have the dog perform the same behaviour reliably for months or years? The training protocols may be different.
4. Break down each stage of training into manageable stages where the dog has a realistic chance of reinforcement. If too much is attempted at once, the dog might never have a chance to be rewarded and learning won’t occur.
5. Plan ahead - know what the dog is going to do before it does it. If the dog unexpectedly jumps ahead a few stages (a ‘lightbulb’ moment!) and the trainer is not prepared, they will miss a chance to reinforce and because the behaviour wasn’t reinforced, the dog might not attempt it again.
6. Similarly, if the dog does something unwanted, know how to react before your training session is derailed. Training time is limited and it can’t be wasted on behaviours which are undesirable.
7. If the dog fails to perform the behaviour expected at the stage of training, go backwards until a stage is found where the dog will reliably perform the behaviour. Training can then be built up again quickly from that stage.
8. Never do nothing. If the dog’s attention wanders, get it back. If the dog isn’t motivated to cooperate, change the motivation. If you get stuck, end training session rather than persisting with something which isn't working. Coming back with a fresh perspective or asking a colleague is much more likely to achieve results.
9. Observe your own training. One of the best ways to improve your training technique is to observe your own training by video recording a training session and watching it back. You can pick up on inefficiencies and gain a fresh perspective on your technique.
10. Never suddenly end a training session. This constitutes a negative punishment (the withholding of an expected reward) and will affect the dog’s motivation to cooperate in the future.
11. Try to end on a high. If the dog achieves its goals for the stage of training, end the session there rather than trying to use all of the time available to push for new behaviours. This is likely to confuse the dog and affect its motivation.
12. Use clear scoring methods for training which can be agreed amongst all trainers so that accurate records can be kept. This is particularly important if more than one trainer is working with the dog.
13. Only train one aspect of a behaviour at a time. This is where a good training schedule will be key.
14. When introducing a new aspect of a behaviour, temporarily relax the criteria applied to the last stage. Dogs might make mistakes in a learned behaviour while attempting a new stage, but will quickly catch up.
In surveys of animal units, there are several common concerns about introducing or expanding a training programme for animals.
1. Staff time
Although training might seem to introduce an increase in staff time in addition to regular duties, the increased efficiency and associated improved welfare can result in an overall increase in efficiency. For example, we found that dogs trained for restraint using positive reinforcement were 20 seconds faster to dose than sham-dose trained dogs .
The cost of a Refinements is always balanced against its benefits. The cost of using food treats can be expensive if approved items with a certificate of analysis is used, however there are many other options, including commercially available dog treats. The benefits of increased efficiency gained from training, as well as increased confidence in results from dogs with better welfare, can balance out any costs associated with using food treats.
3. Introducing unwanted variables
A frequent concern, particularly in regulatory environments, is that using food treats as positive reinforcement will adversely effect study outcomes. However, it is well-documented (e.g. see ) that attempts to standardise the laboratory environment have failed, and that harmonising welfare by providing effective Refinements is more successful at ensuring good quality data. The benefits gained by improving welfare should be considered to be greater than the unwanted variability introduced by Refinements, and therefore Refinements should not be withheld without evidence that they would adversely influence study outcomes.
1.O'Heare, J. (2010). Changing Problem Behavior: A Systematic & Comprehensive Approach to Behavior Change Project Management. BehaveTech Publishing.↩
2. Hall, L. E., Robinson, S., & Buchanan-Smith, H. M. (2015). Refining dosing by oral gavage in the dog: A protocol to harmonise welfare. Journal of pharmacological and toxicological methods, 72, 35-46.↩
3. Wurbel, H. (2000). Behaviour and the standardization fallacy. Nature Genetics, 26 (3), 263-263.↩