Operant conditioning

Operant conditioning


The theory of operant conditioning (aka instrumental learning) was devised by B F Skinner. It is a theory of learning that's suggests that people learn by operating (interacting) with their environments. 

Reinforcement and punishment

A stimulus/ event that increases the likelihood that a behaviour will be repeated is called a reinforcer. Reinforcement can be either positive or negative. In both positive and negative reinforcement the behaviour is strengthened. Positive reinforcement occurs when a behaviour is strengthened by adding a rewarding stimulus e.g. More likely to come to work if you get paid. Negative reinforcement occurs when a behaviour is strengthened by the removal of an unpleasant stimulus e.g. More likely to put the bins out to put an end to your partner nagging you. 

A stimulus that decreases the likelihood that a behaviour will be repeated is called a punisher. Again there are positive and negative punishers. Positive punishment occurs when a behaviour is reduced in frequency by adding an unpleasant stimulus. An example of positive punishment is if a dog growls at someone who tries to stroke them (an unpleasant stimulus), they will be less inclined to try to stroke them again (reduction in behavioural frequency). An example of negative punishment is a parent taking a child's toys of them (removal of a pleasant stimulus) for throwing their food against the wall, the child should be less inclined to repeat the behaviour. 

Primary and secondary reinforcers

Primary reinforcers are instinctual desires such as food, water, social approval and sex (be careful with social approval as it is considered by some to be a secondary reinforcer).

Secondary reinforcers (aka conditioned reinforcers) are not innately appreciated and people have to learn to like them through classical conditioning or other methods. Secondary reinforcers include things such as money.

Schedules of Reinforcement

Different patterns of reinforcement have different influences on the response. There are five main reinforcement schedules

  • Fixed interval - a reward after a fixed amount of time
  • Variable interval - a reward after a varying amount of time
  • Fixed ratio - a reward occurs after a behavior is repeated x number of times
  • Variable ratio - a reward occurs after a random number of responses
  • Random - no pattern

Variable ratio schedules are most resistant to extinction (gambling works in this way).

Shaping and chaining

Sometimes an exact behaviour cannot be performed and so cannot be rewarded. In this instance it is helpful to reward successive, increasingly accurate approximations to the behaviour. This is called shaping.

Chaining involves breaking a complex task into smaller more manageable sections.

Shaping and chaining are similar but different in two main ways:

  • Shaping always moves forward. Where as it is quite possible to move backward with backward chaining

  • Another difference involves when reinforcers are delivered. In shaping, each new approximation is reinforced. In chaining, reinforcers are usually provided at the end of the chain.

Escape conditioning

This actually involves both classical and operant conditioning.

Escape conditioning refers to a situation whereby an aversive situation is removed after a response. It is a form of negative reinforcement. For example, imagine a rat stood on a raised platform in a pool of water. When an electrical current is applied to the plat from the rat will jump into the water to stop the unpleasant sensation of the electric shock (the shock is removed following the response).

When a person learns to respond to a signal in a way that avoids an aversive stimulus before it arrives this is avoidance conditioning. For example, imagine that in the above example, a buzzer sounded just before the electric shock was applied. Eventually the rats would learn to jump off the platform at the sound of the buzzer rather than wait for the shock.

Habituation

Habituation refers to the phenomenon whereby a there is a decrease in response to a stimulus over time (overtime you pay less attention to repeated sounds in your environment). If the stimulus is removed for a period of time and then reintroduced then the response will reappear at full strength. This is referred to as spontaneous recovery.

Covert sensitisation

This is a technique used whereby someone learns to use mental imagery (hence it's covert) to associate a behaviour with a negative consequence.

For example, a person may be encouraged to use imagery to link smoking a cigarette with the development of lung cancer.