1.4 Instrumental Conditioning (aka operant conditioning)

What is it? The learning of an association between actions and rewards or punishments.

What is it for? Learning what works and what doesn’t

This is different from classical conditioning. Classical conditioning is passive. Instrumental conditioning is based on your own action. You act on the world and then the way the world treats your actions shapes the nature of your future actions.

Skinner didn’t devise the idea of instrumental conditioning but built on it. This was his major theoretical and experimental research program was focusing on the extent to which operant conditioning could shape the behavior of humans and of other animals.

The theoretical foundations for operant conditioning were established by the psychologist, Edward Thorndike. Thorndike noticed in his studies that animals don’t seem to learn through sudden insight, but rather through a series of random activities that they get better and better at. His example was putting a cat in a puzzle box. Thorndike summarized what was going on here, as the law of effect, the tendency to perform an action is increased if rewarded, weakened if not.

Suppose you have to train a pig. How would you train a pig? What you would do is, you would reinforce it. There are two types of reinforcement, there’s positive reinforcement, giving it something it wants, and negative reinforcement, releasing it from something aversive. If the pig is doing something you want it to do you could reinforce it, and if it’s doing something you don’t want it to do, you could punish it.

But that’s really limited because suppose you wanted to make the pig do something it’s never done before, like dance, well you can’t just wait for it to dance and then reinforce it. It will never do that.

  • Shaping: When the pig moves in a certain way that approximates dancing, you reward it, and now it’ll start to do that. Then when it moves in that way and another way and it’s looking even more like dancing, you reward it again. In other words, as the pig gradually approximate the behavior you hope it to do, you reward it on its way.
  • Skinner described shaping as analogous to natural selection. The way eyes evolve is, animals evolve something that very roughly approximates, something approximating something like an eye, which leads to increased survival and reproduction. Then you have step by step by step by step until they evolve complex structures.
  • But real life doesn’t work that way. There are all sorts of partial reinforcement in the world, in which we get reinforced some of the time, and not all of the time.
  • Schedules of reinforcement
    • fixed ratio
    • variable ratio
    • fixed interval
    • variable interval