Reinforcement and Punishment
Operant conditioning is built on a simple idea: behaviors are shaped by what happens after them. If a consequence is favorable, the behavior is more likely to happen again. If the consequence is unfavorable, the behavior is less likely to repeat. B.F. Skinner formalized these principles, and they remain central to how educators, therapists, and parents approach behavior change.
The trickiest part of this topic is the terminology. "Positive" and "negative" don't mean "good" and "bad" here. Positive means adding a stimulus. Negative means removing a stimulus. Keep that distinction locked in, and the rest falls into place.
Types of Reinforcement
Reinforcement always increases the likelihood of a behavior happening again.
- Positive reinforcement adds a desirable stimulus after the behavior. A teacher praising a student for raising their hand, a parent giving a sticker for completing homework, or a dog getting a treat for sitting on command are all positive reinforcement. The pleasant addition makes the behavior more likely to repeat.
- Negative reinforcement removes an aversive stimulus after the behavior. Taking ibuprofen removes a headache, so you're more likely to take ibuprofen next time you have one. A student who finishes classwork early gets excused from a boring review session. The relief from something unpleasant is what strengthens the behavior.
A common mistake: students confuse negative reinforcement with punishment. Remember, all reinforcement increases behavior. Negative reinforcement increases behavior by taking away something unpleasant.
Types of Punishment
Punishment always decreases the likelihood of a behavior happening again.
- Positive punishment adds an aversive stimulus after the behavior. A teacher scolding a student for talking out of turn, or assigning extra chores after a child breaks a rule, are examples. The unpleasant addition discourages the behavior.
- Negative punishment removes a desirable stimulus after the behavior. Taking away screen time because a child didn't clean their room, or losing recess privileges for disruptive behavior, are negative punishment. Losing something valued discourages the behavior.
Shaping and Reinforcement Schedules

Shaping Behavior
Shaping is the process of reinforcing successive approximations of a target behavior. Instead of waiting for the full desired behavior to appear on its own, you reward small steps toward it.
Here's how shaping works in practice:
- Identify the target behavior (e.g., a shy student participating in class discussion).
- Reinforce the first rough approximation (e.g., praise the student for making eye contact during discussion).
- Gradually raise the bar for reinforcement (e.g., next reinforce nodding in response to a question, then giving a one-word answer, then a full response).
- Continue until the target behavior is achieved.
Shaping is especially useful for building complex behaviors that a learner wouldn't produce all at once. Training animals, teaching new skills to young children, and building social behaviors in students with autism spectrum disorder all rely heavily on shaping.
Reinforcement Schedules
Schedules of reinforcement describe when and how often reinforcement is delivered. The schedule you use has a big effect on how quickly a behavior is learned and how resistant it is to extinction.
- Continuous reinforcement delivers reinforcement after every instance of the behavior. This is great for teaching a new behavior quickly, but the behavior tends to extinguish fast once reinforcement stops.
- Partial (intermittent) reinforcement delivers reinforcement after some instances. Behaviors learned on partial schedules are slower to develop but much more resistant to extinction. There are four subtypes:
- Fixed-ratio (FR): Reinforcement after a set number of responses (e.g., a sticker after every 5 completed assignments).
- Variable-ratio (VR): Reinforcement after an unpredictable number of responses (e.g., slot machines, or a teacher giving praise after a random number of correct answers). This schedule produces the highest, most steady response rates.
- Fixed-interval (FI): Reinforcement for the first response after a set time period (e.g., a weekly quiz every Friday). Response rates tend to increase as the interval end approaches.
- Variable-interval (VI): Reinforcement for the first response after an unpredictable time period (e.g., pop quizzes given at random). This produces a slow, steady response rate.
Token Economies and Extinction
A token economy is a behavior modification system where learners earn tokens (points, stickers, stars) as conditioned reinforcers. These tokens can later be exchanged for backup reinforcers like privileges, prizes, or free time. Token economies are widely used in classrooms, residential treatment programs, and even apps that reward healthy habits with virtual currency.
Extinction occurs when a previously reinforced behavior stops being reinforced, and the behavior gradually decreases. For example, if a teacher consistently ignores a student's attention-seeking outbursts (rather than responding to them), the outbursts should decrease over time. One thing to watch for: extinction bursts. When reinforcement first stops, the behavior often temporarily increases in frequency or intensity before it fades. This is normal and expected.

Key Figures and Applications
B.F. Skinner's Contributions
B.F. Skinner is the most influential figure in operant conditioning. His key contributions include:
- Designing the operant conditioning chamber (commonly called the Skinner Box), where he systematically studied how consequences shape behavior in rats and pigeons. The controlled environment allowed precise measurement of response rates under different reinforcement schedules.
- Developing the concept of radical behaviorism, which holds that behavior is best understood through observable actions and environmental consequences rather than internal mental states.
- Advocating for the application of operant principles to education, including programmed instruction, where material is broken into small steps and students receive immediate feedback.
Behavior Modification Applications
Behavior modification takes these laboratory principles and applies them to real-world settings. The general process involves:
- Define the target behavior in specific, observable terms (not "be good" but "raise hand before speaking").
- Collect baseline data on how often the behavior currently occurs.
- Select appropriate consequences (reinforcers to increase desired behavior, or punishment/extinction to decrease unwanted behavior).
- Apply consequences consistently according to a chosen schedule.
- Monitor and adjust based on whether the behavior is actually changing.
Common applications include:
- Education: Classroom token economies, behavior contracts, and praise systems for managing student behavior.
- Parenting: Reward charts, consistent consequences for rule-breaking, and strategic use of attention (reinforcing positive behavior, ignoring minor misbehavior).
- Therapy: Applied Behavior Analysis (ABA) for autism intervention, and behavioral components within cognitive-behavioral therapy (CBT).
- Self-improvement: Habit-tracking apps that use reinforcement principles, such as earning streaks or rewards for meeting daily goals.
The effectiveness of behavior modification depends heavily on consistency, choosing meaningful reinforcers for the individual, and clearly defining the target behavior. A reinforcer only works if the person actually finds it reinforcing.