Why This Matters
Operant conditioning is one of the most heavily tested learning concepts on the AP Psychology exam, and for good reason—it explains how consequences shape virtually every voluntary behavior you engage in daily. You're being tested on your ability to distinguish between reinforcement and punishment, positive and negative procedures, and different schedules of reinforcement. These distinctions trip up countless students because the terminology doesn't mean what you'd expect in everyday language.
The key insight here is that operant conditioning is fundamentally about behavioral consequences—what happens after a behavior determines whether it increases or decreases in the future. This connects directly to Thorndike's Law of Effect and Skinner's experimental work, both of which appear frequently on multiple-choice questions and FRQs. Don't just memorize definitions—know why each concept changes behavior and how to identify real-world examples using the correct terminology.
The Core Distinction: Reinforcement vs. Punishment
The most fundamental division in operant conditioning separates procedures that increase behavior from those that decrease it. Reinforcement always strengthens behavior; punishment always weakens it. This is true regardless of whether something is added or removed.
Positive Reinforcement
- Adding a desirable stimulus increases behavior—this is the most intuitive operant procedure and the foundation of most behavior modification programs
- Examples include praise, treats, money, or privileges—anything the organism finds rewarding that follows the target behavior
- Strengthens the behavior-consequence association—making the behavior more likely to occur in similar future situations
Negative Reinforcement
- Removing an aversive stimulus increases behavior—often confused with punishment, but remember: reinforcement always increases behavior
- Examples include taking aspirin to remove a headache or fastening a seatbelt to stop the car's beeping—the behavior is strengthened because it eliminates something unpleasant
- Creates escape or avoidance learning—organisms learn to perform behaviors that end or prevent discomfort
Compare: Positive reinforcement vs. negative reinforcement—both increase behavior, but positive adds something pleasant while negative removes something unpleasant. If an FRQ describes someone repeatedly doing something to "get rid of" an annoyance, that's negative reinforcement, not punishment.
Positive Punishment
- Adding an aversive stimulus decreases behavior—the word "positive" here means adding, not "good"
- Examples include verbal reprimands, physical discomfort, or extra work—anything unpleasant that follows the behavior
- Suppresses behavior but doesn't teach alternatives—this is why psychologists often prefer reinforcement-based approaches
Negative Punishment
- Removing a desirable stimulus decreases behavior—also called response cost or omission training
- Examples include losing privileges, time-outs, or fines—taking away something the organism values
- Often more effective than positive punishment—particularly when combined with reinforcement for alternative behaviors
Compare: Positive punishment vs. negative punishment—both decrease behavior, but through opposite mechanisms (adding unpleasantness vs. removing pleasantness). The classic exam trap: "taking away a teenager's phone" is negative punishment, not positive.
Schedules of Reinforcement
How often and when reinforcement is delivered dramatically affects behavior patterns. Continuous reinforcement (reinforcing every response) produces fast learning but quick extinction. Partial reinforcement creates more persistent behavior—this is called the partial reinforcement extinction effect.
Fixed-Ratio Schedule
- Reinforcement after a set number of responses—produces high, steady response rates with brief pauses after reinforcement
- Examples include piecework pay or punch cards—"buy 10, get 1 free" programs follow this schedule
- Creates a "post-reinforcement pause"—organisms briefly stop responding right after receiving the reward, then resume at high rates
Variable-Ratio Schedule
- Reinforcement after an unpredictable number of responses—produces the highest, most consistent response rates
- Gambling and slot machines are classic examples—the unpredictability keeps organisms responding persistently
- Most resistant to extinction—because the organism can never be certain the next response won't pay off
Fixed-Interval Schedule
- Reinforcement for the first response after a set time period—produces a characteristic scalloped response pattern
- Examples include checking for mail delivery or studying for scheduled exams—response rates increase as the interval ends
- Low responding immediately after reinforcement—organisms learn that early responses don't pay off
Variable-Interval Schedule
- Reinforcement for the first response after unpredictable time periods—produces slow but steady responding
- Examples include checking social media for new posts or pop quizzes—you never know when the next reinforcement will be available
- Moderate resistance to extinction—more persistent than fixed-interval but less than ratio schedules
Compare: Variable-ratio vs. variable-interval schedules—both are "variable" and resist extinction well, but ratio schedules (based on number of responses) produce much higher response rates than interval schedules (based on time). Slot machines use variable-ratio; that's why they're so addictive.
Building and Modifying Behavior
Operant conditioning isn't just about simple responses—it explains how complex behaviors are acquired and how existing behaviors can be changed. Shaping, discriminative stimuli, and different types of reinforcers are the tools that make this possible.
Shaping
- Reinforcing successive approximations toward a target behavior—the only way to teach behaviors that would never occur spontaneously
- Breaks complex behaviors into smaller, achievable steps—each step closer to the goal is reinforced until the full behavior emerges
- Used to train animals and teach new skills—from teaching a rat to press a lever to helping a child learn to speak
Discriminative Stimuli
- Signals that indicate when reinforcement is available—organisms learn to respond only in the presence of these cues
- Creates stimulus control over behavior—the behavior becomes associated with specific environmental contexts
- Examples include "OPEN" signs, green traffic lights, or a teacher's attention—these signal that a particular response will be reinforced
Compare: Discriminative stimuli in operant conditioning vs. conditioned stimuli in classical conditioning—both involve learned associations with environmental cues, but discriminative stimuli signal when to respond, while conditioned stimuli elicit automatic responses. This distinction frequently appears on the exam.
Primary and Secondary Reinforcers
- Primary reinforcers satisfy biological needs—food, water, warmth, and sexual contact are inherently reinforcing without learning
- Secondary (conditioned) reinforcers acquire value through association—money, grades, praise, and tokens become reinforcing because they've been paired with primary reinforcers
- Secondary reinforcers are more practical for human applications—most behavior modification programs use conditioned reinforcers like tokens or points
Foundational Research and Applications
Understanding who developed these concepts and how they're applied gives you the context needed for exam questions about research methods and real-world implications.
Law of Effect
- Edward Thorndike's foundational principle—behaviors followed by satisfying consequences are "stamped in," while those followed by discomfort are "stamped out"
- Established through puzzle box experiments with cats—cats learned to escape faster over trials as successful behaviors were strengthened
- Forms the theoretical basis for all operant conditioning—Skinner built directly on Thorndike's work
Operant Chamber (Skinner Box)
- B.F. Skinner's controlled experimental apparatus—allows precise measurement of behavior and delivery of consequences
- Contains a manipulandum (lever or key) and reinforcement delivery system—the animal's responses are automatically recorded
- Enabled systematic study of reinforcement schedules—the cumulative recorder tracked response patterns over time
Compare: Skinner box vs. Thorndike's puzzle box—both study learning through consequences, but Skinner's apparatus allowed for continuous study of behavior over time, while Thorndike measured escape latency across discrete trials.
Token Economy
- A systematic application of secondary reinforcement—individuals earn tokens for target behaviors that can be exchanged for backup reinforcers
- Commonly used in classrooms, psychiatric hospitals, and prisons—provides immediate reinforcement even when primary reinforcers can't be delivered instantly
- Demonstrates the power of conditioned reinforcers—tokens have no inherent value but become powerful motivators through association
Behavior Modification
- The systematic application of operant principles to change behavior—used in therapy, education, parenting, and organizational settings
- Involves careful measurement of baseline behavior and intervention effects—emphasizes observable, measurable outcomes
- Includes techniques like shaping, token economies, and contingency contracts—always based on manipulating consequences
Additional Key Concepts
Extinction
- Occurs when reinforcement is discontinued—the previously reinforced behavior gradually decreases in frequency
- Often produces an initial extinction burst—a temporary increase in behavior intensity before decline begins
- Partial reinforcement creates greater resistance to extinction—behaviors reinforced intermittently take longer to extinguish
Premack Principle
- A high-probability behavior can reinforce a low-probability behavior—"first eat your vegetables, then you can have dessert"
- Also called "Grandma's Rule"—preferred activities serve as reinforcers for less preferred activities
- Highlights that reinforcement is relative—what functions as a reinforcer depends on the individual's preferences
Compare: Extinction in operant vs. classical conditioning—both involve the weakening of learned associations, but operant extinction removes consequences while classical extinction removes the unconditioned stimulus. Both can show spontaneous recovery.
Quick Reference Table
|
| Positive reinforcement | Praise, treats, money, privileges added after behavior |
| Negative reinforcement | Seatbelt silencing alarm, aspirin relieving headache |
| Positive punishment | Scolding, spanking, extra chores added after behavior |
| Negative punishment | Time-out, losing phone privileges, fines |
| Variable-ratio schedule | Slot machines, sales commissions, fishing |
| Fixed-interval schedule | Weekly paychecks, checking mail at delivery time |
| Shaping | Teaching tricks, speech therapy, successive approximations |
| Secondary reinforcers | Money, grades, tokens, praise |
Self-Check Questions
-
A student studies more frequently because doing so has resulted in good grades. What type of operant procedure is this, and why?
-
Compare the response patterns produced by fixed-ratio and fixed-interval schedules. Which produces a "scalloped" pattern, and what explains this difference?
-
A child throws tantrums to get attention. The parents decide to completely ignore the tantrums. What operant procedure are they using, and what should they expect to happen initially before the behavior decreases?
-
Explain why slot machines are more addictive than vending machines, using your knowledge of reinforcement schedules. Which schedule produces greater resistance to extinction?
-
A teacher gives students points for completing homework, which can be exchanged for prizes at the end of the month. Identify the type of reinforcer the points represent and explain how this system relates to the concept of conditioned reinforcement.