Operant conditioning is learning through consequences: reinforcement makes a behavior more likely, and punishment makes it less likely. To use it well on the AP Psychology exam, you need to tell apart positive vs. negative and reinforcement vs. punishment, plus know shaping, schedules of reinforcement, and tricky effects like superstitious behavior and learned helplessness.

more resources to help you study

practice multiple choice FRQ practice & scoring cheatsheets score calculator AMSCO guided notes

Operant Conditioning AP Psychology Definition

Operant conditioning is learning through consequences. A behavior becomes more likely when it is followed by reinforcement and less likely when it is followed by punishment. The key AP Psychology move is to identify the behavior first, then decide whether the consequence increases or decreases that behavior.

Positive and negative do not mean good and bad in operant conditioning. Positive means a stimulus is added, and negative means a stimulus is removed. Reinforcement increases behavior, while punishment decreases behavior.

Why This Matters for the AP Psychology Exam

Operant conditioning shows up in the learning portion of Unit 3, and it is a favorite for application questions. Multiple-choice questions often give you a short scenario and ask you to label the type of reinforcement, punishment, or schedule at work. Because the exam tends to test research scenarios with data, you may also see a reinforcement-schedule graph and need to interpret the response pattern.

This topic also builds the explanation skills you will use in free-response writing. You may be asked to define a learning concept and then apply it accurately to a described situation, so being able to connect the term to a clear behavioral example is what earns points.

Key Takeaways

The Law of Effect is the foundation: behaviors followed by reinforcing consequences increase, and behaviors followed by punishing consequences decrease.
"Positive" means adding a stimulus and "negative" means removing one; this is separate from whether the consequence increases or decreases behavior.
Reinforcers can be primary (satisfy a biological need) or secondary (learned value, like money or grades).
Shaping reinforces successive approximations toward a goal, but instinctive drift can pull an animal back to natural behaviors.
Superstitious behavior comes from accidental reinforcement; learned helplessness comes from repeated uncontrollable aversive outcomes.
Reinforcement schedules (continuous vs. partial, and fixed/variable interval/ratio) shape how fast learning happens and how resistant it is to extinction.

The Law of Effect

The Law of Effect, first described by Edward Thorndike, says that people and animals tend to repeat actions that lead to good outcomes and avoid actions that lead to bad outcomes. B.F. Skinner built operant conditioning on this idea, studying how consequences control voluntary behavior.

Key points:

Reinforcing consequences increase the frequency of a behavior.
Punishing consequences decrease the frequency of a behavior.
Timing matters; immediate consequences tend to have a stronger effect.
Consistency strengthens the association between a behavior and its consequence.

Types of Reinforcement and Punishment

Reinforcement and punishment each affect behavior differently, depending on whether something is added (positive) or removed (negative). "Positive" and "negative" here do not mean good or bad. Think of them like math signs: positive is +1 (adding) and negative is -1 (taking away).

Positive = adding something
Negative = taking something away
Reinforcement = increases a behavior
Punishment = decreases a behavior

Reinforcement (increases behavior)

Positive reinforcement: adding something desirable to encourage a behavior.
- Example: A student gets candy for answering a question correctly, making them more likely to participate again.
Negative reinforcement: removing something unpleasant to encourage a behavior.
- Example: A driver buckles up to stop the annoying beeping sound, making them more likely to buckle up in the future.

Punishment (decreases behavior)

Positive punishment: adding something unpleasant to discourage a behavior.
- Example: A child gets extra chores for talking back, making them less likely to do it again.
Negative punishment: removing something enjoyable to discourage a behavior.
- Example: A teenager loses phone privileges for missing curfew, making them more likely to come home on time next time.

Effectiveness depends on:

When the consequence is delivered
How consistently it is applied
How much the individual cares about the specific reinforcer or punisher

Primary vs. Secondary Reinforcers

Primary reinforcers are naturally rewarding because they satisfy biological needs, such as food, water, warmth, or relief from pain.
Secondary reinforcers are learned reinforcers that gain value through association with primary reinforcers, such as money, grades, praise, or trophies.
Example: A dog treat is a primary reinforcer, while a paycheck is a secondary reinforcer.

Reinforcement Discrimination and Generalization

Reinforcement discrimination occurs when an organism learns that a behavior is reinforced in one situation but not in another.
- Example: A student jokes around with friends at lunch but not during class because joking is reinforced in one setting and not the other.
Reinforcement generalization occurs when a behavior reinforced in one situation begins to occur in similar situations.
- Example: If a dog is rewarded for sitting in the kitchen, it may also start sitting in the living room to get a treat.

Shaping Behavior Through Reinforcement

Shaping teaches a behavior by reinforcing small steps toward the final goal, rather than waiting for the full behavior to happen all at once. It is useful for learning complex behaviors that do not happen automatically, and it is used in animal training, teaching new skills, and therapy.

Instead of expecting the full behavior right away, you reward progress in small steps called successive approximations. Each step gets closer to the goal.

Example: Teaching a dog to roll over

Pick the target behavior: rolling over completely.
Break it into smaller steps: reward the dog for lying down, then for turning its head, then for rolling halfway, and finally for rolling over.
Reinforce each step: give treats or praise for each small success.
Gradually raise the bar: only reward when the dog gets closer to fully rolling over.

Limits to shaping:

The behavior must be something the person or animal can physically do.
Instinctive drift happens when an animal returns to its natural behaviors, even after training. For example, a raccoon trained to put a coin in a piggy bank might start rubbing the coin instead, because that is what raccoons instinctively do with food.

Superstitious Behavior vs. Learned Helplessness

Both concepts show how learning can go wrong, but they happen for different reasons.

Superstitious behavior happens when someone mistakenly connects an action with an outcome, even though they are not actually related. This comes from accidental reinforcement: a reward happens randomly after a behavior, so the person or animal believes the behavior caused it.

Example: A baseball player wears the same lucky socks for every game because they once hit a home run while wearing them, even though the socks had nothing to do with it.
The behavior can continue even with no real cause-and-effect relationship.

Learned helplessness happens when someone experiences repeated negative outcomes they cannot control. Over time, they stop trying to improve their situation, even when they later have the power to change things.

Example: A student repeatedly fails math tests despite studying, so they stop trying, believing nothing they do will help.
Even when the situation changes (like getting a great tutor), they might still expect to fail and not put in effort.

Key differences:

Superstitious behavior comes from a false belief in control; learned helplessness comes from believing there is no control at all.
Superstitions make people repeat unnecessary actions; learned helplessness makes them stop trying.
Superstitious behavior develops when good things happen by chance; learned helplessness develops when bad things happen repeatedly with no escape.

Reinforcement Schedules

The way rewards are given affects how behavior is learned and maintained. Reinforcement schedules determine when and how often a behavior is reinforced, which changes how quickly learning happens and how long the behavior lasts.

Continuous reinforcement provides a reward every time a behavior occurs. It is the fastest way to teach a new behavior because the learner quickly associates the action with the reward.

Example: A dog gets a treat every time it sits on command.
Works well for initial learning, but if reinforcement stops, the behavior disappears quickly (extinction).

Partial reinforcement gives rewards only sometimes, which makes the behavior more resistant to extinction. There are four main types, divided into interval-based (time-related) and ratio-based (response-related) schedules.

Interval-Based Schedules (Reinforcement Based on Time)

Fixed-interval schedule: the reward comes after a set period of time.
- Example: A worker gets paid every two weeks.
- Behavior increases as the reward time approaches and slows down right after.
Variable-interval schedule: the reward comes after unpredictable time intervals.
- Example: Checking for a text message, since there is no set time, so you keep checking throughout the day.
- Produces steady, consistent responding because reinforcement is unpredictable.

Ratio-Based Schedules (Reinforcement Based on Responses)

Fixed-ratio schedule: a reward is given after a set number of responses.
- Example: A coffee shop gives a free drink after every 10 purchases.
- Creates a high response rate, with a brief pause after the reward.
Variable-ratio schedule: the number of responses needed for reinforcement changes randomly.
- Example: Slot machines reward players at unpredictable times.
- This schedule is the most resistant to extinction because the person keeps responding, hoping the next attempt will be rewarded.

Graph Patterns of Reinforcement Schedules

Each reinforcement schedule produces a distinctive pattern of responding when graphed over time.

Continuous reinforcement: rapid learning, but responding drops quickly if reinforcement stops.
Fixed-interval: produces a scalloped pattern, with responses increasing as the expected reward time gets closer and slowing right after reinforcement.
Variable-interval: produces a slow, steady response pattern.
Fixed-ratio: produces a high rate of responding with a brief pause after reinforcement.
Variable-ratio: produces a very high, steady response rate and is the most resistant to extinction.

Which Schedule Works Best?

Continuous reinforcement is best for learning new behaviors quickly.
Partial reinforcement is better for maintaining behavior over time.
Variable schedules, especially variable-ratio, create the most persistent behaviors because the unpredictability keeps people responding.

How to Use This on the AP Psychology Exam

MCQ

Read the scenario and ask two questions in order: Is the behavior increasing or decreasing? That tells you reinforcement vs. punishment. Then ask: Was something added or removed? That tells you positive vs. negative.
Watch for negative reinforcement traps. Removing something unpleasant to increase a behavior is reinforcement, not punishment.
For schedule questions, decide first whether reinforcement is based on time (interval) or number of responses (ratio), then whether it is predictable (fixed) or unpredictable (variable).
If you see a scalloped response graph, think fixed-interval. A steep, steady line that resists extinction points to variable-ratio.

Free Response

When you define a term, follow it immediately with an applied example tied to the scenario in the prompt. Naming the concept alone usually is not enough.
Be precise with direction: state whether the consequence increases or decreases the behavior so it is clear you understand reinforcement vs. punishment.
For shaping, mention reinforcing successive approximations, not just "rewarding the behavior."

Common Trap

Do not equate "negative" with "bad" or "positive" with "good." They only describe whether a stimulus is removed or added.

Common Misconceptions

Negative reinforcement is not punishment. It removes something unpleasant to increase a behavior, so it strengthens behavior just like positive reinforcement does.
Punishment does not always mean physical pain. Negative punishment removes something enjoyable, like taking away phone privileges.
Primary and secondary reinforcers are about source, not strength. Primary reinforcers meet biological needs; secondary reinforcers gain value through learned association, like money or grades.
Shaping is not one big reward at the end. It reinforces small steps (successive approximations) that build toward the target behavior.
Superstitious behavior and learned helplessness are different. Superstitious behavior comes from accidental reinforcement and a false sense of control; learned helplessness comes from repeated uncontrollable bad outcomes and a sense of no control.
Continuous reinforcement is not the most resistant to extinction. It teaches fast, but partial schedules, especially variable-ratio, keep behavior going longer.

Vocabulary

The following words are mentioned explicitly in the AP® course framework for this topic.

Term	Definition
continuous reinforcement	A reinforcement schedule in which reinforcement is delivered after every correct behavior.
fixed-interval schedule	A reinforcement schedule in which reinforcement is delivered for the first correct behavior after a fixed amount of time has passed.
fixed-ratio schedule	A reinforcement schedule in which reinforcement is delivered after a fixed number of correct behaviors.
instinctive drift	The tendency of organisms to revert to instinctive behaviors even when those behaviors interfere with operant conditioning.
Law of Effect	The principle that behaviors followed by reinforcing consequences are more likely to be repeated, while behaviors followed by punishing consequences are less likely to be repeated.
learned helplessness	A condition in which an organism learns that it has no control over aversive consequences and stops attempting to escape or avoid them.
negative punishment	The removal of a desirable consequence following a behavior to decrease the likelihood of that behavior being repeated.
negative reinforcement	The removal of an undesirable consequence following a behavior to increase the likelihood of that behavior being repeated.
operant conditioning	A learning process in which behavior is modified by its consequences, with reinforcement increasing the likelihood of a behavior and punishment decreasing it.
partial reinforcement	A reinforcement schedule in which reinforcement is delivered after some, but not all, correct behaviors.
positive punishment	The addition of an undesirable consequence following a behavior to decrease the likelihood of that behavior being repeated.
positive reinforcement	The addition of a desirable consequence following a behavior to increase the likelihood of that behavior being repeated.
primary reinforcer	A reinforcer that satisfies a basic biological need, such as food or water.
punishment	A consequence that decreases the likelihood that a behavior will be repeated.
reinforcement	A consequence that increases the likelihood that a behavior will be repeated.
reinforcement discrimination	The ability to distinguish between stimuli that are followed by reinforcement and those that are not, leading to differential responding.
reinforcement generalization	The tendency to respond to stimuli similar to those associated with reinforcement in the same way as the original stimulus.
reinforcement schedule	The pattern or timing with which reinforcement is delivered following a behavior.
secondary reinforcer	A reinforcer that has acquired value through association with a primary reinforcer, such as money or praise.
shaping	A technique for conditioning a desired behavior by reinforcing successive approximations of that behavior.
superstitious behavior	Behavior that is reinforced by coincidental consequences unrelated to the behavior itself.
variable-interval schedule	A reinforcement schedule in which reinforcement is delivered for the first correct behavior after a variable amount of time has passed.
variable-ratio schedule	A reinforcement schedule in which reinforcement is delivered after a variable number of correct behaviors.

Frequently Asked Questions

What is operant conditioning in AP Psychology?

Operant conditioning is learning through consequences. Reinforcement makes a behavior more likely to happen again, while punishment makes a behavior less likely to happen again.

What is the difference between reinforcement and punishment?

Reinforcement increases a behavior, and punishment decreases a behavior. Positive and negative describe whether something is added or removed, not whether the consequence feels good or bad.

What is negative reinforcement?

Negative reinforcement removes an unpleasant stimulus to increase a behavior. For example, buckling a seat belt to stop a beeping sound is negative reinforcement because the behavior increases when the annoying sound is removed.

What is shaping in operant conditioning?

Shaping reinforces successive approximations, or small steps, toward a target behavior. Instead of waiting for the final behavior, each closer step is reinforced until the full behavior develops.

What is reinforcement discrimination?

Reinforcement discrimination happens when an organism learns that a behavior is reinforced in one situation but not another. For example, a student may learn that joking is rewarded with friends but not during a test.

What does a scalloped graph mean in AP Psychology?

A scalloped graph usually points to a fixed-interval schedule. Responses increase as the expected reinforcement time gets closer, then slow down right after reinforcement.