---
title: "Test-Retest Reliability — AP Psych Definition & Exam Guide"
description: "Test-retest reliability means a test gives consistent scores when the same people take it again later. Learn how AP Psych pairs it with validity in Topic 2.8."
canonical: "https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability"
type: "key-term"
subject: "AP Psychology"
unit: "Unit 2"
---

# Test-Retest Reliability — AP Psych Definition & Exam Guide

## Definition

Test-retest reliability is a psychometric principle measured by giving the same test to the same people at two different times and checking how consistent their scores are; high consistency means the test is reliable, though not necessarily valid.

## What It Is

Test-retest reliability answers a simple question. If you took an [intelligence](/ap-psych-revised/unit-2/8-intelligence-and-achievement/study-guide/CTKkcmSqii8BEYzZ "fv-autolink") test today and again next month, would you get roughly the same score? If yes, the test is reliable. If your IQ score bounced from 95 to 130 with no real change in you, the test is unreliable and basically useless, no matter how scientific it looks.

In the AP Psych CED, test-retest reliability lives under Topic 2.8 (Intelligence and Achievement) as one of the sound **[psychometric principles](/ap-psych-revised/key-terms/psychometric-principles "fv-autolink")** every psychological assessment must meet. Think of reliability as consistency and [validity](/ap-psych-revised/key-terms/validity "fv-autolink") as accuracy. A reliable test gives you the same answer every time. A valid test gives you the *right* answer. A broken scale that always reads 10 pounds heavy is perfectly reliable and completely invalid, and that distinction is exactly what the exam tests.

## Why It Matters

This term supports learning objective [AP Psych Revised](/ap-psych-revised "fv-autolink") 2.8.B, explaining how intelligence is measured. The essential knowledge here is blunt. All psychological assessments, including IQ tests, must follow sound psychometric principles to be useful, and reliability is one of those principles alongside [standardization](/ap-psych-revised/key-terms/standardization "fv-autolink") and validity. Topic 2.8 also asks you to think critically about intelligence testing (2.8.A and 2.8.C), and reliability is part of that critique. A test can produce wonderfully consistent scores and still be biased or measure the wrong thing, which is why the CED makes you evaluate reliability and validity as separate hurdles a test has to clear.

## Connections

### [Construct validity (Unit 2)](/ap-psych-revised/key-terms/construct-validity)

Reliability and [construct validity](/ap-psych-revised/key-terms/construct-validity "fv-autolink") are the two big checkpoints for any test. Test-retest reliability asks whether scores are consistent over time, while construct validity asks whether the test actually measures intelligence and not something else, like vocabulary or test-taking speed. A test can pass the reliability check and still fail the validity check.

### [Predictive validity (Unit 2)](/ap-psych-revised/key-terms/predictive-validity)

[Predictive validity](/ap-psych-revised/key-terms/predictive-validity "fv-autolink") asks whether a test forecasts future outcomes, like whether SAT scores predict college GPA. Reliability is the prerequisite. If a test can't even agree with itself across two administrations, it has no hope of predicting anything down the road.

### [Psychometric principles (Unit 2)](/ap-psych-revised/key-terms/psychometric-principles)

Test-retest reliability is one member of the family of psychometric principles the CED says every assessment must meet. The full checklist is standardization (consistent procedures), reliability (consistent scores), and validity (accurate measurement). Knowing which principle a scenario describes is a classic MCQ move.

### The Flynn Effect (Unit 2)

Here's a twist worth knowing. IQ scores rising across generations (the [Flynn Effect](/ap-psych-revised/key-terms/flynn-effect "fv-autolink"), covered in 2.8.C) does not mean IQ tests lack test-retest reliability. Reliability is about the same individuals scoring consistently over a short span, not about population averages drifting over decades due to better nutrition, health care, and education.

## On the AP Exam

Test-retest reliability shows up almost entirely in scenario-based multiple choice. The stem describes what a researcher is doing, and you have to name the psychometric principle. The pattern to memorize is this. Same test, same people, two time points, comparing score consistency equals test-retest reliability. Watch for the distractors. Comparing test scores with academic performance is predictive validity, not reliability. Training all proctors to follow an identical script is standardization, not reliability. On the AAQ-style free response, you might need to evaluate whether a study's measurement was reliable and explain why that matters for the conclusions. No released FRQ has used this term verbatim, but it fits naturally into questions about whether a measure in a study can be trusted.

## test-retest reliability vs validity

Reliability is consistency, validity is accuracy, and a test needs both. A test-retest reliable test gives the same person similar scores across administrations, but that says nothing about whether it measures intelligence correctly. Picture a scale that always reads 10 pounds too heavy. It is reliable (same wrong number every time) but not valid. The reverse is impossible. A test cannot be valid without being reliable, because a test that gives random scores can't be measuring anything accurately. AP MCQs love hiding validity answers in reliability questions and vice versa, so always ask whether the scenario is about consistency (reliability) or correctness (validity).

## Key Takeaways

- Test-retest reliability means the same people get consistent scores when they take the same test at two different times.
- Reliability is about consistency, while validity is about accuracy, and the AP exam expects you to keep those two ideas separate.
- A test can be reliable without being valid, but it can never be valid without first being reliable.
- Test-retest reliability is one of the sound psychometric principles (alongside standardization and validity) that the CED says all psychological assessments must meet (2.8.B).
- Rising IQ scores across generations (the Flynn Effect) reflect societal changes, not a failure of test-retest reliability.
- On the exam, the trigger phrase is 'administered the same test to the same group later and compared scores,' which always points to test-retest reliability.

## FAQs

### What is test-retest reliability in AP Psychology?

It's a way of checking a test's consistency by giving the same test to the same people at two different times and comparing the scores. If scores stay similar, the test has high test-retest reliability. It's part of the psychometric principles covered in Topic 2.8.

### Does high test-retest reliability mean a test is valid?

No. Reliability only means scores are consistent, not that they're accurate. A biased IQ test could give the same wrong score every single time. Validity (whether the test measures what it claims to) is a separate requirement, and AP MCQs frequently test this exact distinction.

### What's the difference between test-retest reliability and standardization?

Standardization means the test is administered with consistent procedures and environments, like all proctors following an identical script. Test-retest reliability is about whether the scores themselves stay consistent over time. One is about how the test is given, the other is about how the results behave.

### How is test-retest reliability different from predictive validity?

Test-retest reliability compares scores on the same test taken twice by the same people. Predictive validity compares test scores to a future outcome, like comparing an intelligence test to later academic performance. If a scenario mentions forecasting future success, it's predictive validity, not reliability.

### Does the Flynn Effect mean IQ tests aren't reliable?

No. The Flynn Effect is the rise in average IQ scores across generations, driven by societal factors like better nutrition, health care, and socioeconomic conditions. Test-retest reliability concerns the same individuals over a short window, so generational score increases don't count against it.

## Related Study Guides

- [2.8 Intelligence and Achievement](/ap-psych-revised/unit-2/8-intelligence-and-achievement/study-guide/CTKkcmSqii8BEYzZ)

## Structured Data

```json
{"@context":"https://schema.org","@graph":[{"@type":"LearningResource","@id":"https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability#resource","name":"Test-Retest Reliability — AP Psych Definition & Exam Guide","url":"https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability","learningResourceType":"Concept explainer","educationalLevel":"AP® / High School","about":{"@id":"https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability#term"},"audience":{"@type":"EducationalAudience","educationalRole":"student"},"dateModified":"2026-06-11T05:53:23.660Z","isPartOf":{"@type":"Collection","name":"AP Psychology Key Terms","url":"https://fiveable.me/ap-psych-revised/key-terms"},"publisher":{"@type":"Organization","name":"Fiveable","url":"https://fiveable.me"}},{"@type":"DefinedTerm","@id":"https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability#term","name":"test-retest reliability","description":"Test-retest reliability is a psychometric principle measured by giving the same test to the same people at two different times and checking how consistent their scores are; high consistency means the test is reliable, though not necessarily valid.","url":"https://fiveable.me/ap-psych-revised/key-terms/test-retest-reliability","inDefinedTermSet":{"@type":"DefinedTermSet","name":"AP Psychology Key Terms","url":"https://fiveable.me/ap-psych-revised/key-terms"}},{"@type":"FAQPage","mainEntity":[{"@type":"Question","name":"What is test-retest reliability in AP Psychology?","acceptedAnswer":{"@type":"Answer","text":"It's a way of checking a test's consistency by giving the same test to the same people at two different times and comparing the scores. If scores stay similar, the test has high test-retest reliability. It's part of the psychometric principles covered in Topic 2.8."}},{"@type":"Question","name":"Does high test-retest reliability mean a test is valid?","acceptedAnswer":{"@type":"Answer","text":"No. Reliability only means scores are consistent, not that they're accurate. A biased IQ test could give the same wrong score every single time. Validity (whether the test measures what it claims to) is a separate requirement, and AP MCQs frequently test this exact distinction."}},{"@type":"Question","name":"What's the difference between test-retest reliability and standardization?","acceptedAnswer":{"@type":"Answer","text":"Standardization means the test is administered with consistent procedures and environments, like all proctors following an identical script. Test-retest reliability is about whether the scores themselves stay consistent over time. One is about how the test is given, the other is about how the results behave."}},{"@type":"Question","name":"How is test-retest reliability different from predictive validity?","acceptedAnswer":{"@type":"Answer","text":"Test-retest reliability compares scores on the same test taken twice by the same people. Predictive validity compares test scores to a future outcome, like comparing an intelligence test to later academic performance. If a scenario mentions forecasting future success, it's predictive validity, not reliability."}},{"@type":"Question","name":"Does the Flynn Effect mean IQ tests aren't reliable?","acceptedAnswer":{"@type":"Answer","text":"No. The Flynn Effect is the rise in average IQ scores across generations, driven by societal factors like better nutrition, health care, and socioeconomic conditions. Test-retest reliability concerns the same individuals over a short window, so generational score increases don't count against it."}}]},{"@type":"BreadcrumbList","itemListElement":[{"@type":"ListItem","position":1,"name":"AP Psychology","item":"https://fiveable.me/ap-psych-revised"},{"@type":"ListItem","position":2,"name":"Key Terms","item":"https://fiveable.me/ap-psych-revised/key-terms"},{"@type":"ListItem","position":3,"name":"Unit 2","item":"https://fiveable.me/ap-psych-revised/unit-2"},{"@type":"ListItem","position":4,"name":"test-retest reliability"}]}]}
```
