Fiveable

🥸Intro to Psychology Unit 7 Review

QR code for Intro to Psychology practice questions

7.5 Measures of Intelligence

7.5 Measures of Intelligence

Written by the Fiveable Content Team • Last updated August 2025
Written by the Fiveable Content Team • Last updated August 2025
🥸Intro to Psychology
Unit & Topic Study Guides

Intelligence testing measures cognitive abilities like reasoning, memory, and problem-solving. Understanding how these tests are built, what they measure, and where they fall short is a major part of this unit.

Intelligence Testing

Steps in Intelligence Test Development

Building a valid intelligence test is a multi-stage process. Each step matters because a flaw at any point can undermine the entire test.

  1. Defining intelligence — Researchers first decide which cognitive abilities the test will measure (reasoning, memory, problem-solving, etc.) and who the test is for (children, adults, a specific population).

  2. Designing test items — Questions or tasks are created to assess those abilities. Think analogies, pattern recognition, or vocabulary tasks. Items need to be clear and appropriate for the target group in terms of language and cultural relevance.

  3. Standardizing administration — Every test-taker gets the same instructions, time limits, and testing conditions. This consistency is called standardization, and without it, you can't meaningfully compare scores between people.

  4. Scoring and norming — A scoring system is established, and the test is given to a large, representative sample. Those results become the norms, the baseline that lets you convert a raw score into a percentile rank or standard score. This is called norm-referenced testing.

  5. Assessing reliability — The test needs to produce consistent results. There are several ways to check this:

    • Test-retest reliability: Do people get similar scores when they take it again later?
    • Alternate-form reliability: Do two different versions of the test produce similar scores?
    • Inter-rater reliability: Do different scorers agree on the results?
  6. Evaluating validity — Consistency isn't enough; the test also has to measure what it claims to measure.

    • Construct validity: Does it actually assess the cognitive abilities it targets?
    • Concurrent validity: Do scores correlate with other established intelligence measures?
    • Predictive validity: Do scores predict real-world outcomes like academic or job performance?
  7. Refining and updating — Based on reliability and validity data, items are revised. Tests are periodically re-normed and updated to stay relevant and fair.

Steps in intelligence test development, Applied Psychometrics: The Steps of Scale Development and Standardization Process

Evolution of IQ Testing

Intelligence testing has changed dramatically since the late 1800s, moving from crude physical measurements to sophisticated multi-ability assessments.

Early attempts (1880s–1890s) — Francis Galton tried measuring intelligence through physical traits and sensory tasks like reaction time. James McKeen Cattell took a similar approach with tests of sensory and motor skills (grip strength, reaction speed). These early efforts didn't correlate well with actual intellectual ability.

Binet-Simon Scale (1905) — Alfred Binet and Théodore Simon created the first practical intelligence test in France. Their goal was specific: identify students who needed extra educational help. Instead of measuring sensory abilities, they focused on complex mental processes like judgment, comprehension, and reasoning. They introduced the concept of mental age, comparing a child's performance to what's typical for children of different ages.

Stanford-Binet Intelligence Scale (1916) — Lewis Terman at Stanford adapted the Binet-Simon Scale for American use. This version introduced the intelligence quotient (IQ), calculated as:

IQ=mental agechronological age×100IQ = \frac{\text{mental age}}{\text{chronological age}} \times 100

So a 10-year-old performing at the level of a typical 12-year-old would score 1210×100=120\frac{12}{10} \times 100 = 120. The Stanford-Binet became the standard for individual intelligence testing and has been revised multiple times since.

Wechsler Intelligence Scales (1939–present) — David Wechsler developed an alternative that addressed some Stanford-Binet limitations. Key features:

  • Separate verbal and performance (non-verbal) scales, giving a broader picture of cognitive ability
  • Deviation IQ scores based on the normal distribution (mean = 100, standard deviation = 15), which compare your performance to others in your age group rather than using the mental age ratio
  • Multiple versions for different ages: WAIS (adults), WISC (children), WPPSI (young children)

The deviation IQ approach solved a real problem with the original ratio formula: it didn't work well for adults, since mental age plateaus but chronological age keeps increasing.

Cattell-Horn-Carroll (CHC) Theory (1990s–present) — This framework merges earlier theories into a comprehensive model of cognitive abilities organized in layers, from broad abilities (like fluid reasoning and crystallized knowledge) down to narrow, specific skills. Modern tests like the Woodcock-Johnson Tests of Cognitive Abilities are built on CHC theory.

Steps in intelligence test development, Frontiers | A Hierarchical Taxonomy of Test Validity for More Flexibility of Validation

Applications vs. Limitations of Intelligence Tests

Intelligence tests are used across many settings, but each application comes with real drawbacks worth knowing.

Educational settings

  • Applications: Identifying gifted students or those with learning disabilities, guiding placement decisions (accelerated programs, remedial support), and informing interventions like tutoring or accommodations.
  • Limitations: Overemphasizing IQ can overlook factors like motivation, creativity, and social-emotional skills. Cultural and linguistic biases can disadvantage students from diverse backgrounds or English language learners.

Clinical and diagnostic settings

  • Applications: Evaluating cognitive strengths and weaknesses, diagnosing intellectual disabilities (generally IQ below 70 along with deficits in adaptive functioning), identifying cognitive impairment from neurological conditions, and planning treatment or accommodations.
  • Limitations: IQ tests alone can't capture the full complexity of someone's cognitive profile. Diagnostic decisions should also consider adaptive functioning, medical history, and cultural context.

Occupational settings

  • Applications: Screening job applicants for roles requiring specific cognitive skills, identifying leadership potential, and assisting with career counseling.
  • Limitations: Over-reliance on IQ scores can lead to discrimination in hiring. Many critical job skills, such as creativity, emotional intelligence, and teamwork, aren't measured by IQ tests.

Research settings

  • Applications: Studying how intelligence develops, its genetic and environmental influences, its relationship to outcomes like achievement or mental health, and whether interventions can improve cognitive skills.
  • Limitations: Lab-based findings may not generalize to real-world settings or diverse populations. There are also ethical concerns about stigmatization or misuse of IQ data.

Advanced Concepts in Intelligence Testing

Psychometrics is the scientific study of psychological measurement. It covers how tests are developed, validated, and interpreted. Any time you hear about reliability coefficients or factor analysis in the context of testing, that's psychometrics.

Multiple intelligences, proposed by Howard Gardner, challenges the idea that intelligence is a single thing. Gardner identified several distinct types: linguistic, logical-mathematical, spatial, musical, bodily-kinesthetic, interpersonal, intrapersonal, and naturalistic. This theory is influential in education, though it's worth noting that many psychologists debate whether all of these qualify as "intelligences" versus talents or abilities.

Cultural considerations address how cultural differences affect test performance and interpretation. Some test items may reflect knowledge or experiences more common in certain cultures, which can skew results. Researchers have worked on developing culture-fair or culture-reduced tests that minimize this bias, though eliminating it entirely is very difficult.