Semantic Universals and Variation
Cross-linguistic studies of meaning ask a deceptively simple question: what aspects of meaning are shared across all human languages, and where do languages diverge? The answers matter because they tell us something fundamental about human cognition. If every language carves up color the same way, that points to shared biology; if languages differ wildly in how they encode space or number, that suggests culture and language structure play a bigger role than we might expect.
This topic sits at the intersection of semantics, cognitive science, and anthropology. You'll encounter key studies (Berlin & Kay, Levinson, Gordon) that use experimental methods to test whether linguistic differences correlate with cognitive differences.
Semantic Universals Across Languages
A semantic universal is a meaning distinction that appears in all (or nearly all) known languages. Three well-studied domains are color, kinship, and space.
Basic color terms. Berlin and Kay's landmark 1969 study proposed that languages acquire basic color terms in a predictable hierarchy:
- All languages have terms for black and white (light vs. dark).
- If a language adds a third color term, it's always red (the first chromatic term).
- Additional terms are added in a roughly fixed order: yellow and green (in either order), then blue, then brown, then purple, pink, orange, and grey.
This hierarchy suggests that color naming isn't arbitrary. Perceptual salience and the biology of human vision likely constrain which color categories get lexicalized first.
Kinship systems. Every language has kinship terms, but the distinctions they encode vary. Two common cross-linguistic patterns:
- Languages distinguish lineal relatives (direct line of descent: parents, grandparents, children) from collateral relatives (connected through a shared ancestor: siblings, cousins, aunts, uncles).
- Some languages also distinguish maternal relatives (mother's side) from paternal relatives (father's side), while others collapse these into a single term.
English, for instance, uses one word "uncle" for both your mother's brother and your father's brother. Many languages (e.g., Mandarin, Hindi) use separate terms for each.
Spatial relations. Languages universally encode basic spatial relationships between objects, though they do so in different ways. Core spatial concepts include:
- Proximity: near vs. far
- Containment: in vs. out
- Support: on vs. under
- Vertical orientation: up vs. down
- Horizontal orientation: front, back, left, right
These categories appear across languages, but how they're packaged differs. Korean, for example, distinguishes between tight-fit containment (kkita) and loose-fit containment (nehta), a distinction English doesn't grammatically encode.
Semantic Variation Between Languages
While universals reveal shared cognitive architecture, variation shows how languages can diverge in meaningful ways.
Grammatical gender. Some languages assign nouns to gender categories (masculine, feminine, neuter), and these assignments aren't consistent across languages. The word for "sun" is masculine in Spanish (el sol), feminine in German (die Sonne), and neuter in Russian (ัะพะปะฝัะต). This variation matters because research suggests grammatical gender can subtly influence how speakers think about objects (e.g., describing a bridge with "feminine" adjectives in German, where Brรผcke is feminine, vs. "masculine" adjectives in Spanish, where puente is masculine).
Evidentiality. Some languages grammatically require speakers to mark how they know what they're saying. English lets you optionally say "apparently" or "I heard that," but in languages like Quechua, evidential markers are obligatory:
- The suffix -mi signals direct evidence (you saw it yourself).
- The suffix -si signals hearsay (someone told you).
This means Quechua speakers must encode their information source every time they make a statement, which is a semantic distinction English leaves largely implicit.
Numeral systems. Most familiar languages use a decimal (base-10) system, but other bases exist. French uses partial vigesimal (base-20) structure: quatre-vingts (4 ร 20 = 80). The Babylonian system was sexagesimal (base-60), which is why we still have 60 seconds in a minute. Some systems are radically different: Oksapmin (Papua New Guinea) uses a body-part counting system where specific body parts map to specific numbers, progressing from the thumb on one hand across the face and down the other arm.

Language, Thought, and Culture
Linguistic Relativity
The Sapir-Whorf hypothesis (also called linguistic relativity) proposes that the language you speak shapes how you think. It comes in two versions:
- Strong version (linguistic determinism): Language determines thought. If your language lacks a word for something, you literally cannot think about it. This version is largely rejected by modern researchers.
- Weak version (linguistic influence): Language influences thought and perception without fully determining it. This version has substantial experimental support.
Most current research tests the weak version by looking for cases where speakers of different languages perform differently on non-linguistic tasks. Three key domains:
- Color perception: Do speakers of languages with more color terms categorize colors faster or more accurately at category boundaries?
- Spatial cognition: Do speakers of languages that use absolute directions ("north," "south") instead of relative ones ("left," "right") reason about space differently?
- Numerical cognition: Do speakers of languages with limited or irregular number systems process quantities differently?
Key Cross-Linguistic Studies
Berlin and Kay (1969): Color terms
- Method: Collected basic color term data from 20 languages and mapped which terms each language had.
- Findings: Identified the universal hierarchy of color term acquisition described above.
- Criticisms: The sample of 20 languages was small and skewed toward languages with contact with industrialized societies. Later work (the World Color Survey) expanded the sample significantly and found the hierarchy holds broadly, though with more flexibility than Berlin and Kay originally proposed.
Levinson and colleagues (2002): Spatial language and cognition
- Method: Compared speakers of languages that use different spatial frames of reference. Some languages favor relative reference (left/right, based on the speaker's body), while others favor absolute reference (cardinal directions like north/south). Participants completed both linguistic and non-linguistic spatial reasoning tasks.
- Findings: Speakers of absolute-reference languages (like Tzeltal, a Mayan language) used absolute strategies even on non-linguistic tasks, while speakers of relative-reference languages (like Dutch) used relative strategies. This supports the weak version of linguistic relativity: spatial language correlates with spatial cognition.
Gordon (2004): Numerical cognition in Pirahรฃ
- Method: Tested speakers of Pirahรฃ, an Amazonian language with an extremely limited numeral system (roughly: "one," "two," "many"), on tasks requiring exact quantity matching and memory for specific numbers.
- Findings: Pirahรฃ speakers performed well with small quantities but struggled with exact arithmetic and quantity matching for larger numbers (above 3 or so).
- Important caveats: This study generated significant debate. Alternative explanations include cultural factors (the Pirahรฃ don't engage in trade or counting practices), limited formal education, and task unfamiliarity. The results are consistent with linguistic influence on cognition, but they don't prove language is the sole cause of the difficulty.
Takeaway for exams: When discussing these studies, always mention the methodology, the key finding, and the limitations or alternative explanations. Cross-linguistic research is powerful but messy. No single study proves linguistic determinism; instead, the cumulative evidence supports a moderate position where language is one of several factors shaping cognition.