Phrase structure grammar is a system for describing how words combine into phrases, and how phrases combine into sentences. Instead of treating a sentence as a flat string of words, it reveals the hierarchical structure underneath. That structure is what lets us produce and understand an infinite number of sentences, even ones we've never heard before.

Tree diagrams are the main tool for visualizing this structure. They show exactly how words group together into larger units, making relationships between parts of a sentence much easier to see.

Principles of phrase structure grammar

Constituent structure means that words don't just line up in a row. They group into phrases, and those phrases group into larger phrases, forming a hierarchy. In The cat sat on the mat, "the cat" forms one unit (a noun phrase) and "sat on the mat" forms another (a verb phrase). Those two units combine to form the full sentence.

Phrase structure rules are the formal way of writing out these combinations. They follow the format $A \rightarrow B\ C$ , which means "A is made up of B followed by C." For example:

$S \rightarrow NP\ VP$ means a sentence is made up of a noun phrase plus a verb phrase

Recursion is what makes language so powerful. A phrase can contain another phrase of the same type inside it. This lets you build sentences of unlimited length: The cat that caught the mouse that ate the cheese that was on the table... Each "that" clause is a new structure embedded inside the previous one.

Constituency tests help you figure out where the phrase boundaries actually are. Three common tests:

Substitution: Can you replace a group of words with a single word (like a pronoun)? If the big dog can be replaced by it, that group is a constituent.
Movement: Can you move the group as a unit to another position in the sentence?
Coordination: Can you join the group with and or or alongside a similar group?

Principles of phrase structure grammar, Parse tree - Wikipedia

Hierarchical structure in syntactic trees

A syntactic tree has three types of components:

Root node (S): the topmost point, representing the whole sentence
Internal nodes (NP, VP, PP, etc.): the phrases that sit between the sentence level and individual words
Terminal nodes: the actual words at the bottom of the tree, connected by branches to the nodes above them

There are two ways to think about how a tree gets built:

Top-down analysis starts at S and breaks it into smaller and smaller pieces until you reach individual words. For example: $S \rightarrow NP\ VP \rightarrow Det\ N\ V\ NP \rightarrow \text{The dog chased the cat}$

Bottom-up analysis works in reverse. You start with the words and group them into phrases, then combine those phrases until you reach S. So the + dog form an NP, chased + the + cat form a VP, and then NP + VP form S.

Both approaches produce the same tree. Top-down is often easier when you're generating a sentence from rules; bottom-up is often easier when you're analyzing a sentence someone gave you.

Principles of phrase structure grammar, Phrase - Wikipedia

Labeling constituents in tree diagrams

Every node in a tree gets a label. These fall into two main categories:

Phrase-level labels identify larger syntactic units:

NP (Noun Phrase): the big dog
VP (Verb Phrase): chased the cat
PP (Prepositional Phrase): on the mat
AP (Adjective Phrase): very tall
AdvP (Adverb Phrase): quite slowly

Word-level labels (also called lexical categories) classify individual words:

N (Noun): dog, V (Verb): run, Adj (Adjective): happy
Adv (Adverb): quickly, Det (Determiner): the, P (Preposition): on

Functional labels like Subject (Subj), Object (Obj), and Complement (Comp) describe the grammatical role a constituent plays. In The dog chased the cat, "the dog" is an NP that functions as the Subject, and "the cat" is an NP that functions as the Object. Notice that the phrase label (NP) and the functional label (Subject) describe different things: what the phrase is versus what it does.

Application of phrase structure rules

Here are the core phrase structure rules you'll work with most often:

$S \rightarrow NP\ VP$
$NP \rightarrow (Det)\ (Adj)\ N\ (PP)$
$VP \rightarrow V\ (NP)\ (PP)$
$PP \rightarrow P\ NP$

Parentheses mean that element is optional. So an NP could be just a noun (dogs), or a determiner + noun (the dogs), or a determiner + adjective + noun + prepositional phrase (the big dog on the porch).

Lexical rules connect word-level categories to actual words: $N \rightarrow dog, cat, book$ ; $V \rightarrow run, eat, sleep$ ; $Adj \rightarrow big, red, happy$ .

To generate a sentence, you start at S and keep applying rules until every symbol is a real word:

Start with $S$
Apply $S \rightarrow NP\ VP$
Apply $NP \rightarrow Det\ N$ to get The dog
Apply $VP \rightarrow V\ NP$ to get chased + another NP
Apply $NP \rightarrow Det\ N$ to that second NP to get the cat
Result: The dog chased the cat

Visualization of sentence structure

One of the most useful things tree diagrams reveal is structural ambiguity, where a single sentence can have more than one valid tree structure, each with a different meaning.

The classic example is I saw the man with the telescope. This sentence has two possible trees:

Tree 1: "with the telescope" attaches to the VP (modifying saw), meaning you used the telescope to see the man
Tree 2: "with the telescope" attaches to the NP the man (modifying man), meaning the man was holding the telescope

This specific type is called attachment ambiguity, because the PP can attach at different points in the tree.

Coordination ambiguity is another common type. In old men and women, does old modify just men, or both men and women? The two readings produce different tree structures.

These ambiguities get resolved in real life through context, intonation, and world knowledge. But the tree diagrams make the ambiguity visible in a way that a flat sentence on a page never could. That's a big part of why phrase structure grammar is so useful: it shows you exactly where and how different interpretations arise.