upgrade
upgrade

💾Intro to Database Systems

Types of Database Keys

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Database keys are the backbone of relational database design—they're how you ensure data stays accurate, relationships stay intact, and queries return exactly what you need. On exams, you're being tested on more than just definitions: you need to understand why a designer would choose one key type over another, how keys enforce referential integrity, and what happens when key constraints are violated. These concepts connect directly to normalization, entity-relationship modeling, and query optimization.

The key types fall into distinct categories based on their purpose: some exist to uniquely identify records, others to establish relationships, and still others to improve performance. Don't just memorize a list of nine key types—know what problem each one solves and when you'd reach for it in a real schema design. That conceptual understanding is what separates a passing answer from a strong one.


Keys for Unique Identification

These keys solve the fundamental problem of distinguishing one record from another. The core principle: every row in a table must be uniquely addressable, or your data becomes unreliable.

Primary Key

  • Uniquely identifies each record—the "official" identifier that the database enforces as the main access point for every row
  • Cannot contain NULL values, ensuring every record has a guaranteed, non-empty identifier
  • Enforces entity integrity by preventing duplicate entries and serving as the target for foreign key references

Candidate Key

  • Any column (or combination) that could serve as the primary key—meets all the same requirements of uniqueness and non-null values
  • Multiple candidate keys can exist in a single table, but only one gets promoted to primary key status
  • Represents minimal uniqueness, meaning no subset of the candidate key's columns would also be unique

Alternate Key

  • A candidate key not selected as the primary key—still unique, still valid, just not the "chosen one"
  • Provides secondary access paths for queries that naturally search by a different attribute
  • Often indexed to support efficient lookups without hitting the primary key

Compare: Candidate Key vs. Alternate Key—both guarantee uniqueness, but candidate keys are potential primary keys while alternate keys are the runners-up after selection. If an exam asks which key "could have been" the primary key, the answer is any candidate key (including the one that became an alternate).


Keys Defined by Composition

These keys are categorized by how many columns they contain and what they include. Understanding the hierarchy here—super key → candidate key → primary key—is frequently tested.

Super Key

  • Any set of columns that uniquely identifies records—the broadest category of identifying keys
  • May include unnecessary attributes, making it a superset of candidate keys (e.g., {StudentID, Name} is a super key even though StudentID alone suffices)
  • Useful for understanding key hierarchy: every candidate key is a super key, but not every super key is a candidate key

Composite Key

  • Combines two or more columns to achieve uniqueness—used when no single column can uniquely identify records
  • Essential for junction tables in many-to-many relationships (e.g., {StudentID, CourseID} in an Enrollment table)
  • Each component column may contain duplicates individually; only the combination must be unique

Compare: Super Key vs. Composite Key—a super key is defined by what it can do (uniquely identify), while a composite key is defined by how it's built (multiple columns). A composite key is always a super key, but a super key with extra unnecessary columns isn't minimal enough to be a candidate key.


Keys for Establishing Relationships

Foreign keys are the mechanism that links tables together, enabling the relational model to work. Without foreign keys, you'd have isolated tables with no way to enforce valid references.

Foreign Key

  • References the primary key of another table—creates a link that the database actively enforces
  • Enables referential integrity, meaning you can't insert a foreign key value that doesn't exist in the referenced table
  • Can contain NULL values when the relationship is optional (e.g., an Employee table where ManagerID is NULL for the CEO)

Compare: Primary Key vs. Foreign Key—primary keys identify records within a table; foreign keys reference records across tables. A foreign key in one table points to a primary key in another, creating the parent-child relationships that make joins possible.


Keys by Origin: Natural vs. Artificial

This distinction matters for real-world database design. The debate between natural and surrogate keys comes down to stability versus meaning.

Natural Key

  • Derived from real-world data that already exists—Social Security numbers, email addresses, ISBN codes
  • Carries business meaning, making data more intuitive to read and query without joins
  • Vulnerable to real-world changes—if someone's email changes or a government renumbers SSNs, your key changes too

Surrogate Key

  • Artificially generated by the database—typically an auto-incrementing integer or UUID with no business meaning
  • Completely stable because it's immune to changes in the actual data it represents
  • Simplifies foreign key relationships since you're always joining on a simple integer rather than a complex natural value

Compare: Natural Key vs. Surrogate Key—natural keys are meaningful but fragile; surrogate keys are meaningless but stable. FRQs often ask you to justify choosing one over the other: use natural keys when the real-world identifier is truly permanent (like country codes), surrogate keys when business data might change.


Keys for Constraints and Performance

These keys focus on enforcement and access speed rather than identification or relationships.

Unique Key

  • Enforces distinctness on a column that isn't the primary key—prevents duplicates where business rules require it
  • Allows one NULL value (unlike primary keys), useful for optional-but-unique fields like secondary email addresses
  • Creates an implicit index, improving query performance on that column

Secondary Key

  • Used for retrieval, not identification—a column you frequently search or sort by, even though it contains duplicates
  • Typically indexed to speed up queries (e.g., indexing LastName for fast alphabetical lookups)
  • Does not enforce uniqueness, distinguishing it from unique keys and candidate keys

Compare: Unique Key vs. Secondary Key—unique keys are constraints that prevent duplicates; secondary keys are access paths that improve performance. A unique key always creates an index, but a secondary key index doesn't enforce any constraint.


Quick Reference Table

ConceptBest Examples
Uniquely identifies recordsPrimary Key, Candidate Key, Alternate Key
Key hierarchy (broad to narrow)Super Key → Candidate Key → Primary Key
Multi-column identificationComposite Key
Cross-table relationshipsForeign Key
Real-world vs. artificial identifiersNatural Key, Surrogate Key
Constraint enforcementPrimary Key (no NULLs), Unique Key (allows NULL)
Performance optimizationSecondary Key, Unique Key (both create indexes)
Many-to-many junction tablesComposite Key (often as Primary Key), Foreign Keys

Self-Check Questions

  1. What's the difference between a super key and a candidate key? Give an example where adding a column to a candidate key creates a super key that's no longer minimal.

  2. You're designing a table for student course enrollments. Why would you use a composite key of {StudentID, CourseID} rather than creating a new surrogate key?

  3. Compare natural keys and surrogate keys: under what circumstances would you choose each? What are the trade-offs for data integrity and query readability?

  4. A table has three candidate keys: {Email}, {EmployeeID}, and {SSN}. You choose EmployeeID as the primary key. What are the other two keys now called, and can they still be used to uniquely access records?

  5. Explain why a foreign key can contain NULL values while a primary key cannot. Describe a real-world scenario where a nullable foreign key makes sense.