upgrade
upgrade

💾Intro to Database Systems

Essential SQL Commands

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

SQL commands form the backbone of every database interaction you'll encounter in this course and beyond. Whether you're building a web application, analyzing business data, or designing enterprise systems, you're being tested on your ability to retrieve, manipulate, and structure data efficiently. These commands aren't just syntax to memorize—they represent fundamental operations that map directly to relational algebra concepts, data integrity principles, and query optimization strategies.

Understanding SQL commands means recognizing when and why to use each one. Exams will test whether you can write correct queries, but FRQs often dig deeper: Can you explain why a JOIN outperforms a subquery? Do you know the difference between WHERE and HAVING? Don't just memorize the syntax—know what category each command belongs to and what problem it solves.


Data Manipulation Language (DML): Reading and Writing Data

These commands handle the day-to-day operations of working with data—the CRUD operations (Create, Read, Update, Delete) that form the foundation of application development.

SELECT

  • Retrieves data from tables—the most frequently used SQL command and the foundation of all query operations
  • Column specification allows you to request specific fields or use * for all columns, though explicit column lists improve performance and clarity
  • Supports filtering, sorting, and grouping through clauses like WHERE, ORDER BY, and GROUP BY, making it the core of data analysis

INSERT

  • Adds new records to a table—the "Create" in CRUD operations
  • Requires column-value mapping where you specify target columns and corresponding values: $$INSERT INTO table (col1, col2) VALUES (val1, val2)$$
  • Batch insertion allows multiple rows in a single command, reducing network overhead and improving transaction efficiency

UPDATE

  • Modifies existing records—changes column values for rows matching specified conditions
  • WHERE clause is critical—omitting it updates every row in the table, a common and dangerous mistake
  • Atomic operation ensures all specified columns update together within a single transaction

DELETE

  • Removes records from a table—the "Delete" in CRUD, but data can be recovered if within a transaction
  • WHERE clause prevents disasters—without it, you'll delete every row (this is a favorite exam trap)
  • Differs from DROP in that DELETE removes data while preserving table structure

Compare: UPDATE vs. DELETE—both require WHERE clauses to target specific rows, both can affect all records if WHERE is omitted. The difference: UPDATE modifies values in place, DELETE removes entire rows. If an FRQ asks about data integrity risks, mention the missing WHERE clause problem for both.


Data Definition Language (DDL): Building Database Structure

DDL commands define the schema—the blueprint of your database. These commands create, modify, and remove database objects like tables, indexes, and views. Changes are typically auto-committed and cannot be rolled back.

CREATE TABLE

  • Defines table structure—specifies table name, column names, and data types that determine what data can be stored
  • Constraints enforce integrity including PRIMARY KEY (unique identifier), FOREIGN KEY (referential integrity), NOT NULL, and UNIQUE
  • Schema design decisions made here affect query performance, data quality, and application behavior throughout the database lifecycle

ALTER TABLE

  • Modifies existing table structure—adds, removes, or changes columns and constraints without recreating the table
  • Supports schema evolution as requirements change: $$ALTER TABLE users ADD COLUMN email VARCHAR(255)$$
  • Can rename tables and modify data types, though type changes may fail if existing data is incompatible

DROP TABLE

  • Permanently removes table and all data—cannot be undone outside of backup restoration
  • Cascading effects may delete related foreign key constraints or dependent views
  • Requires elevated privileges in production environments due to its destructive nature

CREATE INDEX

  • Accelerates data retrieval by creating a separate data structure (typically B-tree) that points to table rows
  • Trade-off between read and write speed—indexes speed up SELECT but slow down INSERT, UPDATE, and DELETE operations
  • Unique indexes enforce uniqueness constraints while also improving query performance on those columns

Compare: CREATE TABLE vs. ALTER TABLE—both modify schema, but CREATE builds from scratch while ALTER modifies in place. CREATE INDEX vs. PRIMARY KEY—both can enforce uniqueness, but indexes are optional performance tools while primary keys are mandatory identifiers.


Query Clauses: Filtering and Organizing Results

These clauses modify how SELECT statements process and return data. They execute in a specific order: FROM → WHERE → GROUP BY → HAVING → SELECT → ORDER BY.

WHERE

  • Filters rows before grouping—evaluates conditions against individual records to include or exclude them
  • Supports comparison operators including $$=$$, $$<$$, $$>$$, $$<=$$, $$>=$$, $$<>$$, plus logical operators AND, OR, NOT
  • Works with SELECT, UPDATE, and DELETE—the universal filtering mechanism across DML commands

ORDER BY

  • Sorts result sets by one or more columns in ascending (ASC, default) or descending (DESC) order
  • Executes last in query processing—sorting happens after all filtering and grouping is complete
  • Multiple column sorting creates hierarchical ordering: $$ORDER BY department ASC, salary DESC$$

GROUP BY

  • Aggregates rows with matching values—collapses multiple rows into summary rows based on specified columns
  • Requires aggregate functions like COUNT, SUM, AVG, MIN, MAX to produce meaningful grouped results
  • Column rule: every non-aggregated column in SELECT must appear in GROUP BY (common exam question)

HAVING

  • Filters groups after aggregation—applies conditions to grouped results, not individual rows
  • Requires GROUP BY—HAVING without grouping is syntactically valid but logically equivalent to WHERE
  • Enables aggregate conditions like $$HAVING COUNT(*) > 5$$ which WHERE cannot evaluate

Compare: WHERE vs. HAVING—both filter data, but WHERE filters individual rows before grouping while HAVING filters groups after aggregation. Classic exam question: "Show departments with more than 10 employees"—you need GROUP BY department with HAVING COUNT(*) > 10, not WHERE.


Multi-Table Operations: Combining Data Sources

These commands work across multiple tables or result sets, enabling the relational power of relational databases. Understanding these is essential for normalized database design.

JOIN

  • Combines rows from multiple tables based on related columns, typically primary and foreign key relationships
  • Four main types: INNER JOIN (matching rows only), LEFT JOIN (all left + matching right), RIGHT JOIN (all right + matching left), FULL JOIN (all rows from both)
  • Foundation of relational queries—without JOINs, normalized databases would require multiple separate queries

UNION

  • Stacks result sets vertically—combines rows from multiple SELECT statements into a single result
  • Removes duplicates by default; use UNION ALL to preserve all rows including duplicates (faster performance)
  • Column compatibility required—all SELECT statements must return the same number of columns with compatible data types

Compare: JOIN vs. UNION—both combine data from multiple sources, but JOIN combines columns horizontally (adding more fields) while UNION combines rows vertically (adding more records). FRQ tip: If asked to "combine customer data from two regional tables," that's UNION. If asked to "show customers with their orders," that's JOIN.


Database Objects: Virtual Tables and Abstraction

These commands create persistent database objects that simplify complex operations and provide abstraction layers over raw tables.

CREATE VIEW

  • Defines a virtual table based on a stored SELECT query—no data is duplicated, just the query definition
  • Simplifies complex queries by encapsulating multi-table JOINs, filters, and calculations into a single named object
  • Provides security abstraction—users can access views without permissions on underlying tables, hiding sensitive columns

Compare: CREATE VIEW vs. CREATE TABLE—both create queryable objects, but tables store actual data while views store query definitions. Views always reflect current underlying data; tables require explicit updates. Use views for reporting and access control, tables for persistent storage.


Quick Reference Table

ConceptBest Examples
Data RetrievalSELECT, WHERE, ORDER BY
Data ModificationINSERT, UPDATE, DELETE
Schema DefinitionCREATE TABLE, ALTER TABLE, DROP TABLE
Aggregation & GroupingGROUP BY, HAVING, aggregate functions
Multi-Table QueriesJOIN, UNION
Performance OptimizationCREATE INDEX
Abstraction & SecurityCREATE VIEW
Row FilteringWHERE (before grouping), HAVING (after grouping)

Self-Check Questions

  1. Which two commands both require a WHERE clause to avoid affecting all rows in a table? What happens if you omit it in each case?

  2. Compare WHERE and HAVING: If you need to find all departments where the average salary exceeds $50,000, which clause would you use and why?

  3. What's the fundamental difference between JOIN and UNION? Give a scenario where each would be the appropriate choice.

  4. CREATE INDEX improves SELECT performance but has a trade-off. What operations become slower when you add indexes, and why?

  5. FRQ-style: You're designing a database for a library system. Explain when you would use CREATE TABLE vs. CREATE VIEW, and provide an example of each that demonstrates their different purposes.