upgrade
upgrade

🧵Programming Languages and Techniques I

Common Data Structures

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Data structures are the foundation of everything you'll build as a programmer—they determine how efficiently your code runs and how elegantly it solves problems. You're being tested not just on what each structure is, but on when to use it and why it performs the way it does. Understanding the tradeoffs between structures like arrays vs. linked lists or hash tables vs. trees will show up repeatedly in exam questions, coding challenges, and real-world development.

Think of data structures as tools in a toolbox. A hammer and a screwdriver both connect things, but choosing the wrong one makes your job harder. The same principle applies here: time complexity, memory usage, and access patterns determine which structure fits your problem. Don't just memorize definitions—know what problem each structure solves best and what tradeoffs you're accepting when you choose it.


Linear Structures with Sequential Access

These structures store elements in a sequence where position matters. The key distinction is whether elements are stored contiguously in memory or linked through references.

Arrays

  • Fixed-size, contiguous memory storage—all elements sit next to each other in memory, enabling direct access via index calculation
  • O(1)O(1) access time makes arrays ideal when you need to read or update elements at known positions frequently
  • O(n)O(n) insertion/deletion because shifting elements is required; choose arrays when your data size is predictable and changes are rare

Linked Lists

  • Node-based structure where each node contains data plus a pointer to the next node (and previous, in doubly linked lists)
  • O(1)O(1) insertion/deletion when you have a reference to the position—no shifting required, just rewire the pointers
  • O(n)O(n) access time since you must traverse from the head; choose linked lists when you'll modify the structure frequently but don't need random access

Compare: Arrays vs. Linked Lists—both store sequences, but arrays win for access speed (O(1)O(1) vs. O(n)O(n)) while linked lists win for modification flexibility (O(1)O(1) vs. O(n)O(n)). If an exam asks about tradeoffs between memory locality and dynamic sizing, this is your go-to comparison.


Abstract Data Types with Restricted Access

These structures limit how you interact with data to enforce specific behaviors. The restriction itself is the feature—it guarantees predictable ordering of operations.

Stacks

  • Last In, First Out (LIFO)—the most recently added element is always the first removed, like a stack of plates
  • Two core operations: push adds to the top, pop removes from the top, both in O(1)O(1) time
  • Essential for recursion and backtracking—function call stacks, undo operations, and expression parsing all rely on LIFO behavior

Queues

  • First In, First Out (FIFO)—elements are processed in the order they arrived, like a line at a store
  • Two core operations: enqueue adds to the back, dequeue removes from the front, both in O(1)O(1) time
  • Critical for scheduling and BFS—task queues, print spoolers, and breadth-first search algorithms depend on FIFO ordering

Compare: Stacks vs. Queues—both restrict access to enforce ordering, but LIFO (stacks) handles nested/recursive problems while FIFO (queues) handles sequential processing. When asked about algorithm design, identify whether you need to "go back" (stack) or "process in order" (queue).


Key-Value Structures for Fast Lookup

These structures optimize for retrieval speed by using keys to locate data. The magic is in the hash function or key comparison that bypasses sequential searching.

Hash Tables

  • Key-value storage using a hash function—the function converts keys to array indices for near-instant lookup
  • Average O(1)O(1) for search, insert, delete—but can degrade to O(n)O(n) when many keys hash to the same index (collisions)
  • Collision resolution is critical: techniques like chaining (linked lists at each index) or open addressing (probing for empty slots) maintain performance

Dictionaries

  • Key-value pairs with unique keys—conceptually similar to hash tables, often implemented using them under the hood
  • O(1)O(1) average operations make dictionaries ideal for configuration data, caching, and any associative mapping
  • Language-specific implementations (Python's dict, Java's HashMap) add features like ordering guarantees or type constraints

Sets

  • Collection of unique elements with no duplicates allowed and typically no guaranteed order
  • Supports mathematical operations: union, intersection, and difference enable powerful data comparisons
  • Implemented via hash tables or trees—membership testing runs in O(1)O(1) average or O(logn)O(\log n) depending on implementation

Compare: Hash Tables vs. Sets—both use hashing for fast lookup, but hash tables store key-value pairs while sets store only keys (values). Use sets when you only care about membership ("Is X present?") and hash tables when you need to associate data with keys.


Hierarchical Structures for Organized Data

These structures represent parent-child relationships. The branching nature enables efficient searching and natural modeling of real-world hierarchies.

Trees

  • Hierarchical structure with root and child nodes—each node (except root) has exactly one parent, creating a branching pattern
  • Binary Search Trees (BSTs) maintain sorted order: left children are smaller, right children are larger, enabling O(logn)O(\log n) search
  • Models real hierarchies like file systems, DOM trees, and organizational charts where relationships matter

Heaps

  • Specialized tree satisfying the heap property—in a max-heap, parents are always ≥ children; in a min-heap, parents are always ≤ children
  • O(1)O(1) access to max/min element plus O(logn)O(\log n) insertion and deletion—perfect when you repeatedly need the extreme value
  • Powers priority queues and heapsort—any problem requiring "always process the highest/lowest priority item" benefits from heaps

Compare: BSTs vs. Heaps—both are tree-based, but BSTs optimize for searching any value (O(logn)O(\log n)) while heaps optimize for accessing the single max/min (O(1)O(1)). Choose BSTs for general lookup, heaps for priority-based processing.


Relationship-Based Structures

These structures model connections between entities where relationships are as important as the entities themselves. The flexibility to represent any connection pattern makes graphs uniquely powerful.

Graphs

  • Vertices connected by edges—unlike trees, graphs can have cycles, multiple paths, and complex connection patterns
  • Directed vs. undirected determines whether edges have one-way or two-way relationships, affecting traversal algorithms
  • Models networks of all kinds: social connections, road maps, internet routing, dependency chains—anywhere relationships matter

Compare: Trees vs. Graphs—trees are actually a restricted type of graph (connected, acyclic, one path between nodes). Use trees when hierarchy is clear; use graphs when relationships are more complex or cyclical.


Quick Reference Table

ConceptBest Examples
O(1)O(1) random accessArrays
O(1)O(1) insertion/deletionLinked Lists, Hash Tables
LIFO orderingStacks
FIFO orderingQueues
Key-value lookupHash Tables, Dictionaries
Unique element storageSets
Hierarchical relationshipsTrees
Priority-based accessHeaps
Complex relationshipsGraphs

Self-Check Questions

  1. You need to implement an "undo" feature that reverses the last 10 user actions. Which data structure is most appropriate, and why does its access pattern fit this problem?

  2. Compare arrays and linked lists: under what conditions would you choose a linked list despite its slower access time?

  3. Both hash tables and binary search trees can store key-value pairs. What are the tradeoffs in terms of time complexity and when might you prefer one over the other?

  4. A hospital emergency room needs to process patients by severity rather than arrival time. Which data structure should manage the patient queue, and what property makes it suitable?

  5. Explain why a social network's friend connections would be modeled as a graph rather than a tree. What specific graph characteristics apply here?