💻Intro to Programming in R

Key R Data Structures

Study smarter with Fiveable

Get study guides, practice questions, and cheatsheets for all your subjects. Join 500,000+ students with a 96% pass rate.

Get Started

Why This Matters

Every operation you perform in R—whether it's cleaning messy datasets, running statistical models, or creating visualizations—depends on understanding how data is stored and accessed. You're being tested not just on what these structures are, but on when to use each one and how they behave differently. The difference between a vector and a list, or between a matrix and a data frame, determines whether your code runs smoothly or throws cryptic errors.

These structures fall into patterns based on two key questions: Does it hold one data type or many? and How many dimensions does it have? Master these distinctions, and you'll know exactly which structure to reach for in any situation. Don't just memorize definitions—know what problem each structure solves and how to access its elements.

Homogeneous Structures: One Data Type Only

These structures enforce type consistency—every element must be the same type (numeric, character, logical, etc.). This constraint enables faster computations and predictable behavior.

Vectors

The fundamental building block of R—nearly everything in R is built on vectors, including single values (which are just vectors of length 1)
Created with c() to combine elements; supports numeric, character, logical, integer, and complex types
Element-wise operations allow you to perform calculations on entire vectors without writing loops—a core R programming pattern

Matrices

Two-dimensional arrays with rows and columns—created using matrix() with specified nrow and ncol arguments
Supports linear algebra operations including matrix multiplication (%*%), transposition (t()), and inversion (solve())
Indexed with [row, col] notation—leave one blank to select entire rows or columns (e.g., mat[1, ] gets row 1)

Arrays

Multi-dimensional extension of matrices—can have 3, 4, or more dimensions for complex data like RGB images or panel data
Created with array() specifying a dim vector that defines the size of each dimension
Indexed by position in each dimension—arr[1, 2, 3] accesses row 1, column 2, layer 3

Compare: Vectors vs. Matrices vs. Arrays—all enforce single-type storage, but differ in dimensionality (1D, 2D, nD). If a question asks about storing image pixel data across color channels, arrays are your answer.

Heterogeneous Structures: Mixed Data Types Welcome

These structures can hold different data types simultaneously. This flexibility makes them essential for real-world datasets where you need numbers, text, and categories together.

Lists

The most flexible structure in R—can hold vectors, data frames, other lists, even functions as elements
Access elements with [[ ]] or $—single brackets [ ] return a sub-list, double brackets extract the actual element
Powers functional programming through lapply() and sapply(), which apply functions to each element iteratively

Data Frames

R's workhorse for tabular data—each column is a vector (enforcing type within columns), but columns can differ from each other
Created with data.frame() or imported from CSV/Excel files; the standard format for statistical analysis in R
Compatible with tidyverse functions like dplyr::filter(), select(), and mutate() for powerful data manipulation

Compare: Lists vs. Data Frames—both hold mixed types, but data frames enforce equal-length columns in a rectangular structure. Use lists when elements have different lengths or structures; use data frames for traditional datasets with observations and variables.

Special-Purpose Structures

These structures serve specific analytical needs that general structures can't handle as elegantly.

Factors

Designed for categorical data—stores categories as integer codes with associated labels, saving memory for repeated values
Created with factor() and essential for statistical modeling—R treats factors differently than character vectors in regression and ANOVA
Levels control ordering and grouping—use levels argument to set custom order, critical for correct plot axis ordering and contrast coding

Compare: Factors vs. Character Vectors—both store text-like data, but factors carry level information that affects statistical functions and plotting. Convert to factors when you need R to recognize categories; keep as characters for free-form text.

Quick Reference Table

Concept	Best Examples
Single data type, 1D	Vectors
Single data type, 2D	Matrices
Single data type, 3D+	Arrays
Mixed types, flexible structure	Lists
Mixed types, rectangular/tabular	Data Frames
Categorical data with levels	Factors
Element-wise operations	Vectors, Matrices, Arrays
Statistical modeling input	Data Frames, Factors

Self-Check Questions

You need to store a dataset with columns for name (text), age (numeric), and enrolled (TRUE/FALSE). Which structure should you use, and why wouldn't a matrix work?
Compare and contrast how you would access the third element of a vector versus the third element of a list. What's the difference between mylist[3] and mylist[[3]]?
Which two structures are both two-dimensional but differ in their type flexibility? When would you choose one over the other?
A function returns multiple outputs of different lengths—a numeric vector of coefficients, a character string for the model name, and a data frame of residuals. Which structure should you use to store all of these together?
You're building a regression model and have a variable for education level (High School, Bachelor's, Master's, PhD). Should this be stored as a character vector or a factor? What happens differently in the model based on your choice?

💻Intro to Programming in R

Key R Data Structures

Why This Matters

Homogeneous Structures: One Data Type Only

Vectors

Matrices

Arrays

Heterogeneous Structures: Mixed Data Types Welcome

Lists

Data Frames

Special-Purpose Structures

Factors

Quick Reference Table

Self-Check Questions

history

social science

english & capstone

arts

science

math & computer science

world languages

high school exams

honors classes

college classes

hs classes

Study Content & Tools

Company

Resources