What is AP Computer Science A unit 4?
Unit 4 is the largest unit in AP CSA by exam weight. It introduces three data structures for storing groups of related values, arrays, ArrayLists, and 2D arrays, and then builds a library of standard algorithms on top of them. The unit also opens with ethical questions about privacy and bias that apply whenever programs collect and process real data.
Unit 4 teaches you how to store, access, traverse, search, sort, and recursively process collections of data in Java, while also asking you to think critically about the social implications of data collection.
Three core data structures
A 1D array has a fixed length set at creation and uses arr[i] indexing. An ArrayList is dynamically sized and uses methods like add(), remove(), get(), and set(). A 2D array is an array of arrays accessed with arr[row][col] and traversed with nested loops.
Standard algorithms
Every data structure supports the same family of algorithms: find min or max, compute sum or average, count or check elements for a property, detect duplicates, shift or reverse elements, and insert or delete. Knowing these patterns for arrays, ArrayLists, and 2D arrays is the core skill of the unit.
Searching, sorting, and recursion
Linear search checks elements one at a time. Binary search cuts a sorted collection in half each step. Selection sort swaps the minimum into place; insertion sort shifts elements to insert each new value. Merge sort recursively splits and merges. Recursion itself requires tracing base cases and recursive calls through a call stack.
Why data collections handle the examArrays, ArrayLists, and 2D arrays are the building blocks of almost every non-trivial Java program. The AP exam tests whether you can read code that manipulates these structures, predict its output, identify bugs, and write new algorithms. Because this unit accounts for 30-40% of the exam, a strong grasp of traversal patterns, index arithmetic, and algorithm tracing pays off across both the multiple-choice and free-response sections.
Unit 4 review notes
4.1
Data Ethics and Data Sets
Before writing any code, AP CSA asks you to think about the human side of data. Topic 4.1 covers privacy risks and algorithmic bias; Topic 4.2 introduces the idea of a data set and how to plan algorithms that process it.
- Personally identifiable information (PII): Data that can identify a specific individual; programmers must protect it and minimize what they collect.
- Algorithmic bias: Systemic errors in a program that produce unfair outcomes for a specific group, often caused by biased or incomplete training data.
- Data set fitness: A data set must match the question being asked; using an unrelated or incomplete data set leads to incorrect conclusions.
- Element-wise processing: Data sets are analyzed by accessing and processing one value at a time, then aggregating results such as sums, counts, or averages.
- Diagram planning: Charts and tables can represent a data set visually and help plan the algorithm before writing code.
Can you explain two ways algorithmic bias can enter a program, and describe what makes a data set appropriate for a given question?
| Issue | Cause | Programmer response |
|---|
| Privacy breach | Storing unnecessary PII | Collect only what is needed; protect stored data |
| Algorithmic bias | Biased or unrepresentative training data | Audit data collection method before use |
| Incomplete data | Missing records or fields | Verify data set completeness before drawing conclusions |
4.3
Array Creation, Access, and Traversal
A 1D array stores multiple values of the same type under one variable. Its length is fixed at creation. You access elements with arr[i], where valid indices run from 0 to arr.length - 1. Traversal uses an indexed for loop, a while loop, or an enhanced for loop depending on whether you need to modify elements.
- Default values: When created with new, int elements default to 0, double to 0.0, boolean to false, and reference types to null.
- ArrayIndexOutOfBoundsException: Thrown at runtime when an index is negative or greater than or equal to arr.length.
- Enhanced for loop limitation: Assigning a new value to the enhanced for loop variable does not change the array; use an indexed loop to modify elements.
- Object references in arrays: When an array stores objects, the enhanced for loop variable holds a reference; you can call methods on it to change the object's state.
- Initializer list: Syntax like int[] nums = {3, 7, 2} creates and populates an array in one statement without using new.
Trace this loop: int[] a = {4, 1, 3}; for (int i = a.length - 1; i >= 0; i--) System.out.print(a[i] + " "); What prints?
| Loop type | Can modify elements? | Needs index? | Best use |
|---|
| Indexed for loop | Yes | Yes | Modify elements, access by position |
| While loop | Yes | Yes | Variable step size or early exit |
| Enhanced for loop | No (primitives) | No | Read-only traversal of all elements |
4.5
Standard Array Algorithms
Every standard array algorithm is a traversal with a specific accumulation or comparison pattern. Know how to implement and trace each one.
- Min/max tracking: Initialize a variable to the first element, then update it whenever a smaller or larger element is found during traversal.
- Sum and average: Accumulate a running total in a loop; divide by arr.length after the loop for the average.
- Existential and universal checks: Use a boolean flag or early return: any-element checks return true on first match; all-element checks return false on first failure.
- Shift and rotate: Shifting moves elements left or right by one position; rotating wraps the displaced element to the other end.
- Reverse in place: Swap arr[i] and arr[arr.length - 1 - i] for i from 0 to arr.length / 2 - 1.
Write the loop condition and body to find the index of the maximum value in an int array named scores.
4.6
Text Files and Wrapper Classes
Topic 4.6 covers reading persistent data from a file using File and Scanner. Topic 4.7 introduces Integer and Double wrapper classes, which are required when storing primitives in an ArrayList.
- File and Scanner pattern: Create a File object with the filename string, pass it to a Scanner constructor, then use hasNext() and nextInt()/nextDouble()/next() in a while loop.
- throws IOException: Any method that opens a file must declare throws IOException in its header; if the file is not found, the program terminates.
- Autoboxing: Java automatically converts a primitive int or double to its wrapper Integer or Double when the context requires an object, such as adding to an ArrayList.
- Unboxing: Java automatically converts an Integer or Double back to its primitive when the context requires a primitive, such as arithmetic.
- Integer.parseInt / Double.parseDouble: Static methods that convert a String to an int or double; useful when reading numeric data from a file as text.
What two import statements are needed to read from a file, and what happens if the file name passed to the File constructor does not exist?
| Wrapper class | Primitive | Parse method | Immutable? |
|---|
| Integer | int | Integer.parseInt(String) | Yes |
| Double | double | Double.parseDouble(String) | Yes |
4.8
ArrayList Methods, Traversal, and Algorithms
An ArrayList is a dynamically sized list of object references from java.util. Its five core methods are size(), add(E), add(int, E), get(int), set(int, E), and remove(int). Traversal and algorithms follow the same patterns as arrays but require careful handling when the list size changes during a loop.
- Generic type ArrayList<E>: Specifying the element type, such as ArrayList<String>, lets the compiler catch type errors at compile time instead of runtime.
- remove(int) shifting: Removing an element shifts all later elements left by one; if you remove inside an indexed for loop, decrement i after the removal to avoid skipping an element.
- ConcurrentModificationException: Thrown when you add or remove elements inside an enhanced for loop over an ArrayList; use an indexed for loop instead.
- set() return value: set(int index, E element) replaces the element at index and returns the element that was previously stored there.
- Multi-list algorithms: Some algorithms require traversing two ArrayLists simultaneously, such as merging two sorted lists or comparing corresponding elements.
An ArrayList<Integer> contains [5, 3, 8, 3, 2]. Write the loop that removes all elements equal to 3 without skipping any element.
| Method | Returns | Effect on size |
|---|
| size() | int | None |
| add(E) | void | Increases by 1 |
| add(int, E) | void | Increases by 1, shifts right |
| remove(int) | E (removed element) | Decreases by 1, shifts left |
| set(int, E) | E (previous element) | None |
4.11
2D Arrays: Creation, Traversal, and Algorithms
A 2D array is an array of arrays. You create it with new int[rows][cols] or an initializer list, access elements with arr[row][col], get the row count from arr.length, and the column count from arr[0].length. Nested loops traverse in row-major or column-major order.
- Row-major vs column-major order: Row-major: outer loop over rows, inner loop over columns. Column-major: outer loop over columns, inner loop over rows. The order matters when the algorithm depends on sequence.
- Enhanced for loop over 2D array: The outer enhanced for variable has type int[] (a row); the inner enhanced for variable has the element type. Assigning to the inner variable does not change the array.
- arr.length and arr[0].length: arr.length gives the number of rows; arr[0].length gives the number of columns in the first row.
- Subsection algorithms: Standard algorithms like min, max, sum, and count can be applied to the entire 2D array or restricted to a specific row, column, or rectangular region.
- Default values in 2D arrays: Same as 1D: int defaults to 0, double to 0.0, boolean to false, reference types to null.
Write nested for loops that compute the sum of all elements in a 2D int array named grid with r rows and c columns.
| Traversal order | Outer loop variable | Inner loop variable | Typical use |
|---|
| Row-major | row index i | column index j | Process each row left to right |
| Column-major | column index j | row index i | Process each column top to bottom |
4.14
Linear Search
Linear search checks each element in order until the target is found or the collection is exhausted. It works on unsorted and sorted data and can start from either end. For 2D arrays, apply linear search row by row using nested loops.
- Return convention: Return the index where the target was found, or -1 if it was not found.
- Boolean flag pattern: An alternative to returning an index: set a boolean found = false before the loop and set it to true when the target is located.
- 2D linear search: Use an outer loop over rows and an inner loop over columns; check arr[i][j] against the target at each position.
- No sorted-order requirement: Unlike binary search, linear search does not require the data to be sorted.
- Early termination: Once the target is found, the loop can exit immediately using a return statement or break to avoid unnecessary comparisons.
Trace a linear search for the value 7 in the array [2, 9, 7, 4, 1]. How many comparisons are made before the value is found?
4.15
Selection Sort and Insertion Sort
Both algorithms sort in place and run in O(n^2) time. Selection sort minimizes swaps; insertion sort is efficient on nearly sorted data. The exam asks you to trace each pass, not write the full algorithm from scratch.
- Selection sort pass: Find the index of the minimum value in the unsorted portion, then swap it with the first element of the unsorted portion. Repeat until sorted.
- Insertion sort pass: Take the next element from the unsorted portion as the key, then shift sorted elements right until the correct position for the key is found, and insert it.
- Sorted vs unsorted portion: Both algorithms maintain a growing sorted prefix and a shrinking unsorted suffix. After k passes, the first k elements are in their correct relative positions.
- Stability: Insertion sort is stable (equal elements keep their original order); selection sort is not guaranteed to be stable.
- Tracing tip: Write out the array after each outer-loop pass to estimate score progress; the exam often asks for the state after a specific number of passes.
Trace selection sort on [5, 2, 8, 1, 4]. What does the array look like after two complete passes?
| Algorithm | Key operation | Swaps per pass | Best case |
|---|
| Selection sort | Find min, swap into position | Exactly 1 | O(n^2) comparisons |
| Insertion sort | Shift elements, insert key | 0 to many | O(n) when already sorted |
4.16
Recursion
A recursive method calls itself. Every recursive method needs at least one base case that stops the recursion and at least one recursive call that moves toward the base case. On the AP exam, you trace recursive calls and determine the return value; writing recursive code is outside exam scope.
- Base case: The condition under which the method returns a value directly without making another recursive call; prevents infinite recursion.
- Recursive call: The call to the same method with a modified argument that moves closer to the base case.
- Local variables per call: Each recursive call has its own set of local variables and parameters on the call stack, independent of other calls.
- Tracing technique: Work from the innermost base case outward, substituting return values back up the call chain to find the final result.
- Recursion vs iteration: Any recursive solution can be rewritten iteratively and vice versa; parameter values in recursion play the same role as loop control variables.
Trace mystery(4) where mystery(int n) returns 0 if n == 0, else returns n + mystery(n - 1). What is the return value?
4.17
Binary Search and Merge Sort
Topic 4.17 applies recursion to searching and sorting. Binary search requires sorted data and eliminates half the collection each step. Merge sort recursively splits a collection to single elements and merges them back in order. Both can be written iteratively or recursively; the exam asks you to trace them.
- Binary search precondition: The array or ArrayList must be sorted before binary search is applied; using it on unsorted data produces incorrect results.
- Mid-index calculation: mid = (low + high) / 2; compare the target to arr[mid] and eliminate the left or right half accordingly.
- Merge sort split phase: Recursively divide the collection in half until each subarray contains one element, which is trivially sorted.
- Merge sort merge phase: Combine two sorted subarrays into one sorted array by comparing front elements and placing the smaller one first.
- Efficiency comparison: Binary search is more efficient than linear search on sorted data; merge sort runs in O(n log n) time, faster than O(n^2) selection and insertion sort for large inputs.
Trace binary search for the value 14 in [2, 5, 8, 11, 14, 17, 20]. List the mid value checked at each step.
| Algorithm | Precondition | Strategy | Time complexity |
|---|
| Linear search | None | Check each element in order | O(n) |
| Binary search | Sorted data | Eliminate half each step | O(log n) |
| Selection sort | None | Swap min into place each pass | O(n^2) |
| Insertion sort | None | Insert each element into sorted prefix | O(n^2) worst, O(n) best |
| Merge sort | None | Split then merge recursively | O(n log n) |
Practice AP Computer Science A unit 4 questions
Try AP-style multiple-choice questions and written prompts after you review the notes.
QuestionConsider the following method that processes an ArrayList<Integer>.
public static void processList(ArrayList<Integer> data) {
int first = data.get(0);
data.add(first);
data.remove(0);
}
Which of the following describes the initial condition that must be met for this method to execute without throwing an exception?
The list data must be instantiated and have some elements.
The list data must be instantiated but can remain empty.
The list data must be declared but remain uninstantiated.
The list data must be instantiated and have null elements.
QuestionA coffee shop is designing a SalesTracker class to determine if playing classical music increases the sales of premium espresso drinks compared to playing pop music. Which of the following sets of instance variables provides the most appropriate data set?
private int[] espressoSalesClassical; private int[] espressoSalesPopMusic;
private int[] pastriesSalesClassical; private int[] pastriesSalesPopMusic;
private int[] espressoSalesMornings; private int[] espressoSalesAfternoon;
private int[] pastriesSalesMornings; private int[] pastriesSalesAfternoon;
4. The Warehouse class is used to manage the storage of boxes in a grid layout. A partial declaration of the Box class is shown below.
public class Box
{
public String getCategory()
{ }
public double getWeight()
{ }
}
public class Warehouse
{
private Box[][] storage;
public int rowWithMaxWeight(String category)
{ }
}
When an element of the two-dimensional array storage is accessed, the first index is used to specify the row and the second index is used to specify the column.
Write the Warehouse method rowWithMaxWeight. The method should traverse the 2D array storage to calculate the total weight of boxes matching the parameter category for each row. It should return the index of the row with the highest total weight for that category. If there are multiple rows with the same maximum total weight, return the smallest of those row indices. If no boxes of the specified category are found in the entire array, the method should return -1.
Suppose storage has the following contents. For each element, the first value is the category and the second value is the weight.
Row | 0 | 1 | 2 |
|---|
0 | "parts" 10.0 | "toys" 5.0 | "parts" 5.0 |
1 | "books" 2.0 | "books" 3.0 | "books" 1.0 |
2 | "parts" 20.0 | "toys" 2.0 | "parts" 2.0 |
3 | "toys" 10.0 | "toys" 10.0 | "parts" 1.0 |
Call: rowWithMaxWeight("parts")
Expected Return: should return 2
Explanation: Row 0 total: 15.0. Row 1 total: 0.0. Row 2 total: 22.0. Row 3 total: 1.0. The maximum is 22.0 at index 2.
Call: rowWithMaxWeight("toys")
Expected Return: should return 3
Explanation: Row 0 total: 5.0. Row 1 total: 0.0. Row 2 total: 2.0. Row 3 total: 20.0. The maximum is 20.0 at index 3.
Call: rowWithMaxWeight("clothing")
Expected Return: should return -1
Explanation: No boxes with category "clothing" exist in the array.
Complete method rowWithMaxWeight.
public int rowWithMaxWeight(String category)
3. The DailyReading class represents weather data collected on a single day. A partial declaration of the DailyReading class is shown below.
public class DailyReading
{
public double getTemperature()
{ }
public double getWindSpeed()
{ }
public boolean isRaining()
{ }
}
public class WeatherStation
{
private ArrayList<DailyReading> readings;
public double averageTempWithCondition(boolean requireRain, double minWind)
{ }
}
Write the WeatherStation method averageTempWithCondition. This method calculates and returns the average temperature of the valid readings in the readings list. A reading is considered valid if it meets the wind speed requirement (at least minWind) and the rain requirement (if requireRain is true, the reading must be raining; if requireRain is false, the rain status does not matter).
Suppose readings contains the following five DailyReading objects.
Index | 0 | 1 | 2 | 3 | 4 |
|---|
Temperature | 70.0 | 80.0 | 60.0 | 90.0 | 65.0 |
Wind Speed | 10.0 | 5.0 | 15.0 | 20.0 | 12.0 |
Is Raining | true | false | true | false | true |
Call: averageTempWithCondition(true, 11.0)
Explanation: The method considers only readings where isRaining() is true AND wind speed is >= 11.0. Index 0 fails the wind condition (10.0 < 11.0). Index 1 and 3 fail the rain condition. Index 2 (Wind 15.0, Rain true) and Index 4 (Wind 12.0, Rain true) are valid. The average of their temperatures (60.0 and 65.0) is 62.5.
Complete method averageTempWithCondition.
public double averageTempWithCondition(boolean requireRain, double minWind)