AP Computer Science A Unit 4 ReviewData Collections

Verified for the 2027 examCompiled by AP educators
Pep mascot
Upgrade your Fiveable account to print any study guide

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Click below to go to billing portal → update your plan → choose Yearly→ and select "Fiveable Share Plan". Only pay the difference

Plan is open to all students, teachers, parents, etc
Pep mascot
Upgrade your Fiveable account to export vocabulary

Download study guides as beautiful PDFs See example

Print or share PDFs with your students

Always prints our latest, updated content

Mark up and annotate as you study

Plan is open to all students, teachers, parents, etc

AP Computer Science A Unit 4, Data Collections, covers arrays, ArrayLists, 2D arrays, sorting algorithms, and searching algorithms across 17 topics worth 30-40% of the AP exam. You'll work with both fixed-size arrays and dynamic ArrayLists, learning when each structure fits the job. AP CSA Unit 4 also gets into recursion, 2D array traversals, and the ethical side of collecting data at scale.

unit 4 review

AP Computer Science A Unit 4 is where you stop working with one variable at a time and start working with collections of data. You build and traverse arrays, ArrayLists, and 2D arrays, then run the classic algorithms on them, including linear search, binary search, selection sort, insertion sort, merge sort, and recursion. The single biggest idea is that a loop plus a data structure lets one short block of code process thousands of values. At 30-40% of the AP exam, this unit carries more weight than any other in the course.

What this unit covers

Data, ethics, and reading from files

  • Personal privacy is at risk whenever programs collect and store user data, so programmers have a responsibility to safeguard it when building new software.
  • Algorithmic bias means systemic, repeated errors in a program that produce unfair outcomes for a specific group of users. It often traces back to how a data set was collected, or to incomplete and inaccurate data.
  • A data set only answers the question it was collected for. Using it to extrapolate answers to a different question can produce wrong conclusions, so check the fit before you analyze.
  • Files store data that persists after a program stops running. You connect a file to your program with the File and Scanner classes, passing the file name as a String to the File constructor.

Arrays: fixed-size storage

  • An array stores multiple values of the same type, either primitives or object references. Its length is set at creation, cannot change, and is accessed with the length attribute (no parentheses).
  • Creating an array with new fills it with default values, 0 for int, 0.0 for double, false for boolean, and null for object references.
  • You traverse arrays with indexed for loops, while loops, or enhanced for loops. The enhanced for loop variable gets a copy of each element, so reassigning it does not change the array.
  • Standard array algorithms include finding a min or max, computing a sum or average, counting elements with a property, checking "at least one" versus "all," comparing consecutive pairs, detecting duplicates, shifting or rotating elements, and reversing order.

ArrayLists: dynamic-size storage

  • An ArrayList is mutable in size and holds object references only. That is why wrapper classes matter here. Integer and Double are immutable objects that wrap primitives, and Java autoboxes (int to Integer) and unboxes (Integer to int) automatically.
  • Use the generic form ArrayList<E>, like ArrayList<String> names = new ArrayList<String>();, so Java checks the element type for you.
  • Core methods to know cold are size(), add(obj), add(index, obj), get(index), set(index, obj), and remove(index).
  • Removing elements during a traversal takes care so you do not skip the element that slides into the removed slot. Changing an ArrayList's size inside an enhanced for loop causes a ConcurrentModificationException, and going past valid indices throws an IndexOutOfBoundsException.
  • The standard array algorithms (min/max, sum, count, duplicates, reverse) all apply to ArrayLists too, just written with method calls instead of bracket notation.

2D arrays: tabular data

  • A 2D array is literally an array of arrays, so grid[r] is a whole row and grid[r][c] is one element. Size is fixed at creation, and new fills it with defaults just like 1D arrays.
  • Nested loops traverse a 2D array. Row-major order finishes one entire row before moving to the next, which is the default pattern. Column-major order goes down each column first, which you get by swapping the loop nesting.
  • grid.length is the number of rows and grid[0].length is the number of columns. Mixing these up is the most common 2D bug.
  • 2D algorithms are the 1D algorithms applied to the whole grid or to a designated row, column, or subsection, including min/max, sum/average, counting, and "at least one" or "all" property checks.
  • To linear search a 2D array, access each row, then run a 1D linear search on that row.

Searching, sorting, and recursion

  • Linear search checks each element in order until it finds the target or runs out of elements. It works on unsorted data and can start from either end.
  • Binary search requires sorted data. It starts at the middle and eliminates half the remaining elements each step, which makes it typically far more efficient than linear search. It can be written iteratively or recursively.
  • Selection sort repeatedly picks the smallest element from the unsorted portion and swaps it into its correct and final position. Insertion sort takes the next unsorted element and slides it into its correct (but not necessarily final) spot in the sorted portion.
  • Merge sort is recursive. It splits the array into smaller subarrays until each has one element, then merges the sorted pieces back together in order.
  • A recursive method calls itself. It needs at least one base case to stop and at least one recursive call to make progress. Each call gets its own copy of local variables and parameters, and any recursive solution can also be written with iteration.

Unit 4, Iteration in Programming at a glance

Structure / AlgorithmWhat it isSizeKey syntax or moveWatch out for
1D arrayFixed list of same-type valuesSet at creationarr.length, arr[i]Defaults on creation; off-by-one at length
ArrayListResizable list of object referencesGrows and shrinkssize(), add, get, set, removeRemoving while traversing skips elements
2D arrayArray of arrays (table of data)Set at creationNested loops, grid[r][c]Rows vs. columns; grid.length is rows
Linear searchCheck every element in orderWorks unsortedOne loop, compare each elementReturns first match; slow on big data
Binary searchCut sorted data in half each stepNeeds sorted dataTrack low, mid, highOnly works on sorted collections
Selection sortSwap the min into final positionIn placeFind min of unsorted part, swapEach placed element is final
Insertion sortSlide next element into sorted partIn placeShift larger elements rightPosition is correct, not final
Merge sortSplit to single elements, merge sortedRecursiveDivide, then merge in orderIt is recursive, not iterative
RecursionA method that calls itselfn/aBase case plus recursive callNo base case means no stopping

Why Unit 4, Iteration in Programming matters in AP CSA

This unit is where the course's two threads finally meet. The control structures from earlier units and the objects you learned to build now combine into programs that actually process real data. Almost every interesting program, from a gradebook to a game board, is a collection plus a traversal.

  • The standard traversal algorithms (min/max, sum, count, search) are the building blocks of nearly every free-response answer in the course. You will reuse and remix them constantly.
  • Sorting and searching introduce the idea of algorithmic efficiency. Binary search beating linear search is your first concrete look at why algorithm choice matters.
  • The ethics topics tie code to consequences. Algorithmic bias and data privacy connect what you write in Java to how software affects real people.
  • Recursion gives you a second model of repetition. Seeing that loops and recursion can solve the same problems deepens how you think about both.

How this unit connects across the course

  • The for and while loops and boolean conditions from Selection and Iteration (Unit 2) are the engine here. Every traversal in this unit is a Unit 2 loop pointed at a collection, and nested 2D traversals are loops inside loops.
  • Objects, method calls, and String skills from Using Objects and Methods (Unit 1) come back through ArrayList methods, wrapper classes like Integer and Double, and recursive String traversals.
  • Class design from Class Creation (Unit 3) pays off when collections hold objects you wrote yourself. An ArrayList<Student> only makes sense if you can build the Student class, and FRQs love combining the two.
  • This unit is the destination the first three were building toward, so expect exam questions that blend all four. A typical FRQ asks you to write a method in a class you are given that traverses an array or ArrayList of objects.

Key syntax and algorithms

  • int[] arr = new int[10]; creates a fixed-size array with every element set to the default value for its type.
  • arr.length (attribute, no parentheses) gives an array's length; list.size() (method, with parentheses) gives an ArrayList's size.
  • for (int x : arr) is the enhanced for loop. The variable is a copy of each element, so use it for reading, not for replacing elements.
  • ArrayList<String> names = new ArrayList<String>(); builds an empty generic ArrayList.
  • add, add(index, obj), get, set, remove are the five ArrayList methods you need fluently, including what set and remove return.
  • new File(str) and Scanner connect a text file to your program so you can read persistent data.
  • Nested loops for 2D traversal, with grid.length rows and grid[0].length columns, in row-major or column-major order.
  • Linear search compares each element to the target in order; it is the only search that works on unsorted data.
  • Binary search repeatedly checks the middle of a sorted collection and discards half. Be able to list the exact elements checked.
  • Selection sort and insertion sort, traced step by step. Know what the array looks like after each pass of each one.
  • Merge sort, the recursive divide-and-merge sort. Know the splitting and merging pattern even though you trace it rather than write it from scratch.
  • Recursive methods with a base case and recursive call. Practice tracing calls by hand, tracking each call's own parameters.

Unit 4, Iteration in Programming on the AP exam

This content is worth 30-40% of the exam, the largest share of any unit. On the multiple-choice section, expect to trace code, including determining what an array or ArrayList holds after a traversal, predicting the output of a recursive method, counting how many elements binary search examines, and identifying the state of a list mid-sort. Off-by-one questions and row versus column questions on 2D arrays are classics.

On the free-response section, collections appear directly. One FRQ targets arrays and ArrayLists and another targets 2D arrays, and both ask you to write complete methods to a specification. That means producing working traversals from scratch, such as finding a value, counting matches, building a new list from an old one, or processing a grid by rows or columns. You will not be asked to write merge sort or binary search from memory, but you do need to determine the results of executing them step by step. Recursion shows up as trace-and-evaluate questions, not as code you write.

Essential questions

  • Why do programs need data structures at all, and when is a fixed-size array the right choice versus a resizable ArrayList?
  • How does the way data is organized (sorted versus unsorted, 1D versus 2D) change which algorithms you can use and how fast they run?
  • How can the same repetition be expressed as either a loop or a recursive call, and what does each version make easier?
  • Who is affected when a program is built on biased or low-quality data, and what responsibilities do programmers have to the people whose data they collect?

Key terms to know

  • Traversal: Using a loop (or recursion) to access all or an ordered sequence of elements in a collection.
  • Enhanced for loop: A loop whose variable receives a copy of each element in order, with no index needed.
  • Wrapper class: An object version of a primitive type, like Integer for int and Double for double, both immutable.
  • Autoboxing: The compiler's automatic conversion from a primitive to its wrapper object, like int to Integer; unboxing is the reverse.
  • Row-major order: Traversing a 2D array by finishing each entire row before moving to the next row.
  • Column-major order: Traversing a 2D array down each column before moving to the next column.
  • Linear search: Checking elements one at a time, in order, until the target is found or the collection is exhausted.
  • Binary search: Repeatedly checking the middle of a sorted collection and eliminating half the remaining elements each step.
  • Selection sort: Repeatedly swapping the smallest unsorted element into its correct and final position.
  • Insertion sort: Inserting each unsorted element into its correct (but not necessarily final) spot in the sorted portion.
  • Merge sort: A recursive sort that splits a collection into single-element pieces, then merges them back together in order.
  • Base case: The condition in a recursive method that halts the recursion instead of making another call.
  • Algorithmic bias: Systemic, repeated errors in a program that create unfair outcomes for a specific group of users.
  • IndexOutOfBoundsException: The runtime error thrown when you access an index outside a collection's valid range.

Common mix-ups

  • length versus size(). Arrays use the length attribute with no parentheses, ArrayLists use the size() method with parentheses, and Strings use the length() method. Mixing these up is a free point lost.
  • Selection sort versus insertion sort. Selection sort puts each chosen element in its final position immediately. Insertion sort places elements in the right relative order, but they can still shift later.
  • Enhanced for loops cannot modify the collection. Assigning to the loop variable does not change the array, and adding or removing from an ArrayList inside one throws a ConcurrentModificationException.
  • grid.length is the number of rows, not columns. In grid[r][c], the first index picks the row and the second picks the column. Reversing them flips your whole traversal.