The Knapsack problem asks: given a set of items, each with a weight and a value, which items should you pick to maximize total value without exceeding a weight limit? It's one of the most important dynamic programming problems because it clearly demonstrates how DP can turn an exponential brute-force search into something tractable.

Problem Definition and Characteristics

Think of it this way: you have a backpack that can hold a fixed amount of weight, and a pile of items to choose from. Each item is all-or-nothing: you either take it or leave it (that's the "0/1" part). Your goal is to pack the most valuable combination without going over the weight limit.

NP-complete problem: there's no known polynomial-time algorithm that solves every instance optimally
A greedy approach (grabbing items with the best value-to-weight ratio first) often seems reasonable but can miss the true optimum
Real-world applications include cargo loading, investment portfolio selection, resource allocation, and cutting stock problems

Mathematical Formulation

The formal setup uses $n$ items, where item $i$ has value $v_i$ and weight $w_i$ , and the knapsack has capacity $W$ .

Objective: maximize total value: $\text{maximize} \sum_{i=1}^n v_i x_i$
Constraint: stay within capacity: $\sum_{i=1}^n w_i x_i \leq W$
Decision variables: $x_i \in \{0,1\}$ for each item $i$ (1 means you take it, 0 means you don't)

Example Scenario

Suppose you're loading a truck with a 1000 kg capacity and you have three categories of packages:

Electronics: high value, moderate weight
Books: low value, high weight
Clothing: moderate value, low weight

You can't take everything, so you need to find the combination that maximizes total value without exceeding 1000 kg. A greedy strategy might grab all the electronics first, but that could leave you with wasted capacity that a different mix would have used better.

Dynamic Programming Solution

DP Table Construction

The DP approach builds a 2D table where rows represent items (considered one at a time) and columns represent possible weight capacities from 0 up to $W$ .

Each cell $dp[i][w]$ stores the maximum value achievable using only the first $i$ items with a weight limit of $w$ . For each cell, you make a simple choice: is it better to skip item $i$ , or include it?

Recurrence relation:

$dp[i][w] = \max(dp[i-1][w],\ dp[i-1][w - w_i] + v_i)$

$dp[i-1][w]$ : the best value if you skip item $i$
$dp[i-1][w - w_i] + v_i$ : the best value if you take item $i$ (only valid when $w_i \leq w$ )

After filling the entire table, $dp[n][W]$ holds the answer. To find which items were selected, you backtrack through the table: if $dp[i][w] \neq dp[i-1][w]$ , then item $i$ was included.

Problem Definition and Characteristics, Untitled Document [web.engr.ship.edu]

Step-by-Step Walkthrough

Say you have 3 items and capacity $W = 5$ :

Item	Weight	Value
1	2	3
2	3	4
3	4	5

Initialize row 0 (no items) to all zeros
For item 1 (weight 2, value 3): at each capacity $w \geq 2$ , you can fit it, so $dp[1][w] = 3$ for $w \geq 2$
For item 2 (weight 3, value 4): at $w = 5$ , you can take item 2 and item 1 (combined weight 5, combined value 7), so $dp[2][5] = 7$
For item 3 (weight 4, value 5): at $w = 5$ , taking item 3 alone gives value 5, but keeping items 1+2 gives 7, so $dp[3][5] = 7$

The optimal solution is items 1 and 2 with total value 7.

Implementation

</>Code
function knapsack(values, weights, capacity):
    n = length(values)
    dp = create_2D_array(n + 1, capacity + 1, initialized to 0)

    for i from 1 to n:
        for w from 0 to capacity:
            if weights[i-1] <= w:
                dp[i][w] = max(dp[i-1][w], dp[i-1][w - weights[i-1]] + values[i-1])
            else:
                dp[i][w] = dp[i-1][w]

    return dp[n][capacity]

Edge cases to handle:

Capacity of 0: no items can be taken, answer is 0
No items available: answer is 0

Time complexity: $O(nW)$ where $n$ is the number of items and $W$ is the capacity. Note that this is pseudo-polynomial because $W$ is a numeric value, not the size of the input.

Optimization Techniques

Space optimization: Since each row only depends on the previous row, you can use a single 1D array of size $W+1$ instead of the full 2D table. This drops space from $O(nW)$ to $O(W)$ . The trick is to iterate weights right to left so you don't overwrite values you still need.
Memoization (top-down): Instead of filling the entire table, use recursion with a cache. This only computes subproblems that are actually needed, which can be faster when many cells would go unused.
Branch and bound: For some instances, pruning the search tree with upper-bound estimates can beat DP in practice, though worst-case complexity is still exponential.

Knapsack Problem Variations

Bounded and Unbounded Knapsacks

Bounded knapsack: each item has a limited number of copies available (say, at most $b_i$ copies of item $i$ ). You modify the 0/1 algorithm to consider taking 0, 1, 2, ... up to $b_i$ copies. One common optimization is to use binary representation of the bounds to reduce the number of "virtual items."

Unbounded knapsack: you can take as many copies of each item as you want. The DP formulation simplifies to a 1D array:

$dp[w] = \max(dp[w],\ dp[w - w_i] + v_i) \text{ for each item } i$

The classic coin change problem is an instance of the unbounded knapsack: given coin denominations, find the combination that reaches a target value using the fewest (or most valuable) coins.

Problem Definition and Characteristics, Category:Knapsack problem - Wikimedia Commons

Fractional and Multi-dimensional Knapsacks

Fractional knapsack: you can take a fraction of an item (e.g., 60% of item 3). This version is not solved with DP. A greedy algorithm works perfectly here: sort items by value-to-weight ratio, take as much of the best-ratio item as possible, then move to the next. This runs in $O(n \log n)$ due to sorting.

Multi-dimensional knapsack: instead of just a weight limit, you have multiple constraints (weight and volume, for example). The DP table gains an extra dimension for each constraint, which makes the problem significantly harder. A shipping container with both weight and volume limits is a typical example.

Advanced Variations

Multiple knapsack: distribute items among several knapsacks, each potentially with a different capacity. Think of assigning tasks to workers who each have limited time. This is NP-hard and typically requires approximation algorithms or heuristics.
Stochastic knapsack: item weights or values are uncertain (modeled as random variables). Portfolio optimization with uncertain returns is a natural example.
Online knapsack: items arrive one at a time, and you must decide immediately whether to take each one without knowing what comes next. Real-time ad placement with limited slots fits this model.

Complexity of Knapsack Solutions

Time Complexity

Variant	Time Complexity	Notes
Naive recursive (0/1)	$O(2^n)$	Tries every subset
DP (0/1)	$O(nW)$	Pseudo-polynomial
DP (unbounded)	$O(nW)$	1D table
DP (bounded)	$O(nWB)$	$B$ = max bound on copies
FPTAS approximation	$O(n^3 / \varepsilon)$	Guarantees $(1-\varepsilon)$ -optimal

The $O(nW)$ complexity is called pseudo-polynomial because it depends on the numeric value of $W$ , not the number of bits needed to represent it. If $W$ is astronomically large, this can still be slow.

Space Complexity

Standard 2D DP: $O(nW)$
Space-optimized 1D DP: $O(W)$
For large-scale problems (millions of items or huge capacities), the space-optimized version becomes essential

Complexity of Variations

Multi-dimensional knapsack: complexity grows exponentially with the number of constraint dimensions
Multiple knapsack: NP-hard, typically handled with approximation algorithms
Online knapsack: evaluated using competitive ratio analysis. No deterministic algorithm can achieve a competitive ratio better than 2 for arbitrary input sequences, meaning the best online algorithm might produce a solution worth at most half the offline optimum in the worst case.