📐Subset Sum

Question

Given a list of positive integers, return the number of subsets that sum to the target value.

Examples

I: nums = [1, 2, 3, 4, 5, 6], target = 8
O: 4 (because [1, 2, 5], [1, 3, 4], [2, 6], [3, 5]) 

I: nums = [1, 2, 3, 4, 5, 6], target = 22
O: 0 

I: nums = [7, 8, 10, 3, 5], target = 20
O: 2

Analysis

The brute force solution is just generating all subsets recursively then checking the sum and updating a count. The key observation is that some subproblems overlap. For example, search_subset(target=5) can overlap search_subset(target=4) with and search_subset(target=2). We can give DP a shot.

DP

1. Define by prefix 
dp[i][s] = number of subsets that sum to value s using prefix i elements nums[0:i]
(similar to the subset recursive search definition)

2. Solve be prefix 
for the current ith element 
    dp[i][s] = AND(dp[i-1][s], dp[i-1][s-nums[i]])
    (not use the current element, use the current element)

3. Base cases 
dp[0][0] = 1 (empty subset sums to 0)

4. Topo order 
for s from 0 to target

5. Final answer 
dp[n][target]

Code

def subset_sum(nums: List[int], target: int) -> int:
    n = len(nums)

    dp: List[List[int]] = [[0] * (target + 1) for _ in range(n + 1)]
    dp[0][0] = 1

    for i in range(n):
        for s in range(target + 1):
            if dp[i][s] == 0:
                continue
            # not use item i
            dp[i + 1][s] += dp[i][s]
            # use item i
            if s + nums[i] > target:
                continue
            dp[i + 1][s + nums[i]] += dp[i][s]

    return dp[n][target]

Time Complexity

time O(n * target) space O(n * target)

State Graph

0: [1, 0, 0, 0, 0, 0, 0, 0, 0]
1: [1, 1, 0, 0, 0, 0, 0, 0, 0]
2: [1, 1, 1, 1, 0, 0, 0, 0, 0]
3: [1, 1, 1, 2, 1, 1, 1, 0, 0]
4: [1, 1, 1, 2, 2, 2, 2, 2, 1]
5: [1, 1, 1, 2, 2, 3, 3, 3, 3]
6: [1, 1, 1, 2, 2, 3, 4, 4, 4]
Subset Sum DAG

Space Optimization

The value keys can be sparse. For example, for nums = [1, 100, 1001, 2002], target = 3003 , we don't have to store all the values between 1 to 3003. We only have to store the valid subset sum values in the memoization table. To optimize the data structure for memoization, we can use a nested dictionary rather than a 2D array.

Optimized Code

def subset_sum(nums: List[int], target: int) -> int:
    dp: Dict[int, Dict[int, int]] = defaultdict(lambda: defaultdict(int))
    dp[0][0] = 1

    n = len(nums)
    for i in range(n):
        for k in dp[i]:
            # not use item i
            dp[i + 1][k] += dp[i][k]
            # use item i
            if k + nums[i] > target:
                continue
            dp[i + 1][k + nums[i]] += dp[i][k]

    return dp[n][target]

Time Complexity

min_target = min(2^n, target)

time O(n * min_target) space O(n * min_target)

More Space Optimization

We only have optimized space from the target value dimension so far. We can also optimize the ith element index dimension.

We can see from the recursive step

dp[i][s] = AND(dp[i-1][s], dp[i-1][s-nums[i]])

only depends on the previous index of the state.

We can mod the ith dimension to only store 2 layers of state at a time.

Optimized Complexity

time O(n * min_target) space O(min_target)

Last updated