Longest Substring with At Most K Distinct Characters

Rework later. Don't write code when you travel.

Given a string, find the longest substring that contains at most K distinct characters.

Example

I: s = "eceba", k = 2
O: 3 because "ece"
I: s = "eceba", k = 3
O: 4 bacause "eceb"

Analysis

The question asks for a longest substring with a constraint. Typically, this is a substring/window problem.

Brute Force Approach

From the start of every string, have a hashset, stops when size exceeds k. Store an int to keep track of the longest substring under this constraint.

Here is the pseudo-code:

func longestSubKD(str, k)
    if s == null || len(s) == 0
        return 0

    int maxLenght = 0
    Set<Character> charSet = new HashSet

    for i from 0 to len(s)
        charSet.clear()
        for j from i to len(s) and charSet.size() <= k // comment
            charSet.add(str.charAt(j))
        end for
        maxLength = Math.max(maxLength, j-i) 
        // since the for loop breaks after the condition becomes false
        // (one step beyond the constraint), j-i+1 the +1 is omitted
    end for

    return maxLength
end func

The pseudo-code looks good but there's one logical error.

How For Loop Works

What is the output for i, j, and the other j? They should all output 5 right? Noooo ʕº̫͡ºʔ In fact, i = 5, j = 4!, the other j = 5. How so? Let's examine how a for loop actually works

You see we get to increment i = 5, then the condition becomes false and the loop ends. We never get to the conditional code in this case.

Brute Force Code

Very brute force O(n^3)

brute force O(n^2)

The time complexity is O(n^2). A relatively long string will result in Time Limit Exceeded.

Sliding Window Approach

Observe the following:

Moving j back to i is not necessary. Just extend the window as we go.

Set is not enough. We need a map for the frequency of each character.

Sliding Window Code

Complexity

Only slide the window one way so O(n) time. O(k) space.

Follow Up

Data stream

Last updated