# Longest Substring with At Most K Distinct Characters

Rework later. Don't write code when you travel.

## Question ([LC.340](https://leetcode.com/problems/longest-substring-with-at-most-k-distinct-characters/#/description))

> Given a string, find the longest substring that contains at most K distinct characters.

## Example

```
I: s = "eceba", k = 2
O: 3 because "ece"
I: s = "eceba", k = 3
O: 4 bacause "eceb"
```

## Analysis

The question asks for a longest substring with a constraint. Typically, this is a substring/window problem.

## Brute Force Approach

From the start of every string, have a hashset, stops when size exceeds k. Store an int to keep track of the longest substring under this constraint.

Here is the pseudo-code:

```
func longestSubKD(str, k)
    if s == null || len(s) == 0
        return 0

    int maxLenght = 0
    Set<Character> charSet = new HashSet

    for i from 0 to len(s)
        charSet.clear()
        for j from i to len(s) and charSet.size() <= k // comment
            charSet.add(str.charAt(j))
        end for
        maxLength = Math.max(maxLength, j-i) 
        // since the for loop breaks after the condition becomes false
        // (one step beyond the constraint), j-i+1 the +1 is omitted
    end for

    return maxLength
end func
```

The pseudo-code looks good but there's one logical error.

## How For Loop Works

```java
int i = 0;
int j = i;

for (i = 0; i < 5 && (j - 1) <= 3; i++) {
    j = i;
}
System.out.println("j: " + j + " i: " + i);

for (j = 0; j - 1 <= 3; j++) {
}
System.out.println("the other j: " + j);
```

What is the output for `i`, `j`, and `the other j`? They should all output `5` right? Noooo ʕº̫͡ºʔ\
In fact, `i = 5, j = 4!, the other j = 5`. How so? Let's examine how a for loop actually works

```
Init: i = 0, j = i
Condition: i < 5 && (j - 1) <= 3
Conditional Code: j = i
Increment: i++

             --F-> end loop
             |
Init ----> Condition --T-> ConCode ----> Increment
             /                               |
             |                               /
             --------------------------------
```

You see we get to increment `i = 5`, then the condition becomes `false` and the loop ends. We never get to the conditional code in this case.

## Brute Force Code

Very brute force O(n^3)

```python


# generate every substring 

# then calcualte the longest substring that satisfies the condition 


def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
    
    n = len(s)
    
    max_length = 0 
    
    if n == 0 or k <= 0:
        return max_length
    
    
    for i in range(n):
        for j in range(i+1, n+1):
            current_substr = s[i:j]
            current_set = set(current_substr)
            if len(current_set) > k:
                break
            if len(current_substr) > max_length:
                max_length = len(current_substr)
    
    return max_length 
    
```

brute force O(n^2)&#x20;

```python
def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
    
    n = len(s)
    
    max_size = 0 
    
    if n == 0 or k <= 0:
        return max_size
    

    
    for i in range(n):
        
        current_set = set()
        
        for j in range(i, n):
            
            current_set.add(s[j])
            
            if len(current_set) > k:
                break
            
            current_window_size = j - i + 1 
            
            if current_window_size > max_size:
                max_size = current_window_size
    
    return max_size 
    
```

```java
public int lengthOfLongestSubstringKDistinct(String str, int k) {
    if (k <= 0 || str == null || str.length() == 0) {
        return 0;
    }

    int maxLength = 0;
    Set<Character> charSet = new HashSet<>();

    int i, j;
    for (i = 0; i < str.length(); i++) {
        charSet.clear();
        for (j = i; j < str.length(); j++) {
            charSet.add(str.charAt(j));
            if (charSet.size() > k) {
                break;
            }
        }
        maxLength = Math.max(maxLength, j-i);
    }

    return maxLength;
}
```

The time complexity is `O(n^2)`. A relatively long string will result in Time Limit Exceeded.

## Sliding Window Approach

Observe the following:

```
I: s = "eceba", k = 3
S  e c e b a
L1 i---->j
L2   i---->j
L3     i-->j
L4       i->j
```

Moving `j` back to `i` is not necessary. Just extend the window as we go.

Set is not enough. We need a map for the frequency of each character.

## Sliding Window Code

```
Extend - add new key to map or increment the value
Maintain - check the current window is the max length
Shrink - decrement the value or evict the key if value is 0
```

```java
public int lengthOfLongestSubstringKDistinct(String s, int k) {
    // validate input
    if (s == null || k < 0) {
        return -1;
    }
    // init map, key is character in s, value is its occurence
    Map<Character, Integer> charMap = new HashMap<>();
    // sliding window
    int maxLength = 0;
    int j = 0;
    for (int i = 0; i < s.length(); i++) {
        // extend
        char curChar;
        while (j < s.length()) {
            curChar = s.charAt(j);
            if (charMap.containsKey(curChar)) {
                charMap.put(curChar, charMap.get(curChar) + 1);
            } else {
                if (charMap.size() < k) {
                    charMap.put(curChar, 1);
                } else {
                    break; // and condition in the while loop
                }
            }
            j++;
        }
        // update
        maxLength = Math.max(maxLength, j - i);
        // shrink
        char removeChar = s.charAt(i);
        if (charMap.get(removeChar) != null && charMap.get(removeChar) > 1) {
            charMap.put(removeChar, charMap.get(removeChar) - 1);
        } else {
            charMap.remove(removeChar);
        }
    }
    return maxLength;
}
```

```python
# because you have to extend to get a longer answer 
# compute the substrings in between is not going to yield a better solution  

def lengthOfLongestSubstringKDistinct(self, s: str, k: int) -> int:
    
    n = len(s)
    
    max_size = 0 
    
    if n == 0 or k <= 0:
        return max_size
    
    # char is the just str with len 1 
    letter_dict: Dict[str, int] = {}
    j = 0
    
    for i in range(n):
        
        # extend 
        # exit when there is one over 
        while j < n:
            current_letter = s[j]
            if current_letter in letter_dict:
                letter_dict[current_letter] = letter_dict[current_letter] + 1
                j += 1                    
            else:
                if len(letter_dict) == k:
                    break
                
                letter_dict[current_letter] = 1
                j += 1
        
        # check
        current_window_size = j - i
        max_size = max(max_size, current_window_size)
        
        # shrink 
        first_letter = s[i]
        if letter_dict[first_letter] > 1:
            letter_dict[first_letter] = letter_dict[first_letter] - 1
        else:
            letter_dict.pop(first_letter)
    
    return max_size 
    
```

## Complexity

Only slide the window one way so O(n) time. O(k) space.

## Follow Up

Data stream


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://zedive.gitbook.io/project-l/part-1/basic_data_structure/array_and_string/two-pointers/window/longest-substring-with-at-most-k-distinct-characters.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
