Group Anagrams

Definition

A word, phrase, or name formed by rearranging the letters of another, such as cinema, formed from iceman.

Valid Anagram (LC.242)

Given two strings, determine if they are anagrams.

Example

("anagram", "nagaram") => true
("rat", "car") => false

Approach 1

let N = len(word1) + len(word2) Sort then compare O(Nlog(N)) + O(N)

public boolean isAnagram(String word1, String word2) {
    if (word1.length() != word2.length()) {
        return false;
    }
    char[] arr1 = word1.toCharArray();
    char[] arr2 = word2.toCharArray();
    Arrays.sort(arr1);
    Arrays.sort(arr2);
    return Arrays.equals(arr1, arr2);
}

Approach 2

Use a hashtable then count O(N) + O(N)

public boolean isAnagram(String word1, String word2) {
    if (word1.length() != word2.length()) {
        return false;
    }
    int[] countTable = new int[26];
    // count
    for (int i = 0; i < word1.length(); i++) {
        countTable[word1.charAt(i) - 'a']++;
    }
    // decrease
    for (int i = 0; i < word2.length(); i++) {
        countTable[word2.charAt(i) - 'a']--;
    }
    // check
    for (int i = 0; i < 26; i++) {
        if (countTable[i] != 0) return false;
    }
    return true;
}

Group Anagrams (LC.49)

Given an array of words, return a list of lists of anagrams.

Example

I: ["eat", "tea", "tan", "ate", "nat", "bat"]
O: [ ["ate", "eat","tea"], ["nat","tan"], ["bat"] ]

Brute Force Approach

For each word in the list, we try to find its group by checking with the rest of the list. Keep a visited array so we don't visit the word more than once.

public List<List<String>> groupAnagrams(String[] words) {
    List<List<String>> anagramList = new ArrayList<>();
    boolean[] visited = new boolean[words.length];
    // check null
    if (words == null || words.length == 0) {
        return anagramList;
    }
    // search all word in words
    for (int i = 0; i < words.length; i++) {
        List<String> group = new ArrayList<>();
        if (visited[i]) {
            continue;
        }
        visited[i] = true;
        group.add(words[i]);
        for (int j = i + 1; j < words.length; j++) {
            if (visited[j]) {
                continue;
            }
            if (isAnagram(words[i], words[j])) {
                group.add(words[j]);
                visited[j] = true;
            }
        }
        anagramList.add(group);
    }
    return anagramList;
}

Let N be the number of words, K be the maximum length of a word. The time complexity is O(N) (outer loop) * O(N) (inner loop worst case compare against the rest) * O(2K) (hashtable + compare) = O(KN^2). This will pass 100 / 101 test cases on LC. How can we do better?

Sorting and Hashing

We can optimized to the inner loop to O(K) by hashing the sorted word. If the hashmap contains the word then we add it to the result list. If not, we want to put (sortedWord, sortedWord) to the map.

public List<List<String>> groupAnagrams(String[] words) {
    List<List<String>> groups = new ArrayList<>();
    if (words == null || words.length == 0) {
        return groups;
    }
    Map<String, List<String>> hashmap = new HashMap<>();
    for (String word : words) {
        char[] arr = word.toCharArray();
        Arrays.sort(arr);
        String key = String.valueOf(arr);
        // O(K) instead of inner loop
        if (hashmap.containsKey(key)) {
            hashmap.get(key).add(word);
        } else {
            hashmap.put(key, new ArrayList<String>());
            hashmap.get(key).add(word);
        }
    }
    for (String key : hashmap.keySet()) {
        groups.add(hashmap.get(key));
    }
    return groups;
}
# brute force 

# map[key] => value 

# key = sorted str, value is the list of acutal words 

# in the end return all the values 


def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
    
    n = len(strs)
    
    results = []
    
    if n == 0:
        return results
    
    # change this list to a set if dup is an issue  
    stem_dict: Dict[str, List[str]] = {}
    
    for word in strs:
        # sorted() returns a copy 
        # list.sort() modifies in place 
        stem = ''.join(sorted(word))
        
        if stem in stem_dict:
            stem_dict[stem].append(word)
        else:
            stem_dict[stem] = [word]
        
    
    return stem_dict.values()

Time complexity O(NKlogK) Space complexity O(NK)

Last updated