We can consider trie as a space efficient hash table for words. It supports add(word) and contains(word) in O(len(word)) time. Trie saves space because it stores words with common prefixes under the same path.
Implement Trie ()
Implement a trie with insert, search, and startsWith methods.
insert - Inserts a word into the trie
search - Returns if the word is in the trie
startsWith - Returns if there is any word in the trie that starts with the given prefix
Example
Trie trie = new Trie();
trie.insert("lol");
if (trie.search(word)) freqMap.get(word)++;
if (trie.startsWith(prefix)) freqMap.get(prefix)++;
Analysis
Trie is a n-ary tree depending on the number of available characters. Each node should hold its children map, (and its own key?), and isWord boolean. Internally, we want to store a dummyRoot to signify the empty string prefix.
Code
public class Trie {
private class TrieNode {
Map<Character, TrieNode> children;
Boolean isWord;
public TrieNode() {
children = new HashMap<Character, TrieNode>();
isWord = false;
}
}
TrieNode dummyRoot;
public Trie() {
dummyRoot = new TrieNode();
}
/**
* Inserts a word into the trie.
*/
public void insert(String word) {
TrieNode current = dummyRoot;
for (int i = 0; i < word.length(); i++) {
char letter = word.charAt(i);
if (!current.children.containsKey(letter)) {
current.children.put(letter, new TrieNode());
}
current = current.children.get(letter);
}
current.isWord = true;
}
/**
* Returns if the word is in the trie.
*/
public boolean search(String word) {
return searchPrefix(word) != null && searchPrefix(word).isWord;
}
/**
* Returns if there is any word in the trie that starts with the given prefix.
*/
public boolean startsWith(String prefix) {
return searchPrefix(prefix) != null;
}
/**
* Search the prefix tree for the word
* @return the node that matches the last character in word, null if no match
*/
private TrieNode searchPrefix(String word) {
TrieNode travel = dummyRoot;
for (int i = 0; i < word.length(); i++) {
char letter = word.charAt(i);
if (travel.children.containsKey(letter)) {
travel = travel.children.get(letter);
} else {
return null;
}
}
return travel;
}
}
The search query can contain . which matches any character.
Before, we use a travel pointer to travel down a path (just like traversing a linked list). Now, in order to support ., we need to explore an entire subtree to find a match. Preform a DFS on all the possible children at that level and then figure out one path that matches with the given word.
Code
/**
* Returns if the word is in the data structure.
* A word could contain the dot character '.' to represent any one letter.
*/
public boolean search(String word) {
return searchWord(word, dummyRoot, 0);
}
private boolean searchWord(String word, TrieNode curNode, int curIndex) {
// base case
if (word.length() == curIndex) {
return curNode.isWord;
}
// search
char curLetter = word.charAt(curIndex);
if (curLetter == '.') {
for (Character match : curNode.children.keySet()) {
if (searchWord(word, curNode.children.get(match), curIndex + 1)) {
return true;
}
}
} else {
if (curNode.children.containsKey(curLetter)) {
return searchWord(word, curNode.children.get(curLetter), curIndex + 1);
} else {
return false;
}
}
return false;
}
Follow Up #2
The search query can contain * which can match 0 or more previous character.