String editing can also be viewed as an implicit graph where the vertex is a word and the edge is the letter transformation.
Word Ladder I ()
Given a startWord, an endWord, and a wordList, find the length of the shortest transformation path from start to end.
Constraints:
Only one letter can be changed at a time
each transformation must be a word in the list
Return 0 if no possible transformation path. All words have the same length. Lower case letters only.
Example
I:
startWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]
draw the word graph
O:
5 because hit -> hot -> dot -> dog -> cog or hit -> hot -> lot -> log -> cog
I:
beginWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log"]
O:
0 because endWord does not exit in the dictionary
Level BFS
shortest transformation path = shortest path in a word graph
we have to construct this graph and search at the same time
stop if we visited every possible node
Code
public int ladderLength(String startWord,
String endWord,
List<String> wordList) {
if (startWord.equals(endWord)) return 0;
// preprogess
Set<String> wordSet = new HashSet<>();
for (String word : wordList) {
wordSet.add(word);
}
// init queue
Queue<String> queue = new LinkedList<>();
Set<String> visited = new HashSet<>();
queue.offer(startWord);
visited.add(startWord);
// level bfs
int levels = 1;
String current;
while (!queue.isEmpty()) {
int queueSize = queue.size();
for (int i = 0; i < queueSize; i++) {
current = queue.poll();
// search all neighbors
for (String neighbor : getNeighbors(current, wordSet)) {
if (visited.contains(neighbor)) {
continue;
}
if (neighbor.equals(endWord)) {
return levels + 1;
}
queue.offer(neighbor);
visited.add(neighbor);
}
}
levels++;
}
return 0; // explored the whole graph but cannot reach end goal
}
private List<String> getNeighbors(String currentWord, Set<String> wordSet) {
List<String> neighbors = new ArrayList<>();
for (int i = 0; i < currentWord.length(); i++) {
for (int l = 'a'; l < 'z'; l++) {
// if same letter continue;
char[] wordArr = currentWord.toCharArray();
wordArr[i] = (char) l;
String nextWord = new String(wordArr);
if (wordSet.contains(nextWord)) {
neighbors.add(nextWord);
}
}
}
return neighbors;
}
from collections import deque
# string.ascii_lowercase
LOWER_CASE_LETTERS = [
'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm',
'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z'
]
class Solution:
def _next_word(self, current_word: str) -> Iterable[str]:
for i in range(len(current_word)):
for l in LOWER_CASE_LETTERS:
yield current_word[:i] + l + current_word[i+1:]
def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
if beginWord == endWord:
return 1
word_set = set(wordList)
if endWord not in word_set:
return 0
# init bfs
queue: Deque[str] = deque()
visited: Set[str] = set()
queue.append(beginWord)
visited.add(beginWord)
depth = 0
# start searching and creating the word graph
while len(queue) > 0:
breadth = len(queue)
for i in range(breadth):
# visit current word
current_word = queue.popleft()
if current_word == endWord:
return depth + 1
# expanding current node to adjacent nodes
# check visited in pre visit
for next_word in self._next_word(current_word):
if next_word not in word_set:
continue
if next_word in visited:
continue
queue.append(next_word)
visited.add(next_word)
depth += 1
return 0
Optimized BFS
The time complexity of a raw BFS is O(V+E). In this case, the number vertices are the valid words in the word list and the number edges are the brute force computed substitution for each letter for each word. The end result is something like O(n+n×26×l)where n is the length of the word list and lis the longest word length.
The idea to optimize is to shrink the implicit graph that we are constructing. One way to do it to precompute a dictionary graph with only valid edges in the beginning (i.e. "bot" will not be an edge to visit only "hot", "dot", "lot" are edges). This will shrink the brute force factor 26 significantly with the cost of a little more memory.
Precomputing is alway a good idea when the dictionary size is small in a search problem.
The dictionary graph looks like this for the first example.
from collections import defaultdict, deque
class Solution:
def _next_key(self, current_word: str) -> Iterable[str]:
for i in range(len(current_word)):
yield current_word[:i] + '*' + current_word[i+1:]
def _construct_dict_graph(self, word_list: List[str]) -> Dict[str, List[str]]:
dict_graph: Dict[str, List[str]] = defaultdict(list)
for word in word_list:
for search_key in self._next_key(word):
dict_graph[search_key].append(word)
return dict_graph
def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
if beginWord == endWord:
return 1
word_set = set(wordList)
if endWord not in word_set:
return 0
dict_graph = self._construct_dict_graph(wordList)
# init bfs
queue: Deque[str] = deque()
visited: Set[str] = set()
queue.append(beginWord)
visited.add(beginWord)
depth = 0
# start searching and creating the word graph
while len(queue) > 0:
breadth = len(queue)
for i in range(breadth):
# visit current word
current_word = queue.popleft()
if current_word == endWord:
return depth + 1
# pre visit adjacent nodes
# check visited in pre visit
for search_key in self._next_key(current_word):
for next_word in dict_graph[search_key]:
if next_word in visited:
continue
queue.append(next_word)
visited.add(next_word)
depth += 1
return 0
Approach
Time
Space
Brute force BFS
516ms
15.3mb
Optimized BFS
176ms
17.5mb
Bidirectional BFS
A bidirectional search is a lot more efficient when start and end are well defined. One big triangle to two much smaller triangles. The less than / less or equal than condition always worth going through an example by hand.
TODO: Write brute force bidirectional BFS then refactor it
def ladderLength(self, beginWord: str, endWord: str, wordList: List[str]) -> int:
if beginWord == endWord:
return 1
word_set = set(wordList)
if endWord not in word_set:
return 0
dict_graph = self._construct_dict_graph(wordList)
# init bidirectional bfs
q1: Deque[str] = deque()
visited_1: Set[str] = set()
q1.append(beginWord)
visited_1.add(beginWord)
q2: Deque[str] = deque()
visited_2: Set[str] = set()
q2.append(endWord)
visited_2.add(endWord)
depth = 0
while len(q1) > 0 and len(q2) > 0:
if len(q1) <= len(q2):
q = q1
visited = visited_1
other_visited = visited_2
target = endWord
else:
q = q2
visited = visited_2
other_visited = visited_1
target = beginWord
breadth = len(q)
for i in range(breadth):
current_word = q.popleft()
if current_word in other_visited:
return depth + 1
if current_word == target:
return depth + 1
for search_key in self._next_key(current_word):
for next_word in dict_graph[search_key]:
if next_word in visited:
continue
q.append(next_word)
visited.add(next_word)
depth += 1
return 0
TODO: Figure out a way to write bidirectional BFS in a concurrent way. It should be possible as long as maintain the visited set as a dictionary that maps word to depth. When one go routine finds the words in another dictionary, it returns that? How does the other go routine terminates?
Time Complexity
The worst case for a bidirectional BFS happens when the start and end cannot meet somewhere in the middle. The search graph and the time complexity then are the same as a regular BFS. But on average, the performance for a bidirectional BFS is a lot better. The search depth will be reduced by a factor 2. That is a huge optimization because the number of nodes will be reduced exponentially.