Word Ladder

Intro

String editing can also be viewed as an implicit graph where the vertex is a word and the edge is the letter transformation.

Word Ladder I (LC.127)

Given a startWord, an endWord, and a wordList, find the length of the shortest transformation path from start to end.

Constraints:

  1. Only one letter can be changed at a time

  2. each transformation must be a word in the list

Return 0 if no possible transformation path. All words have the same length. Lower case letters only.

Example

I:

startWord = "hit"
endWord = "cog"
wordList = ["hot","dot","dog","lot","log","cog"]

draw the word graph

O: 

5 because hit -> hot -> dot -> dog -> cog or hit -> hot -> lot -> log -> cog 

I: 

beginWord = "hit"
endWord = "cog" 
wordList = ["hot","dot","dog","lot","log"]

O: 

0 because endWord does not exit in the dictionary 

Level BFS

shortest transformation path = shortest path in a word graph

we have to construct this graph and search at the same time

stop if we visited every possible node

Code

Optimized BFS

The time complexity of a raw BFS is O(V+E)O(V+E). In this case, the number vertices are the valid words in the word list and the number edges are the brute force computed substitution for each letter for each word. The end result is something like O(n+n×26×l)O(n + n \times 26 \times l )where nn is the length of the word list and llis the longest word length.

The idea to optimize is to shrink the implicit graph that we are constructing. One way to do it to precompute a dictionary graph with only valid edges in the beginning (i.e. "bot" will not be an edge to visit only "hot", "dot", "lot" are edges). This will shrink the brute force factor 26 significantly with the cost of a little more memory.

Precomputing is alway a good idea when the dictionary size is small in a search problem.

The dictionary graph looks like this for the first example.

Optimized Code

Approach

Time

Space

Brute force BFS

516ms

15.3mb

Optimized BFS

176ms

17.5mb

Bidirectional BFS

A bidirectional search is a lot more efficient when start and end are well defined. One big triangle to two much smaller triangles. The less than / less or equal than condition always worth going through an example by hand.

TODO: Write brute force bidirectional BFS then refactor it

TODO: Figure out a way to write bidirectional BFS in a concurrent way. It should be possible as long as maintain the visited set as a dictionary that maps word to depth. When one go routine finds the words in another dictionary, it returns that? How does the other go routine terminates?

Time Complexity

The worst case for a bidirectional BFS happens when the start and end cannot meet somewhere in the middle. The search graph and the time complexity then are the same as a regular BFS. But on average, the performance for a bidirectional BFS is a lot better. The search depth will be reduced by a factor 2. That is a huge optimization because the number of nodes will be reduced exponentially.

Approach

Time

Space

Optimized BFS

176ms

17.5mb

Bidirectional BFS

108ms

17.6mb

Reading

Last updated