Alien Dictionary

Question (LC.269)

The new alien language uses [a-z]. But the order of the letter is unknown.

Given a list of nonempty words from the alien dictionary (where words are sorted), derive the order of letters in this alien language.

Example

Input: ["z", "x"]
Output: "zx"

There may be multiple valid order of letters, return any one of them is fine.

Input: ["wrt", "wrf", "er", "ett", "rftt"]
Output: "wertf"

If the order is invalid, return an empty string.

Input: ["z", "x", "z"]
Output: ""

You may assume that if a is a prefix of b, then a must appear before b in the given dictionary.

Input: ["z", "an"]
Output: "zan" (n cannot be a prefix of z by this rule)

Analysis

We can think of this as a way to linearize the dependencies.

Two challenges:

  1. How to infer the dependency? (draw edges)

    1. It turns out a one-to-one comparison is enough

    2. One-to-many comparison is an overkill because letters at the same position can form a chain. Otherwise, the different position does not produce the same dependency info.

  2. How to store the graph structure? (adj list)

    1. acyclic graph: {'w': ['e'], 'r': ['t'], 't': ['f'], 'f': [], 'e': ['r']}

    2. cyclic graph: {'z': ['x'], 'x': ['z']}

Code

class Solution:

    def alienOrder(self, words: List[str]) -> str:
        if len(words) == 0:
            return ""

        if len(words) == 1:
            return "".join(set(words[0]))

        graph = self._build_graph(words)
        if graph is None:
            return ""

        order = self._top_sort(graph)

        return order


    def _top_sort(self, graph: Dict[str, List[str]]) -> str:
        # post order traversal then reverse the log order
        log = []
        visited = set()
        path_dep = set()
        for node in graph:
            if self._post_order(node, graph, path_dep, visited, log):
                return ""

        log.reverse()

        return "".join(log)

    def _post_order(
            self,
            current_node: str,
            graph: Dict[str, List[str]],
            path_dep: Set[str],
            visited: Set[str],
            log: List[str]
    ) -> bool:
        # how to detect a back edge?
        # because forward edges are fine
        # just a new dependency
        # but back edge means there is a cycle
        if current_node in path_dep:
            return True

        if current_node in visited:
            return False

        # pre visit
        visited.add(current_node)
        path_dep.add(current_node)

        # visit neighbors
        for neighbor in graph[current_node]:
            if self._post_order(neighbor, graph, path_dep, visited, log):
                return True

        # post visit
        log.append(current_node)
        path_dep.remove(current_node)
        return False

    def _build_graph(self, words: List[str]) -> Optional[Dict[str, List[str]]]:
        adj_list: Dict[str, List[str]] = {}

        # build nodes
        for word in words:
            for letter in word:
                if letter not in adj_list:
                    adj_list[letter] = []

        # build edges
        for i in range(len(words) - 1):
            word_i = words[i]
            word_j = words[i+1]
            word_len = min(len(word_i), len(word_j))

            same_prefix = True

            for l in range(word_len):
                if word_i[l] != word_j[l]:
                    same_prefix = False
                    if word_j[l] not in adj_list[word_i[l]]:
                        adj_list[word_i[l]].append(word_j[l])

            if same_prefix and len(word_i) > len(word_j):
                return None

        return adj_list
        

Time Complexity

n = number of words

l = the length of the longest word

time:

building graph = O(l * n + l * l * n) + topo sort = O(V + E)

space:

graph = O(V + E) + visited = O(V) + on stack = O(V)

Reference

Alien Dictionary by Ethan Li

Last updated