Topological Sort
Intro
Linearize the dependencies. Or the right the sequence to take courses. DFS (the highest finish time should go first).
BFS (Khan's Algorithm)
I personally prefer to think in terms of this approach because people naturally think will perform the topological sort manually this way. That might be partly the reason why this is the first algorithm (1962) for topo sort.
add vertices with 0 incoming edge to queue Q
while !Q.isEmpty
current = Q.pop, add current to the end of list
for each neighbor of current
delete edge current → neighbor
if neighbor has 0 incoming edge, add neighbor to queueBFS Code
We don't want to modify the original graph (aka deleting edges). So instead we count the indegree for each node.
public ArrayList<DirectedGraphNode> topologicalSort(ArrayList<DirectedGraphNode> graph) {
ArrayList<DirectedGraphNode> topoOrder = new ArrayList<>();
if (graph == null || graph.size() == 0)
return topoOrder;
// Step 1 count indegree (incoming edges)
Map<DirectedGraphNode, Integer> indegree = countIndegree(graph);
// Step 2 topological sort with bfs
topoSort(graph, indegree, topoOrder);
return topoOrder;
}Another way is to define queue as an ArrayList in the beginning. Then for (int i = 0; i < queue.size(); i++) to get the ith element in queue as current. This way we don't have to add current to topoOrder every time.
Follow Up #1
What if there is no guarantee that the graph contains a valid topological order? So in this case, the graph is no longer acyclic. Will code above run into an infinite loop? NO. That's why we count indegree for each node. When a cycle exists, the starting node in that cycle will have an indegree greater than 0 and therefre will not be added to the queue.
Before we return the result, we can add a simple check if (topoOrder.size() == graph.size()). If yes then we done, if no then there is no topological ordering in this graph.
Follow Up #2
How about finding all possible topological orderings? (ex. all possible course sequences to finish a degree)
Find all orderings/permutations/combinations, we need to use backtracking. We want to backtrack when we delete edges. To try all possibilities.
DFS
DFS on the root. The topo sort has the leaf nodes start first. So we have to reverse the finish time.
I wouldn't say this approach is counter intuitive but we don't do sort dependenties this way manually. This algorithm was introduced at a later time (1976).
DFS Code
Summary
In most graph questions, we can use either BFS or DFS. BFS is easier to think and maybe longer to write. While DFS is shorter to write, the thought process behind it can be lengthy. Also, the size of the system call stack is something we need to keep in mind. Unlike using a queue (in the heap space), the system stack is limited on how the JVM allocates it. Stack overflow can happen if the call stack is too deep.
Reference
StackOverflow - Computing method call stack size for checking StackOverflowException
Wikipedia - Topological Sorting
GeekforGeeks - All Topological Sorts
Last updated