1 DS|A / day: Topological sort

Tue Nov 21 2023

  • Time complexity:
    • O(v+e)
  • Space complexity:
    • O(v)
  • Uses
    • Sorting a graph
    • Finding the longest paths in a graph
    • Finding cycles in a graph
    • Converting a graph into a tree
    • Doing DP on a graph
  • Resources

Given a directed acyclic graph, topological sort orders the nodes, so that every edge leads from a node with a small index to a node with a larger one.

small-pic Topological sort

This can only happen when there are no cycles in the graph (graph must be acyclic).

Most usefully, if we topologically sort a graph, then we can perform dynamic programming on the graph because the topological order guarantees for each edge u->v we process u before v. We can save the state of each node and use it for its children.

small-pic Topological sort

For example, for finding the longest path in a graph. The longest path ending at some node v is the longest path ending at u (+1) if an edge from u->v exists. The base case is 1 if v has no incoming edges.

Here for node 3, there are two parents (0, 1). Longest path ending at node 3 = max(longest[0], longest[1]). Which we can only do once we know node 3's parents have already been fully processed.

BFS

Find all the indegrees of all the nodes. Add all the nodes with indegrees 0 to a queue. Then while there are items in the queue, get the next node off the queue, and decrease the indegree of its children. This is equivalent to removing those edges. If any children reach indegree=0, add the child to the queue.

For example:

tiny-pic Topological sort

tiny-pic Topological sort

tiny-pic Topological sort

tiny-pic Topological sort

If there are still any nodes with >0 indegree, there is a cycle.

At each step, we are only processing nodes with no remaining incoming edges. So the order we process the nodes will be a topological sort.

Topological orderings are not unique. Above, we could have visited 3 instead of 4 first, to produce a different valid ordering.

DFS

We'll need a stack for DFS. We call DFS for every node in the graph that hasn't been seen yet.

DFS will try to traverse all edges outgoing from a node not yet seen. So by the time a call dfs(u) has finished, all vertices reachable from u will already have been visited. Then we put u onto the stack. Any nodes reachable from u will be below it (already) in the stack.

This is kind of like a postordering, representing a reversed topological order.

tiny-pic Topological sort

tiny-pic Topological sort

tiny-pic Topological sort

tiny-pic Topological sort

tiny-pic Topological sort

Afterwards, we can pop nodes off the stack for a forward topological ordering.

To detect cycles, we can colour nodes, and mark them as white (unvisited), grey (visiting - currently exploring descendants), or black (visited - done). Mark a node as visiting at the start of the DFS, and visited at the end of the DFS. If we see another visiting node during the DFS, then we know there is a cycle.

Sample questions