1 DS|A / day: Topological sort
Tue Nov 21 2023
- Time complexity:
O(v+e)
- Space complexity:
O(v)
- Uses
- Sorting a graph
- Finding the longest paths in a graph
- Finding cycles in a graph
- Converting a graph into a tree
- Doing DP on a graph
- Resources
Given a directed acyclic graph, topological sort orders the nodes, so that every edge leads from a node with a small index to a node with a larger one.
This can only happen when there are no cycles in the graph (graph must be acyclic).
Most usefully, if we topologically sort a graph, then we can perform dynamic programming on the graph because the topological
order guarantees for each edge u->v
we process u
before v
. We can save the state of each node and use it for its children.
For example, for finding the longest path in a graph. The longest path ending at some node v
is the longest path ending at u
(+1) if an edge from u->v
exists.
The base case is 1 if v
has no incoming edges.
Here for node 3, there are two parents (0, 1). Longest path ending at node 3 = max(longest[0], longest[1])
. Which
we can only do once we know node 3's parents have already been fully processed.
BFS
Find all the indegrees of all the nodes. Add all the nodes with indegrees 0 to a queue. Then while there are items in the queue, get the next node off the queue, and decrease the indegree of its children. This is equivalent to removing those edges. If any children reach indegree=0, add the child to the queue.
For example:
If there are still any nodes with >0 indegree, there is a cycle.
At each step, we are only processing nodes with no remaining incoming edges. So the order we process the nodes will be a topological sort.
Topological orderings are not unique. Above, we could have visited 3 instead of 4 first, to produce a different valid ordering.
DFS
We'll need a stack for DFS. We call DFS for every node in the graph that hasn't been seen yet.
DFS will try to traverse all edges outgoing from a node not yet seen. So by the time a call dfs(u)
has finished, all vertices reachable
from u
will already have been visited. Then we put u
onto the stack. Any nodes reachable from u
will be below it (already) in the stack.
This is kind of like a postordering, representing a reversed topological order.
Afterwards, we can pop nodes off the stack for a forward topological ordering.
To detect cycles, we can colour nodes, and mark them as white (unvisited), grey (visiting - currently exploring descendants), or black (visited - done).
Mark a node as visiting
at the start of the DFS, and visited
at the end of the DFS. If we see another visiting
node
during the DFS, then we know there is a cycle.