Define the indegree of a vertex as the number of edges pointing to it.
Repeating this process yields a topological sort.
Here is the code for a topological sort.
Let's consider it a step at a time.
In step 1, we get the indegrees of all the vertices, putting them into a map whose key type is
As we do this, we also add any vertices whose indegree is zero into a queue,
In step 2, we repeatedly remove vertices from the queue and add them to the sorted list output,
We then look at the outgoing edges of each vertex, and reduce the
Finally, in step 3, we check to see if all the vertices have been “removed” from the graph and placed into the sorted list. If so, we have successfully found a topological sort. If not, then no topological sort is possible (the graph must have a cycle).
Try running a topological sort.
- A node of indegree 0 can be placed at the start of the sorted order, since there is clearly nothing that must precede it.
- There must be at least one vertex of indegree 0.
- If not, the graph has a cycle and no topological sort exists.
- Once we have placed all nodes of indegree 0 in the list, we can then add all nodes whose indegree would be zero except for edges from the nodes already placed.
Let's consider it a step at a time.
Vertex
and whose associated data isunsigned
. I've chosen to use a hash_map
for this code. hash_map
is a hash-table implementation of the standard map interface.hash_map
isn't actually part of the standard library, but it is a commonly available addition. If I wanted to stay entirely within the standard library, I could just use map
, but at the cost of somewhat slower access. Alternatively, I could (and have, in the past) implemented a vector-based implementation of the map interface that simply uses the Vertex::id()
function to index into the vector, which would give me true O(1) access time.As we do this, we also add any vertices whose indegree is zero into a queue,
q
.sorted
. We can do this because we know that there is nothing in the graph that needs to come before these vertices.We then look at the outgoing edges of each vertex, and reduce the
inDegree
values of the neighboring vertices to simulate having removed v
from the graph. If doing this causes any of their (simulated) indegrees to become zero, we add them to the queue, because we know that there is nothing remaining in the graph that needs to come before these vertices.Try running a topological sort.
inDegree
map is implemented using a vector-like structure.In step 1, therefore, we see that everything in the loop body is O(1). The loop itself goes around once for every vertex in the graph. Following our definition that says that a graph G=(V,E) is a set V of vertices and a set E of edges, we can say that the number of iterations of this loop is |V|, the number of vertices.
(We could also write this as
O(g.numVertices())
, but |V| is shorter and is the usual way that people describe graphs.)But, let's just stop and think about what's going on here.
- Each vertex goes into the queue at most once
- So, in a successful sort, the outer loop will execute |V| times, once for each vertex.
- The inner loop simply visits the edges emanating from the vertex being visited by the outer loop.
- So if the outer loop visits every vertex, and the inner one visits every edge leaving that vertex, over the course of all the outer loop iterations, the inner loop will visit every edge in the graph exactly once.
sorted
list (done when we can't find a solution). Since this list is actually a list of vertices that we have successfully sorted, it contains at most |V| elements, and so the clear
operation is O(|V|).That makes the worst case for the step 3 “if” statement O(|V|) as well.
(Note: this does not mean that topological sorts are faster than conventional sorts --- the number of edges can be as high as |V|2, so this is actually more comparable to the slowest of our conventional sorting algorithms. That's the penalty we pay for working with partial orders. We don't write the cost of this algorithm as O(|V|2), though, because the number of edges varies widely in practical problems, and we may sometimes know that |E| will be far less than that maximum, so O(|V| + |E|) is a more accurate portrayal of the behavior.
No comments:
Post a Comment