38 Topological Sorting as an Application of DFS

T.V. Geetha

epgp books

 

 

 

Welcome to the e-PG Pathshala Lecture Series on Data Structures. In this module we will discuss about Topological Sorting as an Application of DFS

 

 

Learning Objectives

 

The learning objectives of the module are as follows:

 

•  To understand the concept of Topological Ordering as an Application of DFS

•  To discuss the complexity of Topological Sorting

•  To discuss the Proof of Correctness

 

38.1 Topological Sort

 

Topological sort is a method of arranging the vertices in a directed acyclic graph (DAG), as a sequence, such that no vertex appears in the sequence before its predecessor. A directed graph G is a dag (directed acyclic graph) if it does not have any cycle. A directed acyclic graph or dag is a directed graph with no directed cycles. Any vertex in a dag that has no incoming vertices is called a source; any vertex with no outgoing edges is called a sink. Every dag has at least one source and one sink, but may have more than one of each.

 

We have a set of tasks and a set of dependencies (precedence constraints) of form “task A must be done before task B”. Topological sort: An ordering of the tasks that conforms with the given dependencies

 

Examples:  Scheduling:  When  scheduling  task  graphs  in  distributed  systems,usually we first need to sort the tasks topologically and then assign them to resources (the most efficient scheduling is an NP-complete problem). Another example is during compilation to order modules/libraries.

 

The goal of a Topological Sort:

 

When given a list of items with dependencies we produce an ordering of the items that satisfies the given constraints. In order for the problem to be solvable, there can’t be a cyclic set of constraints. A directed acyclic graph (DAG) is a digraph that has no directed cycles. A topological ordering of a digraph is a numbering

v1 , …, v

of the vertices such that for every edge (vi , vj), we have i < j

In order for the problem to be solvable, there can’t be a cyclic set of constraints. In the example of a directed acyclic graph given in Figure 38.1 the vertices A and B have to be processed before vertex C can be processed. Similarly vertex B and C have to be processed before vertex D can be processed and D has to be processed before vertex E can be processed. Therefore one topological sorting of DAG is A,B,C,D and E shown here as numbered vertices (Figure .38.1).

 

 

Basically topological sort can be viewed as a modified version of DFS where we run a DFS but add the fact that at the end of the recursive function, add the node the DFS was called with to the end of the topological sort.

 

Let us argue that a directed graph G is acyclic iff a DFS of G yields no back edges which is implied by the fact that a back edge implies a cycle. The contrapositive is that if G has a cycle Þ that is there exists a back edge. Now let v be the vertex on the cycle first discovered, and u be the predecessor of v on the cycle. When v is discovered, whole cycle is visited since all vertices reachable from v must be visited before returning from DFS-Visit(). Therefore the path from u®v is a back edge.

 

Let us have the small discussion on DFS and DAGs. Argue that a directed graph G is acyclic iff a DFS of G yields no back edges and Forward: if G is acyclic, will be no back edges. Of course a back edge implies a cycle and backward: if no back edges, G is acyclic. In other words if G has a cycle Þ $ a back edge

 

38.2 Depth First Search Algorithm and Edge Classification

 

In this module we will be discussing about Topological sorting, but as an application of Depth First Search(DFS). In digraphs we traverse edges only along their direction. In the directed DFS algorithm, we can distinguish four types of edges as discovery edges, back edges, forward edges and cross edges (Figure 38.3). A directed DFS starting a vertex s determines the vertices reachable from s.

 

 

Edge (u,v) of G is classified as a:

 

(1)  Tree edge iff u discovers v during the DFS: Predecessor [v] = u If (u,v) is NOT a tree edge then it is a:

(2) Forward edge iff u is an ancestor of v in the DFS tree

(3) Back edge iff u is a descendant of v in the DFS tree

(4) Cross edge iff u is neither an ancestor nor a descendant of v

 

In the example given in Figure 38.4, the tree edges in green correspond to the sequence of nodes in the DFS path, forward edges are in purple and correspond to descendents in the DFS path that are not tree edges. Back edges are in red and correspond to backward ancestors in the DFS path. All other edges that do not form part of the DFS path are called cross edges and are given in orange. Note that the edge classification depends on the particular path of the DFS tree.

 

 

38.3 Topological Sorting Algorithm using DFS

 

G is called a Directed Acyclic Graph, or just a DAG,

 

TOPOLOGICAL-SORT(G):

 

1) call DFS(G) to compute finishing times f[v] for each vertex v

2) as each vertex is finished, insert it onto the front of a linked list

3) return the linked list of vertices

 

Note that the result is just a list of vertices in order of decreasing finish times f[].

 

We can use a DFS in our algorithm to determine a topological sort. The idea is: When you do a DFS on a directed acyclic graph, eventually you will reach a node with no outgoing edges. This is because if this never happened, you hit a cycle, because the number of nodes is finite. This node that you reach in a DFS is “safe” to place at the end of the topological sort. Think of the getting ready example. Now what we find is that if we have added each of the vertices “below” a vertex into our topological sort, it is safe then to add this one in. If we added in leaving for work at the end, then we can surely add taking a shower.

 

First call DFS(G) to compute finishing times f[v] for each vertex v and then as each vertex is finished, insert it onto the front of a linked list and return the linked list of vertices. We have to note that the result is just a list of vertices in order of decreasing finish times f[]. Now let us look at how we classify the edge using DFS, Consider an edge E(u,v) of G which has to be classified as,

 

Tree edge iff u discovers v during the DFS: P[v] = u. If (u,v) is NOT a tree edge then it is a: Forward edge iff u is an ancestor of v in the DFS tree, Back edge iff u is a descendant of v in the DFS tree and Cross edge iff u is neither an ancestor nor a descendant of v.

 

38.3.2 Time stamp of a Vertex

 

Each vertex v has two time-stamps namely d[v] records when v is first discovered and f[v] records when the search finishes examining its adjacency list. For every vertex u d[u] < f[u] .

 

38.4 Walkthrough of the Topological Algorithm using DFS

 

The example shown in Figure 38.5 (a) has been taken from ww.cs.utoronto.ca/~tabrown/csc263/2014W/week9.ppt

 

38.4.1 Time stamp of a Vertex

 

Each vertex v has two time-stamps namely d[v] records when v is first discovered and f[v] records when the search finishes examining its adjacency list. For every vertex u d[u] < f[u] .

 

We first initialize the time stamps of d and v values of every node of the graph to ¥. Call DFS(G) to compute the finishing times f[v]. Let’s say we start the DFS from the vertex c. At this point the d[c] is set = 1, the time stamp at which node c is first discovered as shown in Figure 38.5 (a).

 

As each vertex is finished, we insert it onto the front of a linked list. We follow the path from starting vertex c, next we discover vertx d at time stamp 2 i.e d[d]=2, then at next time stamp discover vertexf so d[f]=3 and f is done hence f[f]=4,since f is finished, and is output. We then move back to d (Figure 38.5 (b)).

 

At timestamp 5 d is done hence f[d] =5 and this is output (Figure 38.5 (c)).

We then move back to c (Figure 38.5 (d)).

 

 

Next that is at time stamp 6 we discover the vertex e so d[e] =6. Now e is done since both edges e are back edges. Hence at timestamp 7 e is output and we move back to (Figure 38.5 (e)).

Now c is done as well and it is output. Note that if there was (c,f) edge in the graph, it would be classified as a forward edge (in this particular DFS run) (Figure 38.5 (f)).

 

 

 

The f[c] = 8. Let’s now call DFS visit from the vertex a at time stamp 9 and hence d[a]=9. At this time stamp we discover the vertex c, but c was already processed => (a,c) is a cross edge. Next at time stamp 10 we discover vertex b (Figure 35.8 (g)).

 

Now we set d[b] = 10 and at time stamp 11 we find b is done as (b,d) is a cross edge and hence f[b] = 11 and we output it (Figure 35.8 (h)). Now we go back to c.

 

 

At time stamp 12 we find a also is done (Figure 38.5 (i)).

 

We now set f[a] = 12 and output it. WE HAVE THE RESULT! – return the linked list of vertices (Figure 38.5 (j)).

 

 

The linked list is sorted in decreasing order of finishing times f[] (Figure 38.5 (k)). This is true for any different vertex order for DFS visit. Please note that if we redraw the graph so that all vertices are in a line ordered by a valid topological sort, then all edges point from left to right.

 

 

38.5 Time Complexity

 

Running time of topological sort is Θ(n + m) where n=|V| and m=|E|. This is because Depth first search takes Θ(n + m) time in the worst case, and inserting into the front of a linked list takes Θ(1) time

 

We never visited a vertex more than one time . For each vertex, we had to examine all outgoing edges that is Σ outdegree(v) = m. This is summed over all vertices, not per vertex so, our running time is exactly O(n + m).

you can view video on Topological Sorting as an Application of DFS

Summary

  • Discussed the concept of Topological Ordering as an Application of DFS
  • Explained the complexity of Topological Sorting
  • Discussed the Proof of Correctness