37 Topological sorting

T.V. Geetha

epgp books

 

 

 

Welcome to the e-PG Pathshala Lecture Series on Data Structures. In this module we will discuss about Topological Sorting.

 

Learning Objectives

 

• To understand the concept of Topological Sorting

• To discuss some applications of Topological Sorting

• To Illustrate Topological Sorting using a walkthrough example

 

37.1 Topological Sort

 

We have a set of tasks and a set of dependencies (precedence constraints) of form “task A must be done before task B”.

 

Topological sort: An ordering of the tasks that conforms to the given dependencies.

 

Goal: Find a topological sort of the tasks or decide that there is no such ordering

 

37.2 Recap of Graph Fundamentals

 

37.2.1 Digraphs

 

A digraph is a graph whose edges are all directed. It is a shortened representation of “directed graph”. Some of the applications of digraphs are representation of the following One-way streets, Flights, Task Scheduling as graphs.

 

37.2.2 Digraph Application

 

Scheduling is a typical example of a digraph. Edge (a,b) means that task a must be completed before task b can start. Consider the following graph given in Figure 37.1. The first task that has to be done is ics21 as it does not have any predecessors. Next, we can proceed in sequence to ics22, ics23 and then proceed onto ics161 and end with good life. Or, after ics23, we can do ics52 and ics53 in parallel. Please note that, in order to do ics53, we have to do ics21, ics22 as well as ics23. Similarly, ics51 can be done after ics21. It does not have to wait for ics22 or ics23 as there are no edges from these two to ics51. Another order is ics131 after ics21 as there is an edge from ics21 to ics131. Similarly ics151 can be done after ics51. But ics53 ends there. There are various possible moves from ics21 such as ics131 or ics51 or icsics22. But to reach ics53, ics21, ics22, ics23 and ics51 have to be finished.

 

37.2.3 Reachability

 

Consider the example given in Figure 37.2. The list of vertices reachable from any node via directed paths is called reachability.

 

 

Consider node C. The nodes reachable from C are E, A and D. That is, from C we can go to E directly or to A and D through E (Figure 37.2).

 

Similarly, from B, the nodes reachable are: C, E, A, D and F. While B has direct connectivity to A, C, D and F, E can be reached through C. This is shown in Figure 37.3.

 

37.3 DAGs and Topological Ordering

 

A directed acyclic graph (DAG) is a digraph that has no directed cycles. Figure 37.4 is an example of a DAG and its topological ordering.

 

 

A  topological ordering of a digraph is a numbering v1 , …, vn of the vertices such that for every edge (vi , vj), we have i < j. That is i has to be done before j. Example: In a task scheduling digraph, a topological ordering is a task sequence that satisfies the precedence constraints. A digraph admits a topological ordering if and only if it is a DAG. One of the topological ordering for the DAG G, is shown in Figure 37.4.

 

Initial node can be A or B. Only after A and B are done can we move to C. Only after C and B are done can we move to D. After D is done we can move to E. So, the ordering here is: A,B,C,D and E.

 

37.4 Topological Sorting

 

Now let us consider an example of Topological sorting for a typical student day graph which is given in Figure 37.5.

The first node to be traversed is wake-up. After that, study computer sci.. Then , we can either do eat or nap (in any order). Only after eat and nap, we can do more c.s. There are 3 things to be done before “write a c.s. program”: play, more c.s. and work out. Play and work out can be done only after “more c.s.”. But, after “more c.s.”, play and work out can be done in any order. Once “write a c.s. program is over, “make cookies for professors”, sleep and then “dream about graphs” can be done in that order. For a directed acyclic graph (DAG) G = (V,E), a topological sort is a linear ordering of all of G’s vertices v1, v2, …, vn such that:

 

Formally: for every edge (vi,vk) in E, i<k.

Visually: all arrows are pointing to the right.

A real-world example for this is getting dressed.

 

37.4.1 Examples

 

One of the major applications of topological sorting is scheduling where graphs are used to represent the order of tasks. When scheduling task graphs in distributed systems, usually we first need to sort the tasks topologically and then assign them to resources (the most efficient scheduling is an NP-complete problem). Another application is during compilation to order the modules/libraries. Consider the graph given in Figure 37.6. In the graph given at the side, we can either start with d or c. Let us assume that we start with c. After c, we can proceed to d. After d, we can do a, g and f. After this, we can do b followed by e.

 

 

Let us consider another example of getting dressed. The dependencies and the topological ordering is shown in Figure 37.7.

 

 

 

The topological ordering may be as follows: we start with Socks, Underwear, Pants, Shoes, then, Shirt, Belt, Tie and Jacket and then Watch. Note that here, the node Watch is independent.

 

37.5 Topological sorting for cyclic graphs?

The topological sorting for cyclic graphs is impossible. Let us consider two vertices v and w on a cycle, there exists paths from v to w and from w to v. Consider the cyclic graph (Figure 37.8). To do 1, we have to do 2. To do 2, we have to do 3. To do 3, we have to do 1. Any ordering will contradict one of these paths .

 

37.6 Topological sort more formally

 

Is it possible to execute all the tasks in G in an order that respects all the precedence requirements given by the graph edges? The answer is “yesif and only if the directed graph G has no cycle! (Otherwise we have a deadlock).

 

37.6.1 Topological sort

 

Topological sort can be defined as the linearly ordering of the vertices so that the lineorder respects the ordering relations implied by the arcs.

 

 

 

 

Considering the graph shown in Figure 37.9, the following topological orderings are possible:0,1,2,5,9, and 0,4,5,9 , but 0,6,3,7 is not possible because there is no edge between 3 and 7. There are often many possible topological sorts of a given DAG. Consider the following DAG (Figure 37.10)

 

 

 

The Topological orders for this DAG are as follows:1,2,5,4,3,6,7 2,1,5,4,7,3,6,2,5,1,4,7,3,6 etc.. Each topological order is a feasible schedule.

Let us consider the following problem of Getting Ready: A – waking up, B – taking a shower, C – eating breakfast and D leaving for work

 

The constraints would be A before B, A before C, B before D, and C before D

 

We can model dependencies (or constraints) like these using a Directed Acyclic Graph. Given a set of items and constraints, we create the corresponding graph. In this graph each item corresponds to a vertex in the graph and for each constraint where item A must finish before item B, we place a directed edge A à B. Following these rules, the DAG would be as shown in

 

 

A topological sort would give A, B, C, D.

 

Another example for topological sort is items of clothing to wear. The items are shown in Figure 37.12.

There is no exact one order to put these items on, but we must adhere to certain restrictions:

  • Socks must be put on before Shoes
  • Undergarments must be put on before Slacks and Shirt
  • Slacks must be put on before Belt
  • Slacks must be put on before Shoes

Another example considers CS classes and their prerequisites (Figure 37.13) :

 

 

The goal of a topological sort would be to find an ordering of these classes that you can take.

 

37.7 Goal of a Topological Sort:

 

When given a list of items with dependencies (i.e. item 5 must be completed before item 3, etc.), produce an ordering of the items that satisfies the given constraints. In order for the problem to be solvable, there can’t be a cyclic set of constraints. We can’t have item 5 must be completed before item 3, item 3 must be completed before item 7, and item 7 must be completed before item 5. But however

 

37.8 Topological sort algorithm

 

Assume that the in-degree is stored with each node (in-degree is the number of incoming edges to the node). Now

Repeat until no nodes remain:

 

•  Choose a root and output it.

•  Remove the root and all its edges.

 

The performance of the algorithm is of the order O(V²+E), if linear search is used to find the root.

The algorithm is as follows:

 

–Scan all nodes, pushing roots onto a stack.

–Repeat until stack is empty:

 

1. Pop a root r from the stack and output it.

2.For all nodes n such that (r,n) is an edge, decrement n’s in-degree. If the in degree of the node is 0 then push it onto the stack.

 

The performance of the algorithm is O(V+E), so still O(V²) in worst case, but better for sparse graphs.

Since D and G have in-degree 0, we can start with them. The order would be: D,G,A,B,F,H,J,E,I,C.

 

This is shown in Figure 37.15 given below:

 

 

The details of the topological sort algorithm is given below:

 

Starting point must have zero in-degree

 

• If it doesn’t exist, the graph would not be acyclic

 

Algorithm

 

1. A vertex with zero in-degree is a task that can start right away. So we can output it first in the linear order

2.If a vertex i is output, then its outgoing arcs (i, j) are no longer useful, since tasks j does not need to wait for i anymore- so remove all i’s outgoing arcs

3. With vertex i removed, the new graph is still a directed acyclic graph. So, repeat step 1-2 until no vertex is left.

 

37.8.1 Topological sort algorithm explained with Example

 

The algorithm is implemented as a traversal method that visits the vertices in a topological sort order. An array of length |V| is used to record the in-degrees of the vertices. Hence there no need to remove vertices or edges. A priority queue is used to keep track of vertices with in-degree zero that are not yet visited.

 

37.9 Walk Through Example

 

Considering the example directed graph shown in Figure 37.16 (a) we will illustrate the steps of the algorithm discussed above. Two data structures are associated with the graph. One is the adjacency list where every vertex stored in an array points to a list of neighbours, it points to that is the neighbours for which this node is the .source node. Another array stores all nodes and their in-degree. Now in this example of topological sorting we start with 0 node (Figure 37.16 (a)). The queue initially has the start node 0 and the output is also 0.

 

Now we remove the element from the front of the queue (here 0), remove arcs (edges) from 0 and need to adjust the in-degrees of 0’s neighbours that is the in-degrees of nodes 1,4,6 needs to be decremented ((Figure 37.16 (b)).

 

The in-degree of the nodes (6,1,4) are updated and the node 0 is marked in the in-degree table. Now since the in-degree of nodes 6, 1, 4 after updation is 0 that is they all can be start points, they are enqueued into the priority queue (Figure 37.16 (c)).

Now we dequeue 6, and output it. Then we remove arcs (edges) from 6 and need to adjust in-degree of its neighbours (2,3) (Figure 37.16 (d)).

 

 

The in-degree of the nodes (2,3) are updated and the node 6 is marked in the in-degree table. Now since the in-degree of nodes 1, 4 and 3 after updation is 0 that is they all can be start points. While nodes 1,4 have already been placed in the priority queue now we enqueue 3., the new start node. Note that the in-degree of node 2 after updation is not zero and hence cannot be a start node at this stage and hence is not enqueued (Figure 37.16 (e)).

 

Now we dequeue 1, and output it. Then we remove arcs (edges) from 1and need to adjust in-degree of its neighbours (2) (Figure 37.16 (f)).

 

 

The in-degree of the node 2 is updated and the node 1 is marked in the in-degree table. Now since the in-degree of node 2 after updation is 0 that is it can be a start point. While nodes 1,4,3 have already been placed in the priority queue now we enqueue the new start point 2. (Figure 37.16 (g).

 

 

Now we dequeue 4, and output it. Then we remove arcs (edges) from 4 and need to adjust in-degree of its neighbours (5) (Figure 37.16 (h)).

 

 

The in-degree of the node 5 is updated and the node 4 is marked in the in-degree table. Now due to this updation node 5’s in-degree becomes 1 only and no new node’s in-degree becomes 0, that is no new start point is formed. Hence the queue remains unchanged. (Figure 37.16 (i)).

 

Now we dequeue 3, and output it. Then we remove arcs (edges) from 3 and need to adjust in-degree of its neighbours (8) (Figure 37.16 (j)).

 

 

The in-degree of the node 8 is updated and the node 3 is marked in the in-degree table. Now due to this updation node 8’s in-degree becomes 1 only and no new node’s in-degree becomes 0, that is no new start point is formed. Hence the queue remains unchanged. (Figure 37.16 (k)).

 

Now we dequeue 2, and output it. Then we remove arcs (edges) from 2 and need to adjust in-degree of its neighbours (7,5) (Figure 37.16 (l)). Note that after this dequeueing the queue is empty.

 

 

The in-degree of the nodes (5,7) are updated and the node 2 is marked in the in-degree table. Now since the in-degree of nodes 5 & 7 after updation is 0 that is they all can be start points. Now we enqueue 5,7, the new start nodes. (Figure 37.16 (m)).

 

 

Now we dequeue 5, and output it. Then we remove arcs (edges) from 5 and need to adjust in-degree of its neighbours (9) (Figure 37.16 (n).

 

The in-degree of the node 9 is updated and the node 5 is marked in the in-degree table.Now since the in-degree of nodes  9 after updation is not 0, no new start point is found.(Figure 37.16 (o)).

 

Now we dequeue 7, and output it. Then we remove arcs (edges) from 7 and need to adjust in-degree of its neighbours (8) (Figure 37.16 (p)). Note that after this dequeueing the queue is empty.

 

 

The in-degree of the node 8 is updated and the node 7 is marked in the in-degree table. Now since the in-degree of nodes 8 after updation is 0, we enqueue it. (Figure 37.16 (q)).

 

 

Now we dequeue 8, and output it. Then we remove arcs (edges) from 8 and need to adjust in-degree of its neighbour (9) (Figure 37.16 (r)). Note that after this dequeueing the queue is empty.

 

The in-degree of the node 9 is updated and the node 8 is marked in the in-degree table. Now since the in-degree of node 9 after updation is 0, we enqueue it. Now we dequeue 9 and output it. Since it has no neighbours process stops (Figure 37.16 (s)).

 

 

The final topological sorting of the graph is shown in Figure 37.16 (t). Now we can check whether the topological ordering is correct. 

We start with 0 – (no dependency since in-degree is 0.

Next is 6 – in-degree is 1 and dependent on 0 which has already been processed

Next is 1– in-degree is 1 and dependent on 0 which has already been processed

Next is 4 – in-degree is 1 and dependent on 0 which has already been processed

Next is 3  – in-degree is 1 and dependent on 6 which has already been processed

Next is 2 – in-degree is 2 and dependent on 1& 6 both of which has already been processed

Next is 5- in-degree is 2 and dependent on 2 &4 which has already been processed

Next is 7– in-degree is 1 and dependent on 2 which has already been processed

Next is 8 – in-degree is 2 and dependent on 3 & 7 both of which has already been processed

Next is 9– in-degree is 2 and dependent on 8 & 5 both of which has already been processed

Hence we see that the topological ordering has been followed.

 

Summary

  • Explained the concept of Topological Sorting
  • Discussed some applications of Topological Sorting
  • Illustrated Topological Sorting using a walkthrough example
you can view video on Topological sorting