13 The A* Algorithm

Bhushan Trivedi

Introduction

We have seen quite a few algorithms so far. An interesting observation that we had during the discussion of branch and bound algorithm is that one can have two measures during the search process and dynamically adjust with any optimization based on two components. First is the distance from the root node which identifies the amount of effort provided so far and second is the estimate of reaching to goal node from the current node. We have also seen the best first search in module 5. In best first search, any unexplored node with best heuristic value is explored next. We do not check for how far the best node is from the root. Let us take one example to understand. If we have two nodes, A and B which are unexplored right now. If A is 15 level deep while B 20 levels deep. Heuristic value of A is 6 while B is 7, so we explore B as we expect the goal node is nearer. Is this a correct way? Obviously not if we are planning to get the goal node with the least distance between root and goal node. A* algorithm provides us a solution which considers both the components and gives us the next node based on that measure.

A* has found its usage in many solutions to path finding and graph traversal problems. A* is found to be quite better than most other solutions except some cases which can pre-process graphs. The A* basically find a least cost path from a starting node to an end node in a graph. Let us see how it works.

Prerequisites for A*When we apply A* to a graph (or tree), there are a few prerequisites for it to work and return an optimal result. Here is a list

1. Solution is confined to one node. There may be multiple solutions but the single solution comprises of only one node. We will explore this further in the next module.

2. The heuristic function that we use here, when we apply to the nodes or the problem states, must satisfy admissibility property for the algorithm to return optimal result. We will also explore that later in the next module.

3. The best first search algorithm is applied to the problem states and the best so far node is always picked up but based on two things, a value g which is the cost to the node being expanded and h’ which is a heuristic value telling us how far the goal node is from here. We must have some method to obtain both the measures in an unambiguous way. This process is simplified if g value represents the level of the tree. Thus, the root has g value 0, all first level nodes have g value 1, and all second level has g value 2 and so on till all nodes at level n will have g value n. The summation of g and h’ is commonly denoted as f’. If we are searching the graph, every edge add 1 to total cost of g. Thus g value, at any point of time, indicates the number of edges travelled so far from the root node.

The Graph Exploration using A*To understand the process of how A* is used, look at the graph depicted in figure 13.1. Suppose we start from W and would like to reach to B. Let us see how A* processes further from W and find the path to B.

Figure 13.1 The graph problem and solution by A* algorithm

In case of the network shown in the figure 13.1, The W is a root node for us. That is the only unexplored node in the beginning. The two values shown in the bracket are g and h’. We can associate g and h value to either a specific arc or a node. When we associate it with an arc, we assume g value including that arc and h’ value as the estimate to reach the goal node after that edge. When we associate g and h’ value to a node, we assume g to be the cost to reach to that node and h’ is the estimated cost to reach to the goal node from there.

Now let us begin from W. The value g at W is 0 (as we are at the root node) and h’ is the estimate of how far the goal node B is. We assume that the heuristic function return the value as 151. The total of g and h’ indicate how far the goal node is expected from the current node, which turns out to be 15. Once W is expanded, it becomes part of explored nodes and both of its children are part of unexplored nodes. After expansion, W will be having two children S and V. So at this point of time, the list of explored nodes is {W} and unexplored nodes is {V, S}. Keeping g value as 1 indicates that we are 1 hop away from the start node. As we assume g value to be one for each hop right now, longer or shorter, the problem is simplified. Otherwise, it has to be calculated every time and g + h’ value must be calculated for checking the merit of the node.

At this point of time we have two unexplored nodes, S and V. S (g + h’ = 14) is more promising than V (g + h’ = 15) so the next step is to expand S. Now the explored list becomes {W, S}. It has three children (the fourth child is W which we are not considering as it is not part of the unexplored nodes). Three children are V, N and H. At this point of time, we will have to see if any child is already explored. We have already checked for W. Next we will see any one of them are already generated before and we find that V is already generated earlier. The new path to V must be a longer path as the newer path is at least one hop larger than the older path (if the node is generated in just previous cycle it is one hop less otherwise more). If g value is not same for all hopes, such regenerated nodes are to be checked for optimal paths, we will have to store the optimal path so far and compare it with current path, only if the current path is better, we should consider it, otherwise not. Keeping the g value as constant 1 saves us from that trouble. We do not need to consider the new path. Such paths are shown as dotted lines in the figure. At this point of time, the unexplored node list is {V, H, N}, again N with g + h’ value 11 makes N most promising and we will explore it in the next iteration. The explored list becomes {W,S,N}. The expansion of N yields two children, M and V. V is already part of the unexplored list so no need to process further. The M, when processed, yields 8+3 = 11 which is still better compared to other nodes of unexplored list {V,M,H}.

So now we expand M, move it to the list of explored nodes, add its children to unexplored list. M has three children, L, G, and H. H is already part of unexplored nodes so we do not need to bother about it. The unexplored list now is {V, H, L, G) out of which G seems most promising with g + h’ value as 10.

The process continues like this till it finds the destination node to be B where the g value indicates the actual distance covered. Now the h’ value is 0 as we have reached the final destination and we do not need to estimate the distance.

Let us try to depict the process using a table

The current node	Children	Explored List	Unexplored list	best	The g
W	S, V	W	S,V	S	1
S	V,N,H	W,S	V,N,H	N	2
N	V, M	W,S,N	V,H,M	M	3
M	L,G,H	W,S,N,M	V,H,L,G	G	4
G	J,E,F	W,S,N,M,G	V,H,L,J,E,F	J	5
J	C, I	W,S,N,M,G,J	V,H,L,E,F,C,I	C	6
C	D, K	W,S,N,M,G,J,C	V,H,L,E,F,I D, K	D	7
D	B,A	W,S,N,M,G,J,C,D	V,H,L,E,F,I D, K,B,A	B	8

The process can now be described in a simpler way. In the beginning, we have a root node W which contains two children. At this point of time W is the only explored nodes and both of its children become part of unexplored node list. The best from the unexplored list (based on g + h’ value) is S so we expand S next. All three children are generated, S moved to explored list from unexplored list, all children added to unexplored list and best of them is found. It comes out to be N. The same process is continued till we reach the destination node.

The process can be written in form of algorithm as follows.

A* algorithm version 1

1. Initially the root node is expanded, g value is set to 0, h’ value is found using a heuristic and the root node is moved to explored nodes and all children are moved to unexplored nodes after calculating their g and h’ values.

2. Pick up the best node from the unexplored list. If the best node is the goal node, quit.

3. Otherwise, move the best node to explored node list, make the best node current node for the next cycle.

4. Expand the best node, generate its children.

5. Check if any child is part of explored node, if so ignore it.

6. Check if any child is part of unexplored node, right now the newer path is always longer so ignore that as well.

7. Process all other children to generate their h’ values. The g value is just one more than the value of g in the previous cycle. Add all such children to the unexplored node list.

8. Go to 2

What if the g value is not identical for the entire path? How paths are explored?Our algorithm so far has only considered value of g to be one for one arc. That simplified the algorithm as we can just ignore new paths to the same node. What is it is not so? The value of g for every arc is different than others? Let us try to see using the same graph but now we release the restriction that g value is always 1 for one arc.

Figure 13.2 graph search with g value is not fixed to 1. The g value now indicates the actual distance.

Now carefully look at the new traversal. Like earlier case, we begin with W with (0, 15). W has two children now but g values are different, node V has higher value of g compared to S. So we explore S next which is a cheapest solution so far. Like before, we now expand S and generate two children N and S. At this point of time, after calculating the g value depending on the cost of the arc, we find that both children has higher g+h’ value than V. So now next node to be expanded is not a child of N but V.

This is a significant difference from earlier traversal. We tend to ignore unexplored nodes which are already generated. Now they may be in the contention and may be a better node through other path. This is quite common in graph traversal. If the first path explored is expensive and the later path is cheaper, the later path can be used.

Search propagates further without much of a problem till K where the node which is already generated has found a new and shorter path. Earlier we had explored I as a child of Q. QI link is expensive so we have not considered before. Now the road to I via K sounds more promising than others so we now move to I as the best node.

This is another deviation from the earlier traversal. It is like reaching to a local minima. Both the children are expensive than the parent. Fortunately, we have a better node elsewhere and we progress picking up that node rather than any of the children. This is the power of best first search. We can pick up any node, not just any one of the children.

Again, the search proceeds normally like earlier.

The process can be depicted using a tabular format as below.

The current node	Children	Explored List	Unexplored list	Path from	best	The g
W	S, V	W	S,V	–	S	1
S	V,N,H	W,S	V,N,H	W	N	2
N	V, M	W,S,N	V,H,M	S	V	3
V	Z	W,S,N, V	M,H,Z	N	Z	4
Z	AA,Y	W,S,N,V,Z	M,H,AA	V	AA	5
AA	X	W,S,N,V,Z,AA	M,H,AA,X	Z	X	6
X	U, O	W,S,N,V,Z,AA,X	M,H,AA,U,O	AA	O	7
O	K,I	W,S,N,V,Z,AA,X,O	M,H,AA,U,K,I	X	K	8
K	I,A,C	W,S,N,V,Z,AA,X,O,K	M,H,AA,U,I,A,C	O	I	9
I	P,D	W,S,N,V,Z,AA,X,O,I	M,H,AA,U,K,P,D	O	D	10
D	C,B	W,S,N,V,Z,AA,X,O,I,D	M,H,AA,U,K,P,C,B	D	B	11

We start and work the same way as previous case. You can see that g values are not increment by 1 always. An interesting move happens when N is explored. One of the child of N, the V, is already generated. In previous case we have ignored it but in this case we will check and find that the node V has least value for g + h’ now. Though there is a direct path from W to V, it is expensive than this route. We proceed with exploring V and again an interesting move happens when we expand O. We have two children, I and K. K sounds better so we explore it next. Both the children of K found to be more expensive than O-I route so we now backtrack and take O-I route. Now onwards the route becomes straight forward as the best nodes are from the list of children for all subsequent moves. Carefully look at the tabular representation and also look at the entry which is highlighted. You can clearly see that the best node is from a node (o) which was expanded previously.

Both the cases depicted above are different from each other. In case-1, the first path to V was expensive so we did not consider that. Now when we get a better path to V (shorter path or least cost path), we can move to V but using a newer path. In case number 2, a node I looked promising but after exploration, we learn that our estimate was wrong and both children are more expensive than the route from O. In this case, the old path now sounds better and we backtrack and move on to D from O. The second case is where the children of I are both more expensive than the parent. It is a kind of local minima.

Thus answer to the question “What if all children has larger value than another parent?” Backtracking if there is a better unexplored node or choose a child if it has least value in unexplored nodes.

The A* algorithm version 2 Now it is the time to write a formal algorithm, considering the backtracking possibility.

1. Initiate the process by adding the root node (W) in the explored list. Set the g value as 0 and h’ value as what it is.

2. Current node = root node, path = root node

3. If this is a goal node, return with positive value and the list

4. Explore the current node, generate children

5. If any of the child is already explored, ignore.

6. Generate all other children and add them in unexplored list

7. Calculate the g value of each children as the g value of current node + cost of reaching to that node

8. For each of the children, also calculate h’ value

9. Add g and h’ to learn the estimated cost of the solution though that node.

10. If any of the child is already generated and part of unexplored node, recalculate their g value and update if found to be less than earlier value.If the old path was better, we need to do nothing.

11. Pick up the best node from the unexplored list, if no node is left in unexplored list, return failure.

12. If the best node is the child of current node, path = path->current node, Current node = best

node

13. Else, update the path. Remove the segments of path which begins from the older, better path to best node. Thus Path = Path – segments from the older parent of the best node which is now part of the cheapest path. Now add the current node by path = path->current node

14. Go to 3

The change is described in step 10 as well as 13. 10 is discussing the new path being better than the old path so considering it. 13 is about backtracking and picking up a better path. In fact 13 is a special case of a general issue. That is discussed in the next segment.

The back propagation of estimates

Though we have already seen two versions of the A*, we are not yet done. An important addition is still to be made to the algorithm.

There is still a catch which probably is noticed by you. The h’ estimates are to be updated once the children are explored. In fact, every time the node’s g value is calculated, the information is to be propagated back. For example we start with W with the estimate of 15. Now when we explore it and found that the best path through S is estimated to be 11, the estimate of W must be changed to 11. We do not need to do anything further as we have reached to start node in this case

What if this W is an intermediate node and there are few braches connecting to it? We must also update the corrected value of W back. To understand the point, assume that AB is a parent of W and AC isanother child of AB as depicted in figure 13.3. AB, upon looking at two of its children, thought that AC is better. If we explore both children further, S comes out with a better answer and W estimate changes to (n, 10). Now the path through AC is more expensive than this path. The AB->AC is to be replaced with AB->W path in the best path now. You might have noticed that we have replaced 0 by n. Now W is not the first node so it must have some g value, we assume it to be n.

You can see that when the path value of the child influences the parent to change its estimate, it is possible that the best path changes.

Figure 13.3 Why backward propagation is necessary.

Another point. What if the AC’s h’ value is 9? That means though the revised measurement of W is better than before, it is not the best so we will not have to do anything further. In fact in the case of W being better than AC, the AB estimate also is to be updated. Thus for a parent being P1, the best path, which was earlier through C2, changes to AB. Thus such back propagation influences the path to change to better ones. The path to solution dynamically changes the priority due to this.

Thus, a general rule is, whenever we get an update about an old path becoming better, we must propagate the information back till we get to a point where there is no better path possible.

Another point. We have seen a seemingly bad path becoming better, it is equally possible for a good path to become worse. What if the current best path is passing through W and estimate of W is updated? If the updated value is better than the previous one, like shown in above example, we do not need to do anything.

What if the W’s original estimate is increased? Say it becomes (n,20) as both S and V are found to be further than assumed. Now if the alternate path through AC (n,17) for example is better, we need to update that at AB. Old AB value 15 (best of W and AC) is now 17 (again best of W and AC), and thus any path that is passing from AB with assumption of 15, must reassess it again. If there is an alternate path with estimate 16, that is to be chosen. Look at the figure 13.3. P1 has chosen AB as the best node as it assumed the distance from AB to be 15. Now the estimate is revised to 17, it no longer be an optimal path, the path through C2 is better. You can see that this is just an extension to the example that we have seen earlier in figure 13.2 where O is revisited and a new path is explored.

What is the moral of this story? The algorithm that we have discussed is to be augmented with an important addition. After calculating the children g and h’ values, they must be propagated back and values of all explored nodes are to be updated till the path passing through current node is better than others. The propagation might result into changing the best path. It is quite possible that the propagation goes back all the way to the root node. This back propagation is essential for obtaining the best path. Heuristics are what they are, estimates, they may be wrong, or at least not precisely telling us the distance to the goal node. When we learn the exact value we must propagate it back and fix any gaps that are created by the difference of the actual values and estimated values. Once such values are recalculated and represent closer to real estimates, they better represent the state of the problem and it becomes more probable for A* to find an optimal solution.

In fact even in the case of this back propagation there are chances of problems. For example if the path contains a cycle, it will continue propagating back unnecessarily. Anyway, we are not going indiscussing those details. The A* algorithm’s third version is given as an exercise. The books mentioned in earlier modules, Rich and Knight and Khemani, both contains the final version of the algorithm. In the end, let us answer a query.

Why h’ and not h?

We have used h’ instead of h. In the case of A*, the designers have given h’ to be an estimate. They believe that the h is the exact distance from the current node to the goal node and h’ is an estimate of the same. If h’=h, that means the h’ value gives correct distance to the goal node, we will not need any search, we will reach to the goal node without roaming around. Our figure 13.1 actually is an example of a case. Here we do not need to pick up any other but children of the parent and our search is straight forward. Usually h’ is not such correct and thus we might need to comeback like in the case of 13.2 where we find that our older estimate to V and I were wrong and we need to update them.

As h’ is just an estimate, what is the guarantee that the result that we have received is an optimal solution? Or the path is really the shortest path? Good question. Wait till the next module. We will discuss that and some other related points in the next module.

you can view video on The A* Algorithm