4 Unguided Search methods

Bhushan Trivedi

epgp books

 

Introduction

 

We have already learned that AI is a discipline which tries to solve ‘difficult’ problems. We have also seen that the first step in solving AI problem is to convert the problem’s domain knowledge in some form of a state space. We have also seen that building a state space is not enough. For any real world problem, it is imperative to have some strategy to guide the journey from the start state to a final state, maneuvering through intermediate states from the state space.

 

This maneuvering process is known as search in AI parlance. It is a vital component in any AI solution especially when it has to wade through enormous number of possible states. A typical chess playing program might need to evaluate a million states during a serious chess game.

 

We will look at two types of search, guided and unguided.

 

Guided and unguided searchLet us go back to previous chapter and look at the 8-5-3 gallon milk jug problem. Suppose we have no clue about how to achieve the solution, we may start randomly and try applying any rule which is applicable at a given state space. We may have a sequence which is different than what led to solution in the previous chapter but if given enough amount of tries it is quite likely to reach to a solution barring an important point;the solution path should not contain any cycles.

 

For example consider following sequences

8-0-0, 3-5-0, 8-0-0, …

800, 5-0-3, 0-5-3, 3-5-0,8-0-0, ..

8-0-0, 5-0-3, 5-3-0, 2-3-3, 0-5-3, 5-0-3

 

The process may continue forever but we will not find a final state as state sequences are repeated.

 

So we may form the rule for search that if the same state which is generated earlier should not be generated again. Sometimes this requirement is known as being systematic.

 

Even if we follow that rule you may clearly see that some futile states may be travelled before embarking on the right path. Even when we are dealing with such a simple problem we may end up traversing a very long path before getting a solution.

 

Such  search,  which  operates  blindly, applying  random  moves  to  try  for  a  final  state, is  known  as

unguided or blind search.

 

On the contrary, it is usually possible to have some domain knowledge to learn that some typical types of moves are better than others. Trying them before others usually lead to solution faster. Such domain knowledge is called heuristics and search which operates using heuristics is called guided search.

 

In this module, we will look at some unguided search methods. Though they are less efficient than guided search methods, they are useful when there is not enough domain knowledge and when some other guided search is applied in conjunction with them.

 

Generate and test

 

One of the simplest methods to search unguided is to take any random node and test it for a solution. This is not going to work for the problems we have seen in the previous chapter so let us take an example of one expert system called Dendral. It was designed to assist chemists in determining the structure of a chemical compound. The information about the chemical compound is taken from spectrometer and mass resonance spectrometer readings. Looking at outputs from both devices, candidate structures are calculated. There are many candidate structures (sometimes running into millions) and only few with required constraints are to be presented. Working with one more search type (called constraint satisfaction search), generate and test solves this problem quite efficiently. It looks at each generated candidate (which is shortlisted by the constraint satisfaction algorithm), compares that with real world compounds and see if there is a match (a solution). It quits when a solution is found.

 

Let us see how generate and test algorithm works. It uses a simple algorithm.

 

1. Generate a random node from the state space

2. Test if it is a solution

3. Quit if so otherwise go to first step

 

This method is quite known to us. For example when I am leaving for my college and cannot find my car keys on the place I usually keep them, I try looking for them in my table drawer, in the showcase, on my study table and so on, may be finally looking at the place where my daughter is playing. What am I doing? I am generating states one after another, checking if it is a solution state, if so quits or otherwise try looking at some other option.

 

This simple method is useless if the state space is too large. For example if I have to search my entire house for the key, it is impossible in real time. That is why generate and test is applied when the state space is either small enough or some other technique is used to reduce number of states to a manageable level.

 

An additional refinement to this generate and test is called hierarchical generate and test. One example of HGT application is a fingerprint matching programs. A fingerprint might be captured from a place of crime. The HGT is applied to test if the fingerprint is of curved type or round type or some other type. Once the type is confirmed, the same process is carried out for sub types of that type. When final subtype is chosen, normal generate and test is applied to check the obtained fingerprint with criminals’ fingerprints which are already stored in the database (of that particular subtype).

Figure 4.1 example of HGT

 

Breadth first search (BFS)

 

Another unguided but quite useful search method is known as breadth first search. Let us take our 8-5-3 gallon milk jug problem to understand this.

 

Figure 4.2 shows a breadth first search graph. It has a root node which is 8,0,0, the start node. It applies two possible rules and generate two possible states possible to reach from 8,0,0; i.e. 3,5,0 and 5,0,3. The same process is carried out for next few levels.

 

Figure 4.2 a graph showing breadth first search.

 

Look closely at figure 4.2. Each node describes a state. Each node has a few children which are generated by applying all possible rules on that state. A node which is repeated is shown in dark shade. Each node can actually generate a parent node as one of its child which is not shown except the second level. Here 8,0,0 which is a parent node, is again generated. It is not shown at other places to make sure the graph remains less cluttered. Also the nodes which are repeated are not shown in most cases to simplify viewing.

 

This exploration process is carried out by following method

  1. Start with the start state as a root node
  2. If a node is repeated or if goal state is reached, the node is not explored further. The process quits successfully if the goal state is reached.
  3. Apply each possible (based on the preconditions) rule on the state one by one to generate all children of that parent node. Once all rules are applied and all children are generated, second level of the search graph1 is said to be generated. “1 It looks like tree but actually a graph as you can find children with multiple parents.”
  4. The leftmost child of the previous level is picked up and explored from step-2. The next node from left is explored after that and so on. Once each childof the parent node is explored, this level is also said to be completed.
  5. When there is no way forward, the search stops. Otherwise it continues from step-2 for each children. There are a few important advantages of this search method. Let us look at two of the most important ones.

 

1.  This search method finds all possible solutions

2.  We are guaranteed to get optimum solution when number of branches explored is to be minimized.

 

Unfortunately, the amount of memory required at each level increases exponentially. The number of nodes in the next level for each node at current level is called branching factor. In our trivial problem earlier the branching factor is 3 on an average ignoring the parent node. In a game like chess it is as large as 35 on an average. As mentioned in the book “A first course in AI” by Deepak Khemani, it would be impossible to explore the solution graph after a few levels especially when the branching factor is too high. This single problem prohibits the use of this method in its raw form.

 

Depth first search (DFS) 

 

Figure 4.3 the depth first search, first few moves

 

Figure 4.3 showcases the depth first search process. The depth first, unlike breadth first, explores a single branch (usually the leftmost) until a dead end (a state where no rules are applicable), or a repeat node or a goal node. In case of a dead end or repeat node, it backtracks to the parent and takes another branch (usually the next leftmost) and continue. It stops when reaches a goal node. The process can be described as follows.

 

1. It begins with a root node as the start node

2. If this node is a goal node, it quits successfully

3. It explores the leftmost node (by applying the first rule that matches for the current state)

4. If this node is a repeat node, the parent node is explored again to get its next child, if not, its parent is to be considered and so on, if the root node is reached and no further children to be explored, the search process ends there unsuccessfully.

5. Go back to 2nd step.

 

You can see that this search method does not demand the amount of memory that BFS demands. That is the reason it is used in many search programs. One of the AI languages, PROLOG, uses DFS as its default search method.

 

Depth bounded DFS (DBDFS)

 

Depth bound DFS is a combination of DFS and BFS. Both depth first and breadth first advantages can be achieved with carefully designing the DDFS. The problem with DFS is that if the search paths are very long and solution lies somewhere in the right branch than it takes a huge time. BFS may take more time if the solution is really very deep.

 

Depth bound (or sometimes mentioned as Depth Limited) search assumes a typical length as final and start like a normal DFS. When it reaches a typical boundary, it assumes dead end and try alternate paths. Figure 4.3 explores the search graph using the depth of 2. Once the graph reaches to a complete stage as shown in the final figure of 4.3, it can be extended further as shown in figure 4.4 for another cycle of same length. The requirement, off course, is to remember the complete tree so as to move to the left most node of that tree to start with. If the memory is the constraint, this search has to start all over again.

Figure 4.4 DBDFD search for two levels

 

Figure 4.5 DDFD may be extended for the same length once it is over unsuccessfully for one cycle.

 

Depth first iterative deepening (DFID) An interesting and successful extension of DBDFD is DFID where the depth value starts with 1 and incremented after each cycle. That means a graph is constructed with depth =1 first, then depth = 2 and so on. One interesting thing about DFID is that it starts all over again for the next cycle. The memory problem prohibits the algorithm to remember previous nodes. The advantage is that it can find the optimum solution like BFS with the storage requirement of DFS. The cost that it pays it is the extra CPU time that is spent on regenerating the entire tree all over again every time. Though this operation seem wasteful, it can actually be a for more optimized search algorithm if augmented with heuristic knowledge.

 

We will look at how that is actually done later but a short explanation is in order. When the first pass gets over, the algorithm can apply a heuristic function to all leaves of the graph and find out most promising nodes and explore them before others when it explores it again. Moreover, the nodes which are not leading to optimum paths are not explored further. All in all, it can achieve a much more optimum result so much so that it is a popular choice for many chess playing programs’ search algorithm.

 

Comparison

 

Let us end this module with a bit of comparison. Both BFS and DFS will find the solution eventually if exists (for a finite search graph) as they have a tendency to explore every node of the state space by the end, if the search space is finite.

 

It is possible for a search graph to be infinite or virtually infinite. Suppose if we assume that some string or some integer has some property called X and we are interested in checking for it. The search graph, in that case, explores all integer values one after another or strings one after another. If the property does not hold, that means there is no integer or string for which what we assumed is correct, the search process does not stop. For example we are looking for an integer (other than one) with the value of it’s square and it’s cube is same. We may start with (4,8), (9,27) …. . We will continue forever without getting any N where we have (N2,N3)and N2 = N3.

 

Depth limited or depth bounded search is better in that case. When the search is bounded, it guarantees to stop. Real world problems are more related to virtual infinity. For example, take a case of system with maximum 10 characters of password is allowed. A password cracker program will take inordinate time to look for all possible combinations of 10 character password sequences. It is a better approach to use a DBDFD search or DIFD search. In case some user has chosen a short password, it can crack it in real time.

 

One more issue is about optimal solution. If we consider minimum number ofmoves from the start node, BFS will always get us the optimum solution. DFS does not guarantee that. DBDFS may find a non- optimal solution but it cannot be much different than the optimal solution itself. For example if the depth is 4, an optimal solution may be immediately after the start node but on the right side, and a nonoptimal solution may be at the some other level (at maximum fourth level) but on the left side and discovered before, the difference is guaranteed to be 3 or less.

 

One more important issue is the best solution when it is to be responded back in some specific time. For example a chess program sometimes works under a heavy time constraint and must respond back. DFID, given such constraint, might quit after 5th or 4th or some other move and respond with the best node so far.

 

One can easily understand the need of reducing the amount of work by constraining the search process. The unguided search processes may be augmented with methods based on heuristics to make sure the problem is solved in real time. We will be looking at heuristic based searches in the next module.

you can view video on Unguided Search methods