18 Problems with GSP and introduction to Plan Space Planning

Bhushan Trivedi

Introduction

In this module we will see the problems with Goal Stack. We will also see how Plan Stake planning works and solves some problems cannot be handled by GSP.

Goal stack planning is simple method which works in recursive way but it is really hard to implement that in a manner which produces an optimal solution. Whenever a goal is achieved component by component, a solution of current state might disrupt previously solved components if the components have some inter-dependency. Sussman has shown that it is possible even for a very trivial blocks world problem that we do unnecessary stacking and un-stacking.

Plan space planning is a method of implementing planning in a nonlinear way. Actions are chosen to be added at any part of the plan. Actions are ordered based on one action being a cause of another action or some predicate being an effect of one action while it is in the prerequisite of another action. When an action can break the causal link or deletes or modifies the preconditions, the order is changed to make sure that such break does not take place. Though much more complicated, plan space planning can achieve far better optimized plan compared to goal stack.

Problem with GSP

When a goal state is combination of few sub-goals, for example a goal state described below, GSP might generate a sub optimal solution.

On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _) ᴧ Free(B) ᴧ Free(C) contains 6 different sub goals.

GSP requires solving one sub goal after another. Some other planning methods also work in a similar fashion. That type of planning is called linear planning. Linear planning works only when the sub goals are independent of each other. In above case, the sub goals are not, and thus linear planning might fail.

Another issue is what if current state’s preconditions do not match? For example let us take the same example we took in the previous module. We pick up On (A,_) and find Free(A), the prerequisite to be true so we went straight ahead. What if On(A,_) does not find Free(A)? The process gets little longer. We must get Free(A) and for having that, put anything sitting on top of A, for example some X, on the floor. If something is resting on X, for example Y, we must place Y on the floor before that.

Thus, instead of the goal state, we now pick up the previous state (from where we can reach to final state by applying specific action, Free(A) in our case). We will do the same thing for it, choose and action to apply (PutOnTable (X) may be the case if X is resting on top of A), and see if the preconditions are matched in the start state. If so, we apply that action and plan is complete! If not, take one more step back. Eventually we will reach start state using a recursive process.

Now, we have a little better solution which recursively go back and check if it connects to the start state by checking the preconditions existing in the start state. It is like final state<-some action<- previous state<-previous to previous state<- …. <- start state.

Are we done? There are still two problems to address

1. What if there are multiple actions resulting into final state from various other previous states? This may also be true for earlier states. We will have to check for each one of them and generate multiple stacks for each one of them, exactly similar to what we have done for alternate ways of choosing component’s orders.

2. What if the goal contains multiple components? We have to pick up each component and solve them individually. This might not be as easy as it sounds. It is possible that solving one component disturbs the plan for some earlier solved components. The solution that we produce must address this issue.

The second point needs little more elaboration. Situation described in second case happens when the goal components are inter-dependent. Thus changing one changes another. We pick up each component and solve them linearly, one after another. In case of interdependency, solving a component might un-solve an already solved component and thus final test to see the complete goal is achieved is a must. We need to check once we complete if the complete goal description is true at the end once again. This might look redundant but it is not. In fact goal state planning not only adds a component but complete goal also in the stack. Thus when all components are solved, the complete description also is tested to be correct. If it does not match, that component is again added to the goal stack and the process is resumed.

Let us try to understand the last point using an example depicted in figure 18.1.

Start state End state

B C B A

A D C D

Figure 18.1 The problem

We have two goal components, On(B,C) and On (A,D). We have two options, trying for On (B,C) first or try On (A,D) first. Out of which we decided to take On (B,C) first. The complete operation is depicted in 18.2. it is achieved in one step as shown in 18.2 (b). Now we decide to have second component On(A,D) to be planned. For that we need both A and D to be free and thus we need to break apart the structure of 3 blocks which undoes On (B,C) that we have done in the first step. The rest is simple to understand. When we achieve On (A,D) in 18.2 (e) we check for complete goal and we find On (B,C) is false. Now push it on stack and solve it in 18.2 (f).

Now look at other alternative carried out in 18.3. We will pick up On (A, D) first and then On (B,C). You can clearly see that we do not need to undo anything in this process. The plan that we generate is more compact and optimized compared to the plan that we had in 18.2.

Figure 18.3 Solutions begins with solving On(A,D) first

One must understand that ordering of the actions makes a huge difference as it made so in the previous case. Finding a perfect order is one of the most important things associated with GSP. GSP tries all possible combinations and come out with the best amongst them.

Unfortunately it is not always possible to reorder actions to achieve no backtracking. Let us take another example to illustrate a problem known as Sussman’s anomaly which is a trivial case of two component goal state. One can take first component and then second or vice versa. In both cases, it needs to undo whatever is done in the previous case.

Sussman’s Anomaly

We have a different problem depicted in 18.4 now. This is a version similar to what Sussman showed in his famous paper.

Figure 18.4 the set up for Sussaman’s anomaly case

Now the start state and goal state has Free (B) ᴧOn(D, _) common. That is why they do not need to be worried about. The only point of concern are two goal components, On (B,C) ᴧ On (C,D). We can try achieving the goal state by picking one before another and see what happens.

Let us take On(C,D) first. The prerequisite for this operation is true in current state. The prerequisite for plain C on top of D is that both of them must be free. Thus both Free ( C) and Free (D) are true so in a single action we can achieve that. Figure 18.5 depicts the same.

Figure 18.5 Achieving On (C,D)

Once on(C,D) is achieved, let us try to achieve On(B,C). We need both Free ( C) as well as Free (B) for that. Out of two, Free (C) is true but Free(B) is not true so we need to free A. So we need to execute PlaceOnTable(D) first. But D itself is not free so we need to execute PlaceOnTable(C), to free D. Thus we need to execute them in the order mentioned in figure 18.6.

After completing both steps, if we try to see if the complete goal state is achieved, we find that it is not. We now have to place On (C,D) again on stack and complete the job as depicted in Figure 18.5.

Now we have On (C, D) true but again when we check the complete goal stack we realize that On(B,A) is again disturbed. Fortunately when we again insert On(B, A) on stack, we have both preconditions Free (B) and Free (A) satisfied and thus we can have that. Now we taste for On(C,D), we have that too and the problem is now solved.

Kindly note we have to undo what we have done twice during this process. Let us try to see what happens we take an alternate route. Now we will try to achieve On(B,C) before On(C,D). We can achieve On(B,C) using three steps depicted in figure 18.6

Another route

Now, we will pick another component, On(B,C) first and see if it makes any difference. One of the preconditions of which, Free(C) is not true so we need to execute PlaceOnTable (B) for that and achieve state depicted in 18.8. Now we can achieve On(B,C) as depicted in 18.9

Figure 18.9 On(B,C) requires us to break the structure now

Now, we will check and find that On(C,D) is no longer true. So we will have to insert On(C,D) on the stack once again. One of the preconditions of which, Free( C) is not true so we need to execute PlaceOnTable (B) for that and achieve state depicted in 18.6.

Figure 18.10 On(C,D) requires us to break the structure now

Thus we have On(C,D) but it disturbed On(B,C) which is again inserted on the stack. Now to achieve that, we look at the prerequisites. We need Free(B), which is true. We also need Free (C) which is also true. So we go for it and get On (B,C).

Figure 18.11 Achieving it finally

Now if we check for the final goal, we fortunately get it again.

Now you can see what the problem is. Here is a typical case, seeming innocuous problem which demands us to only solve two different goal components. We faithfully decide the linear order in which we can try solving one component before another, and what we get! None works perfectly.

The problem is that when we pick up next goal component and try solving it, previous one is disturbed, irrespective of the order in which we execute them. This adds unnecessary steps and also complicates the plan. The plan gets longer than required and obviously far from optimal.

This problem is called Sussman’s anomaly. GSP takes unnecessary steps for doing and undoing some steps irrespective of the order in which the components are solved in such cases.

Sussman1 proved that even for a trivial problem that we have discussed here, it is impossible to find an order in which our components are solved to get the linear solution. We need to break whatever we do first in both cases.

Before we proceed further and throw some light on the alternate method based on nonlinear way of planning, there is another point to look at. Why the word stack in used here? You can understand that GSP is basically a recursive process and the plan requires us to build solution from the beginning. We insert the goal state in the beginning, previous state and the action which converts the previous state into this state as next and so on till we reach to the start state. Now we can start popping out states and actions. Now start writing our plan from the start state, pick up next item from the stack and add it to plan and so on. To manage solution of one component disturbing another component which we have already solved, we also check the complete goal state at the end. If it is not fulfilled, we will again push (which we have already solved earlier but disturbed later), components on the stack and restart the process.

Though it takes longer, we can see that goal stack planning can actually solve the problem. That means we get a valid but not guaranteed to be optimal plan. Unfortunately if the goal state is impossible to be achieved, it can lead to trouble. In that case, whatever is left out is placed on goal stack which undoes something done earlier and that will continue forever. What if we are given a goal of On (D,E) On(E,F) On(F,D) ? This is kind of a circular queue. WE can only satisfy any two out of 1Gerald Jay Sussman was the Panasonic Professor of Electrical Engineering at the Massachusetts Institute of Technology (MIT). He described the problem that we discussed here. these three at any given point of time. The algorithm will keep on trying to solve the third component and unsolves the first one. Solving first again will unsolve second one and so on. The process will never end.

Plan space planning

If you remember our discussion on solution spaced before, you can understand the concept of plan space planning easily. Plan space contains all possible plans and plan space planning searches in this space for a plan that is right. It moves from one plan to another plan adding some actions at any part. That means it picks up another plan from plan space. Thus it moves from one plan to another.

In (linear) planning that we have seen so far, the system looks at the current state and the goal, and terminates when the state connects to the goal state. It acts linearly, adding one action at a time, looking at the preconditions and focusing on the current state. The process also adds a state resulted from that action. In GSP, it is the reverse process but logically the same. We begin from the goal state; find an action which can convert a previous state into this state, check for the preconditions and so on. The idea is to have continuous focus on the current state, deciding the next move, without looking at the over plan.

Unlike GSP and few other earlier planning methods used by researchers, where current state is the focus of attention and the state is added as the next or previous state only, the plan space planning allows actions to be added at any given place in the plan and thus sometimes this method is also denoted as nonlinear planning. The plan-space planning does not work on the strict ordering of states. The actions that are listed in the plan do not follow strict ordering. Some actions are needed to be done on order and some are not. That is why this process is also known as partial order planning. Interestingly the process begins with some set of actions in a plan and adds actions at suitable places in the plan. The process looks at quite a few things at a time. It looks at actions and their prerequisites (in plan space planning they are known as preconditions) to decide if they are applicable at a specific place in the plan or not. Plan space planning also looks as if an order is a must for some actions. For example unless you pick up a block resting on some block, you cannot use that block. So action On(X,Y) can only happen if PlaceOnTable(Z) happens in case of Z being on top of Y right now. Thus these two actions must be arranged in order. Sometimes the actions can be done without ordering. If we want to place A over B and C over D when both B and D are resting on table, both actions can take any sequence. Another point that plan space planning takes into account is priority of what to do next. Unlike goal stack planning where one picks up and solves one component and only then solves another, PSP picks up actions after actions and see if some other action is needed to be executed in the middle. For example while solving On(B,A) if it finds an action threatening to undo On(B,A) later, it might schedule that action much later, after the complete effect of On(B,A) is over, or it might schedule is before and complete it, so it cannot interfere in the process of placing B over A or disturb it later.

PSP defines and uses actions which can be a potential threat to other actions, or might change preconditions of actions to follow as flaws. It works in the way that flaws are eliminated while proceeding further. Once all flaws are guaranteed to be out, PSP guarantees a valid and optimal plan unlike GSP.

This method has the same advantage of the global heuristic functions have over local heuristic functions. The PSP looks at the complete plan rather than a single state at a time and can escape problems associated with local exploration. The linear planning methods like GSP look at local (current) state and decide next or previous move irrespective of other moves. That means the current move might invalidate something achieved earlier. The current move also prohibits some other move that is must for achieving goal and thus the current move is to be undone at later stage when one cannot proceed without undoing that move. Such complications can be avoided when plan space planning is used. We have seen that some cases reordering will get us an optimized solution but in some cases not. We have also seen that Sussman’s anomaly plagues some cases where we cannot solve the problem even by ordering it differently using GSP.

Unlike GSP, the PSP looks at the entire plan and add or remove actions from anywhere in the plan. This gives a global vision to the decision maker and thus overall impact of adding or removing an action. As plan space planning does not focus on any one component continuously, it can avoid problems which linear methods like GSP introduces. It is possible to shift attention in the middle of the process to avoid problems like Sussman’s anomaly.

The PSP is about generating sequences of actions which represent correct plan. Each action has two things associated with them, first is prerequisites or preconditions. Without the prerequisites being true, the action cannot be performed. The other part is called of an action. The effect is the changes the action provides to current state.

The initial node in plan space planning is characterized by two actions, viz. and. These two actions are part of every plan and are special as they indicate initial and final actions. A0 is the prerequisite for the start state and has no preconditions itself. Thus every plan begins with it. A∞has no effect and the precondition is the goal state. That means the action describes the final state. The initial node (start node) thus contains both, the start state and description of what goal looks like. It is the first plan which we will continue to modify unless we get a final valid plan. Thus we will begin with a simplest plan A0->A∞. Now we will check if this is a valid plan that means if all effects of action A0matches with prerequisites of A∞. , we are done.

Obviously we cannot solve problems using these two actions in most but trivial cases. We need other actions. Thus as planning proceeds further, we need to add more actions. As mentioned before, this is non-linear process and thus we can add actions wherever we deem it fit, irrespective of any order.

The PSP allows us to add links between actions apart from actions themselves. The symbol that we have used,(->) indicates the link. Sometimes the symbol α is also used. We are using both interchangeably here. Thus the idea of order changes a bit. One type of link is called ordering link. This link describes order. For example if we say that An Am, that means An must occur before Am. One of the examples of ordering link is A0 α A∞. This is a default link which clearly indicates that first action must occur before the last one. Another type of link is also possible. When a link is written as (An, P, Am) it indicates a causal link. The P indicates prerequisite of action An, which is an effect of action Am. In a way, we are telling that Am produces prerequisites for An. This by no way means that and An are to be ordered accordingly. If Am’s prerequisites are matched otherwise, it can execute without waiting for to happen. Here P is known as a predicate, Am is known as producer to P while Anis known as consumer of P. Additionally, this also has no constraint on An to occur without Am, it can happen without any need to be followed by Am, this relation only indicates the relation between these two events.

Another important characteristic of a link is whether it is established or not. An established link is a guarantee that action An gives to support the precondition of predicate P to action Am. That means this link is needed to be true for the plan. That means that the algorithm must make sure that all established links are honored in the planning process. When the link is not honored, it is known as clobbered. If any link during the planning process is clobbered, the algorithm must declobber it. That means it must make sure that the plan changes in a way that the link remains established.

A causal link may be clobbered by a threat. If (An, P, Am) is a causal link, we might have an action Axin the plan, which can delete P. That can happen in a few ways, one of them having negative P in the effect of Ax. So when Ax gets executed, P becomes invalidated when action An is taken before. That means if An is executed, we cannot have Ax in the plan till we have Am, if we have, it is going to clobber the link (An, P, Am).

For example, assume we have C is placed Over A and A is placed over B. now, Free (A) is a precondition for action PlaceOntable(A). The effect of PlaceOnTable(C) is Free(A). Thus the causal link is (PlaceOnTable(C), Free(A), PlaceOnTable(A)).

Now if we have an action On(B,A) in plan which deletes Free(A) (how do we know that? Obviously, it is because Free (A) is in the delete part of the action). Now if PlaceOnTop(B,A) executes before PlaceOnTable(A) after Free(A) is ensured, PlaceOnTable(A) cannot be executed. We must need to undo On(B,A) to get back to Free(A) and only than PlaceOnTable(A) can be executed. We can say that PlaceOnTop(B,A) is a threat to the causal link (PlaceOnTable(C), Free(A), PlaceOnTable(A)).

The preconditions which are linked with other action are calledcausal preconditions. If there is no action associated with the precondition, we call it open precondition. A solution plan cannot have any open preconditions or threats. For example if PlaceOnTop (A,B) demands Free(B) and we have no action which can produce Free(B), Free(B) is an open precondition. Like a threat, open preconditions invalidate the plan. Open preconditions and threats are also known as flaws. In fact it is possible to prove that if a partial plan has no flaws, it is a valid plan.

The PSP algorithm uses late binding strategy. When it has to proceed without really knowing the value of a variable, it does not instantiate a variable unless it has to. For example it has to free A while some block is resting on A. It might assume some block (it indicates the variable as ?X ?y etc)?X such that On(?X,A) and thus execute PutOnTable(?X) to achieve free(A). It does not associate any value to ?X right now. That makes this process true for any block that might be resting on A. Instead if we assume some other block, for example B, and execute PutOnTable(B), the process fail if C is resting on A when the PutOnTable actually executes.

There are a few more things that PSP introduces for making sure that there does not remain any flaws in the plan; i.e. some action cannot disrupt another. Let us brief about three strategies used by PSP in the following.

Separation: – this is a process of binding variables in a way that they do not assume values which lead to disruption. For example On(B,C) should not happen when C is not resting on table (for our goal). So if we make sure that in that case On(?X,?Y) cannot unify with On (B,C) it serves the purpose. What is the meaning of this condition? It says that you can plan to have anything on anything else except On(B,C) to have a valid plan.

Promotion: – if there is an action which is a threat to a causal link, promote that action to happen before it can disrupt that causal link. Thus the action gets over before causal link comes into effect and cannot disturb the causal link.

Demotion: – this is exactly opposite to promotion. The action is hold off till the actions involved in that causal link gets over. Thus when the action is applied the effect of the causal link is already completed so there is no harm to execute that action now.

In fact the PSP is quite complex involving variable binding or instantiating, unifying a variable with another, matching with partial constructs and so on which we do not elaborate further.

Solving Sussman’s anomaly

The PSP (or non linear planning) can actually solve the Sussman’s anomaly in an optimized way. We will not go in detail of how exactly it does that. We will show one typical sequence in which the Sussman’s anomaly can be solved using the plan space planning. Here is one.

We again start with two goal components, On(B,C) and On(C,D) and begin with placement of C over D but we realize that it will produce a threat to B over C, so we change plan and decide to start with other component, place B over C. For that we need to free B and so we place D on table. Now, we realize that placing B over C will disrupt placing C over D; (PlaceOnTop(B,C) is a threat to PlaceOnTop(C,D)), thus we again shift focus and promote PlaceOnTop (C,D) (place C over D) instead so it cannot be disrupted by PlaceOnTable(B,C) later. Now the threat is over, we can safely place B over C and the job is done. You can see that now we get an optimal answer.

Summary

In this module we have seen how goal stack planning solution depends on the order in which goal component are solved. We looked at Sussman’s anomaly which proves that for a simple three block problem, it is impossible to have an order which optimally plans a solution. At the end we have seen Plan Space planning which is a non-linear way to plan. It is possible to realize that an action is a threat to a link and we can schedule actions in a way that they do not disrupt others. An action can be inserted at any place in plan space planning and thus we actually move from one plan to another with more and more actions until we get all actions linked to have a valid plan. We have also seen how PSP can help solve Sussman’s anomaly.

you can view video on Problems with GSP and introduction to Plan Space Planning