17 Progression, Regression and Goal Stack Planning
Bhushan Trivedi
Introduction
In this module we will learn about progression and regression, something which guarantees moving in the right direction; either forward planning or backward planning. Progression is moving further in the direction of Goal State while we are using forward planning. Similarly Regression is about moving back from Goal State to a start state during backward planning. Both progression and regression can be relevant or otherwise. A relevant progression finds the next state nearer to goal state while a relevant regression finds the previous state nearer to start state.
Plan Domain Description Language or PDDL describes progression moving from the current state to next state in terms of characteristics of the state and the effect of the action applied. The new state is described as current state (with preconditions for that state) + additions provided by the action applied – deletions provided by the action applied. It is possible to test if the progression is relevant, that means if the newly generated state is actually nearer to goal state. Similarly one can also test if the regression is relevant.
Goal stack planning extends our discussion of progression and regression. Goal stake planning is based on inserting difference between start and goal state on the stack. Those are the differences we would like to eliminate. Here we begin with pushing the goal components generated from those differences (which are yet to be achieved) on the stack. Next task is to pick up and solve each sub-goal (or component of the goal stack) one by one from the stack top. Once the topmost component is picked up, the action decided to achieve that component is chosen. Once that action is chosen, the previous state which is needed for that action to apply and produce the current state is decided. For solving each sub goal, we need to add the action to achieve that subgoalon top of the stack which might result into insertion of further goal components which we again solve recursively using the same method. Once previous state is decided, its preconditions are also decided and thus placed on the stack. We can see that we are threading back. At any point of time if we can find a state which is achievable from the start state, our job is done we will have to return the current thread as a plan. The plan generated by GSP can be tested to be valid by a simple process of picking up actions and applying each one and see if we get the goal state. The validation begins from start state, each action in the thread is applied looking at the preconditions (the preconditions must match for the action to be applied) and apply next action in sequence continuously looking at matching of prerequisites and also checking to see that after the last action on the plan, the state generated is a goal state.
Progression
While we are forward planning, for some cases it is easier to consider a goal state to be made up of multiple independent components which can be achieved independent of each other1. In that case, the process of getting a solution is about changing the current state component by component to make sure it reaches to the goal state eventually. When we find how much part of the resultant sate is also part of the goal state, we can also measure how much the resultant state is nearer to goal state;more the resemblance, nearer the goal. If the resemblance to goal state increases by that move, we can state that it is moving towards goal state. If the resemblance is decreasing, we are moving away from the goal state.
1 Independent means, achieving one component does not hinder achieving another. This might not be true for a given problem. We will consider such cases later while we will be discussing problems with goal stack planning in the next module.
It is important to note that this is different than using heuristics. We just check after the move how more the new state look like goal state to check if the move is good. We can state that there are two possible effects of every action. Positive effects are making the next state more like goal state and negative effects are doing exactly opposite. We are not using any domain specific knowledge to assess the possibility of the move to reach to a goal state.
Looking at the description we had so far, planning to reach from state to goal state can be described as follows. Find out similarities and differences between start and goal state, and decide what is to be added as well as deleted from the start state to have a goal state. Now choose moves which proceeds in the direction of moving to goal state, in a sense that they must increase resemblance to goal state after every move. The actions which results into moves must add components required in a goal state and missing in a start state and remove components which exists in the start state bur aren’t part of the goal state.
The production rules that we used to write in earlier modules can be rewritten in a different form using PDDL. The PDDL or Plan Domain Description Language was specifically designed to manage planning2. It describes each action in following form Action (parameters): -Preconditions for the action, additions to the current state, and deletions from the current state.
For example if we describe an action called PlaceOnTable, it picks up some block A (which is going to be a parameter), the precondition is that A must be free, If A is resting on top of block X, now X is free too, on (A,B) now is to be deleted as that will no longer be true. Let us write that rule in PDDL form3
PlaceOnTable(A)
Preconditions: – On(A,X) ᴧ Free (A)4
Additions: – On(A,_) ᴧ Free(X)
Deletions: – On(A,X)
The rule states that when A is placed on top of X, and A is free (preconditions), we can have On (A,_); (A is on table) and now X is free (Addition to the previous state), now On(A,X) is no longer true so part of the delete list.
2. PDDL was developed in 98. It allows the current state of the world to be described as well as the action that is possible to be applied on that state.
3. PDDL has its own syntax which is based on a well-known AI language LISP (Abbreviation of List processing), we are not showing the syntax here.
4. The value X indicates that it is a variable. That can assume any valid value. That means while block A is resting on B, X becomes B, while it is resting on C, X becomes C and so on. That makes this rule applicable for any block A is resting on.
Please also see that something which does not change is not mentioned in the additions and deletions. For example A still remains free in the above example, it is not stated in either of the additions as well as deletions. Also, in a given state when A is resting on top of some block X, it is quite possible that block B is resting on top of some other block Y. As the action does not mention anything about that, it is assumed to be part of the new state. This seems logical as placing A on table should not have any side effects on any other part of the state. All such things which are part of the previous state are assumed to be part of the new state. For example the state might additionally have D on top of E. That does not change at all after applying the earlier move. We do not need to mention such additional things which do not change during state change and thus the rules look more crisp and uncluttered. At the same point of time, one must be aware of the side effects of application of rules. Remember our discussion on the frame problem in module 7. The rule writer must be aware of side effects of any action and thus incorporate every possible change in the description of that action. Thus, when the action PlaceOnTable is applied to ∪state S, The new state nS = (S –PlaceOnTable.deletions)) (PlaceOnTable.additions); i.e. remove everything which is part of deletions from the state S and add all additions. The notation action.deletions and action.additions means deletions when action is applied and additions when action is applied. Thus when we write PlaceOnTable.deletions, it indicates deletions from the state when PlaceOnTable is applied to that state.
We can now define progression as generation of a next state nS from current state cS, when action a is applied to it. This is nothing but applying a rule and generating a new state. Once we have clear idea of how progression for a given problem is achieved; i.e. how a new state can be generated from an old state applying a typical action, we can apply our search algorithms over that problem. Similarly Regression is a generation of previous state pS from a current state cS such that when action a is applied to pS, it results into cS.
Relevant and non-relevant actions
Whether we are heading in right directionor the progression is relevant to goal state or not, can be found using a simple method. Whatever is added in the new state, some part of that must be there in goal state and whatever is removed from the current state, must not be in the goal state. These two conditions can be written as follows; considering goal state as gS, for action a.In this case the action a is called relevant.
- a.additions∩ gS <> Φ // something added which is part of goal state
- a.deletions∩gS = Φ // nothing which is removed is part of goal state
For goal directed reasoning, it is possible to go back to a previous∪ state pSfrom a current state cS applying action a if, the previous state pS = (cS – a.additions) a.preconditions. That means pS can be a parent state of cS when action a is applied only if pS has all preconditions for application of a and when additions of a are removed from cS, we can get pS. See that we are not discussing about deletions. pS might contain a few things which are prerequisites but removed when the action is applied. That is implicitly stated in our description. That means if we want to get pS from cS, we have to remove additions of a, and add preconditions of a. This is called the process of regression.
To understand the process of progression and regression let us take one example.
ActionPlaceOnTable(A) is applied to a state described in the following.
Current state cS = Free(A) ᴧ On(A,B) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
The application of action results into
New state nS = Free(B) ᴧ On(A,_) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
Additions(PlaceOnTable) : – Free(B) ᴧ On(A,_)
Deletions(PlaceOnTable): – On(A,B)
Assume the goal state gS = On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). Is the action PlaceOnTable(A)
relevant? Let us check both conditions. We will represent a as PlaceOnTable(A)
What about progression? Please have a close look at figure 17.2. The new state is visibly closer to final state depicted in 17.1 and thus confirms our assertion that PlaceOnTable(A) is a relevant action here.
Regression for goal directed reasoning
Can we do the same thing for goal directed reasoning? We will use word regression to describe the progress in the backward direction. Let us take an example of a goal state and a previous state in figure 17.3.
Figure 17.33 regression, moving from a goal state to a previous state.
We have taken a previous state from where we can reach to goal state. In other words, it is possible to move from this goal state to a previous state. Is this state is relevant? Let us see.
We will now introduce one more action PlaceOnTop(X,Y) as follows.
PlaceOnTop(X,Y)
Preconditions: – Free(X), Free(Y)
Additions: – On(X,Y) //Now X is resting on top of Y
Deletions: – Free(Y)
Now we are trying to see if the previous state is relevant. Let us apply the formula
(cS – additions (a)) preconditions (a)
=(On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _) – On(C,D) ) Free(C) ᴧ Free(D)
=(On(A,_) ᴧ On(B,A) ᴧ On(D, _) ᴧ Free(C) ᴧ Free(D)
What we get is the previous state shown in the figure. But here previous state is not relevant. You can see that yourself. There is no point placing C on the table as we are going away from the initial state. It is better to break the other stack (B over A). Let us check it using the same formula.
(cS – additions (a)) ∪ preconditions (a)
=On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _) – On(B,A) ) ∪Free(A) ᴧ Free(B)
= On(A,_) ᴧ On(C,D) ᴧ On(D, _) ᴧ Free(A) ᴧ Free(B)
We can check the relevance using the same rule; just changing the goal state in our formula to start state, addition is replaced by deletion and vice versa. Let us write the formula first.
For action a and start state sS
1. a.additions∩ sS = Φ// nothing which is removed is part of start state. See that in previous
//state additions were not part so are removed when we move from
//current state to previous state
2. a.deletions∩ sS <> Φ // something added which is part of start state. Whatever is deleted
//was part of previous state but not current state
a.additions∩ sS
= On(B,A) ∩ Free(A) ᴧ On(A,B) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
= Φ
a.deletions∩ sS
= Free (A) ∩ Free(A) ᴧ On(A,B) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
= Free(A)
<>Φ
That means this backward move is relevant.
Let us take the first case
a.additions∩ sS
= On(C,D) ∩ Free(A) ᴧ On(A,B) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
=On (C,D)
<>Φ
Now let us take deletion
a.deletions∩ sS
= Free (D) ∩ Free(A) ᴧ On(A,B) ᴧ On(B,C) ᴧ On(C,D) ᴧ On(D, _)
= Φ
Thus the first move was not relevant.
Now let us move on and see what the significance is of whatever we have discussed so far.
We have seen that it is possible to have progression while we are moving from start state to final state and regression while we are moving back from goal state to start state. It is possible to test the relevance of the move. The relevant moves help us moving in the direction of goal state. We have seen earlier that most search algorithms have no sense of direction. We have looked at a simple method to improve that sense without using any domain knowledge. The bottom line is, out of all possible moves, we would like to choose relevant moves for planning.
Goal Stack Planning (GSP)
Goal stack planning is one of the easiest methods to plan. As the name suggests, it finds the difference between start state and goal state first. The idea is to make sure that difference is to be bridged to reach to goal state by providing sequences of actions which does so. It adds all components which are different in start and end state on the goal stack one after another. That means the complete difference between these two states is recorded. That also means whatever is left to achieve in the goal from the start state is inserted in the goal stack.
One may wonder in which order the component or sub goals of the goal state is inserted on the stack. In case of multiple components, it generates multiple stacks for each possible order. That means if we have components C1 & C2 & C3 which are not there in the start state and needed to be there in the goal state, we can achieve them in following different orders
- C1, C2 and C3
- C1, C3 and C2
- C2, C1 and C3
- C2, C3 and C1
- C3, C1 and C2
- C3, C2 and C1
The GSP will pick up each order one after another and test if it works fine to generate a valid plan. After finding out all differences and inserting the components in the goal stack the GSP tries to reduce that difference by choosing appropriate actions. It picks up each goal component from the stack and decides the operator or action to reduce that difference. When a complete difference cannot be bridged by one action, it looks at new goals which one must apply to achieve to reach to prerequisites of the action. The operators and the new goal are now pushed onto stack after removing that goal. Even in this case if the action cannot begin from the start state, one more action is chosen to add to the list. For that, it picks up topmost items from the goal stack and tries reducing the difference by providing appropriate action and operators. It continues until it finds a connection of start state with an end state describing complete plan to achieve end state from the start state. While generating sub goals or components from the goal components, it will have to try all possible orders.
The algorithm places current goals on the stack and find out preconditions or goals for achieving it. It uses a single stack for storing two types of members
- Goal components or sub goals
- operators or actions that are used to reach to the goal components
The Goal Stack Planning method was used in a program called STRIPS works on a blocks world domain we have already discussed. In our example we have used much simpler representation as compared to STRIPS. In STRIPS, there are actions which are defined in little more elaborated manner than what we did in this module. For example it has ARMEMPTY which indicates that the robot arm is free, UNSTACK (A, B) for removing A from the top of B and so on are part of STRIPE.
The goal stack planning works like this. It picks up the goal state description. Whatever are already part of the start state are achieved and we should only concentrate on remaining part. Now we need to find the link from the start node to the goal component. What we do is, we find out which state component can result into the goal component by applying which action. If the action’s prerequisite is matched in the current state, we can see that by applying that action we can reach to that goal component. If all goal components are reached, we are done!
GSP example
Here is a simple example to illustrate how GSP works. Let us pickup start and end states mentioned in figure 17.1. The goal state is described as
On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _).
The goal state has four components out of which two On(C,D) and On(D, _) are true even in the start state (depicted in gray background to differentiate), so we need to only achieve two of them On(A,_) and On(B,A), They are pushed in the stack. We may push them in any order but pushing On(B,A) first is better as On(A,_) is a prerequisite to On(B,A). If A is resting elsewhere and we place B on top of it, we need to undo On(B,A).
We can write our plan as depicted in figure 17.5 as a stack (The goal Stack).
On(A,_) On(B,A) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.5 Goal stack for the problem
We first will have A on table, then B on top of A and we will get the solution in the third step. The goal stack contains every component in this fashion only. It will have each component ordered one or the other way and at the end complete construct is provided at the bottom. How it is done?
First the complete goal construct is inserted on the stack. Then the components are entered on stack in some order. Sometimes the order is based on some logic as we have discussed here or sometimes it is random.
To solve the problem, we pick up first goal, On (A,_) which Is not true in the start state. To achieve On (A,_), we need to have Free(A) which is true in current state, so we need to add action PlaceOnTable(A) which is possible in current state. So the plan is possible to be executed in the current state. Next item on the stack top is On(B,A). The prerequisite for On(B,A) is that both A and B are to be free; which is also true! After PlaceOnTable(A) is executed. Now if we check the complete structure we can have also found to be true and we are done. The final plan is depicted in figure 15.6 . The plan stored in the stack is already a valid plan and we do not need to do anything further.
PutOnTable(A) On(A,_) PlaceOnTop(B,A) On(B,A) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.6 Final Plan
Now let us take another route. Now we pickupOn(B,A) (which we know is not good compared to earlier but let us see what happens). The goal stack is depicted in figure 15.6.
On(B,A)) On(A,_) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.7 Goal Stack when we pick up ON(B,A) as first component
On(B,A)’s preconditions are Free(B), Free(A). We have Free(A) but not Free(B). Thus the Goal Stack becomes one depicted in 15.8. To Achieve Free(A), we need to unstack A from B and thus we will have to execute action PlaceOnTable(A). That actually is the same which we have in case number one, placing A on table. Thus for second operation, we need to try and fall back to previous solution.
Free(B) On(B,A)) On(A,_) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.8 Goal Stack when we pick up ON(B,A) as first component
PutOnTable(A) Free(B) On(B,A)) On(A,_) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.9 Goal Stack when we pick up ON(B,A) as first component
PutOnTable(A) Free(B) PlaceOnTop(B,A) On(B,A)) On(A,_) On(A,_) ᴧ On(B,A) ᴧ On(C,D) ᴧ On(D, _). |
Figure 15.10 Goal Stack when we pick up ON(B,A) as first component
The plan depicted in 15.9 and 15.10 is essentially the same but longer way of doing it. Though the example that we have seen is very trivial and thus we do not have big goal stacks but you can see that the plan that is being developed depends a lot on the order of sub goals or components that we choose to explore. It might seem that if we choose right order, the plan becomes short and optimal, otherwise not. Unfortunately, in some cases, whatever order we choose, the process becomes long and non-optimal. We will see a small example in the next module to illustrate that point.
Testing the validity of a plan
Suppose we have some problem for which we need to plan as solution. What we need is a sequence of actions which converts the start state into a final state, gradually converging to it state by state when sequence of actions applied to it one by one.
For example we have a start state sS and a goal state gS and if we have sequence of actions a1,a2, ….an such that when a1 is applied to sS, it progresses to state S1, when a2 is applied to S1, it progresses to state S2 and so on till when an is applied to Sn-1, it progresses to final state gS.
This sequence of actions is called a valid plan in this case. Once we have sequence of actions as a plan, we can proceed like this to verify if the plan is valid; i.e. whether the plan yields the goal state. Here is a formal algorithm.
- Pick up the Plan as a list, start state and a goal state
- Current state cS = start state sS
- Pick up next action from the plan-list
- If no action left,
- If (cS == gS), exit with success
- otherwise exit with failure
- next state nS = progress(action, cS) //assuming function progress exists
- If (cS == gS), exit with success
- Else go to 3
An interesting component of the algorithm above is the function progress. This function applies the action on the current state to generate next state.
If fact we also need another function relevant () which tests if the newly generated state is relevant or not. Testing of a plan is simple but generating one requires more efforts We will continue to discuss about GSP in the next module and see what other problems with GSP are. Some of them are addressable while some of them are not. We will also see how Plan Space Planning solves the problems impossible to be solved by GSP.
Summary
In this module we have seen how progression and regression can be checked for relevance. A relevant progression and regression helps us to check the movement being in the right direction. We can test if addition produced by action is present in goal state and reduction is not, then we can say that progression is relevant. Similar is the case for regression. Goal stack planning moves back from goal state to start state; it picks up each goal components and solves them one after another. The order of the solution of sub goals or components has some say in the optimality of the plan produced.
you can view video on Progression, Regression and Goal Stack Planning |