31 Scripts

Bhushan Trivedi

Introduction

In the previous module we have seen that humans commonly refer to a complete scenarios while discussing and speaking statements. We took the case of watching a match in the previous module. There are many other such cases. For example when I say that “We dine at the Osho’s Eatery when we went to Mandvi”, you probably get many other things from that statement. For example you might conclude that Mandvi is not my hometown, we went to Osho’s Eatery, we looked at menu, we ordered items, items were served, we ate them, and we paid the bills and then return back. Humans commonly reason using such expectancy driven methods. When we go to dine, we expect it to be a restaurant of some type, there are events like reading menu, ordering, eating and paying bill must happen, there are people involved like waiters, cashier, restaurant owners, persons who take orders etc. We also expect some objects to be there like a menu, food items, dishes, spoons bowls etc., table cloths and decorating items and so on. Script is about having similar representation in the system. Whenever we get a hint of a typical script being invoked for example when we mention something like “We went to JungleBhookh yesterday, we enjoyed the food” without any mention of word dining or restaurant, a restaurant script is invoked and any query based on that is answered.

Scripts

Schank and his teammates also invented scripts (besides CD). They were working on programs which understand the stories (which consists of multiple simple statements) and responds back answers based on the content provided in the story.

The scripts are designed to describe the stereotyped events like going to a restaurant or going to watch a movie and so on. Schank felt the need for Scripts because of a limitation of CD. The CD is about describing some events (like watching a match). Hardly ever, such events occur individually. There are some common sequences of events assumed when they are referred in a statement. For example when I say “I went to Indroda Park (a typical natural zoo) with kids”, you may assume the complete sequence of buying tickets, parking car in the slot, taking picnic related items from the car, move around, watch and appreciate animals, take rest during the lunch time (or breakfast time if it is evening) and enjoy food that we brought with us, return back to car and then move out of zoo. Script is a mechanism for representing knowledge about sequence of such events and inferring from the same.

A typical script structure is shown in the figure 31.1 which is a very simplified representation for watching a match. In case of a reference to watching a match, the complete script is invoked and activated. That means all these scenes and statements are inserted in the representation of that particular statement. Names and other items are filled as and when needed. For example when we say that Jay went to watch a match, Jay will replace P. If you try to recall our description of frames, you can see that there is some similarity. There are places which resembles slots. There are default event related information provided. For example it is assumed that the entry in the stadium is done after tickets are obtained from outside ticket counter. That is a default information. In case of no other information is available, they are assumed to be true.

It is not always the case that we pick up default values for scenes. It is also possible that the tickets are purchased online and that step is already eliminated (so it is to be removed from the script inserted in the place where the statement is introduced). It is also possible that due to tight security some belongings are to be invited to be kept in the locker room which we have not discussed here but possible (so to be added to whatever default is provided by the Script). Point is, what we have described is a default assumption which might change. In fact it is possible to assume multiple paths for a given event and write all of them in a script. For example one might not sit till the match ends. If he finds his team is surely going to lose, he might leave early. Similarly in a restaurant script the last scene is about paying the bills. If the customer dislikes the food or dislikes the service he follows the non-payment path. For example if we encounter statement, “Jay asked for Italian Pizza with double cheese. He waited for 15 minutes, got angry and went away”. You can easily understand that he has followed the alternate path and thus neither eating nor payment part is considered. That means it is possible to have default values which might change if we have evidence to the contrary. Thus when information is provided, other than default values are taken.

However, the script is quite different from frames. Different objects representing same class by frame only differ in the value of their attributes but the attributes are all the same. For example take the case of two different student objects. Any student object that we take will have same set of values like Name address and all that. They only differ in their name, address values. Unlike that, scripts entire paths change and thus some part may not exist in one case which does in another. Scripts also have some parallel events taking place which is not the case with frames.

The scripts have a typical structure as our example shown in Table 31.1.

Table 31.1 A script for watching a cricket match

Script : watching a match

Various Scenes

Track: Cricket match

Props:

• Tickets

• Seat

• Match

Roles:

• Person (who wants to Watch a match) – P

• Booking Clerk – BC

• Security personal – SP

• Ticket Checker – TC

Entry Conditions:

• P wants to watch match

• P has a money

Results:

• P saw a match

• P has less money

• P is happy (if his team has won) or not (if his team has lost) or some other problem at stadium

Scene 1: Going to a stadium

• P PTRANS P to thestadium

• P ATTEND eyes to ticket counter

Scene 2: Buying ticket

• P PTRANS P to ticket counter

• P MTRANS ticket requirement to BC

• P MTRANS stand information to BC

• BC ATRANS ticket to P

Scene 3: Going inside stadium and sitting on a seat

• P PTRANS P into Stadium

• SPMOVE security check P

• TC ATTEND eyes on ticket POSSby P

• TC MOVE to tear part of ticket

• TC MTRANS (showed seat) to P

• P PTRANS P to seat

• P MOVES P to sitting position

Scene 4: Watching a match

• P ATTEND eyes on match

• P MBUILD (moments) from the match

Scene5: Exiting

• P PTRANS P out of Stadium

The table 31.1 script has five scenes, two entry conditions, two props, four roles and a typical track. Each script contains those components (depicted in the table 31.2).

Entry conditions are must for entering the script. Results are the outcomes of the script. Props and roles are items and people involved in the process while scenes are there to describe sequence of events.

Table 31.2 Components of a script

*Entry* *Conditions*:	Conditions that must be satisfied for execution of the script. Whenever a script is referred, one can assume the preconditions to be true. For example when we read a statement, Jay went to watch a cricket match, it is also concluded that he has money to buy tickets and he is also a cricket fan. Even when not explicitly mentioned, entry conditions can safely assumed to be true.
*Results:*	The Conditions that will be true after exit. This is a general (and thus default) assumption. It might be false under exceptional circumstances. For example when there is rain and the match is abandoned, watching does not happen. Happy may also be false.
*Props:*	Objects involved in the script. Tickets, seats etc. are other objects that the person deals with while watching a match. Here also are some varieties, for example it is possible that with a pavilion ticket, he might also receive a food coupon.
*Roles:*	Persons involved in the script. Again, this is a general assumption. We have not mentioned fellow spectators, or umpires or players in the simplified version of the script that we draw. If might involve all of them in the professionally written scripts. A more detailed script might also involve events like tossing the coin between rival captains, information about innings, score cards and so on.
Track:	Specialization of the script derived from a general pattern. A general pattern may inherit multiple tracks. That means multiple such tracks look quite similar but they have their own individually different scenes or other items associated. For example watching a football match might contain referees, linemen, and so on while a detailed cricket match might have a wicket keeper, umpires, a third umpire and so on.
*Scenes:*	The sequence of events following a general default path. Events are represented in CD form but mentioned as a semi-CD form, just describing the ACT and rest in English for pedagogy purpose.

The agents act on things in real life in a sequential fashion for executing a typical task. Scripts are useful because they represent the real life events in that form. In a way, scripts represent causal relationship between events. For example if we encounter a statement that “Jay went to watch a match”, and another statement after that “he was very happy”, we can conclude that his favourite team has won. You can find cause of something. The events described in the scene are also connected to each other by cause-effect relationship. In fact the entry conditions and results connect scripts to other scripts generally. When we encountered a statement, Jay watched the match and come back home by a bus, we must conclude that after coming out of stadium (the result of first script), and Jay must went to a bus station nearby to pick up a bus to home. Traveling by bus has the entry condition that you are at the bust station. You might encounter set of statements as follows.

Jay went to watch an ODI, he was late, when he occupied his seat, asked his fellow spectator, “Who won the toss?” He answered “India has won the toss and invited South Africa to bat”, Now if the question is asked Why Jay asked who have won the toss? Two possible answers are “he wanted to know which team is invited to bat” and “as he was late, he could not see the toss”

They are obtained from travelling cause-effect chain in either direction. If we travel further to decide the effect of the question, we learn that Jay wanted to know about who is coming to bat first. If we travel back to look at the cause of the question, we learn that he was late due to which he missed the toss part which is the first scene of the script (our simplified version does not have it).

Thus events in the script are connected by cause-effect relationship and that helps in determining the reason or consequence of something.

A script is considered appropriate if it matches with the description. That process is known as matching. Merely checking if ‘watch the match’ or ‘went to watch the match’ is part of the description and invoke the script runs into trouble. There may be thousands of scripts in the memory of a problem solver and when a statement in encountered, it is not that straightforward to determine which script is (or is not) to be invoked. This problem is quite similar to picking up an index to fetch the record. The harder part is to match the index value. For example if we encounter a statement Jay went to watch the match, he received the news of sad demise of his grandfather and he returned back. Will you think that the script ‘watch a cricket match’ should be invoked? Has he really watched the match? No. thus the script should not be invoked.

Another example Jay went to a friend’s house near Cricket Stadium. He enjoyed playing cricket with his friend there. Both playing cricket and cricket stadium in mentioned in the text, should we invoke watching a match script? No.

Sometimes even human reader cannot gather the meaning. For example After the match gets over, Jay went to his friend’s house.

This statement does not offer a conclusive evidence that Jay watched the match. Sometimes such statements are said to refer to fleeting scripts. The scripts which are mentioned in the statement but not central to the discussion. What the designer might do is to keep a reference and invoke the script later if some other statement also refers to the script. I you closely observe, humans also do the same thing.

On the contrary, when the scripts are really appropriate, they add immense value to understanding of that statement. All events, not explicitly mentioned can be reasoned, all people not explicitly mentioned can be involved, all the props involved in the process are also possible to be considered. The script can also help the system understand the sequence of events (for example during a cricket match, a lunch occurs before a tea and a toss occurs before the math commences). An important job after matching the script and finding it appropriate is to look at the slots, fill names of people, stadium, playing teams and so on. That is known as activating the script.

Usually script is considered appropriate based on information provided in the entry conditions, locations, people involved, objects involved, and other related things. These things are known as script headers. It is usually preferred to invoke the script and activate it only if more than one header values are matching. However good the matching process is, it is quite likely that a spurious script is invoked and executed. One also requires necessary method to check if the activated script is really appropriate at later stages.

When the script is really appropriate, it can put to many uses. For example if we read following story Manoj had a chocolate. Rajan wanted it. Manoj refused. Rajan told Manoj if he did not give him chocolate, he will not let him play with his friends. Manoj gave him the chocolate.

Now, a question is asked, “Why Manoj gave Rajan a chocolate?” a response can be “Because he was threatened by Rajan”1

How the program could understand and get the idea of being threatened? There are no words which indicates this bullying process. Anyway this is possible if the script contains such information.

When the script is invoked, it is also possible to predict something which is not mentioned in the statement. For example consider following statement.

Jay went to Epicurean (a name of a restaurant) with his colleagues of the office. As it started raining when he was paying, he called a cab service for returning home.

Now if Jay’s automated cook inquires the system if she will have to cook for Jay or not, what the system should respond if above statement is already fed into it? The system has to decide if Jay had his dinner or not. In fact two things related to restaurant script is mentioned, first, he went to restaurant and second, he paid the bills. It is enough for the system to invoke the complete restaurant script and conclude that Jay had his dinner. Thus whenever a script is activated, all scenes are expected to have occurred in their mentioned sequence and thus any reasoning based on that can be made. Thus script adds predictability to the system. The system could predict unmentioned events. In a way, frames could also do so by providing slots and default values for slots. An important attribute of a knowledge structure is the ability of it to predict obvious and expected behaviour from the description even if it is not mentioned explicitly.

A diligent reader might again ask how the system come to know that the Epicurean is a name of a restaurant. There are a few possible solutions; a simplest being a database of all such restaurants. Second is to keep it as a question mark, read a few more statements and reason about what it could be. Third, take a logical route and get the dictionary meaning of epicurean and relate it to restaurant. Forth, ask the user, what ‘the epicurean is’ whenever he enters this statement. However strange it look like, the fourth option is more like human. When we hear that statement and have no idea what the Epicurean is, we would do the same thing. We can even combine multiple approaches together and store all such answers in the database. If the item is not in the database, only then ask the user.

“1 A program based on Scripts and CD developed by Schank actually did reason like this. That program was named SAM (Semantic Interpretation and Resolution of Ambiguity) and was part of the Yale AI project. The program was popularly known as ‘Story Understander’ as the input to the SAM are stories of this type and responses are more human like than usual.” One more thing which we have already mentioned earlier is that script help the system check for unusual events. If we encounter a story. “Jay went to watch the match, the security person asked him to stand in a queue. There was heavy rush on ticket counter and queues are very long. Jay stayed in a queue for an hour or so, got bored and return back home. “

Do you conclude that Jay has watched the match? There are some elements which are not part of the conventional script of watching the match. For example mention of long queues. Jay got bored before buying the ticket or entering the stadium clearly indicates that there is a deviation from a normal sequence of events. This typical unusual sequence of events not only indicate that Jay has not seen the match, but a few more things like Jay likes cricket match but he dislikes standing in long queues or waiting too long (you might additionally conclude that he does not like cricket to that extent!).

In fact scripts can also be used for cause effect reasoning like we discussed earlier. If we pick up the earlier story and ask “Why Jay get bored?” the answer is not because security person has asked him to stand in a queue but queue was little too long and it was taking lot of time. Such reasoning is possible if we have provided enough if-else in the script and describe possible reasons for somebody getting bored. A good designer can predict and provide as many logical options as possible make the script extremely useful.

Some other similar attempts

A script stands alone and is not connected directly with other scripts. As the AI problem solver has more and more such scripts with real world experience, it becomes harder to have them organized collectively. MOP or memory organized pockets were introduced by the same bunch of researchers. MOP allowed multiple scripts to be collected as a single large structure and reason with that structure rather an individual scripts. The idea of goal enters into picture when we discuss MOP. An MOP has multiple scenes which can direct the problem solver to move in the direction of solving that goal. MOP also has some reusability idea, for example if we consider a script of watching a movie, some scenes for example buying the ticket may be similar to watching a match.

What MOP does is to combine such sequences of scenes to enable better and non-isolated reasoning. Thus it acts on a single structure with logical connects which is easier to manage than a collection of multiple scripts. Other important feature of MOP is that it stores two different types of information separately. First type of information is known as ontological. That is about the information about objects which are related to other objects and have typical contextual meaning. The ontology defines the context and the placement of the object in that context. For example when a financial context is provided, the word bank invokes the idea of a financial institute while a river context, the same word invokes the idea or a riverbed.

The other type of memory used by MOP is known as episodic memory. The episodic memory stores the knowledge of the agent experiences. It is quite similar to scripts but has the ability to modify, improve and ability to relate with other such episodes. Episodic memory consists of events, and more or less static unlike semantic part which continues to evolve2. One important component of MOP is that it can learn from its experience and script like structures can be built on the knowledge learned from repeated experiences of such events (exactly like humans learn. When we go to the restaurant for the first time, we have little idea of how it works but after a few experiences we start expecting menus, waiters, tables and typical sequences of events). “2 Another answer to the question how the system learned about the Epicurean to be a restaurant is that it is mentioned in the ontology somewhere. The database that we were referring can be designed in many ways, one of the most useful ways is to have it in ontological form.”

One more important idea is stressed in MOP’s design is to notice similarities and differences between similar looking experiences. For example a program based on MOP (Called Swale) could relate two distinct stories. One was about a death of a very successful and fit race horse. Another was about spouse killing spouses for insurance money. The program could suggest that the horse was killed due to similar reasons. Another interesting attempt was a program called PAM (Plan Applier Mechanism). We might come across a statement as follows.

Abdul decided to earn more money. He called Rajan.

Normally we cannot relate these two statements, unless we know what Abdul’s plan are. Rajan may be a businessman who offered him a job at a farawayplace where Abdul does not initially want to go for family reasons. Now he decides in favour of it. Rajan may be a smuggler and Abdul is good at motorboats so he want him to work for him. Another possibility is that Rajan is a placement consultant and Abdul is qualified techy who is looking for a job.

The reasoning, for short, can only be determined if we know a few things like what are Abdul’s goals, what are his plans and how his actions are related to his plans and goals. Unless plans and goals are understood such statements are harder to understand. PAM was designed to work with this type of reasoning.

SAM, which we mentioned earlier, was an attempt to understand stories.There was another attempt called TALE SPIN to generate a story from given facts. The idea was to see what is currently believed to be true and generate further statements from them using logical inference (like we did in past using our predicate logic journey), but will also create other elements like script (Roles and Props and so on).

Another attempt was BORIS program which could work with goals, assertions, and especially the situation in which the plan failed to work as expected. One typical example given in the paper of the inventor of this program (Prof. Dyer) indicates a case where the central character in the story went to a restaurant and waitress spilled a glass of coke over him. The problem solver can correctly answer questions like why he denied to pay the bill without any mention about any relation between these two events.

There are a few attempts to improve the semantic knowledge storage. DL or description logic is a branch of knowledge representation for storing information in a hierarchical way. A complex knowledge being represented as a collection of simpler knowledge structures.

Another attempt is to organize the world knowledge in semantic form using OWL or Web Ontology Language. OWL has many similarities with frames. It is about storing classes and their attributes. XML usually is used to implement OWL but it is not compulsory by the standard.

Summary

Script is designed by the same research team which designed CD. Script describes default scenes, entry conditions, people involved and items involved in a typical case like watching a match or watching a movie or going to a restaurant and so on. Each script is designed in a way that it contains multiple scenes which describes what happens in that script in form of a CD representation. It is possible to have if else paths and optional scenes in a script. Whenever a typical statement is encountered, it is checked to see if it contains a reference to a known scripts form the database. If the reference is found, that script is invoked and activated. This type of service helps the problem solver reason with things not explicitly mentioned in the script. The matching process is tested when fleeting references are encountered. In that case the scripts are not invoked but kept as a reference. Deviation from expected sequence help the problem solver to learn about unusual events. There are many other attempts to further the process of getting better meaning out of natural statements like MOPs, PAM, SAM, and BORIS and also better representation of statements with better contextual understanding like DL and OWL.

you can view video on Scripts