32 Introduction to Expert Systems

Bhushan Trivedi

Introduction

We have encountered word Expert System a few times so far. This module introduces the Expert Systems and provide an overview of the ES. We will look at the terminology used in the domain and also the integral parts of the ES.

The experts systems are designed to mimic human experts. The tasks which are considered extremely complicated and are only able to be managed by human experts are addressed by these systems. We can provide many examples of such complicated tasks such as a doctor diagnosing the disease and suggesting a medicine for a patient, civil engineer who looks at the requirements of the user and produce design of the house or other civil structure like bridge or theatre, a computer scientist who look at the user’s requirement and produce UML diagrams to represent the computerised system for solving them, a security expert who looks at the network traffic and decide if an attacker is active or not and so on. The expert systems are designed to attain expertize to deal with similar problems and provide automated solutions for them.

It is not as easy as it sounds (that you probably have guessed based on our discussion of AI topics so far) to build such systems. In fact for most real cases it is hard to differentiate a true expert system from other systems as most systems have some part of ES in them. Whenever any system tries to add humanlike features into it, it encounters the same issues that one needs for a typical expert system development process. For example navigation by speech or handwritten character recognition or understanding customer behaviour or suggest the user what is good for him. One hardly find a ‘pure’ expert system in current era which can only mimic a human expert and do nothing else, most commercial systems are usually augmented with expert system components. As our discussion now onwards applies to both cases, a pure ES or ES components which augments general commercial systems, we will not worry about the difference any longer; i.e. we will not differentiate between a standalone expert system and an expert system which is part of some other system. Interestingly, most commercial systems have already reached to a stage where all possible features that they can provide in a normal sense are already been provided. The business competition drives them to look for new ideas to add spice to their products and ES gives them an option. For example many systems are augmented with abilities to reason with Big Data (data coming from variety of sources especially social networking sites or past data or sensors et, which is very large in number, has lot of varieties which disable them to be stored as it is in conventional databases, and also rapid changes are occurring) . Many systems are (especially dealing with smart phones) dealing with other than keyboard inputs (like stylus with handwritten character recognition and like voice commands instead of typed commands). Designing such systems are not similar to designing conventional systems and is more difficult than one may assume at first sight. This module and subsequent modules will throw some light on the issue.

ES Tasks

ES attempts to perform complicated tasks which were only possible to be performed by expert humans otherwise. That means ES targets systems which are not a cup of tea for a common man. Here iswhere ES deviates from a normal AI goal. All the examples that we have seen in the introduction part is not something which a common person can do. For example determining whether the given network traffic is malicious or not is not possible for even a computer expert of some language like Java or C++ or a database expert, leave alone a common person with little knowledge of a computer.

However hard such problems look like, they are definitely more structured than seemingly simple mundane tasks which are actually harder to code. The ES systems are dealing with problems which are clearly defined and solutions are well documented. There are experts who can monitor and comment or rate the performance and can state whether a particular decision is right or wrong and why. For example if one would like to detect intrusion, there is lot of data available on it for detection and monitoring. If one would like to diagnose a disease, ample of data can also be provided. There are rules which can define things like a denial of service attack or a malicious packet or something similar. The programs which are designed to mimic experts have met with more success than other AI problems that we have mentioned so far just because they are addressing problems which are more structured and possible to be cross checked by experts. Some researchers go to the extent to claim that ES is the only part of AI which is truly successful. There are some areas which are successfully addressed by ES.Following are examples of pure Expert systems, but as we have seen earlier, the trend is to have ES like features in common systems. There are quite a few examples of true expert systems which are successful in last decade. Here is a quick rundown.

ES are built for many purposes like Hearsay for speech recognition, PROSPECTOR for helping geologist in mineral exploration, Risk assessment in preterm birth by a few expert systems, diagnosis and suggestion of medicine by quite a few in addition to Mycin. Determining the molecular structure (Dendral), Planning (MPAUV Mission Planning for Autonomous Underwater Vehicle), Assisting operators in in the diagnosis and treatment of nuclear reactor accidents, Solve student assignments related to mathematics and other topics (SAINT and FROSH), Crisis management (for example Toxic Spill Crisis Management), help teachers teach disabled children more effectively (SMH.PAL), mission critical control (INCO Integrated Communication Officer) and many others.

The study of AI in general and ES in particular, help us powerful technical solutions to many problems but also help in learning how humans (experts) decide and work to find solutions to extremely complicated problems.

What ES entails

As we already mentioned before, ES tries to solve complicated problems like human experts. It is basically a computer application but with specific requirements and thus having specific parts. Let us try to understand what an ES should have.

1. Domain knowledge: – This is the most critical part of any expert system. It is the knowledge of the domain the ES working in. For example an intrusion detection system should have knowledge about how computers function, how databases work, how OS works and how things are related, apart from detailed knowledge of how network functions and TCP/IP stack and also how attacker works, what are attacks, what are the detection techniques, how to determine the detection technique for a given case and so on. For another example, consider the ES which works at premature birth risk assessment, many medical conditions that lead to such risk and over and above knowledge of complete human body and functioning of it including premature birth related problems in mothers and problems occur in infant’s body for the same reason. Thus every ES must have extensive knowledge of the domain it is designed for.

2. Real Time Searching ability:- Domain knowledge is of no use unless proper and real time searching is possible through it. The most critical problem with expert knowledge is that the expert database is not easy to be indexed and many times it is to be searched based on association. For example when a typical mathematical theorem is to be proven, one will have to choose best methods to solve it based on characteristics of the problem. Or the medical database is to be searched to be based on patient’s symptoms. Another example is to look at packets, find out symptoms of intrusion and use them to figure out the type of attack.

3. Heuristics: – real time searching is impossible in most cases without the use of heuristics. Fortunately for most such cases the experts have found out such ‘rules of thumb’ for their domain. For example security experts do not look for all packets, they look for typical signs that indicate intrusion and only look for packets containing those ‘signatures’. A doctor do not look for all possible diseases or symptoms, but a handful of them based on the context, An educator does not look for all types of leaners to decide the teaching strategies but a few one which are reasons for learning disabilities.

4. Inference: – Time and again we have seen that it is not enough to compile knowledge into knowledgebase, it is also important to provide means of inference from the existing knowledge. Many methods for domain knowledge representation exists; some of which we have also looked at. We have also seen that the inference is closely associated with the way we decide to represent our knowledge. Any expert system which decides a typical type of KR (Knowledge Representation) system, is also constrained by the ability and methods to infer from that set of knowledge. So the knowledge representation of the domain and inference go hand in hand.

5. Ability to process symbol structures: – The knowledge representations we have seen so far do not use conventional methods for processing. We have seen that we need to process symbols for intelligent action and thus the system must be able to deal with symbols and symbol structures. From Predicate logic, NMRS, Fuzzy systems to CD, Scripts and OWL, every KR system uses symbols to represent real world entries and relations between them. This clearly implies that an ES must have some form of symbolic processing ability. Sometimes built in programs with ability to process symbols are provided and known as ES shells. They are better in the sense that they help fast development of the ES. Unfortunately current trend to mix conventional systems with ES features demand coding in conventional languages and thus ES Shells are not much in demand.

6. Explanation Facility: – we have mentioned this a few times during our journey. A human expert is capable to tune his explanation to the receiver for any decision that he has made. A doctor informs the patient about the disease and the decision that he took in the terms that the patient understands. (For example he might say that you have mosquito bites and feeling cold, so it is quite likely that you have malaria, or the stomach pain indicates stomach infection, or stomach pain, vomiting sensation and other symptoms indicates gastro enteritis.) An investment advisor explains the client the financial details in the way the investor understands. In many cases, without due explanation, the client does not accept the decision. It is an essential component of any ES which is dealing with humans in one or the other way.

The ES Problem Solving

How expert systems solve their problem? Let us try to understand. If a security expert find a warning from the IDS (the intrusion detection system), that there is in impending attack from a typical IP address (unique address given to every computer system or a smart phone line device which is connected to either network or Internet). Upon receiving such indication, the security expert (usually the admin himself) may plan following.

1. Check if the warning is a false positive; i.e. a signal which indicates attack but in actual sense it is not. It is a case where attack like signal is generated by a genuine activity. (For example a mail containing a discussion on “The Mumbai bomb blast case”, might flag a warning as it contains a phrase “bomb blast”).

2. If warning is false positive, that is ignored and normal network operation is resumed. In most cases though; such events are logged in a security audit table with information about IP address of the machine and port number of the process involved and so on. Such event also invites the changes in the security setting to make sure such false positives do not occur again.

3. If the warning is valid but the infection is only confined to some part of the network; usually a server or a demilitarized zone (area containing public servers and a device known as external firewall), cut off that from the production network by giving specific commands to the internal firewall (which connects the production network to the demilitarized zone). Once that is done, the production network’s normal operations are allowed to continue.

4. If the warning is not false positive and the intrusion is spread in the production network as well; state of emergency is invoked, major servers are closed, and appropriate actions are taken to safeguard the network from further spread or subsequent actions from the intruder. Actions like detaching the network from Internet and other networks by instructing the firewall, stop running infected processes and other nodes, running anti-virus and other debugging tools and running cleaning software on infected hosts and servers, if the important files are infected and they are to be deleted, fetch the most recent back up and adjust accordingly.

5. Another typical case is an attack without a warning. If the warning is not received for an attack, it is called false negative. Such events are reported by humans; for example somebody might complain his host running slower than usual or somebody complain that some server is not accessible; or a user complains about unusual behaviour of known programs etc. In such a case, the IDS system is required to be tuned to catch such problems in future apart from everything mentioned in step 4.

Let me clarify that the discussion above is just one typical plan. Most administrators have their own plans which may be drastically different. The idea is to just demonstrate one typical plan that an administrator might have when the warning is received. We just want to analyse the behaviour of expert to learn what he is doing and the procedures involved in expert behaviour.

The expert begins his investigation with verification of the warning signal. For example he might receive the warning that a process is trying to access a system file (which is not normally accessed by other than administrators). The expert tries to see what exactly that process is trying to do. The warning may be a fake one. Here are a few cases illustrating the point. It might be the case that administrator (most admin has another normal user account) is actually logged in using his normal account but trying execute admin level programs. Another example is a program which tries to check if the mail contains word ‘bomb’ and ‘blow’ and report such suspicious mails to admin. A mail containing “a laughter bomb that will blow you away” will also be picked up as a suspicious mail. Another example is a site advisor which is looking for a pornographic site and blocking them, may block a site about breast cancer.

The expert also looks at general conditions of the network. Checking if the computer system, or the access to server is slowed down, system is doing something strange (asking for username and password when it should not, or denying even when genuine username and passwords are provided and so on), or if a system file is modified without any real purpose, access to an outsider is allowed without due process (this is done by changing a password file and rights), user rights are escalated (user is a normal user but his rights are changed to advanced users or administrators) and so on. He may take decision based on his observations about the network apart from the warning that he received to confirm what is being warned about.

The expert might observe that system has really slowed down, the particular sever under consideration is not responding to commands, and the process under consideration is really doing suspicious things and conclude that the warning is valid (and thus not a false positive). Interestingly, he is using a computer system connected to very network which is likely to be compromised. The server itself might signal nothing being wrong but other observations might suggest that the server is compromised and the malware affecting the server makes a false claim of everything being alright. This makes the expert’s task more challenging.

The expert, once conclude that there is some truth in the warning, now try to see how much the attack is spread, which machines are affected and which are not yet compromised. He might look at few critical files, the processes running on those machines and check for typical symptoms for intrusion.

Most computer attacks begin with exploring the network, servers and hosts of target. The attacker then try to figure out if any node is vulnerable to some known problems. It may find some known vulnerability in a typical OS version or a database or the browser the user is using. Once he find that vulnerability, he will try to see if any exploit is possible. Exploit is a way to devise an attack to take benefit of a given vulnerability. Once the attacker gets the exploit, he runs that exploit, compromise the node and start looking for other nodes or get more information from that node. Thus an attack is a step by step process and gradually the attacker takes control of the network1.

The security expert tries to assess the state the attack is in as a next step. He might decide that the attack is in the first stage so there is no real harm done so far or decide that the attacker has already exploited the vulnerability of a typical machine and stolen the data or installed a spyware (a software which looks at the activities of the user and glean confidential information like bank account details, passwords and so on) or a backdoor (a remote login program which allows attacker to login to the node from a remote place) etc. If he finds that the harm is not done, he might allow the system to run with the attacking process to terminate first and change the login and other credentials which are likely to be compromised. If the harm is done, he might take specific steps to stem the further infection of the attack, might take remedial actions like quarantine the process or terminate the process or block that sender and so on.

It is quite possible that the expert cannot confirm the attack. In that case also, he will run the system in safe mode until final confirmation or otherwise is possible. He uses a general rule that it is better to run the network in safe mode in an uncertain situation rather than allowing it to continue.

Interestingly the security experts have to deal with two different set of objectives. One is what they perceive as the security goals for the users and the others are demands from the production network requirements. For example a security expert might expect every user to keep a 20 character password and change it every week for the best security he perceives. The user community cannot accept that. In the event of suspicious activity being reported, the security expert would like to close down the servers and do not restart them unless attack or otherwise is not confirmed. Many user installations do not allow such shutting down of server (take a case of a web server, or a stock market server or a cricket scores server). The plan that the expert designs or implements must meet with such a conflicting set of objectives. So the client inconvenience with system unavailability as well as the system safety objectives are to be balanced by the expert in his decision making process.

1 According to one interview the author read about an intruder, he took around 6 months to compromise a very secure network. But at the end of it he claimed that he had better knowledge about that network compared to even the network admin.

Two different types of ES knowledge

The knowledge that the expert has exhibited in the problem solving process that is described in the previous section can be categorized in two different types. First one is about finding its way through the maze. Looking at symptoms, assuming a typical attack and testing it, confirm vague indications of an attack by doing further analysis, use his experience to derive rules of thumb and use them in the processing of incoming packets and the values of the parameters of those packets to suggest responses. If you pick up some other expert; for example a doctor, you may find that this process is almost the same for him as well. He also looks at patient’s symptoms, assume a typical disease and testing it by asking questions or suggesting lab tests and looking at their results, confirm vague indication of a disease by further investigation, use his experience to derive disease related rules and use them to process the information provided by patients. You can see that both experts posse the knowledge about this process. In fact this part is common across many expert behaviour.

Even though both security expert and doctor do a similar job and also use a similar tactic to find problems and devise solutions, a doctor cannot act as a security expert nor can a security expert diagnose a patient. Why?

Though both the experts and many other experts exhibit the ability to solve problems, they also possess extremely powerful domain knowledge. A security experts knows a lot about operating systems, databases, browsers, computer languages, TCP/IP like programs, network behaviour and also about typical attacks and its symptoms. He knows nothing about types of diseases, their symptoms or remedies to known diseases, leave alone names of medicines or other common things known to most doctors and some patients too sometimes.

In fact the expert’s expertize is gauged by his ability to grasp and use such domain knowledge in problem solving. More the knowledge he has about the domain, more value that he derives as an expert. In fact domain specific knowledge determines the response the expert generates from the input. For example, let us take a case of Intrusion Detection. There is a tool called Wireshark which displays the network traffic flowing in the network. It is very difficult for others to see the traffic shown by Wireshark and determine if there is some problem in the network. A computer expert (other than security expert) will hardly be able to differentiate between a normal and malicious traffic. A normal user might not even be able to decipher what he is viewing. That is where expertize of the security expert comes into picture. They gain their expertize from their experience of attacks, their observations so far and rules of thumb that they designed over the years for identifying patterns of attacks and possible solutions for possible attacks. Other experts (for example a doctor) might also have learned their own domain and also gained the knowledge of how to find faults and get solutions, cannot work in this domain as they do not have such knowledge over which the problem solving knowledge cam be applied.

What is the bottom line? It is one of the most important principals of expert system design. The expert’s power (the level of expertize) is directly proportional to the domain knowledge he possesses. The expert is also good in navigating his way through the information he has and the knowledge he possesses about the domain to solve problems of his domain but more important part is the former one.

Another part of the expert knowledge involves the speed with which he can diagnose and solve problems. A doctor who diagnoses the patient’s disease in 2 minutes is considered far better than the one who takes 10 minutes. An important part of expert’s knowledgebase is related to methods which provide shortcuts to the conventional processes. Based on their knowledge, doctors forgo many steps to diagnose the disease faster than others. This information, sometimes called knowledge of knowledge or meta-knowledge, is an important part of any expert system design. Thus the expert’s power is derived from both, the domain knowledge and meta-knowledge about the knowledge itself.

Types of domain knowledge

There are many ways to categorize domain knowledge. One way to do is to differentiate between superficial and detailed knowledge. Superficial knowledge can solve simpler problems but for complicated problems detailed knowledge is required. Another way is to differentiate is whether the knowledge is static or dynamic in nature. Static knowledge does not change much over a period of time but the dynamic knowledge changes frequently over a period of time. The introduction of Big Data also introduces three more dimensions to the knowledge that we possess. Volume of information, velocity or mobility of the information, and variety of information. Consider an expert system for deciding what people are thinking about next election and predict which party has more chances of winning analysing comments over Twitter and Facebook data for a huge sample of users. The amount of data (the volume) is very high. The amount by which new data is coming is equally high. The data comes in many forms, comments, likes, stories that one shares, pages that they visits. Friends that they keep in their account and so on. Thus there is lot of variety as well. In fact an important operation for Big Data is cleaning and normalizing before putting it to analysis. The data may have come from a source which is not very trustworthy. For example an excerpt from a Facebook page of a typical party spokesperson cannot be taken as public view on that topic. Sometimes the information is coming from a secondary source and thus that also requires validation. Duplication and inconsistencies in data also requires addressing. This type of domain knowledge processing demands unprecedented level of expertise.

From the point of view of AI researcher, we can classify the domain knowledge in four different categories. Let us try to see what those types are.

Our discussion so far yields a simple answer to a query, “what are the types of knowledge one need to solve a problem?” The answer is “Two, facts and rules”.

Both facts and rules can be of two different types. One type of fact is that is true always and one need not worry about the truthfulness of them. Another type is a fact that might change its truthfulness over a period of time. We will call them default assumptions.

Here are a few examples

Facts

1. A land attack has a same source and destination address

2. Malaria is caused by bacterial infection

3. TCP packet header contains port number

4. Red light is an indicator to stop.

5. A visual learner prefer picture over a statement describing a concept

Default assumptions

1. Slower than usual network operation indicates attack

2. There are more chances of malaria in rainy season

3. Faster vehicles are more prone to accident

4. Application level attacks are most common

5. A student looking for a real world example is most likely to be a sensor

Similarly rules are also of two different types, one which are not going to change over a period of time, well defined, structured, and clear cut rules that describe fundamental properties of domain knowledge and sequences of events and relation about domain knowledge components. They are quite procedural in nature so we call them procedural rules.

The other type of rule is a heuristic rule which we have described in the earlier modules. The heuristic rules are rules of thumb, designed and derived by experts based on their experience with the problem domain and contain those shortcuts which help them diagnose the problem faster than others. Out of all four types of knowledge, this is the type of knowledge which determines the power of an expert. Interestingly, not all experts have same set of such rules.

Here are some examples

Normal rules

1. If the IP packet contains no-more fragment bit false, then there is another fragment coming from the channel.

2. If the patient has malaria, his red blood cells are being attacked.

3. If there is a congestion, the traffic over that road experience additional delay

4. When the firewall only allows port 80 traffic, only possible attacks are web server attacks.

5. Active learners learn better when activities are given in the class

Heuristic rules

1. If the packet contains more than 5 ‘/’ characters, it is likely to be a packet compromised for a slash attack.

2. If the patient is having fever and also feeling cold in rainy season, check for malaria first.

3. If the 10% longer road is 20% less congested than a shorter road, you can reach the destination faster by choosing a longer road

4. When IPS2 reduces the efficiency of the system by more than 20%, it is better to have an alternate traffic route for intrusion related information.

5. If you are teaching a graduate engineering class, it is very likely that almost 50% of them are intuitors.

Thus every expert has to deal with all four types of domain knowledge, the last one being the most crucial and important for problem solving.

Summary

We have introduced Expert System and discussed about common terms used with Expert Systems. The tasks which human experts are good at are being addressed by ES. We have seen examples of some well-known ES solutions. Most ES solutions are not standalone systems but offered as part of conventional commercially available systems. The Expert systems require domain knowledge using symbol structure and inference, searching ability, heuristics, and explanation facility. ES problem solving involves problems solving ability of an expert apart from the ability of the expert to derive solution using extensive domain knowledge and thumb rules that he has developed over the years of experience. All experts’ exhibit ability to have and use former type of knowledge but their actual power is derived from their heuristic knowledge. The experts deal with two conflicting set of objectives, one which improves the effectiveness of the system while the other improves the user convenience. The domain knowledge has two different types of facts, one which is true always and another which change its truthfulness over a period of time. Similarly it has two types of rules, procedural and heuristic.

2 IPS is Intrusion Prevention System which is capable to drop the malicious packets directly, unlike IDS which only inform about suspected attacks to administrator and let him decide the course of further action. Due to this, IDS can and usually works on copy of packets while IPS has to work on original packets which introduces delay.

you can view video on Introduction to Expert Systems