20 Data Mining for Decision Support

Dr. Ashish Saihjpal

epgp books

 

 

 

1Learning Outcome:

  • After completing this module the students will be able to:
  • Understand the fundamentals of Data Mining (DM).
  • Understand the scope of Data Mining.
  • Understand the DM Architecture.
  • List various methods of Data Mining.
  • Get an overview of how Data Mining aids Decision Support.
  • Understand the industry vide applications of Data Mining.

 

2.  Introduction

 

Data Mining (DM) is a process undertaken by enterprises to process unstructured data into relevant information. It uses softwares to look for patterns in large batches of data. Businesses can learn more about their customers and develop strategies in line with business goals so as to have a positive impact on sales. Data mining is dependent on effective data collection, warehousing as well as computer processing.

Data Mining helps establish relationship between the micro and the macro environment that affect key performance indicators like sales, profitability, revenue across different industries like retail, hospitality, education, communication etc. To exemplify, it may enable us to analyze the micro factors such as the 4 P’s of Marketing i.e. Price, Product, Promotions and Place with respect to the macro factors like technology, economic conditions and taxation laws to forecast sales trends. Hence, Data Mining offers the capability to organize the transactional data to deduce relationships that enable business decision making. This can be understood from a simple ordered Rubrics Cube diagram in exhibit 1 which can be reorganized from an unstructured to a systematic format.

 

Data mining tools can find solutions to business problems that traditionally have been difficult to resolve. It can drill down databases and look for patterns, trends and other predictable information that experts may overlook as unimportant to business.

 

Data mining tools enable to predict behavioral patterns, allowing businesses to make prompt and analytical decisions. Integration of Decision Support Systems with Data Mining Tools is possible on existing hardware platform and be installed on operating systems that pre-exist. They are interoperable and scalable to interface with new products and systems as per changing business needs.

 

  1. 1 How Data Mining Works?

 

Data Mining works on the principle of Data Modeling. This helps to translate data in such a manner that supports business process. The data are mined by system experts and users of Information Systems.

 

Data mining consists of five major elements

  • Loading of transaction data to the data warehouse.
  • Database Management and Data Archival.
  • Data Access to IS experts and analysts.
  • Data Analysis using Data Mining Software tools.
  • Graphic user interface for interpretation.

 

Data Mining is the bridge between transaction processing and analytical systems. Based on user queries the mining software searches the database for patterns and relationships. These analytical tools are of various types like neural networks, clustering etc.

 

Exhibit 3: How data mining works

Image Source: http://technologyspace.weebly.com/uploads/1/1/5/2/11524599/2505000.jpg?426

 

Exhibit 3 showcases the sequence on tasks that form part of the Data Mining Process from capturing of information from multiple sources to pattern analysis and evaluation for business decision making.

 

Generally, any of four types of relationships are sought as shown in Fig1.

 

  • Classes: This works on machine learning which involves the classification of data into pre-defined groups. Linear programming and decision trees are among the few techniques used here. Based upon user query the DM tool searches data and segregates it into separate classes. E.g., based on past trends how many sales executives are likely to resign within the first year of their employment?
  • Clusters: It is different from classification as it defines the classes and places the objects in each This can be seen in Exhibit4 where customers have been classified according to their spending patterns. A library management software searches for books for a particular subject or author is this way.
  • Associations: Data is mined in way that studies association as shown in exhibit 5. It can be used for market based analysis. People who may generally buy frozen snacks are likely to try variants of sauces and ketchup. People buying dresses are likely to match the same with accessories or shoes. These associations are studied using historical data.
  • Sequential patterns: Data is mined to understand and interpret behavior patterns and trends over a period of time to be studied. This is exemplified in Exhibit 6. The likelihood of bed linen and curtains purchased with each purchase of new bed may enable home furnishers to understand the patterns of consumer purchases.
  1. Scope of Data Mining

 

Data mining is an important part of knowledge discovery process that analyzes large enormous set of data and gives us unknown, hidden and useful information and knowledge. Data Mining finds numerous applications in multiple fields such as healthcare and medicine, transportation, insurance, hospitality government etc. Data Mining can discover new correlations, patterns and trends in vast amounts of business data stored in data warehouses. Data mining software uses advanced recognition of patterns, algorithms, mathematical and statistical techniques to sift through mountains of data to extract previously unknown strategic business information.

 

Many companies use data for –

  • Performing Market Basket analysis to identify new product bundles
  • Find root cause of quality parameters or manufacturing problems.
  • Prevent high rate of customer churn and acquire new customers.
  • Cross sell to existing customers.
  • Customer profiling and segmentation.

 

The Data Mining tool enables quick data analysis for business oriented queries. The cumbersome process of doing this for large data repositories is minimized considerably as the process is automated. Campaign management, message broadcasting to user lists, digital marketing are areas where data mining tools can provide business oriented results with minimum investments.

 

Data Mining (DM) tools offer drill down capabilities in huge data warehouses. Users can fetch business critical information, at the click of a mouse. The value proposition lies in the fact that DM software can be installed using existing hardware and software platforms which maximizes the return on investment.

  1. The Data Mining Architecture

 

Data mining is a very important process where business critical and un-explored information is extracted from large volumes of data. There are a number of components involved in the data mining process.

 

The following components build up the Data Mining Architecture:

 

  • Data Sources It refers to the activities that lead to accumulation of data. Social media, cloud based applications, activity generated data, public info, legacy based platforms, points of sale (PoS) are touch points that pull up data. e.g ., Apple App Store, iTunes, Google Maps, Ola, Banking Apps, Email Clients, Twitter etc. Data from these multiple access points is unstructured. It needs to be organized, filtered and structured before passing it into the data warehouse server. Multiple techniques are used for cleaning and integration of data.
  • Database / Data Warehouse Server – The database or data warehouse server contains a store house of data. Hence, the server is responsible for retrieving the relevant data based on the data mining request of the user.
  • OLAP Server – Online Analytical Processing Server – A subset of Data and Business Analytics, it comprises of relational databases, data mining and reporting capability. Multiple applications of OLAP include sales funnel reporting, daily MIS reports, sales funnel analysis, financial reporting etc. OLAP tools enable analysis of multidimensional data from various dimensions. Multiple queries can be run; databases can be navigated within rapid execution time.
  • Data Mining Engine –It lies at the core of the data mining infrastructure. It is responsible for carrying out various data mining techniques like clustering, association, correlation etc. through which information is categorized.
  • Pattern Evaluation Modules – It measures the characteristics of the pattern by using a threshold value. It interacts with the data mining engine to focus on pattern analysis.
  • Graphical User Interface – It is responsible for providing a platform between the data mining engine and the end user. It makes the reporting understandable and easy to interpret. Results are displayed using pictorial tools and diagrams in a manner easily understandable.
  • Knowledge Base –It is assigned the task to provide input to the data mining engine. Modules of pattern evaluation interact with the Knowledge Base on a constant basis. It facilitates knowledge discovery and substantiates the data sources for pulling out interesting patterns.

 

4.1  Data Mining Methods

 

An exhaustive list of data mining techniques is listed in Exhibit 9. These methods can be classified on basis of Statistics, Artificial Intelligence, Operation Research Methods, Neural Networks, Stochastic Search Methods and synergy between various Statistical Techniques.

 

From the gamut of DM tools and techniques the most popular and commonly used ones are:

  • Neural Networks – This technique works in a similar way a human would. It is based on the collection of fundamental units called neurons analogous to axons in a biological brain.
  • Decision Trees – At the core, a decision tree comprises of a root node, branches and leaf nodes. It helps to break down groups of data into multiple sub sets. This predictive modeling approach has multiple uses in data mining. The Chi Square Automatic Interaction Detection – CHAID is a popular method.
  • Time Series Analysis – It refers to analysis of data sets arranged in a ordered time frame or chronological order. It helps to study pattern recognition, weather and meteorological forecasts, seismic forces, astronomy, communications and control systems to name a few.
  • Rule Induction – It is a concept where a general rule can be extracted studying data sets. These may be deduced as local data patterns or certain scientific models of data.
  • Regression – It helps to predict values that occur in continuation, in a particular data set. Regression analysis has multiple applications across various industries. Few examples include financial forecasting, sales budget planning and environmental analysis.

The Card Protection Company – Be it any industry one looks at, each customer is unique and behaves in a unique pattern. Then how can a machine predict the possible behavior or buying pattern of any single customer?

This is an area answered by Data Mining and is based on the underlying principle of market segmentation. Accumulating data pertaining to each user and maintaining its profile helps to draw intricate analysis. Compilation of user attributes like demographics, past purchases, purchase schedules and patterns, frequency can help answer number of questions. It helps to segment customers with similar characteristics and acquaint them with best offers.

A similar activity was carried by the Card Protection Company. With its humungous database of nearly 7 million customers, they employed DataInsight to carry customer modeling. The unstructured data was organized with demographic segmentation profiles. Nearly 300 customer characteristics where narrowed down to 30. Decision Trees and CHAID analysis helped reveal best categories to predict customer behavior. Classification such as, the frequent respondents, those who would respond in a month, those who would shop during the sale periods and those who made purchases once in a year. They also narrowed down the age segments that were most reactive to promotional offers and coupons. Such an exercise helped Card Protection Company tremendously.

Source: http://www.campaignlive.co.uk/article/172622/technique—using-data-mining-market-segmentation

  1. Data Mining for Decision Support

 

Decision support systems (DSS) are defined as interactive application systems which are intended to help decision makers utilize data and models in order to identify and solve problems and make decisions. They incorporate both data and models and they are designed to assist decision makers in decision making processes. Exhibit 10 lists various models and techniques of this aspect. They provide support for decision making, but do not replace it.

 

The mission of decision support systems is to improve effectiveness, rather than the efficiency of decisions. A decision support system can take many different forms and every decision support system is developed for a specific objective and bases on a particular decision process and set of methods, techniques and approaches. The design of DSS is created in agreement to the decision-making process and decision problems which the DSS is going to support.

 

The objective of data mining is to discover relationships, patterns and knowledge hidden in data. Data mining is the process of analyzing data in order to discover implicit but potentially useful information and uncover previously unknown patterns and relationships hidden in data.

 

Data mining is an interdisciplinary field which encompasses statistical, pattern recognition, and machine learning tools to support the analysis of data and discovery of principles that lie within the data.

 

Integration of data mining and decision support enhances the capability of the DSS which can handle complex problems than before. Moreover, this can significantly improve current approaches and create new ones for problem solving, by enabling the fusion of knowledge from experts and knowledge extracted from data.

 

Detecting Fraud Using Data Mining

 

 

With the world converging with the use of technology, information is now available at the click of a mouse. However, this ease of information availability and accessibility needs to be restricted in the hands of users who actually need it. Unethical use of business related information, unauthorized access and unsecured mode of transmission may only lead to the collapse of an enterprise.

Data Mining has proven ability in the field of detecting instances of fraud across various industries. Online transactions, e-commerce, internet banking and payments are a platform vulnerable to fraud and money laundering if not looked at critically. Both of the commonly used techniques for fraud detection find applications in Data Mining. Be it the use of Machine Learning or Artificial Intelligence, they provide provision to understand instances of fraud and raise an alarm.

Data mining techniques such as profiling, clustering, and classification and time series look up data sets to study transaction patterns. Log in information, mismatch in typing patters, inappropriate account activity are instances where authentic users can be cautioned. Data Mining also assists forensic analytics, biometric and retina analysis where logging into secure networks requires user validation.

  1. Applications of Data Mining

 

Data Mining is a concept that has numerous applications in possibly all sectors of the industry and academics. Data Mining works on volumes of data and can help out study hidden patterns that can be used for business critical use cases. Data Mining intricately studies transactional data and organizes it into batches using statistical and computing tools to establish relationships and patterns.

 

Some of these applications are elaborated as under:

 

  • The field of Bioinformatics studies the structural patterns of proteomics by studying their patterns and databases. In healthcare, data mining plays an essential role in data visualization and soft computing. It helps in forecasting trends and ensuring resources are available to meet the patient demands in various segments. Health care requires precision and accuracy. The automation and use of Electronic Health Records to database customer records is a common trend. These tools can compare the symptoms, causes and treatments and provide suggestions as per clinical best practices. Data Mining techniques study the data sources and use the analysis to build predictive models.

Today is the era of converging Web Technologies, 4G and high speed connectivity. Thus, establishing communication networks that transmit voluminous information in a secure encrypted way, in the hands of authorized users is key. Data Mining has extensive usage in the telecommunications industry. From office automation solutions to networks that enable voice and data transmission, data mining helps to understand business dynamics better and make decisions. DM helps to trace trespassing, intrusion detection, fraud and maintain quality of service.

 

  • A banking department can leverage data pertaining to credit card users to predict which customers are frequent users and would be keen to purchase a card with new features. Using a small test by mailing, the characteristics of customers with an affinity for the product can be identified. Recent projects have indicated more than a 20-fold decrease in costs for targeted mailing campaigns over conventional approaches.
  • Data mining can be applied to check how customer groups react to a promotion, how effective the promotion are with respect to cost and benefits, which marketing channels have been successful for different campaigns in the past and so on. By analyzing this kind of information, retailer re create advertisement and design promotional activities. Exhibit 15 showcases several use cases for a Sales and Revenue Management system. This is integrated with a data warehouse, and customer profile data from retail stores.
  • Optimizing the prices for every product is a difficult task. A number of factors pertaining to customer demand are considered before pricing. Normally, price increases leads to lower sales and customer adoption of alternate products. Data mining can outline demand for the products and the relation between how a price change of a particular product affects sales of other products.
  1. Summary

 

As business operations in a firm multiply, the involvement of human expertise is not sufficient. It involves time, effort and a considerable amount of compensation. The same tasks can be processed with information systems designed to support business processes. Not only can they work round the clock but tirelessly generate iterative results.

 

Data mining is the process of processing and analyzing data in order to find useful information for business. It involves selecting, exploring and modeling large amounts of data to uncover previously unknown patterns and ultimately arrive at comprehensible information, from large databases. Today, multinational companies and large organizations have operations in many places in the world. Each place of operation may generate large volumes of data. Decision makers require access from all such sources and take strategic decisions. Data mining uses a number of techniques that include statistical analysis, decision trees, neural networks, rule induction and refinement and graphic visualization. The combination of business acumen with the power of data mining techniques can help organizations gain a strategic advantage in their efforts to optimize customer management.

you can view video on Data Mining for Decision Support

Learn More

 

1. Nikam N Sunita, “The Survey of Data Mining Architecture and Feature Scope”, ASM’s International E- Journal of Ongoing Research in Management and IT, e-ISSN- 2320-0065.

2. Laudon Kenneth C, Laudon Jane P, Management Information Systems, Managing the Digital Firm, Pearson Education South Asia, 2013.

3. O’Brien A James, Marakas M George, Behl Ramesh, “Management Information Systems.” 9th Edition, Tata Mc Graw Hill Education Pvt Ltd.

4. Kukar Matjaz, Rupnik Rok, “Decision Support System to Support Decision Processes With Data Mining”, University of Ljubljana, Faculty of Computer and Information Science, Slovenia, Journal of information and organizational sciences, Volume 31, Number 1 (2007).