8 Data & Information Management

Dr. Ashish Saihjpal

epgp books

 

 

 

1. Learning Outcome:

 

After completing this module the students will be able to:

  • Define what is data and how is it different from information.
  • Understand the stages involved in conversion of data to information.
  • Understand the various stages of Data Processing Cycle and kinds of Data Processing.
  • Classify types of Data and Information.
  • Understand methods of Data Collection.
  • Understand the concept of Data and Information Management and Value of Information.

 

2.  Introduction

 

Data in the simplest form refers to the different set of values, a variable may take. These values could be numeric, alphanumeric or special characters. In computer language, each corresponds to a certain ‘American Standard Code for Information Interchange’ (ASCII) code comprising of 8 bits in binary form. This binary format (in form of 0 & 1) is well understood by the computer. Data when processed corresponds to information. This may be in the form of text documents, images, audio clips, software programs, or other types of data. Computer data is processed by the computer’s Central Processing Unit (CPU) and is stored in files and folders on the computer’s hard disk.

 

Data

 

Data can be stated analogous to ‘oxygen’ or ‘blood’ that runs in the human body. Data is at the locus of various systems and databases. Organizations function more efficiently when data flows freely between various systems, processes and departments. The highest performing organizations pay close attention to data. Making diligent use of data empowers the organization to attain competitive advantage over its rivals in the marketplace.

 

In today’s world it is equally critical for organizations to back up and store historical data for retrieval and safety. Data management is just as important as the means to obtain it. Competitive organizations ensure that the data captured is coherent with processes and accessible to individuals who need it well within the necessary timelines.

 

Information

 

Information refers to a collection of facts from which inference may be drawn. Information is data which is extracted, processed or transformed and presented to draw insights. Information, today, includes electronic and physical information (data, paper documents, electronic documents, audio, video, etc.)

 

Referring to the DIK Pyramid shown in Figure 1, data finds its place at the bottom of the pyramid. When the data is organized, processed and filtered it yields information and accumulation of this information leads to knowledge. Information helps improve the representation of an entity as it aids in decision making and reduces uncertainty in the organization.

 

Knowledge

 

Knowledge refers to know- how and expertise. Knowledge possessed by an individual is a product of his premise and experience. As per the definition given by Gamble and Blackwell (2001), “Knowledge is convergence of values, contextual information, expertise, and grounded intuition that provides an environment and framework for evaluating and incorporating new experiences and information. It originates and is applied in the mind of the knower’s. In organizations it often becomes embedded not only in documents or repositories, but also in organizational routines, practices and norms.”

  1. Data Processing: Converting Data to Information

 

Data processing refers to the derivation of useful information from unstructured data. Figure 2 is representative of numeric data. However, giving a context to such a data i.e. date, time or monetary value adds meaningful information about the data.

 

 

 

Other Data Processing Examples can be understood as under:

  • The examination branch hands over examination answer sheets to the concerned teacher. The teacher marks the answer sheets and enters corresponding scores on a score sheet subject-wise. The teacher calculates the total and average score for each student. Report card for each student is prepared and a master report sheet is prepared and kept for each student.
  • An organization compiles the data of daily check-in and check-out time of all employees from biometric sensors connected to the Human Resource Database. This device maps the unique finger prints of each employee with his employee ID. The employee ID further is linked to the employee profile with all relevant information of an employee. This is used to calculate the number of hours worked. These details are then fed into the database of employees against their compensation slab. At the end of the month, the salary is computed by totaling the total hours worked multiplied by compensation per hour to credit the salary.
  • The Library Management Systems of schools and colleges provide faster issuance and record keeping. Each book carries a unique bar code which is mapped to the database. The database is a repository of total books and those issued with the unique identification of the reader. It carries data of past details of issue and return dates. The optical reader reads the bar code and feeds the relevant information at the time of issuance and due date of returning the book. Sensors can detect any infringement.

 

3.1  Stages of Data Processing Cycle

 

Different stages of a data processing cycle are represented in Figure 3 and explained below:

 

  1. Collection of Data Collection of data is a very important step in the process, since the quality of data collected largely defines the quality of the output. Once the problem is well defined, it gives a better estimate of the data required. The process involved in data collection can be quantified better on account of what and how much data is to be captured. The collection process needs to ensure that data gathered are both well defined and accurate.
  2.  Preparation Stage Raw data cannot be used for processing. Preparation stage is the modification of data into format that the computer understands. This process includes classifying, coding, rearranging of edited raw data.
  3. Input of Data This stage includes data to be converted to a machine readable form. Data is fed via input devices such as keyboard, scanner or data entry from an existing source. The data needs to follow a pre-defined format due to complexity and the cost involved. Many businesses tend to outsource this stage to third parties.
  4. Processing Stage The data is manipulated by sorting, calculating, updating etc. Usually a set of procedures or instructions are followed. Numerous software programs are usable in case of large volumes of data to be processed within very short periods.
  5. Storage This stage involves the backup created for usable information for future reference. It further allows fetching data as per business needs, allowing it to be passed on to the next stage directly. Every computer uses storage to hold system and application software.
  6. Output and Interpretation – This is the final stage where the user receives the processed information. The output may be viewed by the user either onto the monitor of the computer or could be held in the form of a report along with an audio, video file. The output is the processed information that guides the future decisions and meets the business needs.

 

3.2 Kinds of Data Processing

 

Figure 4 highlights the three distinct techniques of data processing. These are:

 

  1. Manual Data Processing – It implies when data is processed solely by human experts without any intervention of automated tools. All the calculation upon data is performed manually. This is a time consuming and slow method and the chances that error may occur are much higher.
  2. Mechanical Data Processing – The mechanical process is faster and more reliable than the former. Mechanical devices such as calculator or type writer helps to process calculations faster; the mechanical billing system generates a bill faster than a corresponding manual one. This leads to fewer errors in computation.
  3. Electronic Data Processing – It is the fastest and a very accurate mode of data processing today using computer generated set of instructions. Electronic data processing has industry vide applications. Enterprises, public and private offices, banking, financial and academic institutions etc commonly use this methodology.

 

Case: Dominos: digitizing the pizza slice

 

Domino’s, an everyday name in fast food restaurants, is serving nearly 85 markets across the globe. Domino’s AnyWare (refer Exhibit 1) implementation was an ambitious initiative to allow customers to be able to place orders from multiple access points. These ranged from placing orders by using texts, Twitter, Amazon Echo, a car/TV or smart-watch app, a voice-activated mobile app, to name a few. This application enabled Domino’s to capture a lot of data which can further enhance customer service and generate revenue.

 

Data captured is fed into the Domino’s Information Management Framework. This is intelligently mapped with Geo-code information that derives coordinates of customers and associates with postal service demographic and competitor data, to allow in depth customer segmentation and profiling. The “zero click ordering” options by Domino’s works on a similar know-how (Ref Exhibit 2).

 

Information gets collected via multiple touch points and gets accumulated both in the structured and unstructured format, pouring into the system every day. Domino’s serves up a good example of how enterprises can leverage customer data and roll out information that enhances cross-selling and up selling. The store house of information allows Dominoes to offer a basket of options unique to the customers buying history and choices.

  1. Classification of types of Data and Information

 

4.1Classification of types of Data

At the highest level, two main types of data exist namely Qualitative and Quantitative Data, as shown in Figure 5.

 

Quantitative data deals with data that can be numerically stated and measured such as weight, length, height etc. On the other hand, qualitative data comprises of those parameters that can’t be easily measured such as suitability, color, applicability etc.

  • Quantitative Data: It can further be classified as continuous and discrete data.
  • Discrete data implies that the values are not included in decimal points or in fraction. It only deals with whole numbers or integers. E.g., the number of patients in a hospital is discrete data, since we count whole, indivisible entities. They cannot be stated in values such as 2.5 or 1.6 but whole values.
  • Continuous data on the other hand can be fragmented further or reduced to finer levels. For example, measuring the fetal growth during the pregnancy by an ultra-sound i.e. length (inches) and weight (grams) of the baby. Therefore, height, weight, length width etc. are example of continuous data.
  • Qualitative Data: Nominal, Ordinal, Interval and Ratio Data

While classifying or categorizing something, we create qualitative or attribute data. There are four main kinds of qualitative data as shown in Figure 7.

Nominal Data – The word ‘Nominal’ is derived from the Latin nomen, meaning ‘name’, Therefore, nominal data deals with assigning a label or a nomenclature to any variable. The data does not have a rank significance but is only for the purpose of identification. Nominal basically refers to categorically discrete data. For example: jersey numbers printed on back of player’s tees are merely to identify them better when they are on the field. Another example can be, labeling of items in a grocery store that are purchased only once in 2 months by customers. Exhibit 4 illustrates few of the other examples.

 

 

Ordinal data – Ordinal data is not only placed in an order but carries a particular rank to be identified. Though the ordinal data refers to quantities that have a natural ordering but, the differences among two values may not be clearly stated. Percentile scores of students, assigning grades to students in their performance are examples of ordinal Scale. Exhibit 5 gives another example of ordinal data.

Interval Data – It not only states the ranked order but also specifies the exact differences between variables so that data can be computed and stated clearly. There is no natural zero point because the value of zero is arbitrary. For example when referring to the twelve hour format/ twenty-four hour format as shown in Exhibit 6 the difference between each interval can be stated in exact terms.

 

Ratio Data – The ratio of measurement is most accurate representation of data. Its additional property is the indication of absolute zero. This implies absence of the quantity being measured. Example of a ratio scale is the amount of money in a bank account. Money is measured on a ratio scale because, in addition to having the properties of an interval scale, it has a true zero point: if you have zero money, this implies the absence of money. The other common examples are, distance travelled, speed, time, and weight as shown in Exhibit 7.

 

 

4.2  Classification of Types of Information

 

The information can be classified in multiple ways for a better understanding. As per John Dearden of Harvard University information can be segregated as:

 

Action vs. No-Action Information: The information which prompts an action is called action information. E.g., “non reconciliation of accounts” reports call for an action to be initiated and thus, called as action information. While the information which communicates only the status is No-Action Information. The monthly statement is an example of no-action information.

 

Internal and External Information: The information compiled from internal sources of the organization is termed as internal information while the information generated from secondary data sources and government reports, the industry survey etc. are termed as external information.

 

Recurring vs. Non-Recurring Information: The information compiled at periodical intervals is recurring information. The sales funnel the quarter revenues, profits, stockstatements etc and are recurring information. While, the financial analysis or the report on the market research study are the good examples of non- recurring information.

 

Further, on basis of application, information can be categorized as:

 

Planning Information: The information that becomes the part or forms the foundation for strategic or operational planning of any activity is termed as the Planning Information e g. specification sheet, product manual, time standards etc.

 

Control Information: Control as a management function means setting standards, measuring the actual performance and taking corrective measures accordingly. Hence, when the status of an activity is reported through a feedback mechanism it is called control information. When such information shows a deviation from the goal or the objective, it will induce a decision or an action leading to control. e.g., security, tracking control systems, severs logs etc.

 

Knowledge Information: A collection of information through the library records and the research studies to build up a knowledge base as information is known as Knowledge Information.

 

Organization Information: When the information is for wide circulation and is used by everybody in the organization, it is called Organization Information. Product manuals, notifications, work orders are used by a number of people in an organization.

 

Functional/ Operational Information: When the information is used by specific business departments is called Functional/Operational Information. E.g., logistics information, delivery schedules. This information is mostly internal to the organization.

 

Database Information: When the information has multiple use and application, it is called as database information. Such as, address directories, supplier information.

 

On basis of management hierarchy, information can be categorized as:-

 

Supervisory Level – The supervisory level of information relates to operational tasks and is required at each level of the organization. E.g., business correspondence, circulars, reports and spreadsheets. To ensure smooth flow of information from bottom to top level, a clear and easy to use management information system should be provided so as to fulfill operational, business and decision making goals.

 

Middle Level – Such information sharing occurs within teams, divisions, business units, etc. This information may be critical to the day-to-day activities of the group. E.g., project documentation, business unit specific content, meeting minutes, etc.

 

Top Level Information At the top level, corporate information is useful for whole of the organization as it defines the direction of business. This information is generally well addressed by the corporate intranet. Examples of corporate information include policies and procedures, HR information, online forms, phone directory, etc.

 

It is imperative to note that, an information management solution is essential and be provided to staff at each of the three levels else uniformity of problem handling and cohesion among teams cannot be established.

 

The flowchart in Figure 8 gives an overview of the categories of information discussed above.

 

  1. Methods of Collecting Data

 

The data collection methods used by analysts are called fact- finding techniques which impact the quality of information. Also, the design of data collection method also decides the quality of data and information. The methods of data collection and processing become a part of the MIS. Moreover, it is essential for an MIS expert to understand the potential problems of biasness, accuracy, precision of various methods along with their application. The various data collection techniques are illustrated in the following table Figure 9 along with suitable examples.

 

  1. Data Management vs. Information Management

 

It is important to state here that, data management is a subset of information management and given below are the essentials of the same:

  • Data is managed as a valued resource since capturing data requires both time and effort.
  • The data management process is an end to end process comprising of acquisition, accumulation, processing, security, and documentation and archival of data as per business utility and needs.
  • It further includes practices on creating metadata and documentation for long term.
  • Therefore, the underlying criterion of data management is to ensure the data is valid, accurate, complete, and secure and further to develop and execute Data architectures and procedures that manage the full data lifecycle.

No doubt, misplacing information is a big problem but having incorrect information is much bigger. Handy information should be physically organized and easy to access. Having the right information and being able to get it in real-time will ensure higher productivity. Once the information is captured; it must be so stored that people can utilize it and make changes as needed. There are physical and logical views concerning information management. The physical view deals with how information could be stored on storage devices, while the logical view deals with that how the information could be arranged while simultaneously working with it. Hence, giving arise to the need for Information Management in organizations.

 

Hence, Information Management is defined as

  • The concept of managing processes, the technology and the people who use them to control the usage of information required for management and business intelligence purposes.
  • Formulating the organizational structure for handling flow of information and its timely delivery.

 

6.1  Value of Information

 

Information has a cost for its acquisition and maintenance. Thus, before a particular piece of information is acquired, decision maker must know its value from an economic, business, technical and risk perspective (refer figure 10). The information has a perceived value in terms of decision making. The decision maker feels more confident when additional information is received in case of decision making under uncertainty or risk.

 

 

  1. Economic Dimension: This implies the cost associated with acquisition of information and the benefits derived from it. Total cost of information is a summation of cost involved in data acquisition, data maintenance, information processing and cost of communication. For any system that has a low response time, the cost is high. The cost is depends on accuracy, speed of generation etc.Value of information = Cost to get information – benefit
  2. Business Dimension: Each level of management has different information needs as the managers have different responsibilities and functions to perform in an organization. Therefore, the value of information is perceived on the basis of its functionality.
  3. Technical Dimension: This dimension refers to the technical aspects such as database queries, turnaround time to respond, security, validity, deriving relationships etc.
  4. Risk Dimension Handling large amount of information has its own major concerns. Confidentiality and secrecy are of paramount importance to ensure information is not misused. This arrangement needs various levels of security and continued monitoring via hardware and software.

 

  1. Summary

 

In any organization, data are the single most critical entity on which the knowledge blocks are built. Extensive efforts are put in acquiring data from multiple access points to derive information. It is then that the data are analyzed, stored and interpreted and verified from enumerable data sources. To ensure this, a formal and structured framework needs to be put in place to handle large volume of data. This shall not only facilitate authentication but also enable quick information search and transaction processing for business applications. While, Information Systems and its infrastructure is that platform that enables to store, analyze and retrieve information which is necessary and facilitates quick decision making for industrial and academic applications.

you can view video on Data & Information Management