25 Basic Principles of Experimental Design

Dr. Harmanpreet Singh Kapoor

Module 31: Basic Principles of Experimental Design

 

  • Learning Objectives
  • Introduction
  • Applications of Experimental Design
  • Basic Principles of Design of Experiment
  • Steps for the Construction of Experimental Design
  • Use of Statistical Techniques in Experimentation
  • Summary
  • Suggested Readings
  1. Learning Objectives

 

In this module, our main objective is to give an introduction to basic principle of experimental design. We will discuss three basic principles of design in detail due to its importance and it has wide  applications in different areas like agriculture, social science, service sector, biology and manufacturing. This module will help you to understand this topic in an easy manner. Examples are included for better understanding.

 

  1. Introduction

 

In statistics, data have a vital role and it is essential to have a clear plan or strategy while the collection and analysis of the data. The strategy that is used to take care of this motive is known as statistical design that includes plans starting from the objective till the conclusion. In practical life, one is interested to make an inference from the experiments. As conducting experiments will help you to understand or give suggestions for improving the system whether conducted in the lab or in open fields. For example, experiments are conducted in the lab in a controlled environment and some experiments are conducted in the open and uncontrolled environment like agriculture field. The theory and methods that are used to take care of experiments are termed as experimental design. It is considered as a collection of techniques and methods that help the experimenter while conducting an experiment in terms of how to minimize cost, how to choose units and which model is more appropriate for the given objective, how to analyze the data and draw conclusions.

 

In general, our main interest is to understand the system or process in detail during its operational phase and try to see the cause and effect relationship for that we either change the input variables or conduct the same experiment in different situation. This will lead to make statements or theories about the system. The main objective of these experiments is to determine that these theories or statements are statistical significant or not.

 

Generally experiments are considered as a criteria for testing or series of repeating the same technique or method to see the changes that can be observed in the output values due to change in the input variables. So our main interest in to find out those factors or variables that are responsible for this significant change in the output responses as well as developing a model for the response variable with the significant input factors. After that this model can be used for making an improvement in the system, making a further decision and forecasting the future values of the system for a given input factors values.

 

 

In this module, we will discuss a brief about how to make a good plan, conduct an experiment and how to analyze the data while framing a valid statement about the objective using inferential statistics and draw a valid conclusions from them.

 

 

Experiments have a wide application in different areas like sciences, industry etc. There are many situations where the scientific laws can be applied in an effective manner that one can develop mathematics relationship between the factors and can check its significance further using statistical techniques by just understanding these scientific laws. In science and production, we have units available and experiments help to understand the how the process works. So it is important to choose an appropriate experimentation technique because that will lead to conclusion about the system. As the conclusion be drawn as a step wise procedure first we have to select the units or factors that are of interested to us. After that statistical methods are used for collecting the information, then analysis and at last give the conclusions about the objective.

 

In general, process is considered as a collection of different resources like input factors and output responses. In production sector, input factors are people, raw material and other things that are essential at initial stage to start the process on the other hand output responses are those that one can receive at the final stage. It is possible that some of the input factors or variables properties are controllable or assignable. Controllable means that variable properties can be modified as per the requirement. Whereas some variables properties are uncontrollable or non-assignable. The purpose of conducting experiment may include the following:

 

(a) To find out the most influential variable on the response variable.

(b) To see the impact of the influential variable on the response variable value.

(c) To check whether with a small change in the influential variable will lead to significant change in the response variable.

(d) To see how to minimize the effect of uncontrollable values on the response variable.

 

Hence we can observe that the in experiment we have different factors and the objective of the conducting experiment is to find out the influential factors on the response variable of the system. The whole process of conducting experiment and planning is termed as experimental strategy. So it may be possible that experiment can change the strategy with time or due to structural change. In most of  the area, it is considered that one should start with a simple strategy first and then move to complex one.

 

In the next section, we will discuss about some applications of experimental design that will help to understand its importance.

 

  1. Applications of Experimental Design

 

In general, we understand the process in a series of experiments and we make assumptions about the process while collecting the data and come to conclusions after statistical analysis. Based on the conclusion, we establish new assumptions that will lead to new experiments and so on. Hence the process of experimentation keeps on repeating itself with the passage of time.

 

The use of experimental designing in the process will lead to

 

(i) an improvement in the process outcome;

(ii) reduce the variability among the response variable;

(iii) reduce the overall cost;

(iv) reduce the time period for developing.

 

Also the experimentation will help to understand the comparison between the different alternatives available for the process in terms of raw material, machines, human resources etc. The experimentation will also help to understand the relationship between the factors that will help to see the output variables characteristics. For example, one can check the robustness of the product with the change in the environmental conditions or input variables in manufacturing industry. This will help the experimental to make a conclusion about the quality of the product before make it approachable to the end user.

 

Experimentation will also help in determining the key parameters as well as in the formulation of the new products. Hence the use of experimentation or experimental design will help in selecting the products that are easier to manufacture, reliable, optimized, minimum production cost, short development time and easy product design. Due to the extensive applications of experimental designs  in service industry, manufacturing sector, software development and market research, it is essential for all nowadays to have some knowledge of this topic.

 

In the next section, we will discuss about the three basic principle of design of experiment.

 

  1. Basic Principles of Design of Experiment

 

Design of experiment is the procedure of planning about how to conduct an experiment, how and which method will be used to collect the data and after that analyzed it that would bring the conclusion for the objective. It is essential to use statistical methods to draw meaning conclusions from the data. It is more appropriate method to use when we have the possibility of experimental error exist in the data. Hence we have the two aspects of any experimental problem first is how to design the experiment and second one is it statistical analysis for valid conclusions. These two aspects affects each other in an indirect manner. So to understand the relationship in more depth one has to understand the basic principle of design of experiment. These are

 

(a) Randomization

(b) Replication

(c) Local control or blocking

 

Some statistician also consider the factorial principle as the fourth one but in the literature the above three principles are considered without any ambiguity. We will explain the principles of design of experiment in detail as given below:

 

  • (a) Randomization: It is considered as the initial stage of implementing the statistical technique in the experimental design. Randomization means how we allocate the experimental material to the units and how to repeat the run of the experiment both of these things are done in a random manner. Hence this principle is essential for removing any biasness that may occur while the allocation of the material to the units or from the repetitions of the experiment. As statistical methods require the observations to be independently distributed.

Basically here the observations generally means errors to be independently distributed. The purpose of the randomization is to remove the effect of extraneous factors that may exist. For example, there is difference in the quality of the product. There are many possibilities whether the difference occurs due to difference in the raw material or due to machine operators. If we assign the different raw materials to different operators randomly then we can eliminate the effect of extraneous factors.

 

Nowadays programming language are widely used to assist the experimenter in selecting and choosing experimental design. With the help of programming one can see the runs of experiment in random manner and this is possible through random number generation technique. Sometime it is difficult for the experimenter to assign the material to the units in a random manner due to some constraints like location problem, difference in experience level among the operators etc. So it is not possible to do complete randomization in some situations because that would increase time and cost. In statistics, there are many methods available that can tackle this type of problems and will give you the optimized results.

  • (b) Replication: It means an independent repetition of run for each factor combination. For example, if we have five treatments (type of seeds) and 5 units of lands. If we want to test the productivity of the crop with the type of seeds then we have to assign each seed to one unit in a random manner. Now we also have to repeat this process further so that each unit of land will get each variety of seeds at least one time. Hence the process of repeating the experiment more than one time to get better understanding of the system is called replication.

 

Replication has further two important characteristics. First it helps the experimenter in estimating the experimental error. This error will help in determining whether the variations in the observations are statistically different or not. Second it help in estimating the parameter of the model in a precise manner. For example, sample mean is used to estimate the population mean.

 

Sometime persons consider the term repeated measurement as a synonyms of replication but both have different meanings. Repeated measurements term is basically used to refer to those cases where some operations are conducted repeated on a same units but in replications we repeat the experiment such that each unit will get in contact with different factor on every run. Hence replications not only reflects the source of variability among the responses within runs but also reflects the variability between runs.

 

  • (c) Blocking: It is the third principle of design of experiment that is basically used to enhance the precision that will help to compare the factors of interest are done. Basically blocking is used to remove the effect of extraneous factors. These factors may influence the response variables but not in direct manner. For example, in a field experiment of testing the quality of different variety of seeds then it may be possible that we cannot differentiate between the yield of crop due to different seeds because the difference in the yield may occur due to difference in the fertility of the land so we have to take account of this factor while comparing the different variety of seeds in terms of yield of the crop. Hence, generally block is considered as the set of relatively homogenous units like field that has same fertility can be considered as a block and other field that has fertility different from the previous one will be another block. It is also observed that there is less variability within the elements in a block than the variability between the blocks.

In this section, we discussed about the three basic principles of design of experiment: randomization, replication and local control or blocking. In the next section, we will discuss about the guidelines or rules that one should follow while planning for the experimental design.

 

  1. Steps for the Construction of Experimental Design

 

Before the construction of experimental design it is essential for the experimenter to have a clear idea about the objective in advance what is our objective, how to collect the data and how to do analyses of this data. In this section, we will discuss about this in steps as given below:

 

  • (a) First to recognize the problem. It is essential to first understand the problem and make a valid statement about the objective. It may seem a very simple task at first approach but it is not. As it require a lot brain storming sessions to develop a clear objective of the experiment. Sometime it is essential to take input from all the concerned specialized persons about the system like engineers, marketing, management, customer and operators. Hence for designing the experiment the collective efforts of the group is required.

 

It is helpful if we prepare a list of questions and problems that are raised by the experimenter. The clear statement about the objective will help for better understanding of the system and the solution to the problem.

 

It is also essential to keep the overall objective in mind while performing an experiment. There are many reasons for conducting the experiments and each experiment will generate its own series of questions. Reasons for conducting an experiment are given as:

 

  • (i) The main objective of conducting an experiment is to determine or find out the major factors that have the most influence on the response variable. The process of finding the factors that are important is called factor screening.
  • (ii) After determining the factors that the next step is to determine the optimal stage or level that will give you the optimal values of the response. For example, while experimentation of the yield of the crop it is determined that quality of seeds, fertility of land and fertilizers used are the main factors. Then the next stage is to find the optimum combinations of all these to get the best results in terms of increased profit or minimized cost.
  • (iii) Now after determining the levels the factors, the experimenter is trying to verify that the system operates in the same manner as previously or there are some changes in the structure of the system with time.
  • (iv) It may also happen that the experiment are trying to apply new resources to see their impact on the system. For example, in manufacturing, experiments are conducted with new resources that increase the quality as well as the reliability of the response. Experimentation of this type will increase the demand of the product and overall profit.
  • (v) The experimentation will also help to test the robustness of the response variable. Under what conditions the value of the response variable will deviate from the target value.
  • (b) As we have already determined the factors those are the most influential for the response variable. Now the next step is to select the response variable, the experimenter should be certain about the response variable that will give the useful information about the system under study.
  • (c) After identifying the factors, the next important task is to categorize these factors into design factors, factors that are constant and factors that can vary. The design factors are those factors that are selected for study in the experiment. Constant factors have no direct impact for the current experiment but they can affect the system in future. Varying factors are those factors that have property of inhomogeneous among them for example, the experimental units or materials are not  completely homogenous. By using the principle of randomization, one can reduce the effect of this type of variability on the response variable.

    There are many other factors that have major influence on the response variable. These factors are called nuisance factors. Nuisance factors are classified as controllable (assignable) and (non-assignable) uncontrollable. Controllable nuisance factors are those whose factor’s levels are set by the experimenter. On the other hand, if the factors are uncontrollable but still measurable then the procedure that is used for analysis is called analysis of covariance.

    • (d) If we are able to categorize our factors then the next step is to choose the design according to the objective. It involves determination of the sample size as well as number of replicates, determination of whether to involve blocking or not in the design. There are many statistical software available that will help to fit the design the model on the data. Based on the information in terms of number of factors and their levels software will recommend a particular model. It is also possible that one can compare between models based on the coefficient of determination as well as to give some diagnostic information about the performance of the model. This will help the experimenter to determine the suitable model for the objective. One can start from the very simple model and add the factors/ variables in the model based on the requirements of the objective and attain the optimal model.

     

    • (e) The most important phase of designing experiment is the running stage of the experiment. It is important that process should perform according to the plan. If the process is not working about to the plan then it will bring serious consequences like wastage of material, time, human resources etc. So it is suggested that one should perform some pilot study or runs before implementing the experiment on the large scale. These runs will help to know about the consistency about the experimental design. If it is not working properly, then there is a need to cross check all the previous stage and again do pilot study.

     

    • (f) After running the experiment, the next step is to collect the data to apply statistical methods on them. There are many software available that will help to assist in analysis. Descriptive statistics play an important role in data analysis at initial stage but most the studies require methods of inferential statistics like estimation of variable coefficient, hypothesis testing and confidence interval. Residual analysis and model adequacy checking are among the major analysis techniques.
  • (g) After analysis the data using the appropriate statistical methods, the next stage is to provide an appropriate conclusions about the results. One can also test the validity of the conclusion by performing confirmation testing.

Hence, as we have observed the importance of experimentation and steps how to implement it. One has to work very efficiently stating from the initial stage of determining the factors till the conclusion to meet out the objective.

 

  1. Use of Statistical Techniques in Experimentation

 

The basic purpose of the use of statistical techniques in experimentation is to increase the efficiency of these experiments and to provide a valid conclusions. The following points should be kept in mind by the experimenter while applying the statistical techniques:

 

  • (a) The most important thing that the experimenter should have a strong knowledge in their field. As this knowledge will help in determining the factors that are most influential for the response variable and also in the designing of the model.
  • (b) Second important thing that the experimenter should keep in mind that one should start with simple statistical techniques and keep the design as simple as simple. As for a simple and reasonable design the analysis will be very simple and it is easy to provide its interpretations.
  • (c) There are situations when there is a strong need to interpret the statistical results according to practical situation. For example, if there is a statistical significant difference in the yield of crop by using two quality of seeds then one should consider the significant difference in terms of other factors like cost of production new seed than the old one as well as the total benefit that one will receive while using the new one. If the total profit that one will earn is negligible then however there is a significant difference in terms of statistics but in practical case there is no difference.
  • (d) It is possible that to draw a valid conclusion about the system one has to repeat the experimentation many times. As each experiment collects some information so one can use this information to get the better results in further experiments.

In this section, we discussed about the points that one should apply before applying any statistical analysis on any experimental design.

 

 

  1. Summary

 

In this module, we first provide a basic introduction about design of experiments. Three basic principles: randomization, replication and local control are also discussed in depth. We also discussed the important steps that one should know before planning for the design of experiments. At the end, we also give some tips for applying the statistical techniques in experimentation.

 

  1. Suggested Readings

 

  • Chakarbarti, M.C., Mathematics of Design and Analysis of Experiments, Asia Publishing House, 1970.
  • Cochran W. G. and G. M. Cox, Design of Experiments, Wiley, 1992.
  • Dass, M. N. and N. C. Giri, Design and Analysis of Experiments, New Age International Publishers, 1986.
  • Kempthorne, O., Design and Analysis of Experiments Vol I-II, Wiley, 2007.
  • Montgomery, D. C., Design and Analysis of Experiment, Wiley, 2004.
  • Raghavarao, D., Construction and Combinatorial Problems in Design of Experiments, Wiley, 1971.

Suggested Books for reading

 

  • Agresti, A. and B. Finlay, Statistical Methods for the Social Science, 3rd Edition, Prentice Hall, 1997.
  • Daniel, W. W. and C. L. Cross, C. L., Biostatistics: A Foundation for Analysis in the Health Sciences, 10th Edition, John Wiley & Sons, 2013.
  • Chakarbarti, M.C., Mathematics of Design and Analysis of Experiments, Asia Publishing House, 1970.
  • Cochran W. G. and G. M. Cox, Design of Experiments, Wiley, 1992.
  • Dass, M. N. and N. C. Giri, Design and Analysis of Experiments, New Age International Publishers, 1986.
  • Hogg, R. V., J. Mckean and A. Craig, Introduction to Mathematical Statistics, Macmillan Pub. Co. Inc., 1978.
  • Kempthorne, O., Design and Analysis of Experiments Vol I-II, Wiley, 2007.
  • Meyer, P. L., Introductory Probability and Statistical Applications, Oxford & IBH Pub, 1975.
  • Montgomery, D. C., Design and Analysis of Experiment, Wiley, 2004.
  • Raghavarao, D., Construction and Combinatorial Problems in Design of Experiments, Wiley, 1971.
  • Stephens, L. J., Schaum’s Series Outline: Beginning Statistics, 2nd Edition, McGraw Hill, 2006.
  • Triola, M. F., Elementary Statistics, 13th  Edition, Pearson, 2017.
  • Weiss, N. A., Introductory Statistics, 10th Edition, Pearson, 2017.