21 Introduction to Sampling

R. Saratha

1. Introduction:

The purpose of research is basically to explore relationship between various variables which may be related to human behavior, materials, actions, etc. For example, one may try to find out the relationship between intelligence and performance and in such a case the researcher may define the research problem as “The effect of human intelligence on academic performance.” In deterioration of material, the researcher may try to find out the influence of the environmental factors, on the extent of deterioration and therefore, the research is trying to find out the relationship between two variables. In summary, the purpose of research is to establish the nature and extent of relationship between two variables. In establishing such a relationship, the sample used for the study becomes very vital and selection procedure adopted by the researcher in any research is also of paramount importance. The results of the study can be generalized to the population effectively when the right type of sample is used in the research. Therefore, enormous attention should be given to sample selection. This chapter deals with the concept of Census, Sampling Methods, Sampling frame, advantages and limitations of sampling, sampling and non-sampling errors, etc.

2. Learning objectives:

At the end of the session you will be able to:

Understand ‘Census’, its features, advantages and limitations.
Comprehend the importance of sampling in research and know about its principles.
Conceptualize what is a sampling frame.
Know about the various Sampling and Non- Sampling errors.

3. CENSUS

A study of all items in the ‘population’ is known as a census. It can be assumed that in census, when all items are covered, no element of chance is left and highest accuracy is obtained. But in practice this may not be true. Even a minute bias in such an inquiry will get larger as the number of observation increases. Moreover, there is no way of checking the element of bias or its extent except through a resurvey. Besides, this consumes a great deal of time, money and energy. Therefore, when the population is large, this method becomes difficult to adopt, because of the resources involved. At times, this method is not feasible for ordinary researchers. It is possible only for the Government to get the complete enumeration carried out. Government adopts this in cases such as population census conducted once in a ten years. Further, every time, it is not possible to study every item in the population, and sometimes it is possible to obtain sufficiently accurate results by studying only a part of the total population. In such cases there is no utility of census surveys.

A census can provide detailed information on all or most elements in the population, thereby enabling totals for rare population groups or small geographic areas. A census and a sample survey have many features in common, such as the use of a questionnaire to collect information, the need to process and edit the data, and the susceptibility to various sources of error.

The census method can be applied in a situation where the separate data for every unit in the population is to be collected, such that the separate actions for each are taken. For example, the preparation of the voter’s list for election purposes, income tax assessment, recruitment of personnel, etc. are some of the areas where the census method is adopted. This method can be used where the population is comprised of heterogeneous items, i.e. different characteristics.

3.1 Advantages of Census Method:

1. Concentrated study: the researcher can have an in depth study about a problem and also can gather a lot of information about the entire population.

2. Accuracy: there are more chances of the census data to be highly accurate when compared to other methods of data collection. The smaller the population, higher is the degree of accuracy.

3. Adapts to heterogeneous groups: Census method is the only data collection method that can be adopted for a population which is heterogeneous in nature.

3.2 Disadvantages of census method:

1.Inconvenient: it is a highly inconvenient method because it involves a lot of manual efforts.

2.Time & cost consuming: besides being inconvenient, census method demands spending a lot of time and money to get the data collected.

3. Unsuitable for all study: census method cannot be suitable for all kinds of study. Its adaptability is very limited i.e. only to a few circumstances where the population is limited and does not require vast area of study.

4. SAMPLING

Sampling refers to the process of selecting few representatives from the whole group of items in any field of inquiry. The whole group is referred to as ‘population ‘or ‘universe’. Hence sampling refers to choosing few items of the population which would serve as representative group that can be used to estimate or predict unknown information about the population. Sampling seems to be a pre-requisite when the population is a bigger one and when there is practical difficulty in studying the entire population which is going to be time, cost and energy consuming.

Eg. Suppose you want to estimate the average age of the students in your class. There are two ways of doing this. The first method is to contact all students in the class, find out their ages, add them up and then divide this by the number of students (the procedure for calculating an average). The second method is to select a few students from the class (the small group of students selected is called the Sample), ask them their ages, add them up and then divide by the number of students you have asked. From this you can make an estimate of the average age of the class

4.1 PRINCIPLES OF SAMPLING:

There are two important principles on which the sampling theory works:

4.1.1 PRINCIPLE OF STATISTICAL REGULARITY

The principle of statistical regularity is derived from the theory of probability in mathematics. According to this principle, when a large number of items are selected at random from the universe, then it is likely to possess the same characteristics as that of the entire population.

This principle claims that the sample selection is random, i.e. every item has an equal and likely chance of being selected. It is believed that sample selected randomly and not deliberately acts as a true representative of the population. Thus, this principle is characterized by the large sample size and the random selection of a representative sample.

4.1.2 PRINCIPLE OF INERTIA OF LARGE NUMBERS

The principle of Inertia of large numbers states that the larger the size of the sample the more accurate the conclusion is likely to be. This principle is based on the notion, that large numbers are more stable in their characteristics than the small numbers, and the variation in the aggregate of large numbers is insignificant. It does not mean that there is no variation in the large numbers, there is, but is less than in the smaller numbers.

4.2 SAMPLING METHOD:

A sample is considered to be a true representative of the population. The data gathered from the sample is expected to be applicable to the entire population but many times, such generalizations are questioned citing flaws in the sampling methods. For example, the researcher wants to select 200 girls from a group of 2000 for a research on the nutritional value of a product developed or to evaluate the usage of cloth in particular material (cotton/synthetic etc.) and in this case the sample is 10% of the population. What is the guarantee that the 200 persons selected as the sample truly represent the characteristics of the 2000 persons? This may be possible if the sampling method applies ensures that every person of the population has a high probability of getting selected as one among the 200 persons. Such samples are called probability samples indicating that every person of the population has a chance of getting selected. The results obtained from the probability samples are mostly objective in nature and the generalizing ability factor is fairly high. There are also possibilities that the 200 girls sometimes represent a particular geographical area or in cases volunteers too and therefore, they will not be the true representatives of the population. Such samples are called non-probability samples. The results emerging from non-probability samples are sometimes subjective and cannot be generalized to the population. The different types of probability and non-probability samples are enumerated in the following sections of this chapter.

SAMPLING METHODS

Probability Sampling is further classified as follows:

Simple random sampling
Systematic sampling
Stratified random sampling
Multistage sampling
Cluster sampling

Non- Probability Sampling is further classified as follows:

Convenience sampling
Judgmental sampling
Quota sampling
Snow ball sampling
Accidental sampling

4.3 Advantages of Sampling:

Every unit in the population has a chance of getting selected, as far as probability sampling is concerned.
As the sample is mostly the true representative of the population, the probability of generalizability of the results too is very high.
Direct and reliable data can be acquired, as each and every unit is being dealt with.
Accuracy of data is high
Sampling involves less cost and is convenient
Organizational problems are very low when the population sample is taken for study.
Facilitates better rapport between the researcher and the respondents.
Intensive and exhaustive data can be collected.

4.4 Limitations:

1. Less suitable for homogeneous groups as the generalizability factor gets affected.

E.g. a researcher wants to study the attitude of students towards curriculum and in case the population of students belongs to a school, which admits only meritorious students, then the population becomes largely homogeneous and in such case, the generalizability of the responses obtained from them to the entire student population may be low

2. Requires a lot of controls in sample selection which costs time and resources. Also assuring a truly representative sample is difficult.

3. There are chances of bias.

4. Sampling is difficult in case the population is too small or too heterogeneous.

5. SAMPLING FRAME:

Sampling frame is nothing but the unit of a population from which the sample is selected for the study. The definition of the population also determines the sample frame. For example, a researcher wants to study the papers published by the Home Science teachers of a University, then all the Home Science teachers together become the population for the study and every teacher becomes the sampling frame. Simply stated, all the individual unit of the defined population is a sampling frame. If the Chief Educational Officer wants to gather information on availability of laboratories in secondary schools in a district then every secondary school in that location becomes the sampling frame and the total list of the secondary schools becomes the population. Sampling unit does not mean human beings alone used in census but any entity that is used as a sample for the research. Companies which carryout market research of sale of products uses a particular product as the sample frame and so on.

A good sampling frame will:

Include all the individuals that belong to the target population
Exclude all the individuals that do not comprise the target population
Includes all accurate information about each unit of the population
List all the units in an order and organizes with a numerical identifier.
Avoids repetition of units and thereby the possible errors.

5.1 PROBLEMS WITH SAMPLING FRAME

Missing elements: Some members of the population are not included in the frame.
Foreign elements: The non-members of the population are included in the frame.
Duplicate entries: A member of the population is surveyed more than once.
Groups or clusters: The frame lists clusters instead of individuals.

6. SAMPLING AND NON-SAMPLING ERRORS

In a research, the error may result because of the flaw in selection of the right sample. Unless the sample is the true representation of the population, the results are bound to be suspect and therefore, such errors emerging out of the results are certainly sampling errors. The researcher should definitely minimize the sampling error in research and therefore, probability sample procedure is advisable.

The research may result in non-sampling errors too which emerge due to the faulty methodology. In research, use of measurement tools, forming of the right hypothesis, use of the right statistical applications, interpretation of data, etc., also determine the quality and outcomes of the research and any error in these areas may also make the research outcomes suspect. Some of these potential non-sampling errors are enumerated as follows:

6.1 Errors due to the measurement tools:

In research, the researcher develops a tool to collect data or else an existing tool may also be used. When researcher develops the tool – whether it is a questionnaire or a rating scale or an inventory and so on – the reliability and validity should also be established in order to gather accurate data. Reliability ensures the stability of the tool whereas the validity indicates its true worthiness. Sometimes, the test may be reliable but need not be valid and therefore, the researcher has to use appropriate methods to ensure a high reliability and validity of the tools to collect data.

This can be explained through a simple example. Suppose one wants to buy 5 meters of cloth and the shop keeper measures 5 times in front of the customer using a measurement scale which considered one meter by the customer on its face value. As the procedure of measuring is repeated five times using that scale, the customer assumes that it is five meter. In this exercise, the procedure of measurement is reliable because of the consistency in the repetition of the task by 5 times. However, the customer later finds that the actual length is only 4.9 meters and on further verification finds out that the actual length of the measurement tool is only 98 centimeters. This indicates that the tool is not a valid one as it does not contain the real measurement of a meter. Therefore, good sampling technique but a poor tool may not bring out an objective outcome of the research thus resulting in a non-sampling error.

6.2 Errors due to the application of wrong research design.

The broad classification of research designs may be enumerated as follows:

1. Descriptive studies – A research design which simply describes the phenomenon or explain “what it is” of the research

2. Causal Comparative Studies – This approach is used to explore relationship between variables but not able to establish the cause and effect relationship between them.

3. Correlation studies – This approach helps in measuring the magnitude of relationship between variables

4. Experimental research – This research tries to establish the cause and effect relationship between variables.

Though the sampling is good, the researcher may not use the right design for the research which may bring faulty outcomes too. Applying experimental design when there are no proper controls, trying to establish cause and effect relationship between the variables through a causal-comparative study, etc., may contribute to sampling errors.

6.3 Errors due to wrong selection of variables:

Researchers sometimes commit mistakes in determining the variables of the research study. The following descriptions help in understanding different variables in order to know where the researcher can err.

Independent variable: A variable that can be manipulated as per the objective of the study
Dependent variable: The variable that is treated normally as the criterion or effect in the research study
Antecedent Variables: Variables, which are believed to have an effect on the results of the overall study, particularly in explaining the variance, are called antecedent variables.
Extraneous Variables: In addition to independent variables used in the study and the antecedent variables undetected, some other variables such as the test itself, interest level in taking test, etc., which affect the generalization of the results are called extraneous variables.

In order to get the real impact of the independent variables on the criterion of the study, it is preferable to control the effect of the extraneous variables. Lack of such control may also result in non-sampling error of the study.

6.4 . Errors due to application of inappropriate statistical procedures:

Some research studies may use statistical applications but usage of the inappropriate technique may also result in errors in the study though the sampling procedure is correct. Two broad procedures in statistics and their usage are described below:

6.4.1 Parametric statistical procedures

Parametric statistical procedures make the following assumptions:

o the distributions compared are normal

o the variances of distributions are equal; and

o the subjects of the distributions are independent of each other.

Analysis of Variance: Procedure used to compare the effects when more than two variables are used in the study. Use of ANOVA reduces the error compared to doing pair-wise comparisons. ANOVA uses F test.

Analysis of Covariance: Adjusting the post-test scores on the basis of new variables and then comparing the difference between the experimental group and control group.

Canonical Correlation: When the researcher is interested in computing correlation between a set of independent variables and a set of dependent variables, the canonical correlation technique is used.

Multiple Regression: The process of using more predictors in order to explain the variance in the criterion is called multiple regression.

6.4.2 Non-parametric statistical procedures

Non-parametric statistical procedures are used when the distributions do not satisfy any pre-determined assumptions.

Chi-square, Mann-Whitney, Biserial Correlation, Point biserial correlation tests are some of the commonly used non-parametric statistical procedures.

6.5 Errors due to inappropriate hypothesis:

Hypothesis in a study is nothing but an educated guess. A null-hypothesis is non-directional and usually used in exploration studies. A researcher may use directional hypothesis when there is better control over variables. Non-directional hypotheses work better in the case of confirmatory studies. Hypothesis testing involves application of statistics too. Sometimes the researcher may reject the null hypothesis when it is tenable and this error is called Type I error. Type II error occurs when the researcher retains the null hypothesis when it is false. A good research study tries to control the Type II error. Though the researcher may adopt the right sampling procedure, the error may result through the wrong application of hypotheses too.

Summary

Well, we have discussed the concepts of Census, Sampling Methods, Sampling frame, advantages and limitations of sampling, probable and non-probable sampling, and sampling errors in this lesson. We have also enumerated the advantages and limitations of each sampling technique. We have described the errors that may occur in research due to other factors such as research design, tools for the study, analysis techniques, etc., and how care must be taken in the selection of the sample considering these factors. Sample selection is not an independent process in research but an inter-dependent aspect to make the research study more effective. However, among all procedures used in research, selection of sample is a very important feature and we should try to reduce the sampling error to make the research more accurate. The time and energy spent on research will go waste if the results of the study do not have an impact on the population and therefore, utmost care is necessary in sample selection. Review of related literature on sampling procedure will also help you to make the sampling technique appropriate in your research. I also urge you to go through the research studies conducted in home science and other disciplines and see how the researchers have selected samples. The knowledge gained through this lesson will definitely give you insights in reviewing the sampling techniques adopted by other researchers for various research designs.

you can view video on Introduction to Sampling

Web links

http://psychology.ucdavis.edu/rainbow/html/fact_sample.html
http://www.analytictech.com/mb313/sampling.htm
http://zebu.uoregon.edu/1996/es202/l1.html
https://link.springer.com/chapter/10.1007/978-1-4612-9998-1_3
https://en.wikipedia.org/wiki/Sampling_(statistics)

Suggested References

C.R. Kothari (2004), Research Methodology, methods & techniques second edition, revised. New Delhi, India: New Age Publishing Company, P55-67
Ranjit Kumar (2011), Research Methodology a step-by-step guide for beginners, third edition, New Delhi, India, Sage Publications , P 175- 189
Santhosh Gupta(2001) Research Methodology and Statistical technique, , New Delhi, India , Deep& Deep publications ISBN 81-7100-501-2
G.R. Basotia &K.K. Sharma (2002) Research Methodology, Jaipur, India , Mangal Deep Publications, ISBN: 81-7594-090-5 P.Saravanavel (2007) Research Methodology, Allahabad, India, Kitab Mahal Publications, ISBN: 81-2225-0010-2
R.Panneerselvam(2004), New Delhi, India, Phi Learning Private Limited, ISBN: 978-81-203-2452-7
Welter R. Borg & Meredith D. Gall, Educational Research- An Introduction, fourth edition, New York & London, Longman Publications, ISBN: 0-582-28246-2