17 Sampling

C. Raghava Reddy

epgp books

 

1. Introduction

 

Social researchers draw inferences and make knowledge claims about social phenomena based on data. However, validity of such claims, inferences and explanations about social reality is contingent upon the scientificity of data. Generally, data collected through established procedures of research is considered valid for arriving at explanations. Thus, any critical analysis of new knowledge claim, inference or explanation subjects the data collection process to intense scrutiny for its methodological soundness. Any shortfall or compromise or deviation in the procedure results in lack of credibility of the research findings. Thus, it is claimed that though research is a creative process yet it follows established procedures. Established procedures of research guide research process from the stage of formulation of research problem to the finish stage where conclusions are drawn.

 

One of the key stages of a research process is data collection. Data collection doesn’t merely refer to the means and instruments of data collection but also to the issues related to selection of respondents, field site (geographical area from where data are collected). It is not just important to know from whom and what data are collected, but how the respondents are identified or shortlisted also becomes critical. This is of utmost importance in social research for the simple reason that selection of respondents should be free of subjectivity and bias. If a researcher selects respondents whomsoever s/he wishes without adhering to a procedure which is justifiable, the findings of the study cannot be considered as contribution to scientific knowledge1. It will be a journalistic account but not a scientific account of social reality. Precisely, how some respondents are chosen from a large number is what sampling is concerned about.

 

Sampling is an important procedure in social research which deals with the selection of elements or units from a large population. The researcher collects data from these set of elements or units for research. The aim of this module is to make you familiarize with the

 

a.       meaning and purpose of sampling,

b.      advantages and limitations of sampling, and

c.       relevant terms used in sampling.

 

Before we go into the details of sampling, let us first make the meanings and usage of certain terms like population and elements clear. The term ‘elements’ here refers to the units of study from which data are collected. The elements referred to in sampling could be individual respondents, households, villages, organizations, or even states in a country. Elements are those from whom data are collected. The term population refers to collection of those units or elements2 which possess similar characteristics. It consists of a number of elements whose

 

1  However, this statement pertains more to survey research rather than ethnographic, constructivist studies where researcher has the flexibility and liberty to choose the sample of study

2  Although in a general sense the terms elements and units are used interchangeably, there is a subtle specific difference between the two. Element refers to a single member of population whereas unit could be a single element or a set of elements considered for selection in sampling. When a sample is selected at single stage, the units of sample are nothing but elements. However, in multi-stage number may range from hundreds to lakhs. For example, the number of Kathak dancers in India is in hundreds whereas number of farmers in the country is in crores.

 

Population is also often referred to as universe. However, the difference between the two is that universe is a conceptual category while population is an operational category. Universe is aggregate of all units possessing certain characteristics. A sample is supposed to represent universe. For example, the conceptual category of the universe in a study on ‘farm crisis at small and marginal farmer level’ refers to all the small and marginal farmers of the country. The operational category of the universe of the study may be small and marginal farmers located in a particular district or region. Universe can be either finite or infinite. A finite universe will have identifiable number of elements. For example, voters of a country. In the case of infinite universe, the number of elements is indeterminable or cannot be identified. For example, the number of persons communicated through email in a day3. It is something similar to stars in the sky which is difficult to be counted or leaves on a Neem tree. When it comes to population, the elements in the population are identifiable and reachable. In most cases the term population refers to a specific geographic area. The sampling frame is prepared considering the population of the study.

 

2. Sampling

 

Sampling is the process of systematic selection of elements from a population of interest so that by studying the sample a researcher can fairly generalize the results about the population.

 

Size of population ranges from few individuals, for example, nuclear scientists in the country, to a very large number, for example, school going children in the country. In the first example, it is fairly less difficult for a researcher to identify the population for the study as the number of scientists specialised in nuclear science in the country is less. Given the resources and time, sometimes researcher might collect data from entire population. Operational, technical and material constraints of research may demand collection of data from a set of elements drawn from population instead. If data are collected from all the elements of population, it is referred to as census data. If data are collected from few select respondents, it is referred to as sample data. The important issue here is that how the researcher arrives at generalizations or explanations about population based on the data collected from a sample.

 

Sampling involves selection of some or all elements of population with an intention to explain the properties of population. Sampling, thus, pertains to selection of certain elements from a large or small population. The elements selected for systematic observation or data sampling wherein sampling takes place at different levels, i.e. for example, states at the first level, cities at the next level, and wards at the third level and households at the final level, the units of sampling till ward level are called as units whereas in the final level, i.e. at the household level the units could be referred to as elements. During the course of the module these two terms are used connoting same meaning.

 

3 It may be possible to arrive at a figure, however, difficult to identify them for the purpose of sampling.

collection through various methods are referred to as sample. A sample is a finite part of a population whose properties are studied to explain about the whole. The size of the sample need not be in proportion to the size of population.

 

Studying all the elements from a very large population is usually not possible for any researcher. At another level, social research does not necessarily demand data collection from entire population. In fact, it is considered as meaningless to collect data from entire population. When it is not useful to collect data from all elements in the population or it is impractical, or impossible to deal with all elements, then sampling is undertaken.

 

Sampling is not entirely new and unknown to us. In fact, in our day-to-day life we practice sampling in various contexts. Take the example of us when we go to buy rice or wheat from a grocery store. How do we judge the quality of grains in a bag of rice? Do we check the entire bag to know the quality of rice in the bag? We just take a handful of rice grains from the bag and decide the quality of rice in that particular bag. Here, rice in the particular bag can be considered as the population and few grains which we collect from the bag as the sample. Is sampling such a simple exercise? Just hold on. It is not that simple. Because how do we know that the handful of grains we collect from the bag represent the grains in the entire bag. If there are 10 bags of rice then can this handful of rice from one bag be used to determine the quality of rice in all the ten bags? The important question that emerges here is whether the sample we choose is representing the entire population. Because a sample is subjected to measurement assuming that it represents the entire population. A ‘representative sample’ is the one that represents the entire population.What do we do with a sample study? Through data collected from the sample we arrive at an aggregate value called as statistic. Using the statistic we aim to estimate a population parameter. For example, average age of students in a college may be calculated by using a sample. The average age we calculate is used to make observations about age of students in the college as a whole. Such statements based on sample results are probability statements. They are probable statements because the average we arrive at is not a fact. If we collect data on age from all the students in the college, we would arrive at a factual figure. Thus, while the census value is a fact, the sample value is an estimate. We call sample value (statistic) as an estimate because the value is not based on the information from all the students in the college. Estimate also brings in the issue of its precision. How precise the estimate can take us close to the population value? Closeness to the correct population value (fact) is referred as accuracy. Since population value is seldom known to the researcher, the sample value gives us probable accuracy which is otherwise called as precision.

 

Self Check Exercise – 1

 

a.       What is sampling?

 

Sampling is the process of selection of some or all elements of population with an intention to explain the properties of population. In research a sample of elements are selected from a large number of units following standard procedures.

 

b.      What is the importance of following sampling procedures?

 

The findings of the research study are based on the data collected from a finite (limited) set of elements. However, conclusions of the findings are attributed to the whole population (universe). Hence, the elements included in the sample should represent the features of all the elements of the population. By adopting sampling procedures researcher can offer justifiable explanations. Point to be noted is that simple random sampling technique, when used with complete and authentic sampling frame, validity of the findings can be increased considerably. Nevertheless the explanations offered in a sample study are probabilistic and most plausible explanations.

 

3. Normal distribution and its importance in sampling

 

In statistics, this fact is supported by ‘normal distribution’. Suppose we collect data on marks in a particular subject from a sample of 100 students drawn from a population of 1000 students and calculate the mean; the mean mark thus obtained from the sample is used to estimate the mean mark for the population. The value (here in this example mean) we derive from the sample is used to explain for the population. This value from the sample is known as ‘statistic’4. To what extent the mean marks calculated for 100 students is accurate to explain the mean marks for 1000 students? Generally, the statistic we arrive at is close to the population value, i.e. parameter5. Normally, the mean marks for 100 students may be slightly high or low from the mean marks of the population. However, in many cases we do not know

 

4  Statistic could be mean, median, mode, or any other statistical measure.

5  The population average is referred to as parameter.

 

the mean value of the population. For example, average income of farmers in the country  Hence, the statistic we arrive at from the sample is always referred to as an estimate. Then the next question that arises here is how many samples should be drawn from a population to arrive at a statistic which is close to parameter? Is it sufficient to draw one sample and arrive at a statistic? Theoretically, infinite or any number of samples can be drawn from a population. However, in practice a researcher does not draw several samples and instead confines to a single sample. This is because even if more than one sample is drawn, no two samples would produce the same estimate. However, the average of different sample estimates is more likely equivalent to the population average. (This is explained with an example under sampling distribution). Imagine, we take several samples of same size from a population and calculate the average. Then calculate the average of averages of different samples we have drawn. The average of averages is likely to be equivalent to the population mean. This is explained in statistics by standard normal distribution. Let us look at the example of students’ marks we used above. Suppose we draw several samples of a particular size, for example 100, out of 1000 students and estimate the mean marks of the samples and plot the averages on a bar graph. We can find from Figure 1 that most of the averages converge on the same central value and fewer averages are located either high or low from the central value. In other words, the bar graph produces a ‘bell curve’ which is an indication of normal distribution.

 

 

 

  1. The Normal curve

 

The normal curve is a smooth, unimodal curve that is perfectly symmetrical. It is a bell shaped curve in which the mean, median and mode coincide at the peak. In normal curve one half of the curve is the mirror image of the other half. The area under each half can be divided into smaller portions. The centre of the normal curve is considered as zero point. Moving further right or left from the centre we can divide it into equal interval widths as +1, +2 or +3 and similarly -1, -2 and -3. These points are known as standard deviate points.

       What are these standard deviate points and the percentage of area under normal curve? The standard deviate points indicate the spread of scores around the average (the central point in the bell curve) in a single sample. We know that if we calculate average marks in a test for 100 students we get one single value that is mean. Here, in normal curve, mean marks is located at the centre with zero standard deviate points. Obviously you will find that many students who have scored higher or lower marks than the mean marks. These high or low marks from the mean are indicated by standard deviate points. Statistically, using a particular score (say a student’s marks) and mean marks for the sample, we can say how many students scored marks between the mean and the particular marks. This explanation is something beyond the single value, i.e. mean. With the mean, we would have explained only the mean marks of students. But to explain how many students, for example, scored between mean (if mean is 58 marks) and 75 marks, for example, we need to calculate the standard deviate units. The expression of this calculation is raw score minus the mean divided by the standard deviation. Standard deviation for a given sample can be calculated by using mean value and sample size. Standard deviation is obtained by first subtracting the mean from every test score, then squaring the difference between mean and the test score, summing the squares, dividing by the number of cases and finally taking the square root (Eckhardt and Erman, 1977).

 

Using a normal curve we can infer population values from sample statistics. If we draw fixed size samples repeatedly and calculate the mean, the means cluster around the mean of the population. As stated earlier, the mean of the sample means will equal the mean of the population. The important point to remember here is that the population should have a normal distribution. However, the population does not always have a normal distribution. According to the law of large numbers, the distribution of sample means will approximate a normal curve when the sample size is greater than 30. Thus, sample size matters a lot in making statistically valid claims about population. As the sample size increases the scope for  sampling errors6 decreases. A low sampling error means less variability in the sampling distribution. Greater variation in sampling distribution gives higher sample standard deviation. Higher the sample standard deviation, greater is the sampling error. Imagine if we collect data from entire population we wouldn’t be worrying about sampling error because we are talking about all the cases in the population. Thus, sampling error does not occur. Sampling error occurs only when we select some elements from population as sample. Statisticians suggest that to overcome the problems of sampling errors, a) the sample size must be high and b) sample should have been drawn using random probability sampling procedures. This takes us to the principle of sampling.

 

Sampling is based on the principle that when a moderately large number of elements are chosen for a sample following random sampling procedures, the sample is more likely to possess the characteristics of the population. Inherent to this principle are two conditions. These are: 1) sample should be drawn randomly. Random principle tells that every unit in the population has an equal, known, non-zero probability of being selected to the sample. The human involvement in the selection of sample elements is minimal in random sampling procedures. 2) The sample should be representative of the population. The principle of large numbers suggests that large numbers have greater stability, steadiness and consistency than smaller ones. It states that larger is the size of the sample, more equal is the spread of units in the sample to be. It is based on the principle that the movement in individual components of a population have a tendency to cancel out each other. While some move in one direction, others move in a direction which is exactly opposite. Therefore, if the sample is large enough the average is always representative of both tendencies.

 

The normal distribution explains the fact that a majority of the mean values we obtain from different samples is more or less same with a small variation between them. Too high or too low values would be very few. That is why it is stated that the average of averages is equivalent to the population mean. Normal distribution is distributed symmetrically about its mean with a large number of small deviations from the mean and few large ones. Normal distribution has many day-to-day applications. For example, body-mass index which tells us the ideal weight of a normal person based on her/his height. We can see that a large majority fall within the range of weight given in the chart. However, we can also find people whose weight is either too high or too low.

 

Self Check Exercise – 2

 

c.       What is a normal curve?

 

In general, we use the term ‘normal’ referring to a particular imaginary notion arrived at through observations over a period. For instance, we have expectations of others behaviour during social interaction. If any person deviates from the expected behavioural pattern is called as abnormal. Similarly when we see things which deviate from the expected range we tend to consider it as not normal. The underlying assumption is that there is something called ‘normal’ in our understanding of the things or people’s behaviour. Similarly, statisticians use normal curve referring to a pattern in which majority of the cases (for example, in sample surveys) fall in the middle and few cases fall above or below the normal range. The midpoint in the curve is equal to average and the cases at the two extreme ends are considered as deviations. In sampling, normal curve is used to explain the extent of spread of cases at different levels on the continuum of normal curve.

 

 

5.    Sampling distribution

 

Sampling distribution refers to the distribution of mean values of different samples drawn from the same population from the mean of these means (average of averages). It may be said that the mean of the means (of different samples) is equal to the true population mean. Consider the following example.

 

From a population of 4 students, data were collected from a sample of two students. In all, six samples of equal size were drawn from the population.

 

 

Population – students Age in years
A 15
B 17
C 18
D 22

 

Based on four samples of different combinations, we may draw the following table:

 

Samples Respondents Age in years Mean ages of different samples
Sample 1 A & B 15 & 17 16.0
Sample 2 A & C 15 & 18 16.5
Sample 3 A & D 15 & 22 18.5
Sample 4 B & C 17 & 18 17.5
Sample 5 B & D 17 & 22 19.5
Sample 6 C & D 18 & 22 20.0
Total 108.0
Average of means 18.0

 

Here, the mean age of the population is 18 years. When we calculate the mean of mean ages, it is also 18 (108/6 = 18). This example shows that average of averages is equal to the population average. Then coming to the estimate’s precision, samples 3 and 4 come close to the population mean. By taking a number of samples and calculating the mean of means one can arrive at a statistic which is closer to the population parameter. However, in practice a researcher does not draw several samples and instead tries to estimate the population parameter on the basis of only one sample.

 

Reasonably large samples provide us the values for different sampling distributions which approximates normal distribution. If a particular distribution approximated normal distribution, we can say that 68.26 percent of the sample estimates will lie between its mean and one standard deviate point, 95.44 percent and two standard deviate points and about 99.72 percent between its mean and 3 standard deviate points.

 

6. Sample size

 

How big should be the sample size? Statisticians suggest that larger sample size overcomes the problem of error in estimate. The larger the sample size, the lower is the sampling error. However, sample size is also determined by other constraints such as budget, time and man power. The size of the sample depends on the characteristics of the population. If the population is homogenous, a sample of one element of the entire population is sufficient (for example, a single drop of blood in blood tests). The size of sample is also influenced by the precision of the results desired by the researcher. If the researcher decides for an intensive examination of population properties, then a large sample may be necessary. Researcher goes for a small sample when the errors associated with small sample studies do not undermine the findings of the study. Finally, sample size depends upon the level of confidence at which the  researcher decides for the estimates. Higher level of confidence demands large sample size and vice-versa. Of course, size of sample varies with the method of data collection. Survey researchers place greater emphasis on calculation of sample size to make the data claims valid. They adopt mathematical calculations to arrive at a number for sample size (see, for details, http://www.unc.edu/~rls/s151-2010/class23.pdf or http://en.wikipedia.org/wiki/ Sample_size_determination to know more about calculation of the sample size). In survey research, the researcher has to answer certain questions about accuracy of the result (for example, how accurately the data finding should be?), level of confidence at which survey results are explained and the awareness of population mean or expectations. In survey research, sample size is calculated before the beginning of data collection. Using a formula (which is available in the above two web sources also) one can calculate the sample size.

 

As far as the size of sample in qualitative research is considered, there are no defining rules to guide the selection. It depends on the objectives of the research project, nature of research questions, resources at disposal, researcher’s familiarity with the field and time. We come across certain sociological studies on social phenomena with a smaller sample size conducted with great intensity. Often referred to as purposeful sampling, the sample is judged based on the purpose and rationale of the study rather than on statistical procedure. However, the validity of the insights of such research is based on the richness of data collected and analytical capability of the researcher and theoretical rendering to the findings.

 

Self Check Exercise – 3

 

d.      What is the importance of sample size?

 

Size of the sample plays an important role in survey research. When we collect data from all the elements of population it is called census survey. But if we chose only some elements from population it is called as sample survey. The question that arises here is; how many elements should be selected to the sample? The size of the sample is always determined by the size of population. There are statistical methods using which one can arrive at the sample size. But in simple terms, it may be stated that larger the sample size lower the sampling error. In other words, if the sample size is large the scope for error is less. Similarly, smaller the size of population, larger the sample size (with relation to the size of population). Hence we come across opinion polls, exit polls during elections talking about small sample size. In statistical terms the size of the sample doesn’t increase beyond a particular size of population.

 

 

7.    Advantages of sampling

 

Sampling enables researcher to collect data from few elements or units of the population with less effort and expenditure. Collection of data from few elements, however, does not affect generalization of findings to the population. For example, National Sample Surveys, Exit Polls, etc.When the population is large and the data are collected from the entire population, the problems of inadequacies and incompleteness in data occur (which is called as non-sampling errors). By focusing on a sample, which means few elements or units, data can be collected with more rigour and intensity. Sampling also reduces the burden of data analysis as information is collected from a few elements.

 

Sampling provides quicker results. As the time consumed for data collection and analysis is less, the findings of the research study can be arrived at in a less time. This time comparison is made in relation to any other research exercise that attempts to collect data from a large or entire population. For example, census is an exercise that collects data from all the households in the country whereas the National Sample Surveys (NSS) focus on a few households in the country.

 

If the population is infinite, sampling is the only procedure possible. For example, in studies related to customers’ feedback or consumer behaviour or public opinion, the population is infinite.

 

8. Limitations of sampling

 

Sampling is not free from certain limitations. These limitations are important to consider making generalisations for the population. The basic limitations are as follows:

 

The researcher is expected to possess adequate knowledge of sampling methods and procedures. Inaccurate sampling results in incorrect and often misleading generalisations.

 

It is not easy to ensure the representativeness of the sample. Therefore, sampling results in sampling errors.

 

9. Terms used in sampling

 

9.1. Population: The term population refers to a group of elements or units related to the problem of research. These elements or units in the population can be individuals, households, organizations, villages, states, nations, etc. Identification of relevant population is guided by factors like geographical area of the study and operational definition of the study. For example, a study on university students in the country, all the students enrolled in different are considered as population. Sometimes population can be enumerated. That means, elements or units in the population can be identified and listed. For example, the details of voters in a constituency are available in the form of voter list. In certain cases, it is not possible to enumerate the elements. The population elements that can be accessed by the researcher, in terms of geographical accessibility, is called ‘accessible population’. When the findings of the study based on the sample drawn from the accessible population are generalized to the population beyond the accessible population, it is referred as ‘theoretical population’. Consider the example of persons with disabilities in a district who form the accessible population and the persons with disabilities in the country or state would become the theoretical population. Population size is denoted by the word ‘N’.

 

9.2. Sample: A sample is a finite part of a population whose properties are studied to explain about the whole. Sample consists of elements or units drawn from population. The elements can be individual persons, households, organizations, villages, states, nations, etc. It is important to note that the elements in the sample must possess the properties of the population. A sample also indicates the target elements from which data for the research study are collected. Sample size is denoted by the word ‘n’

 

9.3. Sampling error: The error which arises because of studying only a part of the total population, i.e. sample, is called sampling error. When a sample is drawn from the population, only this part of the population is subjected to data collection and measurement assuming that the elements in the sample represent the entire population. From the sample the relevant statistic, for example, an average age, is calculated and this statistic is used as an estimate of the population parameter. However, due to certain factors like, natural variations among elements or units in the population, incorrect sampling procedure, inadequate sample size, and non-representativeness of the sample, the sample may not give the statistic that is equal to the population value. The degree of variations of sample values is measured by standard deviation and it is known as the standard error of the concerned statistic. Such an error is referred to as sampling error. As the sample size increases the magnitude of the error decreases. Sample size and sampling error are negatively correlated.

 

9.4. Statistic: Statistic is the summary value of a variable calculated from a sample. The value may be average (mean), median, mode or any other statistical value. For example, average age of students from the sample.

 

9.5. Parameter: Parameter is the summary value of the variable in the population that the researcher is trying to estimate. Again, the value may be average (mean), median, mode or any other statistical value. For example, average age of students in a school.

 

9.6. Estimate: An estimate is the value obtained by using the method of estimation for a specific sample. For example, mean age of the students from the sample (statistic) is an estimator for the age of students in a school. If the value of the estimator is equal to the population value (parameter), the estimator is called unbiased. If not, it is called biased. The difference between the expected value and the true population value is termed as bias.

 

9.7. Sampling Frame: The sampling frame contains all the population elements or units. It is a list of the population elements from which the sample is drawn. Generally, in any research process researcher has to develop the sampling frame containing the list of units or elements of the population. In certain cases the sampling frame may be available or can be procured from different sources directly. For example, voter list in a polling booth is a sampling frame from which the researcher draws a sample of voters to be interviewed. However, in many cases, the researcher has to develop the sampling frame. For example, in a study on persons with disabilities in a district the  researcher is expected to develop the sampling frame based on the sources like census, village records, etc. about their number and spread in the district.

 

9.8. Sampling bias: Sampling bias is a tendency to favour the selection of units that have particular characteristics into the sample. Sampling bias is different from sampling error. Sampling error may be reduced by increasing the sample size, but sampling bias may not be reduced. Unlike sampling error, sampling bias cannot be measured. Sampling bias arises due to non-adherence to random sampling procedures, omission of specific sub-groups within a population, inaccurate sampling frame, problem with questionnaire or interview schedule used to collect data, and non-response by a specific sub-group in the sample.

 

10.              Universe of study

 

As discussed in the initial part of this module, you are familiar with the fact that sample is drawn from a population located in a particular geographical setting. How to delimit the geographical area for a research study has been a big challenge to social science researchers. Gideon Sjoberg and Roger Nett (1992: 129) have observe that,

 

‘most lay observers fail to appreciate many of the technical procedures involved in sampling, for some of these depart markedly from common-sense thinking. Perhaps, the greatest difficulty the scientist experiences in effectively utilizing the material collected by lay observers results from the failure of the latter to specify just how informants are chosen. For the more clearly the researcher envisions his universe and the more carefully he selects its component parts, the more likely is his research to be successful and the more readily can others verify his findings’.

 

These observations clearly state the significance of selection of universe and its units (elements) in any social research. Before we go into the criteria adopted in the selection of universe the terms working universe (also referred as special universe) and the general universe need to be clarified.

 

The ‘working universe’ is specific and amenable for empirical observation from which researcher draws the units for the study. The working universe is equal to that of population in survey research. On the other hand, the general universe is abstract universe to which the findings from the units selected will apply. For example, working universe could be a particular community from which respondents are drawn. Findings of the data collected from the respondents are generalized to the community in general which may range from a particular community located in a socio cultural context to global community. For example, women’s studies, disability studies though study women or disabled in a particular context generalise the findings to women and disabled in general.

 

 

10.1.   Selection of a working universe

 

Sjoberg and Roger Nett (1992) suggest that the following factors influence the selection of working universe:

 

i)     Theoretical commitment: Researchers’ original theoretical commitment determines the selection of universe.

ii)    Availability of data: Generally existing data on a subject attract new research studies. Because of the availability of background data researchers raise new questions for empirical study.

iii)   Resources and convenience: Resources available at the disposal of the researcher such as time, money and manpower influence the choice of a working universe.

 

Self Check Exercise – 4

 

e.       What is the relationship between sampling bias, sampling error and sample size?

 

Sampling bias occurs when some elements in the population have unequal chances of being selected into the sample. Sampling error refers to the inadequacy or inaccuracy of the statistical value to explain the properties of the population. Sample size is about the number of elements included in the sample. The relationship between sample size and sampling error is straightforward. Larger the sample size lower the sampling error. By increasing the sample size one can reduce the sampling error. However, sampling bias occurs irrespective of sample size. Whether a small sample study or large sample study sampling bias can occur as the elements in the sample are not drawn based on random sampling procedures. Sampling error may be calculated but not sampling bias. Hence, the researchers are advised to be careful in the selection of sampling units or elements.

 

11.  Summary

 

Systematic data collection is an essential part of any research study. Social science researchers deal with research problems which demand data collection from field. Data collected from a set of elements chosen through scientifically established practices from a large population forms core part of the scientific studies. The exercise of choosing elements in a scientific manner is explained in sampling. This module on sampling explains the statistical and logical basis for sampling which is proved to be scientifically valid. It analyzes the assumptions of sampling through a discussion on normal distribution and normal curve. The module also explains the different terms associated with sampling. The module provides insights into the statistical significance of sampling.

you can view video on Sampling
  1. References
  • Argyrous, George. Statistics for Social Research. London: MacMilan Press Ltd. 1997. Eckhardt, Kenneth W. and M. David Erman. Social Research Methods; Perspective,
  • Theory and Analysis. New York: Random House. 1977.
  • Moser, Claus and G. Kalton. Survey Methods in Social Investigation. New Delhi: Heinmann. 1976.
  • Sjoberg, Gideon and Roger Nett. A Methodology for Social Research. New Delhi: Rawat Publications, 1992

 

Web sources

http://www.unc.edu/~rls/s151-2010/class23.pdf

http://en.wikipedia.org/wiki/Sample_size_determination

http://www.socialresearchmethods.net/kb/concimp.php