16 Issues of Validity and Reliability

C. Raghava Reddy

epgp books
  1. Introduction

 

One of the most important concerns of social science researchers has been the issue of reliability and validity. It has been a challenge to the social science researchers to claim the findings of their study as scientific as that of natural sciences. Social phenomena are complex and any attempt at understanding them is challenging. It is equally tough to decipher the underlying reasons for patterns of social relations as that of understanding natural phenomena by natural scientists. What is important to note here is that natural sciences witnessed significant advancements in instrumentation which helped them to overcome the problems of reliability and validity. However, social sciences, because of its very nature of object of study, i.e. social phenomena, which is dynamic, fluidic, and difficult to predict, suffers from the inability of developing instruments to help researchers to make precise, accurate claims of knowledge. It is known to us that knowledge claims in social science research are the plausible explanations. Hence, a social science researcher faces the challenging task of arriving at reliable and valid data and conclusions.

  1. Reliability

When an instrument of data collection used by the researcher yields a particular set of data, another researcher should be able to derive similar data using the same instrument. Or, the same researcher should be able to derive similar data using the same instrument at another point of time. This refers to the notion called repeatability and consistency which is closely associated with reliability. Such repeatability and consistency may be possible with the instruments to a greatest extent in natural sciences. But, in social science research, there are inherent limitations and it is difficult to talk about reliability in the same sense we talk about it in natural sciences. However, over a period, there has been a steady advancement of tests which qualify the reliability of certain instruments like questionnaire or interview schedule used social research.

  1. Meaning

Reliability in general sense refers to consistency or repeatability. Consistency or repeatability of results is concerned with the instrument used in data collection, methodology adopted in the study and research design. ‘Reliability is the degree to which a variable or test yields the same results when administered to the same people, under the same circumstances’ (Weller 1998). The research instrument is considered to be reliable when the results obtained using the same instrument is consistent over time and space. A research study may be said to be reliable if the results can be reproduced using the same methodology.

 

Within the field of social science research two broad streams of research exist. One that deals with numerical data and concerned with quantification of data, and the other that deals with the constructivist approach which primarily engages with textual data in the form of narratives, observations, etc. The issue of reliability and validity is equally important in both the streams of knowledge. However, the approaches to examine reliability and validity in the two streams are markedly different. The discussion on reliability and validity proceeds along these two streams of knowledge claims in the subsequent part of the module.

 

  1. Reliability in quantitative research

 

In quantitative research reliability of the instrument used in data collection is the most important concern. It is not just that the instrument should measure what it is supposed to measure but it should measure similarly across space and time. Thus, stability of the instrument for its repeatability and consistency becomes central. The degree to which the instrument, when repeated, produces similar results over a period of time reflects the extent of reliability of the instrument.

 

Self check exercise – 1

  1. What is the importance of reliability and validity in social research?

 

Social science research aims at understanding phenomena in a naturally occurring setting. It deals with the social actors who are part of the phenomena to be understood. These social actors are dynamic agents who create and recreate social setting in their everyday life. Observing these activities to measure and explain is a challenging task. Given these challenges and constraints, social science researchers have to make their data, findings and conclusions reliable and valid. As the aim of every research is to contribute to the existing pool of knowledge, conclusions drawn based on unreliable and invalid techniques or procedures is of no use. It is important to note that social phenomena can be studied using quantitative and qualitative research techniques. Quantitative research engages standardised instruments of research whose reliability and validity can be easily assessed. However, the most challenging part is dealing with the issues of reliability and validity in qualitative research. By adhering to certain established practices a qualitative researcher can enhance consistency and validity of data, and the research findings.

  1. Types of reliability tests

Reliability of an instrument used in measuring the properties of objects can be tested. However, we can only estimate the reliability of an instrument but cannot calculate. There are different types of reliability tests developed to estimate the extent of reliability of the instrument used in data collection (please note that the most commonly used instruments in data collection in quantitative research are questionnaire or interview schedule or sometimes attitudinal scales. So, in our discussion, the term instrument refers to questionnaire or interview schedule). Reliability can be estimated by the correlation between the two sets of scores.

 

5.1         Test-retest Method

 

In the test-retest method, the instrument is administered twice, at two different points of time. Important points to remember are: a) the instrument (questionnaire or interview schedule, or a scale) remains same in two occasions, and b) the instrument is administered to the same set of respondents. If the correlation values show consistency between the two tests, then the instrument is said to be reliable. However, what becomes critical in this method is the time interval between the two tests. If the gap between the two tests is short, there is a problem of cueing, i.e. the respondents may remember the earlier test and answer in the same manner leading to higher correlation. If the gap is too large, then there is the problem of maturation, which means that respondents may change their opinion or understanding of a question in the questionnaire or interview schedule as time progresses.

 

5.2         Parallel Forms Method

 

As the name suggests, in this method two sets of instrument are prepared. The two sets contain questions which provide same meaning. The researcher has to generate multiple questions aimed at measuring the same variable. So, in parallel forms reliability test, instead of generating questions for one set of questionnaire, sufficient number of questions are generated so that they can be divided into two sets of questionnaire. Multiple questions which address same construct are randomly divided into two sets. These two sets of questionnaire are administered to the same set of respondents. The correlation between these two sets is estimated. The two sets of questionnaires are equivalent measures. Hence, this test is called parallel forms method. This method is advantageous when compared to test-retest method as there is less cueing effect. However, this method is demanding as the researcher has to evolve multiple questions which are equivalent.

 

5.3.       The Split-Halves Method

 

In this method, the instrument is administered once to the respondents. However, the responses are tested for consistency by splitting the instrument (questionnaire) into two halves. Each half consists of questions which are similar to the other half. Then, the correlation between the two halves is calculated. This method is dissimilar to the earlier methods as only one questionnaire is administered at one point of time. It overcomes the problems of cueing and generating multiple questions. Moreover, it is administered only once. The difficult aspect of this method is splitting the questions into two equal halves without compromising the validity of the questions.

 

5.4         Internal consistency method

 

In this method, reliability is estimated by grouping questions in the questionnaire that measure the same concept. Instead of generating one question to measure the concept, researcher has to evolve two groups of questions, each group consisting of three or more questions that measure the concept. The questions in the two groups aim at measuring the same concept. The instrument is administered only once. The responses to the questions under two groups are correlated. This method enables the researcher to measure the reliability of the instrument by checking the consistency between two groups of questions. Point to be noted is that, it is not just two groups of questions, but the researcher can generate as many questions as possible and group them so that correlation between them is calculated. It is also possible to measure correlation between different questions by calculating inter-item correlation method. Internal consistency is also assessed using Cronbach’s Alpha. This statistic measures the consistency between the items (questions) used in the questionnaire.

 

Self check exercise – 2

 

  1. What are reliability tests and why they are used in quantitative research?

 

There are different types of reliability tests developed to estimate the extent of reliability of the instrument used in quantitative data collection. They are test-retest method, parallel forms method, the split-halves method and internal consistency method. Reliability of an instrument used in measuring the properties of objects can be tested. Higher the reliability of an instrument greater is the credibility of the research study. However, we can only estimate the reliability of an instrument but cannot calculate.

 

qualitative research the findings of the study are based on direct or indirect observation of social phenomena that occur naturally and do not rely on statistical procedures or other means of quantification in drawing inferences. The emphasis in qualitative research is on understanding phenomena in an intense manner. Golafshani (2003: 601) observes that ‘the terms reliability and validity are essential criterion for quality in quantitative paradigms, in qualitative paradigms the terms credibility, neutrality or confirmability, consistency or dependability and applicability or transferability are the essential criteria for quality’. Quantitative research is guided by the objectives of verifying causal relationships, prediction and generalization. Thus, the instruments used for data collection differ greatly. Quantitative researchers use instruments such as questionnaire or scales in order to measure the property and quantify. However, qualitative researchers use techniques such as observation (participant or non-participant), ethnography, interview, etc. In fact, it is said that the researcher himself/herself is the instrument of data collection.

 

While the credibility in quantitative research depends on instrument construction, in qualitative research, “the researcher is the instrument” (Golafshani 2003: 600).

 

Some argue that reliability issue concerns with quantitative research as it is overwhelmingly dependent on instruments to measure the properties of objects. Since qualitative research doesn’t lay emphasis on measurement, the issue of reliability is of no relevance (Stenbacka 2001). However, the issue of reliability in qualitative research is never ignored. In fact, reliability in qualitative research stands for consistency. Qualitative research is considered reliable if the research findings can be replicated by another researcher. Thus, the qualitative researchers face the challenge of reliability of greater magnitude when compared to quantitative researchers.

 

Sjoberg and Nett (1992: 300) observe that ‘reliability is a function of the scientist’s theoretical system, the social order being studied, and the use to which the instrument is to be put’. Other notion, ‘trustworthiness’ of a research report is also talked about in the same sense of reliability in quantitative research. The researcher is expected to provide accurate observation notes or records. At the same time, the notes or records should not be oversimplified or misinterpreted. If multiple observers are engaged in research, they must be trained to record the same observations in the similar manner.

 

Some important considerations to enhance reliability in qualitative research are: If more than one researcher is working on the project, it is imperative that all are trained to observe events, record, and conduct interviews in an identical manner. Lewis (2009), discussing the issues concerning reliability of observations, suggests that researcher must keep changing the time and place of observations. This technique of changing place and time of observations is similar to that of test-retest method used in quantitative research. The researcher can also seek information again from the same respondent on the previously gathered information at different points of time. This can establish the accuracy of information provided by the respondent. Researcher can also increase reliability of the research process by seeking information from respondents on the same question but posed in different ways. If the response is similar then the information may be considered as realisable.

 

To overcome the problems of information gathering in participant observation, M N Srinivas (2009: 563) suggests that the researcher must focus on rapport building with the members of the community or group before embarking on data collection. Quoting Evans Prichard, he observes that ‘data collected in the first few weeks, that is, before the establishment of rapport, should be discarded as it is usually not very reliable. The fieldworker must make himself (/herself) liked and trusted by the people, for only when will they part with true information’. Srinivas (2009: 565) also urges the researchers to know the people and their practices better.

 

The villagers were surprised at the range and depth of my ignorance regarding agriculture and rural life. The entire village took a hand in educating me and this included some boys and girls, and even the headman’s bonded labourers, who used to sleep in the verandahs opening out from my, and my cook’s rooms. As I got to know the villagers better, I learnt that they had valuable time-tested knowledge about agriculture, fertility of soil, weather patterns, flora and fauna. It is essential for all developers (in our context, researchers) to know this, for I am convinced that however well-intentioned they might be, their efforts are bound to fail if they are not willing to learn from the local people. One has to learn in order to be able to teach.

 

Regarding the question on generalizability of fieldworkers’ observations Srinivas (2009: 565) maintains that the obsession with generalizability is political or bureaucratic. Intensive studies on villages (referring to the Indian context) are good enough to describe the pan-Indian pattern. Although the issue of unity as a pan-Indian feature is debatable, Srinivas’s argument in favour of intensive village studies using qualitative methods is supportive of the idea of reliability of such studies.

 

Qualitative research marked by intensive fieldwork faces the problems of ensuring reliability to the findings. Emerson (1981: 361) quoting Becker suggests two considerations for assessing the reliability of field data. First, the presence of observer should not constrain the actions of the observed. Second, the observations must be about interactions between members of the group rather than between the researcher and the researched. He also favours the argument that multiple observers enhance reliability of field data. Replication or repeatability, the hallmark of reliability in quantitative research, is possible in qualitative research only in a loose way (ibid). He suggests that two researchers’ observations on the same setting may differ because of the theoretical and conceptual understanding of the phenomena or actions. He recommends for ‘identifying explicitly the procedures, analytic assumptions, and interpretive devices used to collect, make sense of, and communicate field reports’ to make repeatability possible to some extent in field work based research (ibid: 362).

 

  1. Validity

 

Validity of the findings, data collected, the instrument used in data collection and the research design is of important concern in social research. Similar to reliability, the issue of validity transcends methodological boundaries. In quantitative research validity refers to the ability of the instrument to measure what it is supposed to measure, whereas in qualitative research the issue of validity goes beyond data extending to the research design adopted, the techniques (for example, observation, ethnography, interviews and narratives) used in data collection and the findings discussed in the research study.

 

  1. Validity in quantitative research

 

Numerical data obtained using instruments is the subject of scrutiny in quantitative research. This is because of the fact that the instrument used to measure a particular concept or construct must measure what it is devised for. If not, data obtained using such instrument becomes irrelevant or inappropriate. In other words, the instrument used to measure the property should enable the researcher to measure it. For example, the instrument used to measure empowerment, if measures development, then it may be considered as not valid, because empowerment and development are two different concepts.

 

Following are some of the tests developed to check the validity of instruments used in quantitative research.

 

  1. Types of validity tests

 

9.1         Face validity

 

The most commonly used validity test is face validity test. The instrument (for example, questionnaire or a scale) is accepted as valid if it appears valid for the researcher. Here researcher, as a professional, makes a judgement about the validity of the instrument. It is a casual review of the questions or items incorporated in the instrument. Sometimes, face validity is conducted by individuals who may not have any professional training or formal knowledge. It is the simplest and easiest method of checking the validity of a scale or a questionnaire.

 

9.2         Content validity

 

Instruments such as scales are developed to make predictions. For example, the entrance test conducted to select candidates for admission into IITs is an instrument used to make predictions about the academic ability of the candidates. The items (questions) incorporated in the instrument must reflect the larger goal of the instrument. Hence, in content validity, the items are subjected to review by those who are formally trained and have expertise in the subject under study. Usually individuals with considerable domain knowledge are asked to review whether the items used, measure the intended property or not. Consensus opinion is considered in finalizing the instrument. This type of validity test is mostly used by the researchers.

 

9.3         Criterion validity

 

It is conducted to measure the validity of the instrument against the criteria set in the study. Two types of tests are considered in criterion validity test. They are concurrent validity test and predictive validity test. Concurrent validity test is conducted to measure the extent to which the items of the instrument correlate with the ‘gold standard’ available. Generally, standardized, established instruments are used as references to check the validity of the instrument being tested. The predictive validity test measures the extent to which the instrument predicts the expected future observation. For example, instrument developed to measure IQ must help in making the predictions of IQ levels of the respondents.

 

9.4         Construct validity

 

‘Construct validity involves relating one’s measuring instrument to the overall theoretical structure in order to determine whether the instrument is logically tied to the concepts and theoretical assumptions that are employed’(Sjoberg and Nett 1992: 303). Thus, this test refers to the theoretical assumptions and the way concepts are operationalised in the research process. The items (questions) placed in the scale or questionnaire reflects the definition adopted for a concept and the theoretical standpoint of the research study. For example, the concept of family may be defined differently by a functionalist scholar and a feminist scholar. It may be said that construct validity is closely linked to the theoretical assumptions of the study. Thus, we find that construct validity is conducted to test the concepts and their relationships with the empirical reality. This is done at different levels. At one level the causal relationship between the concept and the questions used to measure the concept is tested. At another

 

level, the causal relationship between the theoretical definition of the concept and its operational definition are tested.

 

Self check exercise – 3

 

  1. What are validity tests and explain their relevance in quantitative research?

 

The different types of validity tests are face validity, content validity, criterion validity and construct validity. These are used to assess whether the instrument measures what is intended to measure. If the instrument measures what is intended then the validity of the research enhances.

 

  1. Validity in qualitative research

 

Issues of validity in qualitative research are complex and varied. Thus, the concept of validity is understood differently by different scholars. A wide range of terms are used to define validity in qualitative studies. Validity is not a single, fixed or universal concept in qualitative research. Rather it is a contingent construct influenced by the research methodology, theoretical assumptions, and the research design of the particular study.

 

Validity in qualitative research is affected by the factors related to the researcher. It is observed that validity of the study is contingent upon how observations are described, how they are interpreted, and how the researcher attempts to manipulate (knowingly or unknowingly) data to fit theory. The most important factor that can influence validity is researcher’s inherent bias. The other most important issue of validity in qualitative research is the presence of the researcher. As the researcher’s presence can affect the nature of interaction among members of the group being studied validity of qualitative study becomes critical.

 

Challenges to validity in qualitative research are multi-fold. Following are some of the potential sources of threats to validity (a large part of this discussion is benefitted from Lewis’s 2009 work cited in reference section).

 

10.1      Descriptive validity

 

It concerns with the recording of observations by the researcher. It is often noted that researchers don’t provide detail description of the observation setting. Accurate description of the site of observation, process of interaction in appropriate words is of great importance in enhancing the validity of the research.

 

10.2      Interpretation validity

 

This occurs when the researcher tries to interpret the actions or event from her/his own perspective without paying much attention to how actors perceive it. To overcome the problem of wrong or invalid interpretations, researcher must collect elaborate information.

 

10.3      Theory validity

 

Researcher enters the field site with a theoretical framework. In most cases it is found that researchers attempt at fitting the data into the theory adopted for the study. Or in some cases researchers ignore data that doesn’t fit the theory or that goes against the theoretical convictions of the researchers. Researchers are suggested to record and collect data without discarding it from theoretical point of view.

 

10.4      Researcher bias

 

This is the biggest threat in qualitative research. As mentioned earlier, in qualitative research when the researcher becomes the instrument of data collection, the potential for bias in recording the observation is enormous. Researcher’s personal factors (religious, economic, cultural, gender, etc.), theoretical assumptions, political affiliations, etc. influence the collection of data and interpretation of data.

 

10.5      Reactivity

 

As suggested, researcher’s presence in the field site sometimes affects the situation. Researchers, as outsiders, knowingly or unknowingly influence the site of observation. To overcome this problem, the researchers must be conscious of the influence of their presence.

  1. How to enhance validity

To enhance the validity, researchers must use certain checklists. ‘A validity checklist assists the researcher in establishing techniques that will be used to strengthen validity issues’ (Lewis 2009: 10). Some of the validity checks are discussed below.

 

11.1      Triangulation

 

The most important technique adopted by researchers in qualitative and quantitative research is triangulation. It involves collection of data from multiple sources. Interviews with key informants and members of the groups observed must be supplemented by data from other sources like non-group members and other informants. Secondary sources like reports, government documents, and earlier research studies may be used to supplement the information gathered first hand. This effort strengthens the validity of the research observations and findings. Similarly, data collected through interview schedule or questionnaire, may be supplemented by focused group discussion/observation/case study (read, for details, Module RMS 7).

 

11.2      Negative cases, discrepant data, or disconfirming evidence

 

One technique recommended to strengthen validity is to focus on negative cases, discrepant data or disconfirming evidence. It is often observed that researchers tend to collect data that proves their theory or hypothesis. In the process they avoid negative cases, or sources of data which are felt inconvenient to the researcher.

 

11.3      Bias or researcher reflexivity

 

It is the most obvious threat to validity in qualitative research. Hence, researcher must state the assumptions, beliefs, values, etc. in the study outcome. Researcher also must state how s/he had identified these threats and methods employed to overcome such threats.

 

11.4      Member checking

 

It refers to the process of involving those who were the sources of data. The recordings of observations, interpretations, conclusions by the researcher are to be tested by sharing it with the people who were observed. The members of the group or community are shown these for their opinions, reactions, and suggestions. This exercise provides the researcher an opportunity to correct errors, misinterpretations, lacunae, etc. This also establishes credibility of the research and strengthens its validity.

 

11.5      Prolonged engagement in the field

 

One of the means to overcome bias or personal factors influencing the research process is to stay put up in the field site for a long time. Prolonged stay enhances researcher’s ability to observe the setting as it unfolds naturally.

 

The researcher learns the norms, language, and habits of those being studied and can better predict and interpret the meaning of events. The researcher also can build trust that can lead to identifying different sources for information and who has access to certain information, both of which would enhance the research and the triangulation of data (Lewis 2009: 12).

 

11.6      Thick, rich description

 

Unlike quantitative researcher who confines to reporting facts, a qualitative researcher has the responsibility of providing the description of research setting, the participants, etc. in detail. While providing the description the researcher must take efforts to transcend the reader to the research situation. The researcher should not just confine to providing the description of the setting and people involved but also describe their emotions, feelings, and experiences.

 

It may be said that validity of qualitative research is challenging, but not impossible. The techniques described so far can enhance the validity of the research. Using a combination of the just discussed techniques, researcher can definitely enhance the validity of the research.

  1. Reliability and validity: relationship

 

Reliability and validity are related to each other. The relationship between these two is better understood with an example. We know that reliability refers to consistency while validity deals with measurement of the intended property. If a shopkeeper’s weighing machine measures 950 grams of sugar every time customers buy 1 kg of sugar, then the weighing machine is said to be reliable. As the weighing machine weighs consistently in repeated operations, the machine is said to be reliable. However, there is a problem of error in measurement. Thus, what is important to note is that reliability tests consider the repeatability and consistency of a scale or an instrument. It does not tell whether there is an error in measurement. In other words, reliability tests don’t test whether the instrument is measuring what it is intended to measure.

 

For the purpose of knowing whether the instrument (or it could be a scale, research design, research findings, etc.) is measuring what it is intended to measure, we use validity tests. It can be explained using the same example. If the shopkeeper’s weighing machine weighs 1000 grams for every 1 kg then it is said to be valid. If the weighing machine weighs 1000 grams for 1 kg of sugar every time (repeatedly), then the weighing machine of the shop keeper is said to be valid and reliable. If the instruments used in data collection measure what is intended in repeated usage then the instrument is said to be reliable and valid.

  1. Summary

 

This module discussed the issues related to reliability and validity. The efforts of social science researchers, who investigate social phenomena, are aimed at achieving credibility to their findings. Quantitative researchers employ instruments like scales or questionnaires or interview schedules which are standardized to a great extent. Thus, they derive greater credibility when compared to their counterparts handling qualitative data. Reliability and validity of quantitative data, instruments used in their collection is verifiable through the tests established through standard procedures. Reliability of quantitative data and instruments can be estimated. Qualitative researchers, on the other hand, face the challenges of repeatability or consistency and validity of data and instruments of a different kind. As the researchers are the instruments of data collection, problems associated with personal as well as professional life crop up in qualitative research. Researcher’s bias, effect of the researcher on the setting, theoretical assumptions influence the process and direction of research. These issues pose challenges to reliability and validity of the data, interpretations and conclusions. However, the techniques to overcome such challenges have been developed and available to the researcher. It is the burden of the researcher to consider the suggestions and practice them during the course of research in order to enhance reliability and validity.

 

you can view video on Issues of Validity and Reliability
  1. References
  • Golafshani Nahid 2003. ‘Understanding Reliability and Validity in Qualitative Research’, The Qualitative Report, 8 (4), pp. 597-607, http://www.nova.edu/ssss/QR/QR8-4/golafshani.pdf
  • Srinivas M.N. 2009. The Oxford India: Srinivas, New Delhi: Oxford University Press
  • Stenbacka, C. 2001. Qualitative research requires quality concepts of its own. Management Decision, 39(7), 551-555
  • Sjoberg, Gideon and Roger Nett. 1992. A Methodology for Social Research, New Delhi: Rawat Publications
  • Lewis       John. 2009. ‘Redefining Qualitative Methods: Believability in the Fifth Moment’, International Journal of Qualitative Methods, 8(2)
  • Emerson M Robert. 1981. ‘Observational field work’, Annual Review of Sociology, Vol. 7. Pp. 351-378.