What is Validity?

Assessing the validity of test

Internal and External Validity

Face validity is simply whether the test appears at face value to measure what it claims to. Accordingly, tests wherein the purpose is unclear have low face validity Nevo, A direct measurement of face validity is obtained by asking people to rate the validity of a test as it appears to them. This rater could use a likert scale to assess face validity. It is important to select suitable people to rate a test e. For example, individuals who actually take the test would be well placed to judge its face validity.

Also people who work with the test could offer their opinion e. Finally, the researcher could use members of the general public with an interest in the test e. The face validity of a test can be considered a robust construct only if a reasonable level of agreement exists among raters.

It should be noted that the term face validity should be avoided when the rating is done by "expert" as content validity is more appropriate. Having face validity does not mean that a test really measures what the researcher intends to measure, but only in the judgment of raters that it appears to do so. Consequently it is a crude and basic measure of validity. A test item such as ' I have recently thought of killing myself ' has obvious face validity as an item measuring suicidal cognitions, and may be useful when measuring symptoms of depression.

However, the implications of items on tests with clear face validity is that they are more vulnerable to social desirability bias. Individuals may manipulate their response to deny or hide problems, or exaggerate behaviors to present a positive images of themselves.

It is possible for a test item to lack face validity but still have general validity and measure what it claims to measure. This is good because it reduces demand characteristics and makes it harder for respondents to manipulate their answers. For example, the test item ' I believe in the second coming of Christ ' would lack face validity as a measure of depression as the purpose of the item is unclear. Because most of the original normative sample of the MMPI were good Christians only a depression Christian would think Christ is not coming back.

Thus, for this particular religious sample the item does have general validity, but not face validity. Construct validity was invented by Cornball and Meehl This type of validity refers to the extent to which a test captures a specific theoretical construct or trait, and it overlaps with some of the other aspects of validity. Construct validity does not concern the simple, factual question of whether a test measures an attribute. To test for construct validity it must be demonstrated that the phenomenon being measured actually exists.

So, the construct validity of a test for intelligence, for example, is dependent on a model or theory of intelligence.

Construct validity entails demonstrating the power of such a construct to explain a network of research findings and to predict further relationships. The more evidence a researcher can demonstrate for a test's construct validity the better.

The first, on the top, is the land of theory. It is what goes on inside our heads as researchers. It is where we keep our theories about how the world operates. The second, on the bottom, is the land of observations.

It is the real world into which we translate our ideas -- our programs, treatments, measures and observations. When we conduct research, we are continually flitting back and forth between these two realms, between what we think about the world and what is going on in it. When we are investigating a cause-effect relationship, we have a theory implicit or otherwise of what the cause is the cause construct. For instance, if we are testing a new educational program, we have an idea of what it would look like ideally.

Similarly, on the effect side, we have an idea of what we are ideally trying to affect and measure the effect construct. But each of these, the cause and the effect, has to be translated into real things, into a program or treatment and a measure or observational method. We use the term operationalization to describe the act of translating a construct into its manifestation.

In effect, we take our idea and describe it as a series of operations or procedures. Now, instead of it only being an idea in our minds, it becomes a public entity that anyone can look at and examine for themselves. It is one thing, for instance, for you to say that you would like to measure self-esteem a construct. But when you show a ten-item paper-and-pencil self-esteem measure that you developed for that purpose, others can look at it and understand more clearly what you intend by the term self-esteem.

Now, back to explaining the four validity types. They build on one another, with two of them conclusion and internal referring to the land of observation on the bottom of the figure, one of them construct emphasizing the linkages between the bottom and the top, and the last external being primarily concerned about the range of our theory on the top.

Assume that we took these two constructs, the cause construct the WWW site and the effect understanding , and operationalized them -- turned them into realities by constructing the WWW site and a measure of knowledge of the course material. Here are the four validity types and the question each addresses:. In this study, is there a relationship between the two variables?

In the context of the example we're considering, the question might be worded: There are several conclusions or inferences we might draw to answer such a question. We could, for example, conclude that there is a relationship. We might conclude that there is a positive relationship. We might infer that there is no relationship.

We can assess the conclusion validity of each of these conclusions or inferences. Assuming that there is a relationship in this study, is the relationship a causal one?

Just because we find that use of the WWW site and knowledge are correlated, we can't necessarily assume that WWW site use causes the knowledge.

Both could, for example, be caused by the same factor. For instance, it may be that wealthier students who have greater resources would be more likely to use have access to a WWW site and would excel on objective tests. When we want to make a claim that our program or treatment caused the outcomes in our study, we can consider the internal validity of our causal claim. Assuming that there is a causal relationship in this study , can we claim that the program reflected well our construct of the program and that our measure reflected well our idea of the construct of the measure?

In simpler terms, did we implement the program we intended to implement and did we measure the outcome we wanted to measure? In yet other terms, did we operationalize well the ideas of the cause and the effect? When our research is over, we would like to be able to conclude that we did a credible job of operationalizing our constructs -- we can assess the construct validity of this conclusion.

Assuming that there is a causal relationship in this study between the constructs of the cause and the effect , can we generalize this effect to other persons, places or times?

We are likely to make some claims that our research findings have implications for other groups and individuals in other settings and at other times.

Construct validity

Construct validity is the approximate truth of the conclusion that your operationalization accurately reflects its construct. All of the other terms address this general issue in different ways. Second, I make a distinction between two broad types: translation validity and criterion-related validity.

External validity is about generalization: To what extent can an effect in research, be generalized to populations, settings, treatment variables, and measurement variables? External validity is usually split into two distinct types, population validity and ecological validity and they are both essential elements in judging the strength of an experimental .

Face Validity is the most basic type of validity and it is associated with a highest level of subjectivity because it is not based on any scientific approach. In other words, in this case a test may be specified as valid by a researcher because it may seem as valid, without an in-depth scientific justification. In social research there are several types of validity -- here they are.

Types of Reliability. Types of Validity 1. Face Validity ascertains that the measure appears to be assessing the intended construct under study. The stakeholders can easily assess face validity. American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (). Establishing Validity in Qualitative Research. Establishing Validity in Qualitative Research. The following module discusses reliability and validity in qualitative research, with an emphasis on establishing credibility and transferability. The link in Resources Links on the left describes different types of triangulation methods.