Article Text
Statistics from Altmetric.com
Introduction
Quantitative nursing research is often reliant on measurement of phenomena. In some studies, there may be measurement of objective data such as height, weight and heart rate. In others though, researchers may have to develop bespoke measurement tools or scales that seek to quantify more abstract phenomena. For example, nurse researchers have previously developed tools that measure health professionals’ compassion for caring1 or the stigma of COVID-19 infection in healthcare workers.2 In these cases, those ideas at the heart of the studies—compassion and stigma—are constructs, terms that we use to describe concepts within broader theories.
When researching constructs such as compassion or stigma, any measurement tools we use need to demonstrate high levels of validity—that is, they need to measure what they are designed to measure. A failure to ensure—or insufficient evidence of—the validity of measurement tools is a common source of bias in published research.3
The more subjective and abstract a construct becomes, the more challenging it becomes to ensure the validity of the tools and scales used in research. For example, in the studies cited above, one of the challenges faced by the researchers was ensuring that the tools they designed to collect data were valid measures of compassion and stigma.
There are many different types of validity that can be considered by researchers (eg, face validity, content validity and criterion validity). However, this paper will explore one type of validity that is a crucial element of any study that uses measurement tools and scales: construct validity.
Construct validity history, development and definition
Heale and Twycross defined construct validity as: ‘The extent to which a research instrument (or tool) measures the intended construct’.4 For some constructs, such as stigma and compassion, it may be impossible to determine all the properties of a concept completely, so Barten et al provide a more nuanced definition, suggesting that ‘construct validity is an estimate of the extent to which variance in the measure reflects variance in the underlying construct’.5 It has its roots in the science of psychology, when Paul Meehl and Lee Cronbach first introduced the concept in their 1955 article ‘Construct Validity in Psychological Tests’, prompting decades of debate that has caused the nature of construct validity to become clearer and better-defined.6
Enhancing construct validity
So, how can researchers enhance the construct validity of measures that they use within their studies? Clark and Watson provided a list of practical steps that researchers could undertake to support the construct validity of developing measurement instruments6 (figure 1).
The first of these requires researchers to define, understand and consider all aspects of the construct under examination. This understanding can be developed through approaches such as literature reviews, primary qualitative research and the use of concept maps.
For example, if a researcher was developing a tool to measure preoperative anxiety, they would need to have a clear, evidence-based insight to what the construct is and how it may present. This would include consideration of the things patients may say about anxiety, and all physical, behavioural and other manifestations of the construct. It is also important to understand that the manifestations of constructs may vary depending on the context, for example, recognising that anxiety may manifest differently across people from different cultures.7 The complexity inherent in some constructs underscores the importance of closely examining every facet of designing and developing a research instrument.
Having this clear understanding of the construct then allows researchers to generate items which encompass the entire scale. By including items which align closely with the construct and consider different elements of it, they can enhance their tool’s validity. At this point, researchers can also use other approaches to enhance validity, such as gathering insight from experts in the area of the construct. Once produced, researchers can then test items, allowing them to refine the scale ahead of using it for data collection.
Testing construct validity
One approach to testing is the application of statistical tests that consider two subelements of construct validity—convergent and discriminant validity.
Convergent validity represents the association between the new measurement tool with other established assessments that represent factors associated with the same or connected constructs. The level of correlation can be assessed through a correlation coefficient identified through statistical analysis, such as ‘Pearson’s r’. This is a value that will be between −1 and 1, with 1 representing a perfect positive correlation (ie, as one variable increases or decreases, so an associated variable moves in the same direction), 0 represents no correlation at all and −1 shows a perfect negative correlation (as one value increases, the associated variable decreases, and vice versa).
For example, Hara et al developed and tested a tool to evaluate the beliefs, principles and standards that guide nurses’ work (the Nurses’ Work Values Scale (NWVS)). To test convergent validity, the authors calculated correlation coefficient between the NWVS and two other validated scales that explored similar concepts.8
Discriminant validity uses a contrary view, assessing the extent to which the new scale is related—or, ideally, not related—to indicators of different constructs.6 As a standalone measure, discriminant validity may be limited value; however, if viewed in partnership with convergent validity, it can provide a valuable indicator of construct validity.
Another set of statistical tests often reported in studies when testing construct validity is factor analysis. These tests allow researchers to explore the relationships between different items within a scale and identify what factors—‘hidden’ variables that influence findings—might explain these relationships. For example, in the study that developed the ‘Capacity for Compassion’ scale, factor analysis identified that the structure of the scale was composed of four factors: ‘motivation/commitment’, ‘presence’, ‘shared humanity’ and ‘self-compassion’.1
There are two main types of factor analysis often reported in studies. Explanatory factor analysis (EFA) is a statistical method often used early in a measurement tool’s development to identify the underlying structure and relationships between variables. In this approach, the hidden factors that researchers are aiming to uncover are not based on any pre-existing theories; instead, EFA supports the generation of ideas and hypotheses related to these factors. Dabbagh et al used EFA to identify the factors measured by the Home and Family Work Roles Questionnaire—allowing them to link questions from the tool to ‘caregiving roles’, ‘traditionally feminine roles’ and ‘traditionally masculine roles’.9
Conversely, and as the name suggests, confirmatory factor analysis (CFA) is a method employed to test factors and relationships that have been previously identified or hypothesised. EFA and CFA can therefore be used together to help develop items and test their construct validity. For example, in the development of a COVID-19 stigma scale by Al Houri et al, they carried out both EFA and CFA, to help develop, test and refine an 11-item scale by splitting it into two subscales: harmfulness and inferiority, and avoidance.2
Conclusion
Construct validity is an important consideration when developing and appraising the soundness of measurement instruments in research. When reading nursing research that includes measurement scales, there should be evidence that study authors have enhanced construct validity by clearly defining the construct being studied, exploring all its nuances and characteristics, developing a scale comprising evidence-based items, and using statistical tests—such as factor analysis—to assess the level of validity. By taking these steps, researchers can enhance the validity of their work and, in turn, its value to the development of evidence-based nursing.
Ethics approval
Not applicable.
Footnotes
X @RasoulTabari, @barrett1972
Contributors RT-K: conceptualisation and design of the study, searching for documents, analysis and drafting the manuscript. DB: provided critical revisions and guidance throughout the study. Assisted in data interpretation and manuscript editing. RT-K is the guarantor of this manuscript
Funding The authors have not declared a specific grant for this research from any funding agency in the public, commercial or not-for-profit sectors.
Competing interests None declared.
Provenance and peer review Not commissioned; internally peer reviewed.