Wednesday, December 11, 2024

Applied #psychometrics 101: Strong programs of #constructvalidity—the #theory - #measurement framework with emphasis on #substantive & #structural validity - #WJIV #WJV #shoolpsychology #psychology

The validity of psychological tests “is an overall judgment of the degree to which empirical evidence and theoretical rationales support the adequacy and appropriateness of inferences and actions based on test scores or other modes of assessment” (Messick, 1995, p. 741).

The ability to draw valid inferences regarding theoretical constructs from observable or manifest measures (e.g., test or composite scores) is a function of the extent to which the underlying program of validity research attends to both the theoretical and measurement domains of the focal constructs (Bensen, 1998; Bensen & Hagtvet, 1996; Cronbach, 1971; Cronbach & Meehl, 1955; Loevinger, 1957; Messick, 1995; Nunnally, 1978). 


The theoretical—measurement domain framework that has driven the revisions of the WJ test batteries, particularly from the WJ-R to the forthcoming WJ V cognitive and achievement test batteries (Q1, 2025; COI disclosure: I am a coauthor of the current WJ IV and forthcoming WJ V), is represented in the figures below.  


The goal of this post is to provide visual-graphic (Gv) images that hopefully, if properly studied by the reader (and if I did a decent job), provide the basic concepts of what constitutes the substantive component (and to a more limited extent the structural component) of a strong program of construct validity—in particular, the theoretical-measurement domain mapping framework used in the WJ-R to the forthcoming WJ V. The external stage of construct validity is not highlighted in this current post.  The goal is for conceptual understanding…thus the absence of empirical data, etc.


For those who want written background information, the most succinct conceptual overview of a “strong program of construct validation” is Bensen (1998; click to download and read).  


Otherwise…sit back and enjoy the Gv presenation…where five images equal at least one or more chapter in a technical manual :). 


Be sure to click on each image to enlarge (and make readable)


This figure below was first published in a book on CHC theoretical (then known as Gf-Gc) interpretation of the Wechsler intelligence test batteries (Flanagan, McGrew, & Ortiz, 2000).









Yea…I know.  The following figure uses Gv as the sample cognitive ability construct domain and not Gf as in the prior figure.  I first crafted the figure below in 2005 and don’t have the time (nor attentional control focus bandwidth) to make a new version.  Consider the switch from the Gf domain (above) to Gv (below) as a test of your understanding of the material…and ability to generalize what you have learned.  And yes, I do see there is a spelling error (“on” for “one”)…but it is an old image file and I don’t have time to “clean it up” as noted above.  The primary new feature is the addition of the concept  of developing developmentally (difficulty) ordered sets of test items for the underling ability trait scales for manifest indicator tests C and D under the CHC theoretical narrow ability domain of spatial scanning, under the broad ability domain of Gv. This is where IRT (Rasch model) item scaling is involved.




The following figure is drawn from the WJ IV technical manual (McGrew, LaForte, Schrank, 2014) and illustrates the three-stage structural validity process used in the WJ IV.  The same process, with slightly different age groups and the addition of exploratory hierarchical psychometric network analysis (see exciting and ground-breaking work of Dr. Hudson Golino and colleagues) during stage 2A, will be presented in the WJ V technical manual (LaForte, Dailey & McGrew, Q1-2025).