IQ's Corner: Journal Alert: Journal of Educational Measurement, 49(4), 2012

Thursday, January 17, 2013

Journal Alert: Journal of Educational Measurement, 49(4), 2012

Journal Name: JOURNAL OF EDUCATIONAL MEASUREMENT (ISSN: 0022-0655)

Issue:          Vol. 49 No. 4, 2012
IDS#:           060WQ
Alert Expires: 10 JAN 2014
Number of Articles in Issue: 9 (9 included in this e-mail)
Organization ID: c4f3d919329a46768459d3e35b8102e6
========================================================================
Note: Instructions on how to purchase the full text of an article and Thomson Reuters Science Contact information are at the end of the e-mail.
========================================================================

*Pages: 339-361 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500001
*Order Full Text [ ]

Title:
Psychometric Equivalence of Ratings for Repeat Examinees on a Performance Assessment for Physician Licensure

Authors:
Raymond, MR; Swygert, KA; Kahraman, N

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):339-361; WIN 2012

Abstract:
Although a few studies report sizable score gains for examinees who
repeat performance-based assessments, research has not yet addressed the
reliability and validity of inferences based on ratings of repeat
examinees on such tests. This study analyzed scores for 8,457
single-take examinees and 4,030 repeat examinees who completed a 6-hour
clinical skills assessment required for physician licensure. Each
examinee was rated in four skill domains: data gathering,
communication-interpersonal skills, spoken English proficiency, and
documentation proficiency. Conditional standard errors of measurement
computed for single-take and multiple-take examinees indicated that
ratings were of comparable precision for the two groups within each of
the four skill domains; however, conditional errors were larger for
low-scoring examinees regardless of retest status. In addition, on their
first attempt multiple-take examinees exhibited less score consistency
across the skill domains but on their second attempt their scores became
more consistent. Further, the median correlation between scores on the
four clinical skill domains and three external measures was .15 for
multiple-take examinees on their first attempt but increased to .27 for
their second attempt, a value, which was comparable to the median
correlation of .26 for single-take examinees. The findings support the
validity of inferences based on scores from the second attempt.

========================================================================

*Pages: 362-379 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500002
*Order Full Text [ ]

Title:
Investigating the Effect of Item Position in Computer-Based Tests

Authors:
Li, FM; Cohen, A; Shen, LJ

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):362-379; WIN 2012

Abstract:
Computer-based tests (CBTs) often use random ordering of items in order
to minimize item exposure and reduce the potential for answer copying.
Little research has been done, however, to examine item position effects
for these tests. In this study, different versions of a Rasch model and
different response time models were examined and applied to data from a
CBT administration of a medical licensure examination. The models
specifically were used to investigate whether item position affected
item difficulty and item intensity estimates. Results indicated that the
position effect was negligible.

========================================================================

*Pages: 380-398 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500003
*Order Full Text [ ]

Title:
Relationships of Measurement Error and Prediction Error in Observed-Score Regression

Authors:
Moses, T

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):380-398; WIN 2012

Abstract:
The focus of this paper is assessing the impact of measurement errors on
the prediction error of an observed-score regression. Measures are
presented and described for decomposing the linear regression's
prediction error variance into parts attributable to the true score
variance and the error variances of the dependent variable and the
predictor variable(s). These measures are demonstrated for regression
situations reflecting a range of true score correlations and
reliabilities and using one and two predictors. Simulation results also
are presented which show that the measures of prediction error variance
and its parts are generally well estimated for the considered ranges of
true score correlations and reliabilities and for homoscedastic and
heteroscedastic data. The final discussion considers how the
decomposition might be useful for addressing additional questions about
regression functions prediction error variances.

========================================================================

*Pages: 399-418 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500004
*Order Full Text [ ]

Title:
Comparison of the One- and Bi-Direction Chained Equipercentile Equating

Authors:
Oh, H; Moses, T

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):399-418; WIN 2012

Abstract:
This study investigated differences between two approaches to chained
equipercentile (CE) equating (one- and bi-direction CE equating) in
nearly equal groups and relatively unequal groups. In one-direction CE
equating, the new form is linked to the anchor in one sample of
examinees and the anchor is linked to the reference form in the other
sample. In bi-direction CE equating, the anchor is linked to the new
form in one sample of examinees and to the reference form in the other
sample. The two approaches were evaluated in comparison to a criterion
equating function (i.e., equivalent groups equating) using indexes such
as root expected squared difference, bias, standard error of equating,
root mean squared error, and number of gaps and bumps. The overall
results across the equating situations suggested that the two CE
equating approaches produced very similar results, whereas the
bi-direction results were slightly less erratic, smoother (i.e., fewer
gaps and bumps), usually closer to the criterion function, and also less
variable.

========================================================================

*Pages: 419-445 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500005
*Order Full Text [ ]

Title:
Item Response Models for Examinee-Selected Items

Authors:
Wang, WC; Jin, KY; Qiu, XL; Wang, L

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):419-445; WIN 2012

Abstract:
In some tests, examinees are required to choose a fixed number of items
from a set of given items to answer. This practice creates a challenge
to standard item response models, because more capable examinees may
have an advantage by making wiser choices. In this study, we developed a
new class of item response models to account for the choice effect of
examinee-selected items. The results of a series of simulation studies
showed: (1) that the parameters of the new models were recovered well,
(2) the parameter estimates were almost unbiased when the new models
were fit to data that were simulated from standard item response models,
(3) failing to consider the choice effect yielded shrunken parameter
estimates for examinee-selected items, and (4) even when the missingness
mechanism in examinee-selected items did not follow the item response
functions specified in the new models, the new models still yielded a
better fit than did standard item response models. An empirical example
of a college entrance examination supported the use of the new models:
in general, the higher the examinee's ability, the better his or her
choice of items.

========================================================================

*Pages: 446-465 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500006
*Order Full Text [ ]

Title:
Measurement Error Adjustment Using the SIMEX Method: An Application to Student Growth Percentiles

Authors:
Shang, Y

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):446-465; WIN 2012

Abstract:
Growth models are used extensively in the context of educational
accountability to evaluate student-, class-, and school-level growth.
However, when error-prone test scores are used as independent variables
or right-hand-side controls, the estimation of such growth models can be
substantially biased. This article introduces a simulation-extrapolation
(SIMEX) method that corrects measurement error induced bias. The SIMEX
method is applied to quantile regression, which is the basis of Student
Growth Percentile, a descriptive growth model adopted in a number of
states to diagnose and project student growth. A simulation study is
conducted to demonstrate the performance of the SIMEX method in reducing
bias and mean squared error in quantile regression with a mismeasured
predictor. One of the simulation cases is based on longitudinal state
assessment data. The analysis shows that measurement error
differentially biases growth percentile results for students at
different achievement levels and that the SIMEX method corrects such
biases and closely reproduces conditional distributions of current test
scores given past true scores. The potential applications and
limitations of the method are discussed at the end of this paper with
suggestions for further studies.

========================================================================

*Pages: 466-466 (Correction)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500007
*Order Full Text [ ]

Title:
A paradox in the study of the benefits of test-item review (vol 48, pg 380, 2011)

Authors:
van der Linden, WJ; Jeon, M; Ferrara, S

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):466-466; WIN 2012

========================================================================

*Pages: 467-468 (Article)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500008
*Order Full Text [ ]

Title:
A Simple Answer to a Simple Question on Changing Answers

Authors:
Bridgeman, B

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):467-468; WIN 2012

Abstract:
In an article in the Winter 2011 issue of the Journal of Educational
Measurement, van der Linden, Jeon, and Ferrara suggested that test
takers should trust their initial instincts and retain their initial
responses when they have the opportunity to review test items. They
presented a complex IRT model that appeared to show that students would
be worse off by changing answers. As noted in a subsequent erratum, this
conclusion was based on flawed data, and that the correct data could not
be analyzed by their method because the model failed to converge. This
left their basic question on the value of answer changing unanswered. A
much more direct approach is to simply count the number of examinees
whose scores after an opportunity to change answers are higher, lower,
or the same as their initial scores. Using the same data set as the
original article, an overwhelming majority of the students received
higher scores after the opportunity to change answers.

========================================================================

*Pages: 469-475 (Correction)
*View Full Record: http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=CCC&DestLinkType=FullRecord;KeyUT=CCC:000312809500009
*Order Full Text [ ]

Title:
Efficiency balanced information criterion for item selection in computerized adaptive testing (vol 49, pg 225, 2012)

Authors:
Han, KT

Source:
*JOURNAL OF EDUCATIONAL MEASUREMENT*, 49 (4):469-475; WIN 2012

========================================================================
*Order Full Text*
All Customers
--------------
   Please contact your library administrator, or person(s) responsible for
   document delivery, to find out more about your organization.s policy for
   obtaining the full text of the above articles. If your organization does not
   have a current document delivery provider, you can order the document from our
   document delivery service TS Doc. To order a copy of the article(s) you wish
   to receive, please go to www.contentscm.com and enter the citation information for
   each document. A price quote for each item will be given and you will need a
   credit card to complete your order request.

TS Doc Customers
--------------
   TS Doc customers can purchase the full text of an article using their
   TS Doc account. Go to www.contentscm.com and login using your TS Doc logon ID
   and password. Copy & paste the citation into the parser (Order by Citation)
   or enter the citation information above on the web order form (Order by Form.)
   A quote will be given for each item and your company will be invoiced as
   specified in your TS Doc agreement.

If you would like to supply contact information for TS Doc, here is the updated info:
   Product name: TS Doc
   Customer Service: customerservice@infotrieve.com  or (800) 603-4367

========================================================================
*Support Contact Information*
If you have any questions, please open a support ticket at http://ip-science.thomsonreuters.com/techsupport/.
Telephone numbers for your local support team are also available here.
========================================================================