检索结果-内蒙古大学图书馆

Using Out-of-Level Items in computerized adaptive testing

INTERNATIONAL JOURNAL OF testing 2015年第1期15卷 50-70页

作者： Wei, Hua Lin, Jie Pearson 4535 Jewel Lane North Plymouth MN 55446 USA

Out-of-level testing refers to the practice of assessing a student with a test that is intended for students at a higher or lower grade level. Although the appropriateness of out-of-level testing for accountability purposes has been questioned by educators and policymakers, incorporating out-of-level items in formative assessments for accurate feedback is recommended. This study made use of a commercial item bank with vertically scaled items across grades and simulated student responses in a computerized adaptive testing (CAT) environment. Results of the study suggested that administration of out-of-level items improved measurement accuracy and test efficiency for students who perform significantly above or below their grade-level peers. This study has direct implications with regards to the relevance, applicability, and benefits of using out-of-level items in CAT.

关键词： computerized adaptive testing measurement accuracy out-of-level testing

来源：评论

学校读者我要写书评

暂无评论

The Effect of Upper and Lower Asymptotes of IRT Models on computerized adaptive testing

引用

APPLIED PSYCHOLOGICAL MEASUREMENT 2015年第7期39卷 551-565页

作者： Cheng, Ying Liu, Cheng Univ Notre Dame Notre Dame IN 46556 USA

In this article, the effect of the upper and lower asymptotes in item response theory models on computerized adaptive testing is shown analytically. This is done by deriving the step size between adjacent latent trait estimates under the four-parameter logistic model (4PLM) and two models it subsumes, the usual three-parameter logistic model (3PLM) and the 3PLM with upper asymptote (3PLMU). The authors show analytically that the large effect of the discrimination parameter on the step size holds true for the 4PLM and the two models it subsumes under both the maximum information method and the b-matching method for item selection. Furthermore, the lower asymptote helps reduce the positive bias of ability estimates associated with early guessing, and the upper asymptote helps reduce the negative bias induced by early slipping. Relative step size between modeling versus not modeling the upper or lower asymptote under the maximum Fisher information method (MI) and the b-matching method is also derived. It is also shown analytically why the gain from early guessing is smaller than the loss from early slipping when the lower asymptote is modeled, and vice versa when the upper asymptote is modeled. The benefit to loss ratio is quantified under both the MI and the b-matching method. Implications of the analytical results are discussed.

关键词： computerized adaptive testing early slipping early guessing upper asymptote lower asymptote (relative) step size

来源：评论

学校读者我要写书评

暂无评论

New Item Selection Methods for Cognitive Diagnosis computerized adaptive testing

引用

APPLIED PSYCHOLOGICAL MEASUREMENT 2015年第3期39卷 167-188页

作者： Kaplan, Mehmet de la Torre, Jimmy Ramon Barrada, Juan Rutgers State Univ New Brunswick NJ 08901 USA Univ Zaragoza Teruel Spain

This article introduces two new item selection methods, the modified posterior-weighted Kullback-Leibler index (MPWKL) and the generalized deterministic inputs, noisy and gate (G-DINA) model discrimination index (GDI), that can be used in cognitive diagnosis computerized adaptive testing. The efficiency of the new methods is compared with the posterior-weighted Kullback-Leibler (PWKL) item selection index using a simulation study in the context of the G-DINA model. The impact of item quality, generating models, and test termination rules on attribute classification accuracy or test length is also investigated. The results of the study show that the MPWKL and GDI perform very similarly, and have higher correct attribute classification rates or shorter mean test lengths compared with the PWKL. In addition, the GDI has the shortest implementation time among the three indices. The proportion of item usage with respect to the required attributes across the different conditions is also tracked and discussed.

关键词： cognitive diagnosis model computerized adaptive testing item selection method

来源：评论

学校读者我要写书评

暂无评论

Evaluating Content Alignment in computerized adaptive testing

引用

EDUCATIONAL MEASUREMENT-ISSUES AND PRACTICE 2015年第4期34卷 41-48页

作者： Wise, Steven L. Kingsbury, G. Gage Webb, Norman L. Northwest Evaluat Assoc Portland OR 97209 USA Univ Wisconsin Wisconsin Ctr Educ Res Madison WI 53706 USA

The alignment between a test and the content domain it measures represents key evidence for the validation of test score inferences. Although procedures have been developed for evaluating the content alignment of linear tests, these procedures are not readily applicable to computerized adaptive tests (CATs), which require large item pools and do not use fixed test forms. This article describes the decisions made in the development of CATs that influence and might threaten content alignment. It outlines a process for evaluating alignment that is sensitive to these threats and gives an empirical example of the process.

关键词： computerized adaptive testing content alignment

来源：评论

学校读者我要写书评

暂无评论

A Review of SimuMCAT: A Simulation Software for Multidimensional computerized adaptive testing

引用

APPLIED PSYCHOLOGICAL MEASUREMENT 2015年第3期39卷 241-244页

作者： Sie, Haskell Amer Inst Res Washington DC 20007 USA

This article reviews the software package SimuMCAT that simulates unidimensional and multidimensional computerized adaptive testing with various types of items (dichotomous/polytomous) and loading structures (simple-/complex-structured). In addition, the software allows users to choose from five different item selection procedures, two stopping rules for variable-length tests, as well as test constraints to satisfy test blueprint and limit item exposure.

关键词： item response theory computerized adaptive testing exposure control content balancing stopping rule

来源：评论

学校读者我要写书评

暂无评论

computerized adaptive testing for the Random Weights Linear Logistic Test Model

引用

APPLIED PSYCHOLOGICAL MEASUREMENT 2014年第6期38卷 415-431页

作者： Crabbe, Marjolein Vandebroek, Martina Katholieke Univ Leuven Fac Econ & Business B-3000 Leuven Belgium Katholieke Univ Leuven Leuven Stat Res Ctr B-3000 Leuven Belgium

This article discusses four-item selection rules to design efficient individualized tests for the random weights linear logistic test model (RWLLTM): minimum posterior-weighted D-error (D-B), minimum expected posterior-weighted D-error (EDB), maximum expected Kullback-Leibler divergence between subsequent posteriors (KLP), and maximum mutual information (MUI). The RWLLTM decomposes test items into a set of subtasks or cognitive features and assumes individual-specific effects of the features on the difficulty of the items. The model extends and improves the well-known linear logistic test model in which feature effects are only estimated at the aggregate level. Simulations show that the efficiencies of the designs obtained with the different criteria appear to be equivalent. However, KLP and MUI are given preference over D-B and EDB due to their lesser complexity, which significantly reduces the computational burden.

关键词： random weights linear logistic test model computerized adaptive testing D-efficiency Kullback-Leibler divergence mutual information

来源：评论

学校读者我要写书评

暂无评论

Item Response Models in computerized adaptive testing: A Simulation Study

Item Response Models in Computerized Adaptive Testing: A Sim...

引用

14th International Conference on Computational Science and Its Applications (ICCSA)

作者： Ferrao, Maria Eugenia Prata, Paula Univ Beira Interior Covilha Portugal

ISBN: (纸本)9783319091501;9783319091495

In the digital world, any conceptual assessment framework faces two main challenges: (a) the complexity of knowledge, capacities and skills to be assessed;(b) the increasing usability of web-based assessments, which requires innovative approaches to the development, delivery and scoring of tests. Statistical methods play a central role in such framework. Item response models have been the most common statistical methods used to address such kind of measurement challenges, and they have been used in computer-based adaptive tests, which allow the item selection adaptively, from an item pool, according to the person ability during test administration. The test is tailored to each student. In this paper we conduct a simulation study based on the minimum error-variance criterion method varying the item exposure rate (0.1, 0.3, 0.5) and the test maximum length (18, 27, 36). The comparison is done by examining the absolute bias, the root mean square-error, and the correlation. Hypotheses tests are applied to compare the true and estimated distributions. The results suggest the considerable reduction of bias as the number of item administered increases, the occurrence of ceiling effect in very small size tests, the full agreement between true and empirical distributions for computerized tests of length smaller than the paper-and-pencil tests.

关键词： Item response model computerized adaptive testing measurement error

来源：评论

学校读者我要写书评

暂无评论

Scoring the Outcomes of 1-3-3 Multistage Model of Computerised adaptive testing

引用

PROGRAMMING AND COMPUTER SOFTWARE 2024年第SUPPL 1期50卷 S18-S25页

作者： Razmadze, S. Georgian Comp Soc Tbilisi 0160 Georgia

The given paper discusses an original method of the evaluation of outcomes of adaptive testing in the case of the strategy of the multilevel testing. The multiplicity/set of the outcomes of testing consists of atypical different-dimensional elements. The given paper defines the criteria of their comparison, describes the principles of ordering of the given multiplicity and draws the getting of a final score. The paper presented considers the ordering method of outcome set for multi-stage testing (MST) of 1-3-3 model. The ordering method of outcome set is used for the estimation of results of computerized adaptive testing (CAT). This method is not tied to a specific testing procedure. Acknowledgment of this is its usage for the 1-3-3 model, which is described in the paper. To sort the set of testing outcomes the function-criteria described in the initial article are used here and a comparative analysis of obtained results is performed. The criterion of ordering of the outcomes of testing may not be the only option. The given paper illustrates this fact through a comparative discussion of two samples. An original procedure of testing is used for the presentation of the essence of the method. The given procedure is aimed to be illustrative, because a described method of assessment can be used for similar strategies. The ordered outcome set is estimated by a hundred-point system according to the normal distribution. Applied results of our scientific research is developed as "Adaptester" portal and available on the following address: https://***.

关键词： Adaptester computerized adaptive testing stradaptive testing multistage adaptive testing evaluation algorithm ordering of a set

来源：评论

学校读者我要写书评

暂无评论

Multidimensional Forced-Choice CAT With Dominance Items: An Empirical Comparison With Optimal Static testing Under Different Desirability Matching

引用

EDUCATIONAL AND PSYCHOLOGICAL MEASUREMENT 2023年第2期83卷 322-350页

作者： Lin, Yin Brown, Anna Williams, Paul Univ Kent Canterbury Kent England SHL The Pavil1 Atwell Pl Thames Ditton KT7 0NE Surrey England

Several forced-choice (FC) computerized adaptive tests (CATs) have emerged in the field of organizational psychology, all of them employing ideal-point items. However, despite most items developed historically follow dominance response models, research on FC CAT using dominance items is limited. Existing research is heavily dominated by simulations and lacking in empirical deployment. This empirical study trialed a FC CAT with dominance items described by the Thurstonian Item Response Theory model with research participants. This study investigated important practical issues such as the implications of adaptive item selection and social desirability balancing criteria on score distributions, measurement accuracy and participant perceptions. Moreover, nonadaptive but optimal tests of similar design were trialed alongside the CATs to provide a baseline for comparison, helping to quantify the return on investment when converting an otherwise-optimized static assessment into an adaptive one. Although the benefit of adaptive item selection in improving measurement precision was confirmed, results also indicated that at shorter test lengths CAT had no notable advantage compared with optimal static tests. Taking a holistic view incorporating both psychometric and operational considerations, implications for the design and deployment of FC assessments in research and practice are discussed.

关键词： forced choice computerized adaptive testing multidimensional item response theory Thurstonian IRT model personality

来源：评论

学校读者我要写书评

暂无评论

Location-Matching adaptive testing for Polytomous Technology-Enhanced Items

引用

APPLIED PSYCHOLOGICAL MEASUREMENT 2024年第1-2期48卷 57-76页

作者： Kang, Hyeon-Ah Arbet, Gregory Betts, Joe Muntean, William Univ Texas Austin 1912 SpeedwayStop D5800 Austin TX 78712 USA Natl Council State Boards Nursing Washington DC USA

The article presents adaptive testing strategies for polytomously scored technology-enhanced innovative items. We investigate item selection methods that match examinee's ability levels in location and explore ways to leverage test-taking speeds during item selection. Existing approaches to selecting polytomous items are mostly based on information measures and tend to experience an item pool usage problem. In this study, we introduce location indices for polytomous items and show that location-matched item selection significantly improves the usage problem and achieves more diverse item sampling. We also contemplate matching items' time intensities so that testing times can be regulated across the examinees. Numerical experiment from Monte Carlo simulation suggests that location-matched item selection achieves significantly better and more balanced item pool usage. Leveraging working speed in item selection distinctly reduced the average testing times as well as variation across the examinees. Both the procedures incurred marginal measurement cost (e.g., precision and efficiency) and yet showed significant improvement in the administrative outcomes. The experiment in two test settings also suggested that the procedures can lead to different administrative gains depending on the test design.

关键词： technology-enhanced items polytomous scoring response time computerized adaptive testing variable-length computer-adaptive classification testing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：