CAT Field-Test Item Calibration Sample Size: How Large is Large under the Rasch Model?

1
Wei He
Wei He
1 Northwest Evaluation Association

Send Message

To: Author

GJHSS Volume 15 Issue G1

Article Fingerprint

ReserarchID

N6E64

CAT Field-Test Item Calibration Sample Size: How Large is Large under the Rasch Model? Banner
  • English
  • Afrikaans
  • Albanian
  • Amharic
  • Arabic
  • Armenian
  • Azerbaijani
  • Basque
  • Belarusian
  • Bengali
  • Bosnian
  • Bulgarian
  • Catalan
  • Cebuano
  • Chichewa
  • Chinese (Simplified)
  • Chinese (Traditional)
  • Corsican
  • Croatian
  • Czech
  • Danish
  • Dutch
  • Esperanto
  • Estonian
  • Filipino
  • Finnish
  • French
  • Frisian
  • Galician
  • Georgian
  • German
  • Greek
  • Gujarati
  • Haitian Creole
  • Hausa
  • Hawaiian
  • Hebrew
  • Hindi
  • Hmong
  • Hungarian
  • Icelandic
  • Igbo
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Javanese
  • Kannada
  • Kazakh
  • Khmer
  • Korean
  • Kurdish (Kurmanji)
  • Kyrgyz
  • Lao
  • Latin
  • Latvian
  • Lithuanian
  • Luxembourgish
  • Macedonian
  • Malagasy
  • Malay
  • Malayalam
  • Maltese
  • Maori
  • Marathi
  • Mongolian
  • Myanmar (Burmese)
  • Nepali
  • Norwegian
  • Pashto
  • Persian
  • Polish
  • Portuguese
  • Punjabi
  • Romanian
  • Russian
  • Samoan
  • Scots Gaelic
  • Serbian
  • Sesotho
  • Shona
  • Sindhi
  • Sinhala
  • Slovak
  • Slovenian
  • Somali
  • Spanish
  • Sundanese
  • Swahili
  • Swedish
  • Tajik
  • Tamil
  • Telugu
  • Thai
  • Turkish
  • Ukrainian
  • Urdu
  • Uzbek
  • Vietnamese
  • Welsh
  • Xhosa
  • Yiddish
  • Yoruba
  • Zulu

This study was conducted in an attempt to provide guidelines for practitioners regarding the optimal minimum calibration sample size for pretest item estimation in the computerized adaptive test (CAT) under WINSTEPS when the fixed-person-parameter estimation method is applied to derive pretest item parameter estimates. The field-testing design discussed in this study is a form of seeding design commonly used in the large-scale CAT programs. Under such as seeding design, field-test (FT) items are stored in an FT item pool and a predetermined number of them are randomly chosen from the FT item pool and administered to each individual examinee. This study recommends focusing on the valid cases (VCs) that each item may end up with given a certain calibration sample size, when the FT response data are sparse, and introduces a simple strategy to identify the relationship between VCs and calibration sample size. From a practical viewpoint, when the minimum number of valid cases reaches 250, items parameters are recovered quite well across a wide range of the scale. Implications of the results are also discussed.

17 Cites in Articles

References

  1. J-C Ban,B Hanson,T Wang,Q Yi,D Harris (2001). A comparative study of on-line pretest item: Calibratoin/Scaling methods in computerized adaptive testing.
  2. R Hambleton,H Swamina Than,H Rogers (1991). Fundamentals of Item Response Theory.
  3. C Glas (2003). Quality control of online calibration in computerized assessment.
  4. K Haynie,W Way (1995). An investigation of item calibration procedures for a computerized licensure examination.
  5. Y Hsu,T Thompson,W.-H Chen (1998). CAT item calibration.
  6. P Jansen,A Van Den Wollenberg,F Wierda (1988). Correcting unconditional parameter.
  7. G Kingsbury (2009). Adaptive item calibration: A process for estimating item parameters within a computerized adaptive test.
  8. J Linacre (2001). WINSTEPS Rasch measurement computer program.
  9. H Meng,S Steinkamp (2009). A comparison study of CAT pretest item linking designs.
  10. C Par Shall (1998). Item Development and Pretesting in a CBT Environment.
  11. Martha Stocking (1988). SCALE DRIFT IN ON‐LINE CALIBRATION.
  12. M Stocking (1990). Specifying optimum examinees for item parameter estimation in item response theory.
  13. A Van Den Wollenberg,F Wierda,P Jansen (1988). Consistency of Rasch model parameter estimation: a simulation study.
  14. Wen-Chung Wang,Cheng-Te Chen (2005). Item Parameter Recovery, Standard Error Estimates, and Fit Statistics of the Winsteps Program for the Family of Rasch Models.
  15. B Wright,G Douglas (1977). Best procedures for sample-free item analysis.
  16. B Wright,M Stone (1979). Measurement, Evaluation.
  17. M Zimowski,E Muraki,R Mislevy,R Bock (1999). BILOG-MG: Multiple group IRT analysis and test maintenance for binary items.

Funding

No external funding was declared for this work.

Conflict of Interest

The authors declare no conflict of interest.

Ethical Approval

No ethics committee approval was required for this article type.

Data Availability

Not applicable for this article.

Wei He. 2015. \u201cCAT Field-Test Item Calibration Sample Size: How Large is Large under the Rasch Model?\u201d. Global Journal of Human-Social Science - G: Linguistics & Education GJHSS-G Volume 15 (GJHSS Volume 15 Issue G1): .

Download Citation

Issue Cover
GJHSS Volume 15 Issue G1
Pg. 73- 79
Journal Specifications

Crossref Journal DOI 10.17406/GJHSS

Print ISSN 0975-587X

e-ISSN 2249-460X

Keywords
Classification
GJHSS-G Classification: FOR Code: 139999, 200499
Version of record

v1.2

Issue date

February 19, 2015

Language

English

Experiance in AR

The methods for personal identification and authentication are no exception.

Read in 3D

The methods for personal identification and authentication are no exception.

Article Matrices
Total Views: 4293
Total Downloads: 2247
2026 Trends
Research Identity (RIN)
Related Research

Published Article

This study was conducted in an attempt to provide guidelines for practitioners regarding the optimal minimum calibration sample size for pretest item estimation in the computerized adaptive test (CAT) under WINSTEPS when the fixed-person-parameter estimation method is applied to derive pretest item parameter estimates. The field-testing design discussed in this study is a form of seeding design commonly used in the large-scale CAT programs. Under such as seeding design, field-test (FT) items are stored in an FT item pool and a predetermined number of them are randomly chosen from the FT item pool and administered to each individual examinee. This study recommends focusing on the valid cases (VCs) that each item may end up with given a certain calibration sample size, when the FT response data are sparse, and introduces a simple strategy to identify the relationship between VCs and calibration sample size. From a practical viewpoint, when the minimum number of valid cases reaches 250, items parameters are recovered quite well across a wide range of the scale. Implications of the results are also discussed.

Our website is actively being updated, and changes may occur frequently. Please clear your browser cache if needed. For feedback or error reporting, please email [email protected]
×

This Page is Under Development

We are currently updating this article page for a better experience.

Request Access

Please fill out the form below to request access to this research paper. Your request will be reviewed by the editorial or author team.
X

Quote and Order Details

Contact Person

Invoice Address

Notes or Comments

This is the heading

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

High-quality academic research articles on global topics and journals.

CAT Field-Test Item Calibration Sample Size: How Large is Large under the Rasch Model?

Wei He
Wei He Northwest Evaluation Association

Research Journals