Blueprinting the Future: Automatic Item Categorisation using Hierarchical Zero-Shot and Few-Shot Classifiers

Author(s)

Wang, Ting, Stelter, Keith L, O’Neill, Thomas R, Hendrix, Nathaniel, Bazemore, Andrew W, and Newton, Warren P

Topic(s)

Family Medicine Certification

Keyword(s)

Psychometrics

Volume

Journal of Applied Testing Technology

Precise item categorisation is essential in aligning exam questions with content domains outlined in assessment blueprints. Traditional methods, such as manual classification or supervised machine learning, are often time-consuming, errorprone, or limited by the need for large training datasets. This study presents a novel approach using zero-shot and few-shot Generative Pretrained Transformer (GPT) models for hierarchical item categorisation. By leveraging human-readable language descriptions within a structured Python dictionary, the model navigates complex blueprint hierarchies without requiring extensive training data. An initial simulation with synthetic items demonstrated the method’s effectiveness, achieving an average F1 score of 92.91%. The approach was then applied to 200 real exam items from the 2022 In-Training Examination (ITE) by the American Board of Family Medicine (ABFM), reclassifying them according to a newly developed blueprint in just 15 minutes—a process that would typically take several days of expert review. This technique offers rapid, consistent, and scalable item categorisation, minimises human bias, and allows for iterative refinement through simple adjustments to category definitions, enhancing both efficiency and sustainability in assessment design.

ABFM Research

Read all