Hello! My name is Hyunjoo Kim, and I am a Ph.D. student in the Quantitative Psychology program at the University of Illinois-Urbana Champaign. I am passionate about advancing psychometrics by using cutting-edge computing and data science methods. My research interests span cognitive diagnosis models, network analysis, and latent variable modeling, with the goal of driving methodological innovation in the field. Future collaborations are always welcome!
Advisor: Dr. Hans-Friedrich Koehn (UIUC)
Funded by Accessible Teaching, Learning, and Assessment Systems (ATLAS), University of Kansas
Research with Dr. Hans-Friedrich Koehn (UIUC), Dr. Justin L. Kern (UIUC)
Kim, H., Koehn, H.F., and Chiu, C.-Y. (in press)
Identifiability Conditions in Cognitive Diagnosis: Implications for Q-matrix Estimation Algorithms.
British Journal of Mathematical and Statistical Psychology
https://doi.org/10.1111/bmsp.70020
The Q-matrix of a cognitively diagnostic assessment (CDA), documenting the item-attribute associations, is a key component of any CDA. However, the true Q-matrix underlying a CDA is never known and must be estimated—typically by content experts. However, due to fallible human judgment, misspecifications of the Q-matrix may occur, resulting in the misclassification of examinees. In response to this challenge, algorithms have been developed to estimate the Q-matrix from item responses. Some algorithms impose identifiability conditions while others do not. The debate about which is “right” is ongoing; especially, since these conditions are sufficient but not necessary, which means viable alternative Q-matrix estimates may be ignored. In this study, the performance of Q-matrix estimation algorithms that impose identifiability conditions on the Q-matrix estimate was compared with that of estimation algorithms which do not impose such identifiability conditions. Large-scale simulations examined the impact of factors like sample size, test length, attributes, or error levels. The estimated Q-matrices were evaluated for meeting identifiability conditions and their accuracy in classifying examinees. The simulation results showed that for the various estimation algorithms studied here, imposing identifiability conditions on Q-matrix estimation did not change outcomes with respect to identifiability or examinee classification.
Kim, H. (2025) Validating Generative AI Scoring of Constructed Responses with Cognitive Diagnosis.
In Proceedings of the Artificial Intelligence in Measurement and Education Conference (AIME-Con)
https://aclanthology.org/2025.aimecon-wip.20/
Generative AI has been investigated as a tool for scoring constructed responses (CRs). Although generative AI can provide both numeric scores and qualitative feedback on written tasks effectively and efficiently, its lack of transparency in output makes it challenging to build strong validity. Validity evidence for outputs from generative AI scoring is evaluated mainly through expert reviews or statistical concordance measures with human raters. As additional validity evidence for CR scores produced by generative AI, particularly for essay-type tasks, this research examines the feasibility of applying the Cognitive Diagnosis (CD) framework in psychometrics. The results of the study indicate that the classification information of CRs and item-parameter estimates from cognitive diagnosis models (CDMs) could provide a new perspective as additional validity evidence for CR scores and feedback from generative AI with less human oversight.
Koehn, H.F., Chiu, C.-Y., Oluwalana, O., Kim, H., and Wang, J. (2024) A Two-Step Q-Matrix Estimation Algorithm
Applied Psychological Measurement
https://doi.org/10.1177/01466216241284418
Cognitive Diagnosis Models in educational measurement are restricted latent class models that describe ability in a knowledge domain as a composite of latent skills an examinee may have mastered or failed. Different combinations of skills define distinct latent proficiency classes to which examinees are assigned based on test performance. Items of cognitively diagnostic assessments are characterized by skill profiles specifying which skills are required for a correct item response. The item-skill profiles of a test form its Q-matrix. The validity of cognitive diagnosis depends crucially on the correct specification of the Q-matrix. Typically, Q-matrices are determined by curricular experts. However, expert judgment is fallible. Data-driven estimation methods have been developed with the promise of greater accuracy in identifying the Q-matrix of a test. Yet, many of the extant methods encounter computational feasibility issues either in the form of excessive amounts of CPU times or inadmissible estimates. In this article, a two-step algorithm for estimating the Q-matrix is proposed that can be used with any cognitive diagnosis model. Simulations showed that the new method outperformed extant estimation algorithms and was computationally more efficient. It was also applied to Tatsuoka’s famous fraction-subtraction data. The paper concludes with a discussion of theoretical and practical implications of the findings.
Jin, I.H., Yun, J.H., Kim, H.J., and Jeon, M.J. (2023) Latent Space Accumulator Model for Analyzing Bipartite Networks with
Its Connection Time and Its Applications to Item Response Data with Response Time
Stat
https://doi.org/10.1002/sta4.632
Response time has attracted increased interest in educational and psychological assessment for, e.g., measuring test takers' processing speed, improving the measurement accuracy of ability, and understanding aberrant response behavior. Most models for response time analysis are based on a parametric assumption about the response time distribution. The Cox proportional hazard model has been utilized for response time analysis for the advantages of not requiring a distributional assumption of response time and enabling meaningful interpretations with respect to response processes. In this paper, we present a new version of the proportional hazard model, called a latent space accumulator model, for cognitive assessment data based on accumulators for two competing response outcomes, such as correct vs. incorrect responses. The proposed model extends a previous accumulator model by capturing dependencies between respondents and test items across accumulators in the form of distances in a two-dimensional Euclidean space. A fully Bayesian approach is developed to estimate the proposed model. The utilities of the proposed model are illustrated with two real data examples.
Kim, H.J., Jeon, Y.J., Kim, H.C., Jin, I.H., and Jung, S.J. (2022) Application of latent space item response model
to clustering stressful life events Beck and Depression Inventory-II: results from Korean epidemiological survey data
Epidemiology and Health
https://doi.org/10.4178/epih.e2022093
OBJECTIVES: According to previous findings, stressful life events (SLEs) and their subtypes are associated with depressive symptoms.
However, few studies have explored potential models for these events and incidental symptoms of depression.
METHODS: Participants (3,966 men; 5,709 women) were recruited from the Cardiovascular and Metabolic Diseases Etiology Research Center cohort.
SLEs were measured using a 47-item Life Experiences Survey (LES) with a standardized protocol.
Depressive symptoms were assessed using the Beck Depression Inventory-II (BDI-II). Joint latent space item response models were applied by gender and age group (<50 vs. ≥50 years old).
RESULTS: Among the LES items, death or illness of close relatives, legal problems, sexual difficulties, family relationships, and social relationships
shared latent positions with major depressive symptoms regardless of gender or age. We also observed a gender-specific domain: occupational and family-related items.
CONCLUSIONS: By projecting LES and BDI-II data onto the same interaction map for each subgroup, we could specify the associations between specific LES items and depressive symptoms.