Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis

Author(s)

Hendrix, Nathaniel, Parikh, Rishi V, Taskier, Madeline, Walter, Grace, Rochlin, Ilia, Saydah, Sharon, Koumans, Emilia H, Rincón-Guevara, Oscar, Rehkopf, David H, and Phillips, Robert L

Topic(s)

Role of Primary Care, and Achieving Health System Goals

Keyword(s)

Population Health, and Quality Of Care

Volume

PLOS One

Source

PLOS One

Background Post-COVID conditions (PCC) have proven difficult to diagnose. In this retrospective observational study, we aimed to characterize the level of variation in PCC diagnoses observed across clinicians from a number of methodological angles and to determine whether natural language classifiers trained on clinical notes can reconcile differences in diagnostic definitions. Methods We used data from 519 primary care clinics around the United States who were in the American Family Cohort registry between October 1, 2021 (when the ICD-10 code for PCC was activated) and November 1, 2023. There were 6,116 patients with a diagnostic code for PCC (U09.9), and 5,020 with diagnostic codes for both PCC and COVID-19. We explored these data using 4 different outcomes: 1) Time between COVID-19 and PCC diagnostic codes; 2) Count of patients with PCC diagnostic codes per clinician; 3) Patient-specific probability of PCC diagnostic code based on patient and clinician characteristics; and 4) Performance of a natural language classifier trained on notes from 5,000 patients annotated by two physicians to indicate probable PCC. Results Of patients with diagnostic codes for PCC and COVID-19, 61.3% were diagnosed with PCC less than 12 weeks after initial recorded COVID-19. Clinicians in the top 1% of diagnostic propensity accounted for more than a third of all PCC diagnoses (35.8%). Comparing LASSO logistic regressions predicting documentation of PCC diagnosis, a log-likelihood test showed significantly better fit when clinician and practice site indicators were included (p < 0.0001). Inter-rater agreement between physician annotators on PCC diagnosis was moderate (Cohen’s kappa: 0.60), and performance of the natural language classifiers was marginal (best AUC: 0.724, 95% credible interval: 0.555–0.878). Conclusion We found evidence of substantial disagreement between clinicians on diagnostic criteria for PCC. The variation in diagnostic rates across clinicians points to the possibilities of under- and over-diagnosis for patients.
Read More

ABFM Research

Read all