Home Research Research Library Incorporating machine learning and social determinants of health indicators into prospective risk adjustment for health plan payments Incorporating machine learning and social determinants of health indicators into prospective risk adjustment for health plan payments 2020 Author(s) Irvin, Jeremy A, Kondrich, Andrew A, Ko, Michael, Rajpurkar, Pranav, Haghgoo, Behzad, Landon, Bruce E, Phillips, Robert L, Petterson, Stephen M, Ng, Andrew Y, and Basu, Sanjay Topic(s) Role of Primary Care Keyword(s) Payment Volume BMC Public Health Source BMC Public Health Background Risk adjustment models are employed to prevent adverse selection, anticipate budgetary reserve needs, and offer care management services to high-risk individuals. We aimed to address two unknowns about risk adjustment: whether machine learning (ML) and inclusion of social determinants of health (SDH) indicators improve prospective risk adjustment for health plan payments. Methods We employed a 2-by-2 factorial design comparing: (i) linear regression versus ML (gradient boosting) and (ii) demographics and diagnostic codes alone, versus additional ZIP code-level SDH indicators. Healthcare claims from privately-insured US adults (2016–2017), and Census data were used for analysis. Data from 1.02 million adults were used for derivation, and data from 0.26 million to assess performance. Model performance was measured using coefficient of determination (R2), discrimination (C-statistic), and mean absolute error (MAE) for the overall population, and predictive ratio and net compensation for vulnerable subgroups. We provide 95% confidence intervals (CI) around each performance measure. Results Linear regression without SDH indicators achieved moderate determination (R2 0.327, 95% CI: 0.300, 0.353), error ($6992; 95% CI: $6889, $7094), and discrimination (C-statistic 0.703; 95% CI: 0.701, 0.705). ML without SDH indicators improved all metrics (R2 0.388; 95% CI: 0.357, 0.420; error $6637; 95% CI: $6539, $6735; C-statistic 0.717; 95% CI: 0.715, 0.718), reducing misestimation of cost by $3.5 M per 10,000 members. Among people living in areas with high poverty, high wealth inequality, or high prevalence of uninsured, SDH indicators reduced underestimation of cost, improving the predictive ratio by 3% (~$200/person/year). Conclusions ML improved risk adjustment models and the incorporation of SDH indicators reduced underpayment in several vulnerable populations. ABFM Research Read all 1987 Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations Go to Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations 2011 Variation over time in preventable hospitalization rates across counties Go to Variation over time in preventable hospitalization rates across counties 2025 Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis Go to Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis 2025 Methods for measuring comprehensiveness in primary care: a narrative review Go to Methods for measuring comprehensiveness in primary care: a narrative review
Author(s) Irvin, Jeremy A, Kondrich, Andrew A, Ko, Michael, Rajpurkar, Pranav, Haghgoo, Behzad, Landon, Bruce E, Phillips, Robert L, Petterson, Stephen M, Ng, Andrew Y, and Basu, Sanjay Topic(s) Role of Primary Care Keyword(s) Payment Volume BMC Public Health Source BMC Public Health
ABFM Research Read all 1987 Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations Go to Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations 2011 Variation over time in preventable hospitalization rates across counties Go to Variation over time in preventable hospitalization rates across counties 2025 Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis Go to Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis 2025 Methods for measuring comprehensiveness in primary care: a narrative review Go to Methods for measuring comprehensiveness in primary care: a narrative review
1987 Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations Go to Pilot study using ‘dangerous answers’ as scoring technique on certifying examinations
2011 Variation over time in preventable hospitalization rates across counties Go to Variation over time in preventable hospitalization rates across counties
2025 Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis Go to Heterogeneity of diagnosis and documentation of post-COVID conditions in primary care: A machine learning analysis
2025 Methods for measuring comprehensiveness in primary care: a narrative review Go to Methods for measuring comprehensiveness in primary care: a narrative review