专题:培育新质生产力 助力高水平科技自立自强

疾病风险动态预测模型方法前沿进展与精准预防

  • 宋雨昕 ,
  • 叶倩 ,
  • 赵盟生 ,
  • 张隆垚 ,
  • 魏永越
展开
  • 1. 北京大学公众健康与重大疫情防控战略研究中心, 北京 100191;
    2. 南京医科大学公共卫生学院生物统计学系, 南京 211166;
    3. 北京大学公共卫生学院流行病与卫生统计学系, 北京 100191;
    4. 重大疾病流行病学教育部重点实验室(北京大学), 北京 100191
宋雨昕,博士研究生,研究方向为疾病动态风险预测模型统计方法,电子信箱:songpku2023@bjmu.edu.cn;叶倩(共同第一作者),硕士研究生,研究方向为非独立数据统计分析方法,电子信箱:yeqian@stu.njmu.edu.cn;魏永越(通信作者),研究员,研究方向为健康医疗大数据统计分析理论方法与应用、疾病风险和预后预测模型理论方法与应用,电子信箱:ywei@pku.edu.cn

收稿日期: 2024-04-17

  修回日期: 2024-06-07

  网络出版日期: 2024-07-09

基金资助

国家自然科学基金面上项目(81973142)

Disease dynamic risk prediction modeling methods and precision prevention

  • SONG Yuxin ,
  • YE Qian ,
  • ZHAO Mengsheng ,
  • ZHANG Longyao ,
  • WEI Yongyue
Expand
  • 1. Center for Public Health and Epidemic Preparedness & Response, Peking University, Beijing 100191, China;
    2. Department of Biostatistics, School of Public Health, Nanjing Medical University, Nanjing 211166, China;
    3. Department of Epidemiology and Biostatistics, School of Public Health, Peking University, Beijing 100191, China;
    4. Key Laboratory of Epidemiology of Major Diseases (Peking University), Ministry of Education, Beijing 100191, China

Received date: 2024-04-17

  Revised date: 2024-06-07

  Online published: 2024-07-09

摘要

动态疾病风险预测模型将是精确预防策略的核心,在过去20年中,以精准预防为目的的疾病风险预测模型研究呈现快速增长的态势。目前广泛应用的模型未能充分考虑预测因子随时间变化对疾病风险的影响(静态模型),校准漂移不可避免。综述了动态风险预测模型建模方法,得出如下认识:随着医疗健康大数据的互联互通和共享共用的不断推进,统计学和人工智能新方法的不断涌现,如何挖掘出更丰富的预测因子、识别出更准确的作用模式、开发更符合生物医学背景和实际场景的具有可解释性的疾病风险预测模型,赋能共病共防、异病同防,最终实现个体化多疾病谱的精准预防,将是未来的预测模型方法学研究的重点方向。

本文引用格式

宋雨昕 , 叶倩 , 赵盟生 , 张隆垚 , 魏永越 . 疾病风险动态预测模型方法前沿进展与精准预防[J]. 科技导报, 2024 , 42(12) : 75 -91 . DOI: 10.3981/j.issn.1000-7857.2024.05.00543

Abstract

Dynamic disease risk prediction models are essential for precision prevention strategies. Over the last twenty years, there has been a surge in research focused on these models for precision prevention. However, widely used models(static models) often overlook the impact of changes in predictors over time on disease risk, leading to inevitable calibration drift. This paper reviewed modeling methods for dynamic risk prediction models and provided reference for their development. The conclusions are as follows: As healthcare big data becomes more interconnected and shared, and new methods of statistics and artificial intelligence emerge, the challenge lies in discovering richer predictors, in identifying more accurate modes of action, and in creating interpretable disease risk prediction models which align with biomedical contexts and practical scenarios, to enhance common prevention of common diseases and co-prevention of heterogeneous diseases and to achieve precision and personalized prevention across a spectrum of diseases. This will be a crucial focus for future research on predictive modeling methodologies.

参考文献

[1] Grant S W, Collins G S, Nashef S A M. Statistical Primer:Developing and validating a risk prediction model[J]. European Journal of Cardio-Thoracic Surgery, 2018, 54(2):203-208.
[2] Hippisley-Cox J, Coupland C, Robson J, et al. Derivation, validation, and evaluation of a new QRISK model to estimate lifetime risk of cardiovascular disease:Cohort study using QResearch database[J]. BMJ, 2010, 341:c6624.
[3] Collins G S, Reitsma J B, Altman D G, et al. Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD):The TRIPOD statement. The TRIPOD Group[J]. Circulation, 2015, 131(2):211-219.
[4] Patzer R E, Kaji A H, Fong Y. TRIPOD reporting guidelines for diagnostic and prognostic studies[J]. JAMA Surgery, 2021, 156(7):675-676.
[5] Moons K G M, Wolff R F, Riley R D, et al. PROBAST:A tool to assess risk of bias and applicability of prediction model studies:Explanation and elaboration[J]. Annals of Internal Medicine, 2019, 170(1):W1-W33.
[6] Davis S E, Lasko T A, Chen G H, et al. Calibration drift in regression and machine learning models for acute kidney injury[J]. Journal of the American Medical Informatics Association, 2017, 24(6):1052-1061.
[7] Greene T, Li L. From static to dynamic risk prediction:Time is everything[J]. American Journal of Kidney Diseases, 2017, 69(4):492-494.
[8] Tangri N, Inker L A, Hiebert B, et al. A dynamic predictive model for progression of CKD[J]. American Journal of Kidney Diseases, 2017, 69(4):514-520.
[9] Pan Z C, Zhang R Y, Shen S P, et al. OWL:An optimized and independently validated machine learning prediction model for lung cancer screening based on the UK Biobank, PLCO, and NLST populations[J]. EBioMedicine, 2023, 88:104443.
[10] Davis A M, Cifu A S. Lung cancer screening[J]. JAMA, 2014, 312(12):1248.
[11] Tammemägi M C, Ruparel M, Tremblay A, et al. USPSTF2013 versus PLCO m2012 lung cancer screening eligibility criteria (International Lung Screening Trial):Interim analysis of a prospective cohort study[J]. The Lancet Oncology, 2022, 23(1):138-148.
[12] Wang F, Tan F W, Shen S P, et al. Risk-stratified approach for never-and ever-smokers in lung cancer screening:A prospective cohort study in China[J]. American Journal of Respiratory and Critical Care Medicine, 2023, 207(1):77-88.
[13] Shen H B, Zhu M, Wang C. Precision oncology of lung cancer:Genetic and genomic differences in Chinese population[J]. NPJ Precision Oncology, 2019, 3:14.
[14] Hong W, Li A, Liu Y H, et al. Clonal hematopoiesis mutations in patients with lung cancer are associated with lung cancer risk factors[J]. Cancer Research, 2022, 82(2):199-209.
[15] Irajizad E, Fahrmann J F, Marsh T, et al. Mortality benefit of a blood-based biomarker panel for lung cancer on the basis of the prostate, lung, colorectal, and ovarian cohort[J]. Journal of Clinical Oncology, 2023, 41(27):4360-4368.
[16] Bos L D J, Sjoding M, Sinha P, et al. Longitudinal respiratory subphenotypes in patients with COVID-19-related acute respiratory distress syndrome:Results from three observational cohorts[J]. The Lancet Respiratory Medicine, 2021, 9(12):1377-1386.
[17] Haines R W, Zolfaghari P, Wan Y Z, et al. Elevated urea-to-creatinine ratio provides a biochemical signature of muscle catabolism and persistent critical illness after major trauma[J]. Intensive Care Medicine, 2019, 45(12):1718-1731.
[18] Ye Q, Wang X, Xu X S, et al. Serial platelet count as a dynamic prediction marker of hospital mortality among septic patients[J]. Burns & Trauma, 2024,12:tkae016.
[19] Tu Y K, Tilling K, Sterne J A C, et al. A critical evaluation of statistical approaches to examining the role of growth trajectories in the developmental origins of health and disease[J]. International Journal of Epidemiology, 2013, 42(5):1327-1339.
[20] Leffondré K, Abrahamowicz M, Regeasse A, et al. Statistical measures were proposed for identifying longitudinal patterns of change in quantitative health indicators[J]. Journal of Clinical Epidemiology, 2004, 57(10):1049-1062.
[21] Laird N M, Ware J H. Random-effects models for longitudinal data[J]. Biometrics, 1982, 38(4):963-974.
[22] Nguena Nguefack H L, Pagé M G, Katz J, et al. Trajectory modelling techniques useful to epidemiological research:A comparative narrative review of approaches[J]. Clinical Epidemiology, 2020, 12:1205-1222.
[23] Thurston R C, Chang Y F, Kline C E, et al. Trajectories of sleep over midlife and incident cardiovascular disease events in the study of women's health across the nation[J]. Circulation, 2024, 149(7):545-555.
[24] Muthén B, Shedden K. Finite mixture modeling with mixture outcomes using the EM algorithm[J]. Biometrics, 1999, 55(2):463-469.
[25] Feldman B J, Masyn K E, Conger R D. New approaches to studying problem behaviors:A comparison of methods for modeling longitudinal, categorical adolescent drinking data[J]. Developmental Psychology, 2009, 45(3):652-676.
[26] Muthén B. Latent variable hybrids:Overview of old and new models[J]. Advances in latent variable mixture models, 2008, 1:1-24.
[27] Muthén B, Asparouhov T. Growth mixture modeling with non-normal distributions[J]. Statistics in Medicine, 2015, 34(6):1041-1058.
[28] Wei Y H. Review for Dynamic Prediction in Clinical Survival Analysis[J/OL].[2023-11-27]. https://arxiv.org/abs/2311.15743.
[29] Little R J A, Rubin D B. Nonignorable missing-data models[M]. Hoboken:John Wiley & Sons, Inc., 2014.
[30] Wu L, Liu W, Yi G Y, et al. Analysis of longitudinal and survival data:Joint modeling, inference methods, and issues[J]. Journal of Probability and Statistics, 2012, 2012:1-17.
[31] Wulfsohn M S, Tsiatis A A. A joint model for survival and longitudinal data measured with error[J]. Biometrics, 1997, 53(1):330-339.
[32] Tsiatis A A, DeGruttola V, Wulfsohn M S. Modeling the relationship of survival to longitudinal data measured with error. applications to survival and CD4 counts in patients with AIDS[J]. Journal of the American Statistical Association, 1995, 90(429):27.
[33] Parr H, Hall E, Porta N. Joint models for dynamic prediction in localised prostate cancer:A literature review[J]. BMC Medical Research Methodology, 2022, 22(1):245.
[34] Chesnaye N C, Tripepi G, Dekker F W, et al. An introduction to joint models-applications in nephrology[J]. Clinical Kidney Journal, 2020, 13(2):143-149.
[35] Zhang T H, Tang X C, Zhang Y, et al. Multivariate joint models for the dynamic prediction of psychosis in individuals with clinical high risk[J]. Asian Journal of Psychiatry, 2023, 81:103468.
[36] Hennessey V, Novelo L L, Li J, et al. A Bayesian joint model for longitudinal DAS28 scores and competing risk informative drop out in a rheumatoid arthritis clinical trial[J/OL].[2018-01-25]. https://arxiv.org/abs/1801.08628.
[37] Chen M H, Ibrahim J G, Sinha D. A new joint model for longitudinal and survival data with a cure fraction[J]. Journal of Multivariate Analysis, 2004, 91(1):18-34.
[38] Chi Y Y, Ibrahim J G. Bayesian approaches to joint longitudinal and survival models accommodating both zero and nonzero cure fractions[J]. Statistica Sinica, 2007, 17:445-462.
[39] Andrinopoulou E R, Nasserinejad K, Szczesniak R, et al. Integrating latent classes in the Bayesian shared parameter joint model of longitudinal and survival outcomes[J]. Statistical Methods in Medical Research, 2020, 29(11):3294-3307.
[40] Garre F G, Zwinderman A H, Geskus R B, et al. A joint latent class changepoint model to improve the prediction of time to graft failure[J]. Journal of the Royal Statistical Society Series A:Statistics in Society, 2008, 171(1):299-308.
[41] Li K, Luo S. Dynamic predictions in Bayesian functional joint models for longitudinal and time-to-event data:An application to Alzheimer's disease[J]. Statistical Methods in Medical Research, 2019, 28(2):327-342.
[42] Li K, Luo S. Bayesian functional joint models for multivariate longitudinal and time-to-event data[J]. Computational Statistics & Data Analysis, 2019, 129:14-29.
[43] Köhler M, Umlauf N, Greven S. Nonlinear association structures in flexible Bayesian additive joint models[J]. Statistics in Medicine, 2018, 37(30):4771-4788.
[44] Köhler M, Umlauf N, Beyerlein A, et al. Flexible Bayesian additive joint models with an application to type 1 diabetes research[J]. Biometrical Journal Biometrische Zeitschrift, 2017, 59(6):1144-1165.
[45] Rizopoulos D. Joint models for longitudinal and time-toevent data:With applications in R[M]. Boca Raton:CRC Press, 2012.
[46] Huang X, Li G, Elashoff R M, et al. A general joint model for longitudinal measurements and competing risks survival data with heterogeneous random effects[J]. Lifetime Data Analysis, 2011, 17(1):80-100.
[47] Herle M, Micali N, Abdulkadir M, et al. Identifying typical trajectories in longitudinal data:Modelling strategies and interpretations[J]. European Journal of Epidemiology, 2020, 35(3):205-222.
[48] Proust-Lima C, Dartigues J F, Jacqmin-Gadda H. Joint modeling of repeated multivariate cognitive measures and competing risks of dementia and death:A latent process and latent class approach[J]. Statistics in Medicine, 2016, 35(3):382-398.
[49] 邱皓政. 潜在类别模型的原理与技术[M]. 北京:教育科学出版社, 2008.
[50] Nylund K L, Asparouhov T, Muthén B O. Deciding on the number of classes in latent class analysis and growth mixture modeling:A Monte Carlo simulation study[J]. Structural Equation Modeling:A Multidisciplinary Journal, 2007, 14(4):535-569.
[51] Larose C, Harel O, Kordas K, et al. Latent class analysis of incomplete data via an entropy-based criterion[J]. Statistical Methodology, 2016, 32:107-121.
[52] Han J, Slate E H, Peña E A. Parametric latent class joint model for a longitudinal biomarker and recurrent events[J]. Statistics in Medicine, 2007, 26(29):5285-5302.
[53] Proust-Lima C, Joly P, Dartigues J F, et al. Joint modelling of multivariate longitudinal outcomes and a time-toevent:A nonlinear latent class approach[J]. Computational Statistics & Data Analysis, 2009, 53(4):1142-1154.
[54] Beunckens C, Molenberghs G, Verbeke G, et al. A latent-class mixture model for incomplete longitudinal Gaussian data[J]. Biometrics, 2008, 64(1):96-105.
[55] Andrinopoulou E R, Rizopoulos D, Takkenberg J J,et al. Combined dynamic predictions using joint models of two longitudinal outcomes and competing risk data[J]. Statistical Methods In Medical Research, 2017, 26(4):1787-801.
[56] Hatfield L A, Boye M E, Carlin B P. Joint modeling of multiple longitudinal patient-reported outcomes and survival[J]. Journal of Biopharmaceutical Statistics, 2011, 21(5):971-991.
[57] He B, Luo S. Joint modeling of multivariate longitudinal measurements and survival data with applications to Parkinson's disease[J]. Statistical Methods in Medical Research, 2016, 25(4):1346-1358.
[58] Andrinopoulou E R, Rizopoulos D, Takkenberg J J M, et al. Joint modeling of two longitudinal outcomes and competing risk data[J]. Statistics in Medicine, 2014, 33(18):3167-3178.
[59] Taylor J M G, Yu M G, Sandler H M. Individualized predictions of disease progression following radiation therapy for prostate cancer[J]. Journal of Clinical Oncology, 2005, 23(4):816-825.
[60] Rizopoulos D, Hatfield L A, Carlin B P, et al. Combining dynamic predictions from joint models for longitudinal and time-to-event data using Bayesian model averaging[J]. Journal of the American Statistical Association, 2014, 109(508):1385-1397.
[61] Proust-Lima C, Taylor J M G. Development and validation of a dynamic prognostic tool for prostate cancer recurrence using repeated measures of posttreatment PSA:A joint modeling approach[J]. Biostatistics, 2009, 10(3):535-549.
[62] Rizopoulos D. Dynamic predictions and prospective accuracy in joint models for longitudinal and time-to-event data[J]. Biometrics, 2011, 67(3):819-829.
[63] Proust-Lima C, Séne M, Taylor J M G, et al. Joint latent class models for longitudinal and time-to-event data:A review[J]. Statistical methods in medical research, 2014, 23(1):74-90.
[64] Rizopoulos D, Taylor J M G. Optimizing dynamic predictions from joint models using super learning[J]. Statistics in Medicine, 2024, 43(7):1315-1328.
[65] Barrett J K, Sweeting M J, Wood A M. Dynamic risk prediction for cardiovascular disease:An illustration using the ARIC study[M]//Handbook of Statistics. Amsterdam:Elsevier, 2017:47-65.
[66] McCrink L M, Marshall A H, Cairns K J. Advances in joint modelling:A review of recent developments with application to the survival of end stage renal disease patients[J]. International Statistical Review, 2013, 81(2):249-269.
[67] Zheng Y Y, Heagerty P J. Partly conditional survival models for longitudinal data[J]. Biometrics, 2005, 61(2):379-391.
[68] Keogh R H, Seaman S R, Barrett J K, et al. Dynamic prediction of survival in cystic fibrosis:A landmarking analysis using UK patient registry data[J]. Epidemiology, 2019, 30(1):29-37.
[69] Yang Z J, Hou Y W, Lyu J J, et al. Dynamic prediction and prognostic analysis of patients with cervical cancer:A landmarking analysis approach[J]. Annals of Epidemiology, 2020, 44:45-51.
[70] Yao Y, Li L, Astor B, et al. Predicting the risk of a clinical event using longitudinal data:The generalized landmark analysis[J]. BMC Medical Research Methodology, 2023, 23(1):5.
[71] Bull L M, Lunt M, Martin G P, et al. Harnessing repeated measurements of predictor variables for clinical risk prediction:A review of existing methods[J]. Diagnostic and Prognostic Research, 2020, 4:9.
[72] Chen Q, Tang B H, Song J Q, et al. Dynamic Bayesian network for predicting physiological changes, organ dysfunctions and mortality risk in critical trauma patients[J]. BMC Medical Informatics and Decision Making, 2022, 22(1):119.
[73] Marini S, Trifoglio E, Barbarini N, et al. A Dynamic Bayesian Network model for long-term simulation of clinical complications in type 1 diabetes[J]. Journal of Biomedical Informatics, 2015, 57:369-376.
[74] Orphanou K, Stassopoulou A, Keravnou E. DBN-extended:A dynamic Bayesian network model extended with temporal abstractions for coronary heart disease prognosis[J]. IEEE Journal of Biomedical and Health Informatics, 2016, 20(3):944-952.
[75] 国家统计局. 2022年中国卫生健康统计年鉴[M]. 北京:中国统计出版社, 2023.
[76] Allen B. The promise of explainable AI in digital health for precision medicine:A systematic review[J]. Journal of Personalized Medicine, 2024, 14(3):277.
[77] Luu M N, Han M J, Bui T T, et al. Smoking trajectory and cancer risk:A population-based cohort study[J]. Tobacco Induced Diseases, 2022, 20:71.
[78] You D F, Wang D H, Wu Y Q, et al. Associations of genetic risk, BMI trajectories, and the risk of non-small cell lung cancer:A population-based cohort study[J]. BMC Medicine, 2022, 20(1):203.
[79] Bui T T, Han M J, Luu N M, et al. Cancer risk according to alcohol consumption trajectories:A populationbased cohort study of 2.8 million Korean men[J]. Journal of Epidemiology, 2023, 33(12):624-632.
[80] Jarrett D, Yoon J, van der Schaar M. Dynamic prediction in clinical survival analysis using temporal convolutional networks[J]. IEEE Journal of Biomedical and Health Informatics, 2020, 24(2):424-436.
[81] Li C X, Zhao K, Zhang D F, et al. Prediction models of colorectal cancer prognosis incorporating perioperative longitudinal serum tumor markers:A retrospective longitudinal cohort study[J]. BMC Medicine, 2023, 21(1):63.
[82] Averbuch T, Sullivan K, Sauer A, et al. Applications of artificial intelligence and machine learning in heart failure[J]. The European Heart Journal-Digital Health, 2022, 3(2):311-322.
[83] Hunter D J, Holmes C. Where medical statistics meets artificial intelligence[J]. The New England Journal of Medicine, 2023, 389(13):1211-1219.
[84] Fihn S D, Berlin J A, Haneuse S J P A, et al. Prediction models and clinical outcomes:A call for papers[J]. JAMA Network Open, 2024, 7(4):e249640.
[85] Collins G S, Dhiman P, Ma J, et al. Evaluation of clinical prediction models (part 1):From development to external validation[J]. BMJ, 2024, 384:e074819.
[86] Riley R D, Snell K I E, Archer L, et al. Evaluation of clinical prediction models (part 3):Calculating the sample size required for an external validation study[J]. BMJ, 2024, 384:e074821.
[87] Riley R D, Archer L, Snell K I E, et al. Evaluation of clinical prediction models (part 2):How to undertake an external validation study[J]. BMJ, 2024, 384:e074820.
文章导航

/