Prediction of Breast Cancer Treatment-Induced Fatigue by Machine Learning Using Genome Wide Association Data
Menée à partir de l'analyse, à l’aide d’un algorithme d’apprentissage automatique, de données génomiques portant sur 2 799 patientes ayant survécu à un cancer du sein de stade précoce, cette étude de cohorte identifie les facteurs prédictifs d’une fatigue liée aux traitements anticancéreux
Background : We aimed at predicting fatigue after breast cancer (BC) treatment, using machine learning on clinical covariates and germline genome-wide data. Methods : We accessed germline genome-wide data of 2799 early-stage BC patients from CANTO (NCT01993498). The primary endpoint was defined as scoring zero at diagnosis and higher than quartile-3 at 1 year after primary treatment completion, on EORTC-QLQ-C30 for overall fatigue and on the multidimensional EORTC-QLQ-FA12 for physical, emotional and cognitive fatigue. First, we tested univariate associations of each endpoint with clinical variables and genome wide variants. Then, using pre-selected clinical (false discovery rate < 0.05) and genomic (p?<?0.001) variables, a multivariate preconditioned-random-forest regression model was built and validated on a hold-out subset to predict fatigue. Gene set enrichment analysis identified key biological correlates (MetaCore). All statistical tests were two-sided. Results : Statistically significant clinical associations were found only with emotional and cognitive fatigue, including receipt of chemotherapy, anxiety and pain. Some SNPs had some degree of association (p?<?0.001) with the different fatigue endpoints, although there were no genome-wide significant (p?<?5.00x10-8) associations. Only for cognitive fatigue, the predictive ability of the genomic multivariate model was statistically significantly better-than-random (area under the curve [AUC] =0.59, p?=?0.01) and marginally improved with clinical variables (AUC=0.60, p?=?0.005). SNPs found to be associated (p?<?0.001) with cognitive fatigue belonged to genes linked to inflammation (false discovery rate adjusted p?=?0.03), cognitive disorders (p?=?1.51x10-12) and synaptic transmission (p?=?6.28x10-8). Conclusions : Genomic analyses in this large cohort of BC survivors suggest a possible genetic role for severe cognitive fatigue that warrants further exploration.