MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) scientists, in collaboration with Limor Appelbaum, a scientist in the Department of Radiation Oncology at Beth Israel Deaconess Medical Center (BIDMC), have developed two machine-learning models for the early detection of pancreatic cancer. The two models PRISM and the logistic regression model both surpassed current diagnostic methods. While standard screening detects about 10% of pancreatic cancer cases, PRISM can detect 35%.
The teams developed these models using data from over 5 million patients, retrieved from various institutions across the United States. To overcome geographical differences in patient data, the team partnered with a federated network company, thereby ensuring the models’ reliability and applicability across a diverse range of populations, geographical locations, and demographic groups.
“The key advancement of the PRISM models is their application and validation on an extensive database, which is greater than the scope of most prior research in the field”, explained Kai Jia, an MIT PhD student in electrical engineering and computer science, MIT CSAIL affiliate, and first author on an open-access paper in eBioMedicine.
David Avigan, a Harvard Medical School professor, hailed the study as a huge step towards redefining the approach to identifying risk profiles for cancer. This approach may enable the identification of high-risk patients that could benefit from early intervention, potentially leading to new preventative strategies against cancer.
The development of the PRISM model began six years ago, conceived from firsthand experiences with the limitations of current diagnostic procedures. The team collaborated closely with Appelbaum to better comprehend the combined medical and machine learning aspects of the problem and generate a highly accurate and transparent model.
Both PrismNN and PrismLR models assess PDAC risk by analysing electronic health record data such as patient demographics, diagnoses, medications, and lab results. “The hypothesis was that these records contained hidden clues — subtle signs and symptoms that could act as early warning signals of pancreatic cancer,” says Appelbaum.
Despite showing promise, the PRISM models have some limitations. They are currently primarily suited to U.S. datasets, which will require adjustment and testing for global use. The next step for the team involves enhancing the model’s applicability to international datasets and incorporating additional biomarkers for a more refined risk assessment.
“A future goal for us is to facilitate the models’ incorporation in routine healthcare settings. The vision is for these models to function unobtrusively within healthcare systems, analysing patient data automatically and alerting physicians to high-risk cases without increasing their workload,” says Jia. “An integrated machine-learning model with the EHR system could equip physicians with early alerts for high-risk patients, potentially leading to interventions before symptoms appear. We are keen to apply our techniques in the real world to aid people in leading longer, healthier lives.”