Verified by Psychology Today


AI Machine Learning Predicts Alzheimer’s Disease Risk

AI algorithm uses genetic, non-genetic, and health records to predict AD risk.


The most common cause of dementia worldwide is Alzheimer’s disease (AD), a neurodegenerative disorder with no known cure. A new study published in Scientific Reports uses artificial intelligence (AI) machine learning (ML) and data from electronic health records (EHRs) to identify the important predictors for Alzheimer’s disease and finds that a person’s genetics outperforms age as a predictor for individuals who are 65 years of age and older.

“Machine learning (ML) methods provide an attractive and effective alternative to traditional statistical regression models, especially in situations where one has a large number of features or predictors,” wrote the authors of the National Institutes of Health (NIH) funded study led by Xiaoyi Raymond Gao at The Ohio State University College of Medicine, with Ohio State researchers Marion Chiariglione, Ke Qin and Douglas Scharre; the University of Miami researchers Karen Nuytemans and Eden Martin; and Yi-Ju Li at Duke University.

Globally, Alzheimer’s disease accounts for an estimated 60-70 percent of the over 55 million people with dementia and affects women disproportionately according to the World Health Organization (WHO).

In the U.S., there are currently 6.7 million people aged 65 and older with living AD, of which almost two-thirds are women and that figure will increase significantly to an estimated 12.7 million Americans by 2050 according to the Alzheimer’s Association.

Alzheimer’s disease was first identified in 1906 by German psychiatrist and neurologist Alois Alzheimer. He had discovered abnormal clumps and tangled bundles of fibers in the brain tissue of his female patient, Auguste Deter, who died at 51 years of age. Alois Alzheimer was treating her at a Frankfurt psychiatric hospital for memory loss, irrational behavior, and communication problems. Today, those abnormal clumps are known as amyloid plaques and the tangled bundles of fibers as neurofibrillary or tau tangles.

In addition to issues with memory, other AD symptoms include issues in thinking, reasoning, decision-making, judgment, and performing daily routine tasks according to the Mayo Clinic. Alzheimer’s disease may cause changes in personality and behavior with symptoms that include depression, delusions, changes in sleep habits, loss of inhibitions, mood swings, anger, aggression, loss of interest in activities, social withdrawal, and wandering. Although there is no cure, the progression of AD symptoms may be slowed with medication per the Mayo Clinic.

Changes in the brain due to Alzheimer’s disease may occur a decade or more before there are any symptoms according to the National Institute on Aging. Early detection of the disease enables AD patients and their caregivers to plan for future care services as well as provide an opportunity for treatment of the symptoms that may help improve the quality of life.

For this new study, the researchers aimed to create an explainable AI model by using a popular machine learning library called eXtreme Gradient Boosting (XGBoost) and Shapley Additive exPlanations (SHAP), a state-of-the-art algorithm for AI machine learning explainability that reverse-engineers the output of predictive algorithm based on game theoretical optimal Shapley values. SHAP computes the contribution of each feature to the prediction, hence it is a useful tool for visualizing output. The researchers used over 11,000 features and predictors.

“The combination of XGBoost and SHAP can be used as an explainable ML model, which maintains the accuracy of ML models while providing the distribution of the effects with direction for each variable to enhance the interpretability of the results,” wrote the scientists.

The researchers developed polygenic risk scores (PRSs) for Alzheimer’s disease from the Alzheimer Disease Genetics Consortium database and age-at-onset (AAO) for AD using the UK Biobank database.

“The Apolipoprotein-E gene (APOE) is the most well-known genetic risk factor for AD3,12, but genome-wide association studies (GWASs) have identified more than 40 genetic loci to date for AD,” the researchers shared. “In recent years, polygenic risk scores (PRSs) have been proposed to aggregate genetic effects, from small to large, across the genome into a single measure of risk for each individual.”

AI machine learning models for predicting Alzheimer’s were developed using International Classification of Diseases Tenth Revision (ICD-10) codes from electronic health records (EHRs) and genetic data in large-scale biorepositories.

“To our knowledge, this is the first report to develop predictive models for AD using genetic, non-genetic information, and ICD-10 codes from EHR in a large-scale cohort study using a modern explainable ML framework,” the researchers wrote.

The researchers found that age, income, and polygenic risk scores were the top AD risk factors that improved prediction accuracy.

Other important risk factors include a family history of AD/dementia, hearing issues, diabetes, and blood pressure (higher systolic and lower diastolic). Interestingly, they found that being underweight, not obese, increased AD risk and may be a useful pre-clinical biomarker.

An important discovery was that data from the electronic health record can provide key data for predicting Alzheimer’s disease and the AI model provided the top 20 features for both the 40+ and 65+ age groups. The scientists point out that feature importance does not indicate a causal relationship.

The AI machine learning model revealed that age ranks first among all features in the age 40+ group and that the genetic effects reflected in the polygenic risk scores become more important than age for individuals aged 65 and older for predicting Alzheimer’s disease.

Copyright © 2023 Cami Rosso All rights reserved.

More from Cami Rosso
More from Psychology Today
Most Popular