Skip to main content

Verified by Psychology Today

Using Artificial Intelligence to Identify Diseases via Health Records

A study probes disease phenotyping using electronic health records.

Doctor-a/Pixabay
Source: Doctor-a/Pixabay

Clinical research evaluates the safety and efficacy of medications, devices, diagnostics and treatment for the diseases and conditions that impact human health and longevity. Last week, researchers at the Icahn School of Medicine at Mount Sinai published a study in Patterns that shows how an artificial intelligence (AI) deep learning algorithm called Phe2vec can help accelerate clinical research by learning how to identify disease phenotypes from patient electronic health records (EHR).

“Phe2vec aims to contribute to the next generation of clinical systems that use machine learning to effectively support clinicians in their activities,” the researchers wrote. “These systems capable of scaling to a large number of diseases, patients, and health data promise to offer a more holistic way to examine disease complexity and to improve clinical practice and medical research.”

In genetics, phenotypes are the observable physical characteristics of an organism. The combination of phenotype data with genetic information enables a more precise diagnosis of hereditary diseases and conditions. Patient electronic health records are an important source of phenotype data for clinical research.

Currently, phenotyping algorithms are manually hard-coded by researchers, and require advanced knowledge of the target phenotype or disease. Furthermore, these results need to be validated, a time-consuming process.

To improve on the existing process, the researchers used artificial intelligence machine learning to learn from the training data instead of hard-coding. Phe2vec is a scalable, artificial neural network that uses an unsupervised learning framework for EHR-based phenotyping.

“Phe2vec derives vector-based representations, i.e., embeddings, of medical concepts to define disease phenotypes using the semantic closeness in the embedding space to a seed concept (e.g., an ICD code),” wrote the scientists. “Embeddings are then aggregated at the patient-level to identify populations related to a specific disease based on distance from the phenotype in the embedding space.”

For the study, the researchers de-identified the electronic health records of over 1.9 million patients from the Mount Sinai Health System database and aggregated a range of information for each patient such as vital signs, lab tests, medications, procedure codes, diagnosis, and clinical notes.

“This method showed performance comparable or superior to that of other widely adopted EHR phenotyping approaches,” reported the researchers.

The researchers compared Phe2vec versus the existing gold standard called PheKB for ten different conditions including abdominal aortic aneurysm, atrial fibrillation, attention deficit hyperactivity disorder (ADHD), autism (ASD), Crohn disease, dementia, herpes zoster, multiple sclerosis, sickle cell disease, and type 2 diabetes mellitus (T2D).

According to the researchers, Phe2vec performed as well or better than PheKB algorithms with a higher overall positive predictive value (PPV).

“Phe2vec obtained better PPV in nine diseases, with highest improvements for herpes zoster and T2D, showing qualitative performances on par with manual phenotypes,” the researchers reported. “Overall, Phe2vec and PheKB achieved an average PPV of 0.94 and 0.82, respectively.”

By applying AI deep learning to learn disease phenotypes from patient electronic health records, the researchers have achieved a solution that harnesses unsupervised learning to spot cohorts for any target disease in a manner that is on par or better than existing methods.

“Phe2vec aims to contribute to the next generation of clinical systems that use machine learning to effectively support clinicians in their activities,” the researchers concluded. “These systems capable of scaling to a large number of diseases, patients, and health data promise to offer a more holistic way to examine disease complexity and to improve clinical practice and medical research.”

Copyright © 2021 Cami Rosso All rights reserved.

advertisement