AI can help physicians predict medical events

0
294

Artificial intelligence will never replace a doctor.

However, researchers at the Department of Energy’s Pacific Northwest National Laboratory have taken a big step toward the day when AI can help physicians predict medical events.

A new approach developed by PNNL scientists improves the accuracy of patient diagnosis up to 20 percent when compared to other embedding approaches.

The PNNL approach seeks to capture and re-create the types of connections physicians do naturally when they apply a lifetime of learning and knowledge to the patient standing in front of them in the exam room.

The goal: Use the laboratory’s robust AI capabilities in machine learning and deep learning to improve patient care and save lives.

PNNL scientists recently discussed their new approach in a paper presented at the Data Science for Healthcare workshop at the SIGKDD Conference on Knowledge Discovery and Data Mining.

At the heart of the development is a data set PNNL created in collaboration with Stanford University of over 300,000 medical concepts defined by SNOMED Clinical Terms, a collection of standard medical terms, codes, synonyms and definitions used by medical researchers and practitioners.

PNNL developed a graph-based learning method grounded on these terms that outperformed current models. The code is available as an open-source download.

“If you think it’s hard translating doctors’ handwriting, try translating their medical knowledge into computer speak,” observes Robert Rallo, a computer scientist at PNNL who leads the PNNL team applying artificial intelligence to health care.

“The tough part is combining multiple types of data. Computer-friendly data like blood work numbers or diagnosis codes are easier than unstructured data like chart notes or images from X-rays or MRIs.” 

Rallo and the rest of the PNNL team are creating ways to fuse the many different types of health care data with an AI tool known as a knowledge graph as part of the PNNL-funded project Deep Care.

“A knowledge graph is what doctors have in their minds when they are diagnosing you,” said Rallo. “Doctors see relationships based on years of training and experience.

This is their mental model that creates links between symptoms and diseases. We are translating a symbolic representation of medical knowledge like that into something we can feed to machine learning algorithms together with patient data.”

PNNL computer scientist Khushbu Agarwal stresses AI will not replace doctors.

Instead, AI will be a decision support tool.

The models will have access to more data and more connections than can be stored in any human brain.

Far more than a database, the models may even detect connections a doctor observing a set of random symptoms may not consider initially. But doctors shouldn’t be expected to take the output of a model at face value.

Sutanay Choudhury, a computer scientist at PNNL, is focused on the interpretability of these models. He is working to build a tool that can explain its reasoning, predictions and recommendations using understandable examples that doctors will interpret.  Such explanations increase trust in the model, which the PNNL team envisions will someday be deployed at medical clinics.

As part of the next phase of its research, the PNNL team is working with a new data set as part of a collaboration between the Veterans Administration and the Department of Energy. The VA-DOE Big Data Science Initiative created a secure computing environment for analyses of medical data and includes new approaches to study suicide, cardiovascular disease and prostate cancer.


Artificial intelligence (AI) proffers the ability of computer systems to perform human brain tasks across various topics in all aspects of everyday life. Most clinical physicians are sceptical about the help that AI may provide in their current medical practice.

In this commentary, we aim to provide readers with insight on our experience -including all the benefits and pitfalls- since the implementation of an AI programme in our hospital within an infectious disease setting. Our aim for such programme was to create a set of tools to support more objective, and accurate, clinical decision-making processes.

Clinical scenarios for predictions

Our journey in exploring AI in relation to clinical decision-making processes started when we reported the high frequency of inappropriate empiric antibiotic treatments in neutropenic patients with bloodstream infection, even in spite of high adherence −87%- to current clinical guidelines [1].

These inappropriate empiric antibiotic treatments had a direct impact on mortality. Offering broad antibiotic treatment to all patients is not a suitable option due to its association with further selection of antimicrobial resistance, unnecessary toxicity and/or increases in healthcare costs.

With this background, we created our AI programme to check if using data directly retrieved from electronic health records (EHRs), we were able to predict which haematological patients with febrile neutropenia would have multidrug-resistant Gram-negative bacilli (MDR-GNB) infections.

Currently, physicians’ decisions regarding the use of empiric antibiotics are based on few studies, which offer a description of risk factors for bacterial multidrug-resistance [[2][3][4][5]].

In these studies, researchers have manually entered data and employed multivariate logistic regression models to evaluate a limited number of 5–10 variables. Moreover, target population usually comprises only those patients in whom an infection is finally documented but it is important to note that >30% of febrile neutropenia episodes are due to non-infectious causes.

Now, training a high number of data with machine learning (ML) or neural networks (NN), predictions on the results that will be obtained by cultures at febrile neutropenia onset are possible. This new and revolutionary reality is composed of two main tenets. First, a high number of data available from EHRs can be retrieved in real time. Second, advances made in computational performance allows extensive mathematical operations to dramatically optimise big data result training with ML and NN models.

Availability of data in EHRs

Almost all medical research has been conducted through manually collected data uploaded to statistical programmes. This current approach presents with several weaknesses such as a great sacrifice of time collecting data, analyses of only small sets of variables or lack of real-time data.

Yet, as most hospitals have begun transforming their patient care processes by integrating medical electronic records, healthcare data available from computers has reached an annual growth rate that exceeds data coming from any other media [6,7].

One of the concerns we faced was that EHRs were stored in a complex structure, from different sources and with limited accessibility to build large datasets.

Our experience per building a main dataset that integrates all the EHRs required first a hard task to create a dictionary to translate the system codes into clinical variables readable by physicians. In order to accomplish this data mining process, it is imperative that a multidisciplinary team of motivated clinicians and computer scientists work together.

If a clinical institution can gather all available data, the potential utility of such information for research and AI applications is unimaginable. The availability of millions of data from a single topic could essentially revolutionise descriptive and epidemiological studies and moreover, can be used to train ML, NN or deep learning algorithms to predict clinical situations and help clinicians in clinical decision-making processes.

What a clinician should know about machine learning and neural networks?

The focus of this paper is to explain to medical physicians the great, positive potential of AI in improving clinical decision-making processes.

It is important to highlight that whilst it is not our objective to detail the mathematical basis of ML or NN, it is our aim to underscore how such a mathematical approach could constructively impact medicine.

Medical research has widely employed the use of regression logistic as a statistical model in order to achieve different endpoints. Linear prediction divides events into possible or not possible. However, as real life has shown, most events are not as black and white. If we return to the clinical problem presented initially in this review—predict which patients will have an infection caused by multidrug-resistant bacteria—what would one of the points detailed in Fig. 1, which mimics a regression model, mean?

Fig. 1
Fig. 1
Subjective interpretation about the multidimensional problem.

Most may agree that the risk of multidrug-resistant infection will depend on antibiotic use, hospital environment and host microbiota status. But indeed, these factors depend simultaneously on other factors as well, such as clinical severity, comorbidities, team’s experience on antibiotic use or hospital characteristics. By the same argument, it falls to reason that these factors depend simultaneously on other factors too, and so on. Depending on the weight of each of these factors at play, a particular patient could have a greater or lesser risk of contracting a multidrug-resistant infection. As can be seen here, the theory is rather simple. The practice, however, is not.

The extensiveness of all the factors involved during a patient’s care is unfathomable for the human brain. It is impossible for the mind to predict how a change in any of these numerous factors will precisely affect the probability of risk of a multidrug-resistant infection alongside the various axis comprising this complex model. Whilst accuracy predictions in complex models could be performed using machine learning [8,9], mathematicians found that computers worked extremely slowly and too much time was needed. In an era of state-of-the-art technology and constant advances, supercomputers can currently analyse vast amounts of data using different dimensional analyses in seconds.

For our initial data mining process, we used computer clusters in order to reach enough computing power. Once data were ordered, a computer with standards conditions was used to perform clinical predictions. Our starting project trained ML algorithms. We achieved an AUC for predicting multidrug-resistant infections close to 0.80.

As of lately, our team has also been working with NN. NN comprise a series of algorithms linked by consecutive results to recognise specific patterns. This process mimics the way, in which the human brain operates. In our experience, we found this mathematical approach was significantly better in achieving more optimal accuracy when training a model to predict risk of multidrug-resistant infection at febrile neutropenia onset. Supplementary Table 1 summarises the most important, recently published ML and NNs approaches in medicine.Go to:

Artificial intelligence tools to help in clinical decision-making processes

Computer identification of a clinical problem; rapid data collection from EHRs and algorithm evaluations; real-time predictions and links with clinical recommendation are the tenets of the AI smart support system created and implemented in our hospital to facilitate clinical decision-making processes. In the past 29th European Congress of Infectious Disease and Clinical Microbiology in Amsterdam, we presented our novel AI tool [10].

This AI strives to overcome the challenges discussed earlier concerning clinical situations, and has demonstrated to identify those patients at risk of multidrug-resistant infections at febrile neutropenia onset and provide recommendations according to predictions made.

The model consisted of a search for neutropenic patients with fever in hospital EHRs every 4 min. When a febrile neutropenia event was documented, data from EHRs were retrieved by AI algorithm.

In detail, three algorithms were constructed to answer the following questions:

1) Is our patient going to have a multidrug resistant-P. aeruginosa infection?

2) Is our patient going to have a multidrug Enterobacterales infection?

3) Is our patient going to have none of the previous infections?

Remarkably, all algorithms created by our team achieved more than a 95% accuracy rate in their predictions. Per algorithm prediction, the computer distributed patients into different groups of multidrug-resistant infections and facilitated empiric antibiotic recommendations to help physicians in taking clinical decisions regarding treatment. This objective tool was available 24 h a day, 7 days a week.


More information: Snomed2Vec: Random Walk and Poincaré Embeddings of a Clinical Knowledge Base for Healthcare Analytics, arXiv:1907.08650 [cs.LG] arxiv.org/abs/1907.08650

Provided by Pacific Northwest National Laboratory

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Questo sito usa Akismet per ridurre lo spam. Scopri come i tuoi dati vengono elaborati.