Home Health Care Artificial intelligence is able to predict who is most likely to die...

Artificial intelligence is able to predict who is most likely to die from the coronavirus

Febbraio 6, 2021

823

Artificial intelligence is able to predict who is most likely to die from the coronavirus. In doing so, it can also help decide who should be at the front of the line for the precious vaccines now being administered across Denmark.

The result is from a newly published study by researchers at the University of Copenhagen’s Department of Computer Science. Since the COVID-19 pandemic’s first wave, researchers have been working to develop computer models that can predict, based on disease history and health data, how badly people will be affected by COVID-19.

Based on patient data from the Capital Region of Denmark and Region Zealand, the results of the study demonstrate that artificial intelligence can, with up to 90 percent certainty, determine whether an uninfected person who is not yet infected will die of COVID-19 or not if they are unfortunate enough to become infected.

Once admitted to the hospital with COVID-19, the computer can predict with 80 percent accuracy whether the person will need a respirator.

“We began working on the models to assist hospitals, as during the first wave, they feared that they did not have enough respirators for intensive care patients. Our new findings could also be used to carefully identify who needs a vaccine,” explains Professor Mads Nielsen of the University of Copenhagen’s Department of Computer Science.

Older men with high blood pressure are highest at risk

The researchers fed a computer program with health data from 3,944 Danish COVID-19 patients. This trained the computer to recognize patterns and correlations in both patients’ prior illnesses and in their bouts against COVID-19.

“Our results demonstrate, unsurprisingly, that age and BMI are the most decisive parameters for how severely a person will be affected by COVID-19. But the likelihood of dying or ending up on a respirator is also heightened if you are male, have high blood pressure or a neurological disease,” explains Mads Nielsen.

The diseases and health factors that, according to the study, have the most influence on whether a patient ends up on a respirator after being infected with COVID-19 are in order of priority: BMI, age, high blood pressure, being male, neurological diseases, COPD, asthma, diabetes and heart disease.

“For those affected by one or more of these parameters, we have found that it may make sense to move them up in the vaccine queue, to avoid any risk of them becoming inflected and eventually ending up on a respirator,” says Nielsen.

Predicting respiratory needs is a must

Researchers are currently working with the Capital Region of Denmark to take advantage of this fresh batch of results in practice. They hope that artificial intelligence will soon be able to help the country’s hospitals by continuously predicting the need for respirators.

“We are working towards a goal that we should be able to predict the need for respirators five days ahead by giving the computer access to health data on all COVID positives in the region,” says Mads Nielsen, adding:

“The computer will never be able to replace a doctor’s assessment, but it can help doctors and hospitals see many COVID-19 infected patients at once and set ongoing priorities.”

However, technical work is still pending to make health data from the region available for the computer and thereafter to calculate the risk to the infected patients. The research was carried out in collaboration with Rigshospitalet and Bispebjerg and Frederiksberg Hospital.

The field of machine learning has made tremendous progress over the past decade. Improved deep learning algorithms coupled with increased computational capacity catalyzed the growth of the field into stratosphere. As a result, machine learning has been used in a diverse array of applications. Arguably the most crucial application of machine learning has been in the fight against COVID-19 pandemic.

Re- searchers have aggressively – and often successfully – pursued a number of different avenues using machine learning to battle COVID-19. A range of machine learning applications have been developed to tackle var- ious issues related to the virus. In this paper, we present the latest results and achievements of the machine learning community in the battle against the global pandemic.

In contrast, with other existing surveys on the subject we provide a general overview that is nuanced enough to provide a substantial insight. Our sur- vey includes preprint works to ensure the most up-to-date coverage of the topics. The current applications of machine learning to COVID-19 can be divided into four groups:

forecasting
medical diagnostics
drug development
contact tracing

Deep learning algorithms have been successfully deployed to forecast the number of new infections. Re- current neural networks have shown superior performance in time-series forecasting over traditional ap- proaches such as ARIMA models. Researchers have used recurrent networks, and their variant long short- term memory networks, to successfully model the spread of the infection and predict the future number of infections in population. Arguably the most important application of machine learning is in the field

of medical diagnostics that is made possible by the advances in computer vision. Machine learning has achieved near human level accuracy in many image recognition tasks. Therefore, it is no surprise that image recognition software is successfully being used to detect signs of COVID-19 in patient chest X- ray images. In many parts of the world where an effective clinical testing procedure is not available or unaffordable chest X-ray images and CT scans provide the only option to diagnose the virus. Studies have shown that deep leaning approaches can diagnose COVID-19 based on chest X-ray image with over 99% accuracy. Smart contact tracing using artificial intelligence has helped authorities locate potential infected persons. A number of software solutions based on artificial intelligence are currently in use to trace spread of the virus. Machine learning has been used to help guide researchers to new discoveries in pharmacology. In particular, variational autoencoders have the ability to analyze perturbations in chemical composition that can lead to possible new medicines. Applying autoencoders to the existing flu vaccines can help identify potential avenues to creating COVID-19 vaccine.

The challenge to fight off the global pandemic and help the humanity has spurred researchers across disciplines. In an effort to accelerate scientific research on COVID-19 the publishing community has made all the related publications freely available to the public. As a result, we are able to access and assess all the current research and present our survey to the readers. Our goal is to provide a quick, but sufficiently detailed, overview of the current state of the art in machine learning research applied to COVID-19. We hope our survey will supply the reader with the necessary information to facilitate a deeper investigation into the topic.

The paper is structured as follows. In Section 2, we discuss the use of machine learning in forecasting the number of new infections. Section 3 discusses the use of deep learning in detection and diagnosis of the infection. Section 4 contains the information about the use of machine learning in drug discovery and development. Section 5 discusses the current research related to the application of machine learning for contact tracing. Finally, Section 6 concludes the paper with a few closing remarks.

FORECASTING

Forecasting the number of infections is critical for proper planning and allocation of resources. Modern machine learning (ML) algorithms such as long short-term memory (LSTM) networks have been shown to outperform the traditional time series models such ARIMA and GARCH. As a result, LSTMs have been used in various application involving time series projections [17, 19]. Several countries employ ML based software to estimate the number of future infections and the trajectory of the infected population.

In this subsection, we will provide an overview of the latest advances in ML related to forecasting the number of COVID-19 infections. The results of our survey are summarized in Table 1 and a more detailed discussion follows below.

A comparative study of ML-based algorithms for COVID-19 forecasting was done in [7]. The au- thors analyzed a number of evolutionary algorithms such as Genetic Algorithm, Particle Swarm Optimiza- tion, and Gray Wolf Optimizer as well as ML algorithms such Multilayer Perceptron (MLP) and adaptive network-based fuzzy inference system (ANFIS). The models were evaluated on the basis of their accuracy for different prediction lead times. The authors employed data from 5 different countries in their study experiments revealed that MLP and ANFIS algorithms produce the best results achieving correlation level

of 0.999. A novel approach from Google Research that combines temporal and spatial data is proposed in [21]. Using graph neural networks and Google mobility data the authors uncover the rich interactions be- tween time and space that is often present in the spread of pandemic. Numerical experiments demonstrate the power of mobility data with the GNN framework. In [31], the authors employ an ensemble neural network to predict the number of confirmed cases and deaths in Mexico.

The proposed ensemble network (MNNF) consists of 3 modules: nonlinear autoregressive and function fitting neural networks. The mod- ule predictions are combined via a fuzzy integrator – designed to handle uncertainty – into a single output. The method is tested on data from Mexico. The authors carried out experiments to predict the number of confirmed cases and deaths 10 days ahead. Results reveal that the MNNF method outperforms single neural network models. The authors in [9] test 3 LSTM-based models to forecast the number of infected individuals for 32 states in India. The tested models include stacked, convolutional, and bi-directional LSTM neural networks.

The predictions are made one day and one week ahead. The results show that the bi-directional LSTM produces the optimal results. Several ML models are compared in [42] to forecast confirmed cases in Brazil and the US. The models under consideration include Bayesian neural network, cubist regression, kNN, random forest, and SVR. In addition, variational mode decomposition (VMD) is applied as a preprocessing step.

The authors also consider exogenous variables such as temperature and precipitation. Numerical experiments produce mixed results with no clear favorite. It can only be noted that VMD improves model performance when the prediction horizon is 6 days ahead. The authors in [38] compare statistical and ML approaches to time series forecasting. In particular, they study autoregressive integrated moving average (ARIMA), support vector regression (SVR), and LSTM models to forecast the number of infections, deaths, and recoveries.

The model input consists of the data from the previous 110 days. The model is used to predict infections for the next 48 days. The study is based on data from 10 countries. The results show that LSTM models generally outperform ARIMA and SVR. Machine learning approaches do not always outperform traditional methods. In [37], the authors compare classic statistical methods to SVR to predict the number of positive cases, death rate, and recovery rate.

The study covers a large number of countries. Results show that statistical models outperform SVR. In [15], the authors apply deep learning to forecast the number of infections and deaths regionally and worldwide. LSTM models use observed last 3 days of data to forecast 10 days ahead. In their analysis the authors considered Middle East, Europe, China, and worldwide data. The results show that the forecasts achieve 1.5% root mean square error (RMSE).

MEDICAL DIAGNOSTICS

Diagnosing COVID-19 infection is a key first step to fighting the virus. The rapid spread of the disease across the globe has made diagnosis of the disease at early stages not only important for the individual patient but also for preventing the community spread of the disease. Polymerase chain reaction (PCR) tests that are currently employed to detect the presence of the COVID-19 virus require time and capital to administer.

Despite recent improvements PCR tests remain scarce and costly in developing countries and rural areas. PCR tests may further suffer from sample preparation and quality control which can lead to insufficient sensitivity [41]. Therefore, developing alternative approaches to testing is a vital research area. At present there are several ML applications that support diagnostic process.

Deep neural networks demonstrated capability to achieve high accuracy in image detection tasks. Consequently, applying deep learning and other ML techniques to X-ray and CT scan images has been one of the intensely researched areas. In addition, detection approaches based on clinical data have also been tried and tested.

Artificial intelligence (AI) based methods augment the diagnosis process and accelerate the treatment of the disease. These models can assist the physicians and healthcare professionals not only during testing and treatment but also for planning and managing of the resource [27]. The results of our survey on the current AI/ML research for COVID-19 diagnostics are summarized in Tables 2 and 3.

Imaging techniques such as X-rays and CT scans are widely used as diagnosis tools for many lung diseases including tuberculosis, lung cancer, and pneumonia viruses. CT scan images provide fast and detailed information about the pathology and prognosis of diseases. As a result, ML techniques are being increasingly integrated with imaging and computer vision methods for applications in disease diagnosis.

The success of deep learning techniques in detecting and diagnosing various types of pneumonia has been already reported in the literature. The authors in [28] developed a robust model based on 3-dimensional convolutional neural network (CNN) framework to extract features from CT scan images and distinguish COVID-19 from the community acquired pneumonia.

When diagnosing patients in early stages, AI mod- els proved to be successful by integrating both CT scan imaging and clinical information [32]. Combining the output of CNN model on CT scan images and the output of ML models such as SVM and Random Forests on clinical data the accuracy of diagnosis reaches the levels of human healthcare experts. CT scan imaging is the diagnostic tool predominantly used in treating the pulmonary infections.

The same is employed during the current outbreak by many countries in diagnosing COVID-19 patients – particularly at early stages. Further progress was made by Zhou et al. [50] who identified the importance of segmentation and proposed deep learning based models to address these issues in ML based diagnosis of COVID-19.

Despite the promising research results there is still a lot of room for growth for ML based diagnos- tics. Production ready applications that can be used in hospitals require further refinement. A great deal of of research is yet to be conducted to improve their reliability.

The main challenge in deploying the AI/ML models in the COVID-19 is the generalization ability of these models which is also prevalent in AI based models in other applications. Another major bottleneck in implementing AL/ML based solutions in healthcare is the availability of patient data samples of necessary size and quality to train the ML models. In some instances though the data is available, format and structure of the data pose another challenge. In- tegrating existing research solutions to practical applications and products is another challenge.

Finally it is vital to ensure that the studies, investigations conducted and reported during this pandemic and pressing times are technically, scientifically and ethically are correct.
A wide array of ML models has been deployed to try diagnose instances of COVID-19. The list of mod- els includes CNN, RNN, SVM, transfer learning, XGBoost and others. Although these models demon- strate high performance and accuracy they possess limitations such as the lack of sufficient data to train the models, inability to generalize the results, etc [50].

Despite the ongoing efforts to apply ML/AI in COVID-19 diagnostics some members of the radiologist community have raised their concerns regarding possible pitfalls. Laghi [26] has cautioned that while AI/ML should be used for diagnosis of COVID-19, a more objective and precise quantification is required in understanding the lungs involvement of disease. Wynants [46] reviewed the validity and usefulness of the various models published in the literature on COVID-19 diagnosis, prognosis and risk prediction.

Their analysis over 145 models in 107 published documents showed that there exists a high risk of bias. The results of these models are probabilistic and hence are not recommended to be adopted for practical use. They call for more rigorous analysis of these models with proper methodological guidance and provision of description of populations under study. They also warned that if the studies are unreliable, it would lead to harmful effects in diagnosis and prognosis of the disease.

Based on the careful review of the existing literature on ML based diagnostics for COVID-19 we conclude that the proposed models have significant potential. The existing models can be used as stepping stones for building more robust and resilient models that would assist the healthcare professionals in diagnosis and decision making. AI/ML researchers should learn from the experiences of this pandemic and focus on developing models in collaboration with healthcare professionals and medical experts. We note that the most important challenge is the availability of data to train the models as well as the treatment of the data. Resolving this issue can have a big impact on the robustness, generalization ability of models for practical applications.

DRUG DEVELOPMENT
Machine learning algorithms are increasingly being used to search for new chemical combinations that can lead to effective medicine. Artificial intelligence and machine learning techniques have become an integral part of the pharmaceutical world. Integrating these techniques into the complex drug developing pipeline has proven to be both cost-effective and less time-consuming. Machine learning techniques are particularly useful as they provide a set of tools that improve the process of drug discovery and devel- opment for specific situations with the help of available data that is reliable and of high quality.

As a results a large effort has been under way to apply AI/ML based solutions in pharmacology. A summary of the survey of the current literature in the field is provided in Table 5. Several pharmaceutical companies have employed ML-based algorithms such as artificial neural networks, Support Vector Machines (SVM), deep learning and many others to develop various drugs and vaccines [36].

The authors in [36] provide a review of recently developed algorithms to design automated drug development pipelines consisting of drug discovery, drug testing and drug re-purposing. In drug discovery, the deep learning algorithm Gener- ative Adversarial Networks (GAN) is used to identify DNA sequences associated with specific functions and Bayesian Optimization (BO) is used to produce proteins of interest with lower costs.

In drug testing, sequential decision-making algorithms such as the Bayesian-based Multi-Armed Bandit (MAB) algo- rithms are used to test several drug candidates and determine the best treatments. In drug re-purposing, text mining methods and graph-based recommender systems are used to identify correlations and predict drug-disease interactions. The authors compiled a list of relevant data sets for drug development pipeline studies.

In 2019, the National Institute of Allergy and Infectious Diseases sponsored the first U.S. clinical trial to develop a vaccine against SARS-CoV-2 using an AI-based model [5]. An AI program called synthetic chemist was created to generate trillions of synthetic compounds and another AI-based program called Search Algorithm for Ligands (SAM) was used to sift through the trillions of compounds and determine the most suitable candidates as vaccine adjuvants. With the fast spread of COVID-19, there has recently been a race in utilizing ML techniques and AI capabilities to develop an effective vaccine and antivirals.

The authors in [1] incorporated reverse vaccinology, bioinformatics, immunoinformatics and deep learning strategies to build a computational framework for identifying probable vaccine candidates and constructing an epitope-based vaccine against COVID-19. The screening of viral proteome sequences resulted in short listing of Spike protein or Surface Glycoprotein of SARS-CoV-2 as a potential protein target that can be used to design the vaccine.

The physicochemical properties of the protein were further examined using LSTMs and the results showed that the protein is the primary responsible for the pathophysiology of SARS-CoV-2. The authors proposed that their computational pipeline can be used to design effective and safe vaccine against COVID-19. In [47], the authors used an ’In-Silico’ analysis to design a potent multi-epitope peptide vaccine against SARS-CoV-2. MLP and SVM algorithms were used to screen for potential epitopes.

The vaccine immunogenicity was enhanced using three potent adjuvants and its tertiary structure was predicted, refined and validated using appropriate strategies. The results showed that the vaccine can interact effectively with toll-like receptors (TLR) 3, 5, 8 and by using in silico cloning, it has demonstrated a high-quality structure, high stability and potential for expression in Escherichia coli.

The authors in [35] surveyed existing literature about COVID-19 and vaccine development. They used Vaxign Reserve Vaccinology (VRV) tool and Vaxign-ML, a machine learning-based vaccine candidate prediction and analysis system, to predict and evaluate potential vaccine candidates for COVID-19. The results showed that in addition to the commonly used S protein, the non-structural protein (nsp3) was found to be second highest in protective antigenicity. Further investigation of the the sequence conservation and immunogenicity of the multi-domain nsp3 protein, the authors concluded that the nsp3 can be an effective and safe vaccine target against COVID19.

For the development of drug treatment for COVID19, the authors in [10] used a pre-trained deep learning-based drug-target interaction model called Molecule Transformer-Drug Target Interaction (MT- DTI) to predict any commercially available antiviral drugs that could be effective against SARS-CoV-2. The model was compared to CNN-based model called DeepDTA and another two traditional machine learning based algorithms, gradient boosting and regularized least-squares model, using various data set.

The MT-DTI showed the best performance in predicting the drug–target interactions and was able to iden- tify various antiviral drugs such as redeliver, dolutegravir, efavirenz and atazanavir which could potentially be used in the treatment of SARS-CoV-2 infection. In [23], the authors used deep neural networks (DNN) and established an AI platform to identify potential old drugs that could be used against the SARS-CoV-

Different learning data sets consisting of compounds reported or proven active against SARS-CoV, SARS-CoV-2, human immunodeficiency virus (HIV), and influenza virus were generated and used to pre- dict drugs potentially active against coronavirusout of the marketed drugs. The predicted drugs were then tested and verified to serve as feedbacks to the AI platform for relearning and thus to generate a modified AI model.

The implemented AI-based framework was able to identify eight drugs with activities against Feline Infectious Peritonitis (FIP) coronavirus. The authors suggested that with prior use experiences in patients, these identified old drugs can potentially be proven to have anti-SARS-CoV-2 activity and hence be applied for fighting COVID-19 pandemic. The authors in [25] analyzed over 10 million compounds us- ing a machine learning pipeline in order to predict chemicals that interfere with SARS-CoV-2 targets.

The pipeline involves selection of important physicochemical features for each target using recursive feature elimination algorithms, followed by fitting aggregated multiple support vector machines (SVM) models and regularized random forest algorithm (regRF) to improve generalizability and then evaluating model performance using various computational validation methods.

The authors concluded that their identified chemicals can accelerate testing of short-term and long-term treatment strategies for COVID19. The im- portance of AI and Machine learning (ML) techniques that can accelerate the discovery of a possible cure for COVID-19 is discussed in a recent review article by [8]. The review article by [20] focused on the recent advances of COVID-19 drug and vaccine development using artificial intelligence and discussed the potential of intelligent training for the discovery of COVID-19 therapeutics.

CONTACT TRACING
Effective contact tracing is a major factor in a virus containment strategy [2]. In conventional contact tracing, a health care professional interviews the infected patient to trace and discover other individuals who may potentially be infected though contact with the patient. The main challenge of the conventional approach is the difficulty for an individual to recall all his contacts. In addition, the process requires availability of specialized clinicians using their experience and other resources [13].

Recent technological improvements allowed the contact tracing process to be optimized with less human intervention in an intelligent approach known as digital proximity (DP) contact tracing. The DP approach utilizes network technologies to identify and locate individuals who could be potentially infected through contact.
With he widespread availability of computing networks and mobile applications – and their associated technologies including smartphone, smartwatch, and others – most of the technology-based contact trac- ing systems are built on mobile platforms [4, 30]. These systems, named digital contact tracing (DCT), enable a registered user’s exposure to be evaluated through wireless signals such as Bluetooth low energy.

Alternative technology-based tracing systems that are non-mobile and application-based utilize tracking information collected from a variety of sources such as banking transactions, security camera footage, GPS data from vehicles, mobile phones and others to estimate the proximity of an individual to an infected person.
Artificial intelligence and machine learning – in particular deep learning algorithms – have been suc- cessfully used in medical diagnosis and screening systems due to their exceptional learning capabilities.

In the context of DCT systems, these technologies can be incorporated to aid the decision-making process and improve the detection accuracy of contact tracing. Concretely, the data collected from registered users such as their daily tracks and geo locations in the DCT system are explored by the ML algorithm within digital platforms to provide medical professionals and government officials with useful insights.

Artificial intelligence and machine learning applications are currently utilized through the entire life cycle of COVID-19 starting from detection to mitigation [27]. In contact tracing, a virtual AI agent is an alter- native to a health professional in the case of classical contact tracing. The virtual AI agent with natural language capabilities can collect the information previously gathered by a health professional.

In DCT systems, Bluetooth technology is widely employed as a proximity detector for COVID cases. However, the performance of Bluetooth-based contact tracing apps may be affected by changing signal intensity, which can be exhibited by different mobile devices, mobile positions, body positions, and physical barri- ers [51]. Generic wireless multipath effects and shadowing are persistent issues which can lead to false positive and false negative identification.

To improve the proximity detection accuracy in DCT systems, ML techniques can be used to analyze the Bluetooth signal and other phone sensors’ data.
Recently, a 2-stage classifier was proposed that utilizes vanilla neural network to extract features from a signal emanating from different sources [18]. Employing a deep learning technique directly on a smart- phone involves high computational cost and power consumption.

Therefore, during the first stage raw data from different sources is converted into fixed-length vectors and stored in the database. In the second stage, the vanilla deep learning algorithm is applied to detect proximity [18]. A similar project under the TC4TL challenge compares several deep learning models including Conv 1d [29], support vector machines [40], and decision tree-based algorithms [14] to evaluate the accuracy of Bluetooth-based distance measurement [39, 33]. The performance of different techniques is measured based the lowest normalized decision cost function (NDCF) which represents proximity detection performance considering the combination of false negatives and false positives. The results show that the Conv 1d network has the lowest NDCF.

It is evident that the performance of classification algorithms varies widely based on proximity thresh- olds. For example, Song [43] reported that when considering two people six feet apart in classifying Bluetooth beacon RSSI values, a Gaussian support vector machine classifier yielded better accuracy than a decision tree classifier. For validation, each experiment was conducted by placing two Raspberry Pi’s six feet apart and measuring the RSSI values.

An AI-based contact tracing app named COVI developed in Canada leverages probabilistic risk levels to profile an individual’s infection risk level [3]. COVI uses the advantages of ML algorithms to optimize and automate the integration of pseudonymized user data in assessing the risk levels. An a priori version of an epidemiological model-based simulated dataset is used to pre-train the ML models.

Upon collection of real data through an app, the simulator parameters are tuned to match with real data. The impact of ML in the COVI app is observed by using the ML predictor inside the simulator to influence the behavior of the agent in recommending the risk levels. The contact tracing application can be used to predict the lockdown area based on places visited by an infected patient. In [30] the authors proposed a K-Means clustering algorithm with DASV seeding to predict the lockdown area. The proposed method has been tested in Denver, USA and successfully identified the area to be locked down as users walking in the area approach each other very frequently. Despite the significant advantages of using DCT systems, there are issues related to data privacy and use. However, these are out of the scope of this review paper.

CONCLUSION
Machine learning has become a potent tool in many applications. In particular, it has recently been employed in the battle against COVID-19. There exists a growing body of literature that is dedicated to the subject. The decision by the major publishers to make all COVID-19 related research publicly available has improved information flow. In this paper, we attempt to provide an overview of the rapidly increasing corpus of research in machine learning related to COVID-19. We discuss the state-of-the-art research including the material on research archives.

In particular, we covered four major areas of ML research related to COVID-19: forecasting, medical diagnostics, drug development, and contact tracing.
Our survey revealed the following key observations. In forecasting, recurrent neural network such as LSTMs have been used to predict the future infection and death rates. Many studies are focused on the

North American region, but also other countries including Brazil and China. The best models achieve correlation of 0.999. In medical diagnostics, deep learning models that have previously shown success in other domains are being deployed to detect the presence of the infection based on CT scans and X-rays. The best models achieve accuracy rate of 99%. In drug discovery, a variety of algorithms are being used to develop new vaccine against the infection. However, the majority of the studies are still in the initial stage.

In contact tracing, AI based applications are utilized to identify and locate potential virus carriers though with limited success.
Despite the tremendous progress, the current machine learning approaches suffer from two major drawbacks. First, the underlying algorithms have not yet reached the level of human reasoning.

The deep learning models such as CNNs, LSTMs, Transformer, and others remain imperfect and cannot consis- tently outperform a human expert. Second, the lack of data hinders the training and development of the models. Patient data is notoriously difficult to obtain. Since deep learning models rely on abundance of data the lack of thereof results in suboptimal generalization performance.

Our main recommendation based on the extensive survey of current literature is the involvement of government agencies to facilitate procurement of COVID-19 related data. Public institutions and govern- ment agencies can play a key role in obtaining and disseminating data from hospitals to researchers. Since machine learning algorithms rely heavily on large amounts of data its availability can drastically improve results.

reference link: https://arxiv.org/pdf/2101.07824.pdf

More information: Espen Jimenez-Solem et al, Developing and validating COVID-19 adverse outcome risk prediction models from a bi-national European cohort of 5594 patients, Scientific Reports (2021). DOI: 10.1038/s41598-021-81844-x