Science-Watching: Forecasting New Diseases in Low-Data Settings Using Transfer Learning

[from London Mathematical Laboratory]

by Kirstin Roster, Colm Connaughton & Francisco A. Rodrigues

Abstract

Recent infectious disease outbreaks, such as the COVID-19 pandemic and the Zika epidemic in Brazil, have demonstrated both the importance and difficulty of accurately forecasting novel infectious diseases. When new diseases first emerge, we have little knowledge of the transmission process, the level and duration of immunity to reinfection, or other parameters required to build realistic epidemiological models. Time series forecasts and machine learning, while less reliant on assumptions about the disease, require large amounts of data that are also not available in early stages of an outbreak. In this study, we examine how knowledge of related diseases can help make predictions of new diseases in data-scarce environments using transfer learning. We implement both an empirical and a synthetic approach. Using data from Brazil, we compare how well different machine learning models transfer knowledge between two different dataset pairs: case counts of (i) dengue and Zika, and (ii) influenza and COVID-19. In the synthetic analysis, we generate data with an SIR model using different transmission and recovery rates, and then compare the effectiveness of different transfer learning methods. We find that transfer learning offers the potential to improve predictions, even beyond a model based on data from the target disease, though the appropriate source disease must be chosen carefully. While imperfect, these models offer an additional input for decision makers for pandemic response.

Introduction

Epidemic models can be divided into two broad categories: data-driven models aim to fit an epidemic curve to past data in order to make predictions about the future; mechanistic models simulate scenarios based on different underlying assumptions, such as varying contact rates or vaccine effectiveness. Both model types aid in the public health response: forecasts serve as an early warning system of an outbreak in the near future, while mechanistic models help us better understand the causes of spread and potential remedial interventions to prevent further infections. Many different data-driven and mechanistic models were proposed during the early stages of the COVID-19 pandemic and informed decision-making with varying levels of success. This range of predictive performance underscores both the difficulty and importance of epidemic forecasting, especially early in an outbreak. Yet the COVID-19 pandemic also led to unprecedented levels of data-sharing and collaboration across disciplines, so that several novel approaches to epidemic forecasting continue to be explored, including models that incorporate machine learning and real-time big data data streams. In addition to the COVID-19 pandemic, recent infectious disease outbreaks include Zika virus in Brazil in 2015, Ebola virus in West Africa in 2014–16, Middle East respiratory syndrome (MERS) in 2012, and coronavirus associated with severe acute respiratory syndrome (SARS-CoV) in 2003. This trajectory suggests that further improvements to epidemic forecasting will be important for global public health. Exploring the value of new methodologies can help broaden the modeler’s toolkit to prepare for the next outbreak. In this study, we consider the role of transfer learning for pandemic response.

Transfer learning refers to a collection of techniques that apply knowledge from one prediction problem to solve another, often using machine learning and with many recent applications in domains such as computer vision and natural language processing. Transfer learning leverages a model trained to execute a particular task in a particular domain, in order to perform a different task or extrapolate to a different domain. This allows the model to learn the new task with less data than would normally be required, and is therefore well-suited to data-scarce prediction problems. The underlying idea is that skills developed in one task, for example the features that are relevant to recognize human faces in images, may be useful in other situations, such as classification of emotions from facial expressions. Similarly, there may be shared features in the patterns of observed cases among similar diseases.

The value of transfer learning for the study of infectious diseases is relatively under-explored. The majority of existing studies on diseases remain in the domain of computer vision and leverage pre-trained neural networks to make diagnoses from medical images, such as retinal diseases, dental diseases, or COVID-19. Coelho and colleagues (2020) explore the potential of transfer learning for disease forecasts. They train a Long Short-Term Memory (LSTM) neural network on dengue fever time series and make forecasts directly for two other mosquito-borne diseases, Zika and Chikungunya, in two Brazilian cities. Even without any data on the two target diseases, their model achieves high prediction accuracy four weeks ahead. Gautam (2021) uses COVID-19 data from Italy and the USA to build an LSTM transfer model that predicts COVID-19 cases in countries that experienced a later pandemic onset.

These studies provide empirical evidence that transfer learning may be a valuable tool for epidemic forecasting in low-data situations, though research is still limited. In this study, we aim to contribute to this empirical literature not only by comparing different types of knowledge transfer and forecasting algorithms, but also by considering two different pairs of endemic and novel diseases observed in Brazilian cities, specifically (i) dengue and Zika, and (ii) influenza and COVID-19. With an additional analysis on simulated time series, we hope to provide theoretical guidance on the selection of appropriate disease pairs, by better understanding how different characteristics of the source and target diseases affect the viability of transfer learning.

Zika and COVID-19 are two recent examples of novel emerging diseases. Brazil experienced a Zika epidemic in 2015–16 and the WHO declared a public health emergency of global concern in February 2016. Zika is caused by an arbovirus spread primarily by mosquitoes, though other transmission methods, including congenital and sexual have also been observed. Zika belongs to the family of viral hemorrhagic fevers and symptoms of infection share some commonalities with other mosquito-borne arboviruses, such as yellow fever, dengue fever, or chikungunya. Illness tends to be asymptomatic or mild but can lead to complications, including microcephaly and other brain defects in the case of congenital transmission.

Given the similarity of the pathogen and primary transmission route, dengue fever is an appropriate choice of source disease for Zika forecasting. Not only does the shared mosquito vector result in similar seasonal patterns of annual outbreaks, but consistent, geographically and temporally granular data on dengue cases is available publicly via the open data initiative of the Brazilian government.

COVID-19 is an acute respiratory infection caused by the novel coronavirus SARS-CoV-2, which was first detected in Wuhan, China, in 2019. It is transmitted directly between humans via airborne respiratory droplets and particles. Symptoms range from mild to severe and may affect the respiratory tract and central nervous system. Several variants of the virus have emerged, which differ in their severity, transmissibility, and level of immune evasion.

Influenza is also a contagious respiratory disease that is spread primarily via respiratory droplets. Infection with the influenza virus also follows patterns of human contact and seasonality. There are two types of influenza (A and B) and new strains of each type emerge regularly. Given the similarity in transmission routes and to a lesser extent in clinical manifestations, influenza is chosen as the source disease for knowledge transfer to model COVID-19.

For each of these disease pairs, we collect time series data from Brazilian cities. Data on the target disease from half the cities is retained for testing. To ensure comparability, the test set is the same for all models. Using this empirical data, as well as the simulated time series, we implement the following transfer models to make predictions.

  • Random forest: First, we implement a random forest model which was recently found to capture well the time series characteristics of dengue in Brazil. We use this model to make predictions for Zika without re-training. We also train a random forest model on influenza data to make predictions for COVID-19. This is a direct transfer method, where models are trained only on data from the source disease.
  • Random forest with TrAdaBoost: We then incorporate data from the target disease (i.e., Zika and COVID-19) using the TrAdaBoost algorithm together with the random forest model. This is an instance-based transfer learning method, which selects relevant examples from the source disease to improve predictions on the target disease.
  • Neural network: The second machine learning algorithm we deploy is a feed-forward neural network, which is first trained on data of the endemic disease (dengue/influenza) and applied directly to forecast the new disease.
  • Neural network with re-training and fine-tuning: We then retrain only the last layer of the neural network using data from the new disease and make predictions on the test set. Finally, we fine-tune all the layers’ parameters using a small learning rate and low number of epochs. These models are examples of parameter-based transfer methods, since they leverage the weights generated by the source disease model to accelerate and improve learning in the target disease model.
  • Aspirational baseline: We compare these transfer methods to a model trained only on the target disease (Zika/COVID-19) without any data on the source disease. Specifically, we use half the cities in the target dataset for training and the other half for testing. This gives a benchmark of the performance in a large-data scenario, which would occur after a longer period of disease surveillance.

The remainder of this paper is organized as follows. The models are described in more technical detail in Section 2. Section 3 shows the results of the synthetic and empirical predictions. Finally, Section 4 discusses practical implications of the analyses.

Access the full paper [via institutional access or paid download].

COVID-19 and “Naïve Probabilism”

[from the London Mathematical Laboratory]

In the early weeks of the 2020 U.S. COVID-19 outbreak, guidance from the scientific establishment and government agencies included a number of dubious claims—masks don’t work, there’s no evidence of human-to-human transmission, and the risk to the public is low. These statements were backed by health authorities, as well as public intellectuals, but were later disavowed or disproven, and the initial under-reaction was followed by an equal overreaction and imposition of draconian restrictions on human social activities.

In a recent paper, LML Fellow Harry Crane examines how these early mis-steps ultimately contributed to higher death tolls, prolonged lockdowns, and diminished trust in science and government leadership. Even so, the organizations and individuals most responsible for misleading the public suffered little or no consequences, or even benefited from their mistakes. As he discusses, this perverse outcome can be seen as the result of authorities applying a formulaic procedure of “naïve probabilism” in facing highly uncertain and complex problems, and largely assuming that decision-making under uncertainty boils down to probability calculations and statistical analysis.

This attitude, he suggests, might be captured in a few simple “axioms of naïve probabilism”:

Axiom 1: more complex the problem, the more complicated the solution.

This idea is a hallmark of naïve decision making. The COVID-19 outbreak was highly complex, being a novel virus of uncertain origins, and spreading through the interconnected global society. But the potential usefulness of masks was not one of these complexities. The mask mistake was consequential not because masks were the antidote to COVID-19, but because they were a low cost measure the effect of which would be neutral at worst; wearing a mask can’t hurt in reducing the spread of a virus.

Yet the experts neglected common sense in favor of a more “scientific response” based on rigorous peer review and sufficient data. Two months after the initial U.S. outbreak, a study confirmed the obvious, and masks went from being strongly discouraged to being mandated by law. Precious time had been wasted, many lives lost, and the economy stalled.

Crane also considers another rule of naïve probabilism:

Axiom 2: Until proven otherwise, assume that the future will resemble the past.

In the COVID-19 pandemic, of course, there was at first no data that masks work, no data that travel restrictions work, no data of human-to-human transmission. How could there be? Yet some naïve experts took this as a reason to maintain the status quo. Indeed, many universities refused to do anything in preparation until a few cases had been detected on campus—at which point they had some data, as well as hundreds or thousands of other as yet undetected infections.

Crane touches on some of the more extreme examples of his kind of thinking, which assumes that whatever can’t be explained in terms of something that happened in the past is speculative, non-scientific and unjustifiable:

“This argument was put forward by John Ioannidis in mid-March 2020, as the pandemic outbreak was already spiralling out of control. Ioannidis wrote that COVID-19 wasn’t a ‘once-in-a-century pandemic,’ as many were saying, but rather a ‘once-in-a-century data-fiasco’. Ioannidis’s main argument was that we knew very little about the disease, its fatality rate, and the overall risks it poses to public health; and that in face of this uncertainty, we should seek data-driven policy decisions. Until the data was available, we should assume COVID-19 acts as a typical strain of the flu (a different disease entirely).”

Unfortunately, waiting for the data also means waiting too long, if it turns out that the virus turns out to be more serious. This is like waiting to hit the tree before accepting that the available data indeed supports wearing a seatbelt. Moreover, in the pandemic example, this “lack of evidence” argument ignores other evidence from before the virus entered the United States. China had locked down a city of 10 million; Italy had locked down its entire northern region, with the entire country soon to follow. There was worldwide consensus that the virus was novel, the virus was spreading fast and medical communities had no idea how to treat it. That’s data, and plenty of information to act on.

Crane goes on to consider a 3rd axiom of naïve probabilism, which aims to turn ignorance into a strength. Overall, he argues, these axioms, despite being widely used by many prominent authorities and academic experts, actually capture a set of dangerous fallacies for action in the real world.

In reality, complex problems call for simple, actionable solutions; the past doesn’t repeat indefinitely (i.e., COVID-19 was never the flu); and ignorance is not a form of wisdom. The Naïve Probabilist’s primary objective is to be accurate with high probability rather than to protect against high-consequence, low-probability outcomes. This goes against common sense principles of decision making in uncertain environments with potentially very severe consequences.

Importantly, Crane emphasizes, the hallmark of Naïve Probabilism is naïveté, not ignorance, stupidity, crudeness or other such base qualities. The typical Naïve Probabilist lacks not knowledge or refinement, but the experience and good judgment that comes from making real decisions with real consequences in the real world. The most prominent naïve probabilists are recognized (academic) experts in mathematical probability, or relatedly statistics, physics, psychology, economics, epistemology, medicine or so-called decision sciences. Moreover, and worryingly, the best known naïve probabilists are quite sophisticated, skilled in the art of influencing public policy decisions without suffering from the risks those policies impose on the rest of society.

Read the paper. [Archived PDF]