Skip to main content
EDIT

Comparing Trained and Untrained Probabilistic Ensemble Forecasts of COVID-19 Cases and Deaths in the United States

Comparing Trained and Untrained Probabilistic Ensemble Forecasts of COVID-19 Cases and Deaths in the United States

In recent work published in the International Journal of Forecasting, Professor Jacob Bien and his co-authors describe their efforts to evaluate the performance of different ensemble models to forecast cases and deaths as part of the US Covid-19 Forecast Hub, which was used by the CDC and various state health officials.

03.22.23
Color photograph of Covid heat map.
Stay Informed + Stay Connected
MARSHALL MONTHLY BRINGS YOU ESSENTIAL NEWS AND EVENTS FROM FACULTY, STUDENTS, AND ALUMNI.

Accurate short-term forecasts of infectious disease indicators are critical to health officials and policymakers. For example, during the Covid-19 pandemic, health officials depended on forecasts of cases and deaths to plan and deploy various healthcare measures as well as distribution of limited healthcare resources and supplies. Forecasting models differ in several respects: for examples, some may use explicit or implicit representations of a disease transmission process while others may only use historical data to forecast. Prior work has shown that ensemble forecasting models that combine predictions from many models provide more accurate and robust forecasts than individual models.

In recent work published in the International Journal of Forecasting, Professor JACOB BIEN and his co-authors describe their efforts to evaluate the performance of different ensemble models to forecast cases and deaths as part of the US Covid-19 Forecast Hub, which was used by the CDC and various state health officials. The U.S. Hub created ensemble short-term forecasts of Covid-19 cases and deaths at the state and national level by combining forecasts submitted by a large and variable number of contributing teams using different modeling techniques and data sources. This work provided a probabilistic distribution of the number of cases and deaths rather than simply point predictions since decision-makers need to plan for various scenarios.

A key finding from the study is that ensemble models based on weighted median values of the individual forecasts consistently perform better than ensemble models based on weighted means. The equal weights refer to the weight given to the individual forecasts. This is because the median based ensemble models are less sensitive to outlier forecasts. Another important finding from the study is that ensemble models that used unequal weights for the individual forecasts or included only certain individual forecasts (called trained ensemble models) outperformed the equally weighted (or untrained ensemble) models. This is the case only if some individual forecasting models show a consistent record of good performance over time and trained ensemble models give those forecasts higher weights and perform very well. If individual forecasting models do not show consistent performance over time, then it is better to resort to the untrained models that are simpler to deploy. The study also found that trained ensemble models had very good success at forecasting deaths in the US during the COVID-19 pandemic but were less successful in predicting cases.

Overall, this study provides valuable guidelines about how to construct ensemble models that will provide accurate and robust forecasts that are of immense value to researchers and decision-makers for predicting the spread of any type of infectious disease.

"Comparing trained and untrained probabilistic ensemble forecasts of COVID-19 cases and deaths in the United States"

Ray, Brooks, Bien, et al.

GO TO PAPER