Skip to main content
Advertisement
Browse Subject Areas
?

Click through the PLOS taxonomy to find articles in your field.

For more information about PLOS Subject Areas, click here.

  • Loading metrics

Comparing Observed with Predicted Weekly Influenza-Like Illness Rates during the Winter Holiday Break, United States, 2004-2013

  • Hongjiang Gao ,

    hgao@cdc.gov

    Affiliation Centers for Disease Control and Prevention, National Center for Disease Control and Prevention, Division of Global Migration and Quarantine, Atlanta, Georgia, United States of America

  • Karen K. Wong,

    Affiliation Centers for Disease Control and Prevention, National Center for Disease Control and Prevention, Division of Global Migration and Quarantine, Atlanta, Georgia, United States of America

  • Yenlik Zheteyeva,

    Affiliation Centers for Disease Control and Prevention, National Center for Disease Control and Prevention, Division of Global Migration and Quarantine, Atlanta, Georgia, United States of America

  • Jianrong Shi,

    Affiliation Chenega Time Solutions, Chesapeake, Virginia, United States of America

  • Amra Uzicanin,

    Affiliation Centers for Disease Control and Prevention, National Center for Disease Control and Prevention, Division of Global Migration and Quarantine, Atlanta, Georgia, United States of America

  • Jeanette J. Rainey

    Affiliation Centers for Disease Control and Prevention, National Center for Disease Control and Prevention, Division of Global Migration and Quarantine, Atlanta, Georgia, United States of America

Abstract

In the United States, influenza season typically begins in October or November, peaks in February, and tapers off in April. During the winter holiday break, from the end of December to the beginning of January, changes in social mixing patterns, healthcare-seeking behaviors, and surveillance reporting could affect influenza-like illness (ILI) rates. We compared predicted with observed weekly ILI to examine trends around the winter break period. We examined weekly rates of ILI by region in the United States from influenza season 2003–2004 to 2012–2013. We compared observed and predicted ILI rates from week 44 to week 8 of each influenza season using the auto-regressive integrated moving average (ARIMA) method. Of 1,530 region, week, and year combinations, 64 observed ILI rates were significantly higher than predicted by the model. Of these, 21 occurred during the typical winter holiday break period (weeks 51–52); 12 occurred during influenza season 2012–2013. There were 46 observed ILI rates that were significantly lower than predicted. Of these, 16 occurred after the typical holiday break during week 1, eight of which occurred during season 2012–2013. Of 90 (10 HHS regions x 9 seasons) predictions during the peak week, 78 predicted ILI rates were lower than observed. Out of 73 predictions for the post-peak week, 62 ILI rates were higher than observed. There were 53 out of 73 models that had lower peak and higher post-peak predicted ILI rates than were actually observed. While most regions had ILI rates higher than predicted during winter holiday break and lower than predicted after the break during the 2012–2013 season, overall there was not a consistent relationship between observed and predicted ILI around the winter holiday break during the other influenza seasons.

Introduction

In the United States, influenza season typically begins in October or November, peaks in February, and tapers off in April, although the timing and duration vary from year to year [1]. The U.S. Centers for Disease Control and Prevention (CDC) assesses influenza activity using the National Influenza Sentinel Surveillance System for influenza-like-illness (ILINet) [2]. More than 2,900 ILINet sentinel providers in all 50 states, Puerto Rico, the District of Columbia, and the U.S. Virgin Islands report weekly visits for influenza-like illnesses (ILI), defined as fever (≥100°F [≥37.8°C]), plus cough and/or sore throat, in the absence of another known cause of illness.

There are a number of studies[35] suggesting that school closures, which temporarily change the social contact patterns of children, can reduce total illnesses and peak incidence of pandemic and seasonal influenza among school children. Among the general population, commonly observed holidays, such as the winter holiday break in late December to early January, also disrupt normal social mixing patterns. These periods provide unique opportunities to explore the relationship between ILI and temporary, atypical social patterns.

Assessing the impact of winter break on ILI is challenging. It is difficult to identify appropriate control series, and populations may not be similarly susceptible to influenza across regions. However, models that can be trained on longitudinal data to predict weekly ILI rates may be able to identify deviations from expected ILI patterns around a time period of interest.

The ARIMA method was first popularized by Box-Jenkins[6] for analyzing time-series data. Since then, they have been widely applied in fields such as engineering, economics, agriculture, meteorology, and infectious diseases, including influenza [79]. Unlike most generalized linear regression, in which model predictions are restricted to the range of predictors, ARIMA models forecast beyond the scope of model predictors using the recursive relationship between observations and error terms.

We apply the ARIMA method to compare observed and predicted weekly ILI around the winter holiday break period to explore patterns that may be associated with the holiday period.

Methods

Data Source

We obtained ILINet data for the 2003–2004 through 2012–2013 influenza seasons for each of 10 U.S. Health and Human Services (HHS) regions listed on the CDC website [2]. A U.S. HHS region is the geographic aggregation of 4–10 adjacent states or island areas. Each week, sentinel clinic providers in these regions report the number of clinic visits due to ILI, as well as the overall number of clinic visits [10]. The ILI rate (number of clinic visits due to ILI/total number of visits across the state’s sentinel clinics) is weighted according to the state population and aggregated to the regional level.

Data Analysis

Winter Holiday Break.

Because the beginning and ending dates of the winter holiday break may vary by geography and year, we approximated this using website announcements made by 4,297 public schools in Tennessee and North Carolina during the 2012–2013 influenza season (data originally collected for a different project). We found that most school districts started the winter holiday break during the week of Christmas, or the Thursday and Friday of the previous week, and ended the break on the first Monday after the New Year. Therefore, we approximated the disruption to normal social mixing patterns and healthcare-seeking behavior during the holiday break as the 2-week period that covered New Year’s Day.

Defining ILINet Surveillance Weeks.

The ILI weeks in this research were defined in the same way as they were for the CDC’s Morbidity and Mortality Weekly Report (MMWR) [11]. The first day of any week is Sunday. Week # 1 is the first week of the year having at least four days in the calendar year. Using this definition, years 2003 and 2008 consisted of 53 weeks in our analysis.

In addition to the first week and last week that encompass the New Year, we also compared observed and predicted ILI rates for weeks 44 to week 8 for each influenza season from 2004–2005 to 2012–2013 for the purpose of determining whether we can observe similar interruptions of ILI trend in these influenza weeks. Therefore, our analysis included a total of 1,530 unique (17 weeks x 9 seasons x 10 regions) predictions. However, for years 2003 and 2008, the starting week for our data analysis was week 45 rather than week 44, since years 2003 and 2008 had 53 weeks. In addition, each 17-week period was aligned with the 17-week periods from all other seasons.

Auto-Regressive Integrated Moving Average Method.

To forecast ILI rates for each week-year-region combination, we used the ARIMA (p, d, q) method, in which p represents the number of autoregressive terms (i.e., number of previous observations on which the current observation linearly depends), d is the order of differencing, and q is the number of lagged forecast errors in the prediction equation (i.e., number of preceding estimation errors are taken into account when estimating the next time-series value). For example, to forecast the ILI rate for week 52 in 2004, we used all the data points in 2003, plus weeks 1–51 in 2004, as a training set to determine the proper parameters in the ARIMA model using model identification, model diagnosis, and forecasting [12]. In the identification step, we selected parameters p, d, q using Bayesian information criterion (BIC), which assigns a penalty according to the number of parameters in the model. In the model-diagnosis step, we used the Ljung-Box statistical test and quantile-quantile plot to check the time-series assumptions. We also applied the test outlined by Osborn et al [13] to check if seasonality terms should be included in the ARIMA model. Finally, in the forecasting step, we applied the model identified in the above steps to predict the weekly ILI rate and computed 95% prediction intervals by bootstrapping the model error 5,000 times.

We repeated these steps, as described above, for each combination of the 17 weeks (weeks 44 to week 8), nine influenza seasons, and 10 HHS regions. To generate the forecasting series for a specific week in one influenza season, we replaced the observed ILI rate for this week in all previous years with the predicted ILI rate to avoid the potential carryover effect by the actual ILI data in the prediction. For example, to forecast the ILI rate for week Y in influenza season 2005, we replaced the observed week Y ILI rates in 2003 and 2004 with ARIMA model-predicted values. Because there were not enough data points to have reliable predictions for the last week of 2003 and first week of 2004, these weeks are excluded from the results presented.

All analyses were performed using package Forecast 4.06 [14] in R 3.0.1 [15] and PROC ARIMA in SAS 9.3 for Windows 7 (SAS Institute, Cary NC). Because the ILI surveillance data are publicly available and include summary data only, this research was not subject to CDC institutional review board (IRB) review.

Results

We present the last and first week predictions in the main article in Tables 1 and 2, Tables 3 and 4. All other results from weeks 44–51 and weeks 2–8, plus model-fitting procedures, are reported in the supplemental material. We summarize our results into three categories in this section: 1) predicted ILI rates lower than observed; 2) predicted ILI rates higher than observed; and 3) model predictions at peak and after peak.

thumbnail
Table 1. Last Week ILI rate prediction for HHS regions 1–5 from influenza seasons 2004–2005 to 2012–2013.

https://doi.org/10.1371/journal.pone.0143791.t001

thumbnail
Table 2. Last Week ILI rate prediction for HHS regions 6–10 from influenza seasons 2004–2005 to 2012–2013.

https://doi.org/10.1371/journal.pone.0143791.t002

thumbnail
Table 3. Week 1 ILI rate prediction for HHS regions 1–5 from influenza seasons 2004–2005 to 2012–2013.

https://doi.org/10.1371/journal.pone.0143791.t003

thumbnail
Table 4. Week 1 ILI rate prediction for HHS regions 6–10 from influenza seasons 2004–2005 to 2012–2013.

https://doi.org/10.1371/journal.pone.0143791.t004

Predicted ILI Rates Lower than Observed

Of 1,530 model predictions from influenza seasons 2004–2005 to 2012–2013 for each HHS region between weeks 44 and week 8, 927 (61%) predicted weekly ILI rates were lower than observed (S2 Table). The percentage of predicted ILI rates that were lower than observed ranged from 44% in 2009–2010 to 67% in 2004–2005. The percentage of predicted ILI rates that were lower than observed was similar across regions (from 58% in Region 10 to 63% in Region 5).

There were 64 of 1,530 predicted ILI rates that were statistically significantly lower than observed (i.e., the observed weekly ILI rate was higher than the upper bound of 95% bootstrapped prediction interval). These 64 predicted values were almost evenly distributed by HHS region. However, seasons 2007–2008 and 2012–2013 contributed 18 and 17 predictions, respectively. Of the 64 predicted values that were significantly lower than observed, the most commonly identified week numbers were weeks 52, 51, 5, 7, and 4, which contributed 12, 9, 8, 7, and 7 values, respectively.

Fig 1A illustrates the number of predicted rates that were significantly lower than observed by influenza season and weeks. Weeks 51–52 form an apparent cluster, with a combined 21 (33% of 64) predicted values that is significantly lower than observed. Twelve of the 21 values in this cluster were from influenza season 2012-2013and nine from week 52 (Tables 1 and 2).

thumbnail
Fig 1. a) Number of predicted that are significantly lower(the upper bound of 95% prediction interval lower than the observed) than observed across influenza season and weeks, b) Number of predicted that are significantly higher(the lower bound of 95% prediction interval higher than the observed) than observed across influenza season and weeks.

https://doi.org/10.1371/journal.pone.0143791.g001

Predicted ILI Rates Higher than Observed

Of 1,530 predicted ILI rates, 38% (579) were higher than the corresponding observed values (S2 Table). Influenza season 2009–2010 had the most predicted ILI rates that were higher than observed (93/1,530; 6%) and influenza season 2010–2011 was the least represented (48/1,530; 3%). There was regional variation in the number of ILI predictions that were higher than observed (36% in region 5 and region 9 to 42% in region 10).

There were 46/1,530 (3%) predictions that were statistically significantly higher than observed (i.e., the observed weekly ILI rate was lower than the lower bound of 95% bootstrapped prediction interval). Influenza season 2009–2010 comprised most of these predictions (12/46; 26%), followed by 2012–2013 (10/46, 22%) and 2004–2005 (5/46, 11%).

Fig 1B displays the 46 predicted ILI rates that are significantly higher than observed by influenza season and week. Most of these predicted values were for week 1 (16/46, 35%). A cluster that accounted for 11 predicted values was seen from weeks 44–46. Influenza season 2012–2013 dominated the contribution from week 1 (Tables 3 and 4), and influenza season 2009–2010 contributed the cluster of predictions from weeks 44–46.

Model Predictions at Peak and after Peak

While timing of peak-observed ILI activity often differed by HHS region, ILI activity peaked mostly during a single week during some influenza seasons. For example, in influenza season 2009–2010, the peak observed weekly ILI rate occurred in week 44 in all 10 HHS regions. In influenza season 2012–2013, the peak occurred in week 52 in eight of the 10 HHS regions. In seven of the 10 HHS regions, the peak ILI rate occurred in week 7 during influenza seasons 2004–2005 and 2007–2008.

Of 90 (10 HHS regions x 9 seasons) predictions during the peak week ILI activity, 78 predicted ILI rates were lower than observed. In seasons in which the peak week ILI activity occurred in week 7 or earlier, 62 out of 73 predictions for the post-peak week (i.e., the week after the peak ILI week) were higher than observed. There were 53 out of 73 models that had lower peak and higher post-peak predicted ILI rates than were actually observed.

Discussion

From weeks 44 to week 8 of the influenza seasons investigated, week 52 (during the typical winter holiday break) had the most predicted ILI rates that were significantly lower than observed and week 1 (after the typical winter holiday break) had the most predicted ILI rates that were significantly higher than observed. However, these findings were largely driven by a single influenza season (2012–2013). During 2009–2010, an unusual season due to the emergence of influenza A(H1)pdm09, weeks 44–46 demonstrated a similar phenomenon, with an unusually high number of predicted ILI rates significantly higher than observed.

Influenza seasons 2012–2013 and 2009–2010 share some common characteristics in terms of reported weekly ILI rates by ILINet. First, both seasons had earlier starts than the other influenza seasons we investigated in this study. Secondly, both seasons had a very narrow peak across all U.S. HHS regions; this occurred in weeks 41–42 for influenza season 2009–2010 and week 52 for influenza season 2012–2013. This could be one explanation for our findings. For example, if we look at influenza season 2012–2013 alone, seven of 10 HHS regions reported peak ILI activity in week 52; 13 out of 20 first-week and last-week predictions have either ARIMA(2,0,2), ARIMA(2,0,1), or ARIMA(1,0,2). When time series is used to forecast the peak values along the timeline, the predicted value will depend on the past week or a rolling average of the past two weeks’ values, in most cases. As a result, the forecasted ILI at week 52 will be smaller than the highest observed ILI. However, when the model is trying to forecast the data point next to the peak, the peak observation exerts influence on this prediction, which is then predicted as a value higher than that observed.

Changing daily routines or traveling to other parts of the country during the holiday break could influence the decision to seek care for ILI symptoms. During week 52, the total number of visits reported to ILINet decreases, while the number of ILI-related visits remains the same around the peak, resulting in an elevated proportion of ILI visits (Fig 2). Because ARIMA model predictions for week 52 are based on previous weeks when the total number of visits was higher, the observed ILI rate may be higher than expected. In early January, the observed proportion of ILI may decrease because of the increase in total patient visits. The model prediction in the week after the holiday break is based on previous ILI rates, including the holiday week with the lower denominator of visits; this can lead to predicted ILI rates that are higher than observed. In addition to the changes in the denominator of patient visits during the holiday break, ILI cases requiring medical attention may be more severe than ILI cases reported during non-holiday weeks because of differences in patient care-seeking behavior.

thumbnail
Fig 2. Average total number of patient visits (red line, scales on the left-side y-axis) and average total number of ILI visits (blue line, scales on the right-side y-axis) across all HHS regions between 2003–2004 and 2012–2013 influenza seasons.

https://doi.org/10.1371/journal.pone.0143791.g002

The variability we detected between predicted and observed ILI rates during the winter holiday break of certain influenza seasons may reflect several factors, including a change in social mixing patterns, healthcare-seeking behavior, specific characteristics of the circulating influenza strain, or artifacts of ILINet surveillance data. ILINet surveillance does not capture the full age-specific information; therefore, we could not explore whether our findings are related to age-specific ILI reports or other factors.

We were unable to assess the role of different influenza strains on the predicted ILI rates, although such differences can impact the epidemiology of influenza [12, 13]. Previous infection from one strain can provide full or partial immunity to the same strain, or similar influenza strains, during subsequent exposures. Additionally, certain strains are associated with more or less transmissibility and pathogenicity, affecting illness severity and use of healthcare services. Any of these virus-related factors, as well as co-circulation or serial circulation of different strains, could have influenced our findings.

Our time-series analysis was an ecological approach to describing the relationship between ILINet data and the winter holiday break. As a result, we cannot establish a causal relationship between winter holiday breaks and changes in ILI rates. ILINet provides weekly data on all people with ILI, and since these data do not rely on laboratory confirmation, reported ILI rates likely included people with influenza as well as other acute respiratory diseases. Repeating this analysis using confirmed influenza cases could be informative. Additionally, we divided the winter holiday break into two time frames in this analysis: the last week of the year and the first week of each year. This categorization was made based on our estimation of the typical duration of the winter holiday break, given limited information in the literature as well as the availability of surveillance data by week. Future analyses interpolating daily changes in ILI rates may be warranted. We must also note that our time-series analysis used only 51 and 52 data points for predicting ILI rates for the last week and the first week of the 2003–2004 influenza season, respectively. This could have limited the reliability of our predictions; more data points would have been beneficial.

Conclusion

In our analysis, we detected high variability in the temporal relationship between winter holiday break and weekly ILI rates across influenza seasons. While overall observed ILI rates during the last week in December were higher than predicted, and observed ILI rates during the first week of January were lower than predicted, these findings were mainly attributable to a specific influenza season 2012–2013. We demonstrated the use of the ARIMA method in conducting this time-series analysis. Additional analyses using this method and others to incorporate environmental factors or virological data may better clarify the relationship between influenza activity and the winter holiday break.

Supporting Information

S1 Fig. Model selection and assumption check.

The ARIMA model selection and assumption check for the last-week prediction of 2004 in HHS region 4: a) autocorrelation function indicated a wide range of lagged error terms (q term in ARIMA [p,d,q]); b) partial autocorrelation function indicated the number of auto-regressive terms (p term in ARIMA [p,d,q]) was either 1 or 2; c-d) based on Bayesian information criterion, ARIMA(2,0,2) was selected for the time-series fitting and histogram and quantile-quantile plot assessed the normality assumption for time-series residuals.

https://doi.org/10.1371/journal.pone.0143791.s001

(TIF)

S2 Fig. Last-week ILI rate prediction for HHS region 4.

Last-week ILI rate prediction for HHS region 4 was based on previously observed weekly ILI rates and previously fitted last-week ILI rate. A solid blue line represented weekly ILI rates reported by CDC ILInet, a solid red line represented fitted weekly ILI rates. Blue dots and red dots represented the observed and predicted week-52 ILI rates, respectively, with ARIMA models for the following years: a) 2004 last-week prediction by ARIMA(2,0,2); b) 2006 last-week prediction by ARIMA(2,0,2); and c) 2008 last-week prediction by ARIMA(2,0,2).

https://doi.org/10.1371/journal.pone.0143791.s002

(TIF)

S1 Table. Bayesian information for the candidate ARIMA models to forecast the 2004 week-52 ILI rate.

https://doi.org/10.1371/journal.pone.0143791.s004

(DOCX)

S2 Table. ARIMA models and predictions for week 44 to week 8 from influenza season 2004–2005 to 2012–2013.

https://doi.org/10.1371/journal.pone.0143791.s005

(XLSX)

Acknowledgments

Disclaimer: The findings and conclusions in this report are those of the authors and do not necessarily represent the official position of the Centers for Disease Control and Prevention.

Author Contributions

Conceived and designed the experiments: HG JJR YZ AU. Performed the experiments: HG. Analyzed the data: HG JRS. Wrote the paper: HG JJR KKW.

References

  1. 1. Textbook of Influenza. Second ed. John Wiley & Son, Ltd, The Atrium, Southern Gate, Chichester, Wesr Sussex, PO 19 8SQ, UK: Wiley Blackwell; 2013.
  2. 2. National and Regional Level Outpatient Illness and Viral Surveillance US Centers for Disease Control and Prevention. US Centers for Disease Control and Prevention. Available: http://gis.cdc.gov/grasp/fluview/fluportaldashboard.html. Accessed 21 December 2014.
  3. 3. Earn DJD, He D, Loeb MB, Fonseca K, Lee BE, Dushoff J. Effects of School Closure on Incidence of Pandemic Influenza in Alberta, Canada. Annals of Internal Medicine. 2012;156(3):173–81. pmid:22312137
  4. 4. Cauchemez S, Valleron AJ, Boelle PY, Flahault A, Ferguson NM. Estimating the impact of school closure on influenza transmission from Sentinel data. Nature. 2008 Apr 10;452(7188):750–4. pmid:18401408. Epub 2008/04/11. eng.
  5. 5. Heymann AD, Hoch I, Valinsky L, Kokia E, Steinberg DM. School closure may be effective in reducing transmission of respiratory viruses in the community. Epidemiology and infection. 2009 Oct;137(10):1369–76. pmid:19351434. Epub 2009/04/09. eng.
  6. 6. Box GEP, Jenkins GM. Time series analysis: forecasting and control: Holden-Day; 1976.
  7. 7. Soebiyanto RP, Adimi F, Kiang RK. Modeling and predicting seasonal influenza transmission in warm regions using climatological parameters. PLOS One. 2010;5(3):e9450. pmid:20209164. Pubmed Central PMCID: PMC2830480. Epub 2010/03/09. eng.
  8. 8. Upshur RE, Knight K, Goel V. Time-series analysis of the relation between influenza virus and hospital admissions of the elderly in Ontario, Canada, for pneumonia, chronic lung disease, and congestive heart failure. Am J Epidemiol. 1999 Jan 1;149(1):85–92. pmid:9883797. Epub 1999/01/12. eng.
  9. 9. Kane MJ, Price N, Scotch M, Rabinowitz P. Comparison of ARIMA and Random Forest time series models for prediction of avian influenza H5N1 outbreaks. BMC bioinformatics. 2014;15:276. pmid:25123979. Pubmed Central PMCID: PMC4152592. Epub 2014/08/16. eng.
  10. 10. Overview of Influenza Surveillance in the United States 2015. US Centers for Disease Control and Prevention, Available: http://www.cdc.gov/flu/weekly/overview.htm.Accessed 31 March 2015.
  11. 11. MMWR Weeks US Centers for Disease Control and Prevention. US Centers for Disease Control and Prevention;Available: http://wwwn.cdc.gov/nndss/document/MMWR_week_overview.pdf. Accessed 12 March 2015.
  12. 12. Brocklebank JC, Dickey DA. Sas For Forecasting Time Series. 2 ed: SAS Institute Inc.; 2003 April 2003.
  13. 13. Osborn DR, Chui APL, Smith JP, Birchenhall CR. Seasonality And The Order Of Integration For Consumption*. Oxford Bulletin of Economics and Statistics. 1988;50(4):361–77.
  14. 14. Hyndman RJ, Athanasopoulos G, RazbashS,Schmidt D, Zhou Z, Khan Y, et al. forecast: Forecasting functions for time series and linear models. R package version 4.06. 2013.
  15. 15. R: A language environment for statistical computing. R Core Team, R Foundation for Statistical Computing, Vienna, Austria.; 2013.