Detecting Influenza Epidemics on Twitter

Abstract

This paper presents a predictive model for Influenza-Like-Illness, based on Twitter traffic. We gather data from Twitter based on a set of keywords used in the Influenza wikipedia page, and perform feature selection over all words used in 3 years worth of tweets, using real ILI data from the Greek CDC. We select a small set of words with high correlation to the ILI score, and train a regression model to predict the ILI score cases from the word features. We deploy this model on a streaming application and feed the resulting time-series to FluHMM, an existing prediction model for the phases of the epidemic. We find that Twitter traffic offers a good source of information and can generate early warnings compared to the existing sentinel protocol using a set of associated physicians all over Greece.

0

Turn this paper into a lesson

ArcXiv compiles a structured reading guide from this paper's metadata: plain-English importance, contributions, prerequisite concepts, which sections to read first, flashcards, and a quiz. Grounded in the abstract, never invented.

Discussion (0)

Sign in to join the discussion.

Loading comments…