作者: Ireneus Kagashe , Zhijun Yan , Imran Suheryani
DOI: 10.2196/JMIR.7393
关键词: Survey methodology 、 Data mining 、 Latent Dirichlet allocation 、 Oseltamivir 、 Disease 、 Public health surveillance 、 Social media 、 Social desirability bias 、 Medicine 、 Environmental health 、 Flu season
摘要: Background: Uptake of medicinal drugs (preventive or treatment) is among the approaches used to control disease outbreaks, and therefore, it vital importance be aware counts frequencies most commonly trending topics about these from consumers for successful implementation measures. Traditional survey methods would have accomplished this study, but they are too costly in terms resources needed, subject social desirability bias discovery. Hence, there a need use alternative efficient means such as Twitter data machine learning (ML) techniques. Objective: Using data, aim study was (1) provide methodological extension efficiently extracting widely consumed during seasonal influenza (2) extract tweets infer how insights provided by can enhance surveillance. Methods: From collected 2012-13 flu season, we first identified with mentions then constructed an ML classifier using dependency words features. The that evidenced consumption drugs, out which mostly drugs. Finally, extracted each drugs’ latent Dirichlet allocation (LDA). Results: Our proposed obtained F1 score 0.82, significantly outperformed two benchmark classifiers (ie, P<.001 lexicon-based P=.048 1-gram term frequency [TF]). 40,428 50,828 were virus vaccines had around 76.95% (31,111/40,428) share total; other notable Theraflu, DayQuil, NyQuil, vitamins, acetaminophen, oseltamivir. exhibited common themes experiences people who Among enabling deterrent factors uptake, keys mitigating severity outbreaks. Conclusions: results showed feasibility surveillance lieu traditional conventional approaches. Public health officials stakeholders benefit findings especially enhancing strategies extended outbreaks diseases. [J Med Internet Res 2017;19(9):e315]