作者: Anastazia Zunic , Padraig Corcoran , Irena Spasic
DOI: 10.2196/16023
关键词:
摘要: Background: Sentiment analysis (SA) is a subfield of natural language processing whose aim to automatically classify the sentiment expressed in free text. It has found practical applications across wide range societal contexts including marketing, economy, and politics. This review focuses specifically on related health, which defined as “a state complete physical, mental, social well-being not merely absence disease or infirmity.” Objective: study aimed establish art SA health by conducting systematic recent literature. To capture perspective those individuals are affected, we focused spontaneously generated content necessarily that care professionals. Methods: Our methodology based guidelines for performing reviews. In January 2019, used PubMed, multifaceted interface, perform literature search against MEDLINE. We identified total 86 relevant studies extracted data about datasets analyzed, discourse topics, creators, downstream applications, algorithms used, their evaluation. Results: The majority were collected from networking Web-based retailing platforms. primary purpose online conversations exchange information provide support online. These communities tend form around conditions with high severity chronicity rates. Different treatments services discussed include medications, vaccination, surgery, orthodontic services, individual physicians, general. 5 roles respect among authors types narratives considered this review: sufferer, an addict, patient, carer, suicide victim. Out considered, only 4 reported demographic characteristics. A methods SA. Most common choices included vector machines, naive Bayesian learning, decision trees, logistic regression, adaptive boosting. contrast general trends research, 1 deep learning. performance lags behind achieved other domains when measured F-score, was be below 60% average. context SA, domain resource poor: few domain-specific corpora lexica shared publicly research purposes. Conclusions: results area lag domains. yet unclear if because intrinsic differences between respective sublanguages, size training datasets, lack lexica, choice algorithms.