作者: Prashant Khare , Grégoire Burel , Diana Maynard , Harith Alani
DOI: 10.1007/978-3-030-00671-6_36
关键词:
摘要: Many citizens nowadays flock to social media during crises share or acquire the latest information about event. Due sheer volume of data typically circulated such events, it is necessary be able efficiently filter out irrelevant posts, thus focusing attention on posts that are truly relevant crisis. Current methods for classifying relevance a crisis set struggle deal with in different languages, and not viable rapidly evolving situations train new models each language. In this paper we test statistical semantic classification approaches cross-lingual datasets from 30 consisting written mainly English, Spanish, Italian. We experiment scenarios where model trained one language tested another, translated single show addition features extracted external knowledge bases improve accuracy over purely model.