作者: Ravil I Mukhamediev , Kirill Yakunin , Rustam Mussabayev , Timur Buldybayev , Yan Kuchin
DOI: 10.3390/SYM12121945
关键词:
摘要: Mass media not only reflect the activities of state bodies but also shape informational context, sentiment, depth, and significance level attributed to certain initiatives social events. Multilateral quantitative (to practicable extent) assessment activity is important for understanding their objectivity, role, focus, and, ultimately, quality society’s “fourth power”. The paper proposes a method evaluating in several modalities (topics, evaluation criteria/properties, classes), combining topic modeling text corpora multiple-criteria decision making. based on an analysis as follows: conditional probability distribution by topics, properties, classes calculated after formation model corpora. Several approaches are used obtain weights that describe how each relates criterion/property class described paper, including manual high-level labeling, multi-corpora approach, automatic approach. proposed approach suggests topical asymmetry describing topic’s relationship criterion/property. These weights, combined with model, can be applied evaluate document according considered criteria classes. was corpus 804,829 news publications from 40 Kazakhstani sources published 01 January 2018 31 December 2019, classify negative information socially significant topics. A BigARTM derived (200 topics) applied, fill table analytical hierarchical process (AHP) all necessary labeling procedures. Experiments confirm general possibility using corpora, because area under receiver operating characteristics curve (ROC AUC) score 0.81 achieved classification task, which comparable results obtained same task applying BERT (Bidirectional Encoder Representations Transformers) model.