Predicting Corporate Credit Ratings Using Content Analysis of Annual Reports – A Naïve Bayesian Network Approach

作者: Petr Hajek , Vladimir Olej , Ondrej Prochazka

DOI: 10.1007/978-3-319-52764-2_4

关键词:

摘要: Corporate credit ratings are based on a variety of information, including financial statements, annual reports, management interviews, etc. Financial indicators critical to evaluate corporate creditworthiness. However, little is known about how qualitative information hidden in firm-related documents manifests rating process. To address this issue, study aims develop methodology for extracting topical content from using latent semantic analysis. This integrated with traditional into multi-class prediction model. Informative obtained correlation-based filter the process feature selection. We demonstrate that Naive Bayesian networks perform statistically equivalent other machine learning methods terms classification performance. further show “red flag” values may indicate low quality (non-investment classes) firms. These findings can be particularly important investors, banks and market regulators.

参考文章(39)
Petr Hájek, Vladimír Olej, Evaluating Sentiment in Annual Reports for Financial Distress Prediction Using Neural Networks and Support Vector Machines international conference on engineering applications of neural networks. pp. 1- 10 ,(2013) , 10.1007/978-3-642-41016-1_1
Petr Hájek, Vladimír Olej, Predicting Firms' Credit Ratings Using Ensembles of Artificial Immune Systems and Machine Learning - An Over-Sampling Approach artificial intelligence applications and innovations. pp. 29- 38 ,(2014) , 10.1007/978-3-662-44654-6_3
Stefan Feuerriegel, Antal Ratku, Dirk Neumann, Analysis of How Underlying Topics in Financial News Affect Stock Prices Using Latent Dirichlet Allocation hawaii international conference on system sciences. pp. 1072- 1081 ,(2016) , 10.1109/HICSS.2016.137
Ping-Feng Pai, Yi-Shien Tan, Ming-Fu Hsu, Credit Rating Analysis by the Decision-Tree Support Vector Machine with Ensemble Strategies International Journal of Fuzzy Systems. ,vol. 17, pp. 521- 530 ,(2015) , 10.1007/S40815-015-0063-Y
Huan Liu, Lei Yu, Feature selection for high-dimensional data: a fast correlation-based filter solution international conference on machine learning. pp. 856- 863 ,(2003)
Nir Friedman, Dan Geiger, Moises Goldszmidt, Bayesian Network Classifiers Machine Learning. ,vol. 29, pp. 131- 163 ,(1997) , 10.1023/A:1007465528199
David J. Hand, Robert J. Till, A Simple Generalisation of the Area Under the ROC Curve for Multiple Class Classification Problems Machine Learning. ,vol. 45, pp. 171- 186 ,(2001) , 10.1023/A:1010920819831
Petr Hájek, Vladimír Olej, Credit rating modelling by kernel-based approaches with supervised and semi-supervised learning Neural Computing and Applications. ,vol. 20, pp. 761- 773 ,(2011) , 10.1007/S00521-010-0495-0
Gerard Salton, Christopher Buckley, Term Weighting Approaches in Automatic Text Retrieval Information Processing and Management. ,vol. 24, pp. 323- 328 ,(1988) , 10.1016/0306-4573(88)90021-0