作者: Theodore B. Trafalis , Indra Adrianto , Michael B. Richman , S. Lakshmivarahan
DOI: 10.1007/S10287-013-0174-6
关键词: Probabilistic logic 、 Rare events 、 Pattern recognition 、 Feature (machine learning) 、 Support vector machine 、 Computer science 、 Artificial intelligence 、 Feature selection 、 Tornado 、 Severe weather 、 Machine learning 、 Random forest
摘要: Learning from imbalanced data, where the number of observations in one class is significantly larger than ones other class, has gained considerable attention machine learning community. Assuming difficulty predicting each similar, most standard classifiers will tend to predict majority well. This study applies tornado data that are highly imbalanced, as they rare events. The severe weather used herein have thunderstorm circulations (mesocyclones) produce tornadoes approximately 6.7 % total observations. However, since high impact events, it important minority with accuracy. In this study, we apply support vector machines (SVMs) and logistic regression without a midpoint threshold adjustment on probabilistic outputs, random forest, rotation forest for prediction. Feature selection SVM-recursive feature elimination was also performed identify features or variables tornadoes. results showed SVMs provided better performance compared classifiers.