作者: Antonia Kyriakopoulou
DOI: 10.5772/6083
关键词:
摘要: Supervised and unsupervised learning have been the focus of critical research in areas machine artificial intelligence. In literature, these two streams flow independently each other, despite their close conceptual practical connections. this work we exclusively deal with text classification aided by clustering scenario. This chapter provides a review interpretation role different fields an eye towards identifying important research. Drawing upon literature analysis, discuss several issues surrounding tasks support tasks. We define problem, postulate number baseline methods, examine techniques used, classify them into meaningful categories. A standard issue for is creation compact representations feature space discovery complex relationships that exist between features, documents classes. There are approaches try to quantify notion information basic components problem. Given variables interest, sources about can be compressed while preserving information. Clustering one used context. vein, area where aid dimensionality reduction. as compression and/or extraction method: features clustered groups based on selected criteria. Feature methods create new, reduced-size event spaces joining similar groups. They similarity measure collapse single events no longer distinguish among constituent features. Typically, parameters cluster become weighted average its Two types studied: i) one-way clustering, i.e. distributions or classes, ii) coclustering, both documents. second has lot offer, semi-supervised learning. Training data contain labelled unlabelled examples. Obtaining fully training set difficult task; labelling usually done using human expertise, which expensive, time consuming, error prone. much easier since it involves collecting known belong