作者: William S. Spangler
DOI:
关键词:
摘要: A method and a storage medium, that includes instructions for causing computer to implement the method, document categorization is presented. The identifying terms occurring in collection of documents, determining cohesion score each terms. function cosine difference between documents containing term centroid all term. further sorting based on scores. also creating categories scores terms, wherein only (i) selected one (ii) have not already been assigned category. still moving category nearest centroid, thereby refining categories.