作者: Caroline Privault , Jean-Michel Renders , Ludovic Menuge
DOI:
关键词: Document clustering 、 Categorization 、 Cluster analysis 、 Ambiguity 、 Computer science 、 Similarity (network science) 、 User input 、 Outlier 、 Class (biology) 、 Information retrieval
摘要: Documents are clustered or categorized to generate a model associating documents with classes. Outlier measures computed for the indicative of how well each document fits into model. identified user based on outlier and selected criterion. Ambiguity number classes which has similarity under If is annotated label class, possible corrective class if higher than class. The clustering categorizing repeated adjusted received input an updated ambiguity also calculated at runtime new classified using