Pre-screening training data for classifiers

作者: Vincent Devin , Mathieu Chuat

DOI:

关键词:

摘要: A system and method provide recommendations for refining training data that includes a set of digital objects. submitter labels the objects in with labels, which may indicate whether object is considered positive, neutral, or negative respect to each predefined classes. Score vectors are computed by trained categorizer labeled set. From score vectors, various metrics computed, such as representative vector distances from label group, cluster, category categorizer. Based on metrics, heuristics applied evaluated be made submitter, proposing mislabeled relabeled. The include unlabeled objects, case, suggestions labeling

参考文章(24)
Florent Perronnin, Thomas Mensink, Jorge Sanchez, Large scale image classification ,(2010)
Caroline Privault, Jean-Michel Renders, Ludovic Menuge, Interactive cleaning for automatic document clustering and categorization ,(2007)
Florent C. Perronnin, Jorge Sanchez, Yan Liu, Training a classifier by dimension-wise embedding of training data ,(2009)
Cyril Goutte, Caroline Privault, Francois Pacull, Jean-Michel Renders, Agnes Guerraz, Eric Gaussier, Hierarchical clustering with real-time updating ,(2006)
Florent C. Perronnin, Yan Liu, Modeling images as mixtures of image models ,(2008)