作者: Leon Bottou , Saleema A. Amershi , Patrice Y. Simard , David G. Grangier
DOI:
关键词:
摘要: A collection of data that is extremely large can be difficult to search and/or analyze. Relevance may dramatically improved by automatically classifying queries and web pages in useful categories, using these classification scores as relevance features. thorough approach require building a number classifiers, corresponding the various types information, activities, products. Creation classifiers schematizers provided on sets. Exercising hundreds millions items expose value inherent adding usable meta-data. Some aspects include active labeling exploration, automatic regularization cold start, scaling with featuring, segmentation schematization.