A Theory of Term Importance in Automatic Text Analysis

作者： G. Salton , C. S. Yang , C. T. Yu

关键词: Information processing 、 Recall 、 Computer science 、 Automatic indexing 、 Word lists by frequency 、 Linear discriminant analysis 、 Information retrieval 、 Term (time) 、 Artificial intelligence 、 Text mining 、 Search engine indexing 、 Content analysis 、 Natural language processing

摘要: A good deal of work has been done over the years in an attempt to use statistical or probabilistic techniques as a basis for automatic indexing and content analysis.(1–10) Unfortunately, many of these methods are lacking in effectiveness, and the more refined procedures are computationally unattractive. A new technique, known as discrimination value analysis, ranks the text words in accordance with how well they are able to discriminate the documents of a collection from each other; that is, the value of a term …

参考文章(11)

G. Salton, M. E. Lesk, Computer Evaluation of Indexing and Text Processing Journal of the ACM. ,vol. 15, pp. 8- 36 ,(1968) , 10.1145/321439.321441

Abraham Bookstein, Don R. Swanson, Probabilistic Models for Automatic Indexing. Journal of the Association for Information Science and Technology. ,vol. 25, pp. 312- 316 ,(1974) , 10.1002/ASI.4630250505

H. P. Luhn, A statistical approach to mechanized encoding and searching of literary information Ibm Journal of Research and Development. ,vol. 1, pp. 309- 317 ,(1957) , 10.1147/RD.14.0309

Shyam Kumar, Semantic Clustering of Index Terms Journal of the ACM. ,vol. 15, pp. 493- 513 ,(1968) , 10.1145/321479.321480

Lauren B. Doyle, Indexing and abstracting by association American Documentation. ,vol. 13, pp. 378- 390 ,(1962) , 10.1002/ASI.5090130404

M. E. Maron, Automatic Indexing: An Experimental Inquiry Journal of the ACM. ,vol. 8, pp. 404- 417 ,(1961) , 10.1145/321075.321084

Fred J. Damerau, An experiment in automatic indexing American Documentation. ,vol. 16, pp. 283- 289 ,(1965) , 10.1002/ASI.5090160403

G. SALTON, C.S. YANG, On the Specification of Term Values in Automatic Indexing Journal of Documentation. ,vol. 29, pp. 351- 372 ,(1973) , 10.1108/EB026562

M. E. Maron, On Relevance, Probabilistic Indexing and Information Retrieval Journal of the ACM. ,vol. 7, pp. 216- 244 ,(1960) , 10.1145/321033.321035

10.

KAREN SPARCK JONES, A statistical interpretation of term specificity and its application in retrieval Journal of Documentation. ,vol. 60, pp. 493- 502 ,(1972) , 10.1108/EB026526

A Theory of Term Importance in Automatic Text Analysis

来源期刊

我的账户

A Theory of Term Importance in Automatic Text Analysis

来源期刊

相似文章 10

我的账户