Measures of Distributional Similarity

作者: Lillian Lee

DOI: 10.3115/1034678.1034693

关键词: Range (statistics)Distributional similarityComputer scienceData miningFunction (mathematics)Similarity (network science)EconometricsProxy (statistics)

摘要: We study distributional similarity measures for the purpose of improving probability estimation unseen cooccurrences. Our contributions are three-fold: an empirical comparison a broad range measures; classification functions based on information that they incorporate; and introduction novel function is superior at evaluating potential proxy distributions.

参考文章(33)
Vasileios Hatzivassiloglou, Do we Need Linguistics When We Have Statistics? A Comparative Analysis of the Contributions of Linguistic Cues to a Statistical Word Grouping System The Balancing Act: Combining Symbolic and Statistical Approaches to Language. ,(1994)
Vasileios Hatzivassiloglou, Kathleen R. McKeown, Frank Smadja, Translating collocations for bilingual lexicons: a statistical approach Computational Linguistics. ,vol. 22, pp. 1- 38 ,(1996) , 10.7916/D8C82M3R
Hwee Tou Ng, Exemplar-Based Word Sense Disambiguation” Some Recent Improvements empirical methods in natural language processing. ,(1997)
Patrick Hanks, Kenneth Ward Church, Word association norms, mutual information, and lexicography Computational Linguistics. ,vol. 16, pp. 22- 29 ,(1990) , 10.5555/89086.89095
Ido Dagan, Lillian Lee, Fernando C. N. Pereira, Similarity-Based Models of Word Cooccurrence Probabilities Machine Learning. ,vol. 34, pp. 43- 69 ,(1999) , 10.1023/A:1007537716579
F. Jelinek, Interpolated estimation of Markov source parameters from sparse data Proc. Workshop on Pattern Recognition in Practice, 1980. pp. 381- 397 ,(1980)
Dekang Lin, An Information-Theoretic Definition of Similarity international conference on machine learning. pp. 296- 304 ,(1998)
Gerard Salton, Michael J. McGill, Introduction to Modern Information Retrieval ,(1983)
William P. Jones, George W. Furnas, Pictures of relevance: a geometric analysis of similarity measures Journal of the Association for Information Science and Technology. ,vol. 38, pp. 420- 442 ,(1987) , 10.1002/(SICI)1097-4571(198711)38:6<420::AID-ASI3>3.0.CO;2-S
Jean Dickinson Gibbons, Nonparametric Measures of Association ,(1993)