Word Sense Discrimination by Clustering Contexts in Vector and Similarity Spaces

作者: Amruta Purandare , Ted Pedersen

DOI:

关键词:

摘要: This paper systematically compares unsupervised word sense discrimination techniques that cluster instances of a target occur in raw text using both vector and similarity spaces. The context each instance is represented as high dimensional feature space. Discrimination achieved by clustering these vectors directly space also finding pairwise similarities among the then We employ two different representations which occurs. First order represent features context. Second are an indirect representation based on average words evaluate discriminated clusters carrying out experiments sense–tagged 24 SENSEVAL2 well known Line, Hard Serve corpora.

参考文章(10)
Rebecca F. Bruce, Ted Pedersen, Distinguishing Word Senses in Untagged Text empirical methods in natural language processing. ,(1997)
Ted Pedersen, Knowledge lean word sense disambiguation national conference on artificial intelligence. pp. 814- 814 ,(1997)
Ying Zhao, George Karypis, Evaluation of hierarchical clustering algorithms for document datasets conference on information and knowledge management. pp. 515- 524 ,(2002) , 10.1145/584792.584877
Amruta Purandare, Discriminating among word senses using McQuitty's similarity analysis Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology Proceedings of the HLT-NAACL 2003 student research workshop - NAACL '03. pp. 19- 24 ,(2003) , 10.3115/1073416.1073420
Fumiyo Fukumoto, Yoshimi Suzuki, Word sense disambiguation in untagged text based on term weight learning Proceedings of the ninth conference on European chapter of the Association for Computational Linguistics -. pp. 209- 216 ,(1999) , 10.3115/977035.977064
George A. Miller, Walter G. Charles, Contextual correlates of semantic similarity Language and Cognitive Processes. ,vol. 6, pp. 1- 28 ,(1991) , 10.1080/01690969108406936
Hinrich Schütze, Automatic word sense discrimination Computational Linguistics. ,vol. 24, pp. 97- 123 ,(1998)
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9
H. Schutze, Dimensions of meaning conference on high performance computing (supercomputing). pp. 787- 796 ,(1992) , 10.5555/147877.148132
Thomas K Landauer, Peter W. Foltz, Darrell Laham, An introduction to latent semantic analysis Discourse Processes. ,vol. 25, pp. 259- 284 ,(1998) , 10.1080/01638539809545028