Method for document retrieval and for word sense disambiguation using neural networks

作者: Stephen I. Gallant

DOI:

关键词: Context (language use)Feature (machine learning)Computer scienceWord stemWord (computer architecture)Document retrievalCentroidk-nearest neighbors algorithmBase (topology)Artificial intelligencePattern recognition

摘要: A method for storing and searching documents also useful in disambiguating word senses a generating dictionary of context vectors. The vectors provides vector each stem the dictionary. is fixed length list component values corresponding to word-based features, being an approximate measure conceptual relationship between feature. Documents are stored by combining words remaining document after uninteresting removed. summary obtained adding all normalized. normalized document. data base searched using query identifying whose closest that vector. can be cluster trees according centroid consistent algorithm accelerate process. Said process gives efficient way finding nearest neighbor high-dimensional spaces.

参考文章(11)
Patrick N. Lawrence, Arrays of machines such as computers ,(1978)
S. K.M. Wong, W. Ziarko, V. V. Raghavan, P. C.N. Wong, On modeling of information retrieval concepts in vector spaces ACM Transactions on Database Systems. ,vol. 12, pp. 299- 321 ,(1987) , 10.1145/22952.22957
Peter G. Ossorio, Classification Space: A Multivariate Procedure For Automatic? Document Indexing And Retrieval. Multivariate Behavioral Research. ,vol. 1, pp. 479- 524 ,(1966) , 10.1207/S15327906MBR0104_6
Richard K. Belew, Adaptive information retrieval: using a connectionist representation to retrieve and learn about documents international acm sigir conference on research and development in information retrieval. ,vol. 23, pp. 11- 20 ,(1989) , 10.1145/3130348.3130359
Dario Lucarella, A document retrieval system based on nearest neighbour searching Journal of Information Science. ,vol. 14, pp. 25- 33 ,(1988) , 10.1177/016555158801400104
Matthew B. Koll, WEIRD ACM SIGIR Forum. ,vol. 13, pp. 32- 50 ,(1979) , 10.1145/1095366.1095368
David L. Waltz, Jordan B. Pollack, Massively Parallel Parsing: A Strongly Interactive Model of Natural Language Interpretation* Cognitive Science. ,vol. 9, pp. 51- 74 ,(1985) , 10.1207/S15516709COG0901_4
Scott Deerwester, Susan T. Dumais, George W. Furnas, Thomas K. Landauer, Richard Harshman, Indexing by Latent Semantic Analysis Journal of the Association for Information Science and Technology. ,vol. 41, pp. 391- 407 ,(1990) , 10.1002/(SICI)1097-4571(199009)41:6<391::AID-ASI1>3.0.CO;2-9