作者: S. K. M. Wong , Wojciech Ziarko , Patrick C. N. Wong
关键词: Representation (mathematics) 、 Theoretical computer science 、 Divergence-from-randomness model 、 Vector space 、 Term Discrimination 、 Automatic indexing 、 Function space 、 Vector space model 、 Information retrieval 、 Generalized vector space model 、 Computer science
摘要: In information retrieval, it is common to model index terms and documents as vectors in a suitably defined vector space. The main difficulty with this approach that the explicit representation of term not known priori. For reason, space adopted by Salton for SMART system treats set orthogonal vectors. such often necessary adopt separate, corrective procedure take into account correlations between terms. paper, we propose systematic method (the generalized model) compute directly from automatic indexing scheme. We also demonstrate how can be included minimal modification existing based retrieval systems. preliminary experimental results obtained new are very encouraging.