作者: K. P. Pushpalatha , G. Raju
DOI: 10.1109/ICTEE.2012.6208623
关键词:
摘要: Generating meaningful or relevant keywords for information retrieval using Data Mining techniques is a highly field. Term Discrimination Values (TDVs) are better measures compared to frequency term weights select the keywords. Terms with high TDVs will generate good Hamdouchi, P. Willet and Carolyn J Crouch have developed various algorithms TDVs. In earlier days weighted was used compute But these simple frequencies not enough retrieving documents. Here we use some new features, connected distribution of terms within document, called distributional Distributional features such as First Appearance, Last Compactness on number parts, distance between first last occurrence variance positions occurrences etc. pointers importance in document. Experiments shown that combination give much improved results than individual case Text Categorization. Through this work also could prove it correct generating An additional overhead storage time compensated by efficient output. This add narrow light towards text document search education both teaching research.