Hierarchical Document Clustering Using Frequent Itemsets

作者: Benjamin C. M. Fung , Martin Ester , Ke Wang

DOI:

关键词:

摘要: … Since HFTC does not take the number of clusters as an input … 3% to 6%, for both HFTC and our algorithm in each data set … of HFTC is comparable with other algorithms in the first 5000 …

参考文章(42)
Steven L. Salzberg, Alberto Segre, Programs for Machine Learning ,(1994)
Kenneth A. Ross, Divesh Srivastava, Fast Computation of Sparse Datacubes very large data bases. pp. 116- 125 ,(1997)
Oren Etzioni, Oren Zamir, Richard M. Karp, Omid Madani, Fast and intuitive clustering of web documents knowledge discovery and data mining. pp. 287- 290 ,(1997)
Mark T. Maybury, Gerald Kowalski, Information Storage and Retrieval Systems: Theory and Implementation Kluwer Academic Publishers. ,(2000)
Marko Grobelnik, Dunja Mladenic, Feature Selection for Unbalanced Class Distribution and Naive Bayes international conference on machine learning. pp. 258- 267 ,(1999)
Richard R. Muntz, Jiong Yang, Wei Wang, STING: A Statistical Information Grid Approach to Spatial Data Mining very large data bases. pp. 186- 195 ,(1997)
Yu He, Ke Wang, Senqiang Zhou, Hierarchical Classification of Real Life Documents. siam international conference on data mining. pp. 1- 16 ,(2001)
Mehran Sahami, Daphne Koller, Hierarchically Classifying Documents Using Very Few Words international conference on machine learning. pp. 170- 178 ,(1997)
George Karypis, Michael Steinbach, Vipin Kumar, A Comparison of Document Clustering Techniques ,(2000)
Hans-Peter Kriegel, Martin Ester, Jörg Sander, Xiaowei Xu, A density-based algorithm for discovering clusters in large spatial Databases with Noise knowledge discovery and data mining. pp. 226- 231 ,(1996)