作者: Ling Song , Po Yan , Li Lian , Dongmei Zhang , Jun Ma
关键词: Fuzzy clustering 、 Semantic similarity 、 k-means clustering 、 Database 、 Computer science 、 Vector space model 、 Cluster analysis 、 Data mining 、 Similarity (network science) 、 Rand index 、 Cosine similarity
摘要: Deep Web database clustering is a key operation in organizing resources. Cosine similarity Vector Space Model (VSM) used as the computation traditional ways. However it cannot denote semantic between contents of two databases. In this paper how to cluster databases semantically discussed. Firstly, fuzzy measure, which integrates ontology and set theory compute visible features forms, proposed, then hybrid Particle Swarm Optimization (PSO) algorithm provided for clustering. Finally results are evaluated according Average Similarity Document Cluster Centroid (ASDC) Rand Index (RI). Experiments show that: 1) PSO approach has higher ASDC values than those based on K-Means approaches. It means intra lowest inter similarity; 2) have RI cosine similarity. reflects conclusion that can explore latent semantics.