作者: Min Shi , Jianxun Liu , Buqing Cao , Yiping Wen , Xiangping Zhang
关键词:
摘要: The rapid growth in both the number and diversity of Web services raises new requirement clustering techniques to facilitate service discovery, repository management etc. Existing methods primarily focus on using semantic distances between features, e.g., topic vectors, mined from WSDL documents. However, these quality vectors are hard be obtained due lack abundant textual information description In practice, prior knowledge human's trajectory utilizing could helpful improving accuracy clustering. With an analysis dataset Mashups ProgrammableWeb, we observe that Mashuped together highly likely belong different clusters being annotated with identical tags tend within same cluster. Based observations, this paper proposes efficient approach for services. firstly uses a probabilistic model elicit latent It then performs based K-means++ algorithm by incorporating parameters representing above mentioned knowledge. A comprehensive evaluation is conducted validate performance our proposed ground truth crawled ProgrammableWeb. Experimental comparisons approaches without considerations show has significant improvement accuracy.