作者: Sriharsha Veeramachaneni , Conor Hayes , Paolo Avesani
DOI:
关键词:
摘要: The Web is experiencing an exponential growth in the use of weblogs or blogs, websites containing dated journal-style entries. Blog entries are generally organised using informally defined labels known as tags. Increasingly, tags being proposed a 'grassroots' alternative to Semantic standards. We demonstrate that by themselves weak at partitioning blog data. then show how may contribute useful, discriminating information. Using content-based clustering, we observe frequently occurring each cluster usually good meta-labels for concept. introduce Tr score, score based on proportion high-frequency cluster, and it strongly correlated with strength. enables detection removal clusters. As such, can be used independent means verifying topic integrity cluster-based recommender system.