作者: Mingxi Zhang , Hao Hu , Zhenying He , Liping Gao , Liujie Sun
DOI: 10.1016/J.ESWA.2015.07.042
关键词: Similarity (network science) 、 Data mining 、 Mathematics 、 Recommender system 、 Pruning (decision trees) 、 Learning to rank 、 Reduction (complexity) 、 Nearest neighbor search 、 SimRank 、 Theoretical computer science 、 Network analysis
摘要: The pre-computation cost in the off-line stage is significantly reduced.The efficiency of query processing optimized by proposing a pruning algorithm.The accuracy loss algorithm controlled tuning threshold.The effectiveness returned result effective and acceptable. Similarity search web networks, aiming to find entities similar given entity, one core tasks network analysis. With proliferation applications, including recommendation system, SimRank has been well-known measure for evaluating entity similarity network. However, existing work computes iteratively over huge matrix, which expensive terms time space cannot efficiently support large networks. In this paper, we propose link-based method, WebSim, towards finding WebSim defines between as 2-hop SimRank. To reduce computation cost, divide process into two stages: on-line stage. stage, 1-hop similarities are computed, an designed unnecessary accumulation operations on zero similarities. developed fast through searching entries from partial sums index derived items that lower than threshold skipped space. Compared iterative computation, reduced, since maintains only matrix much smaller multi-hop. Experiments comparison with its algorithms demonstrate average 99.83% reduction 92.12% achieves 99.98% NDCG.