EPLogCleaner: Improving Data Quality of Enterprise Proxy Logs for Efficient Web Usage Mining

作者： Hongzhou Sha , Tingwen Liu , Peng Qin , Yong Sun , Qingyun Liu

DOI: 10.1016/J.PROCS.2013.05.104

关键词:

摘要: Abstract Data cleaning is an important step performed in the preprocessing stage of web usage mining, and widely used many data mining systems. Despite efforts on for server logs, it still open question enterprise proxy logs. With unlimited accesses to websites, logs trace requests from multiple clients servers,which make them quite different sever both location content. Therefore, irrelevant items such as software updating cannot be filtered out by traditional methods. In this paper, we propose first method named EPLogCleaner that can filter plenty based common prefix their URLs. We evaluation with a real network traffic captured one proxy. Experimental results show improve quality further filtering more than 30% URL comparing

sciencedirect.com 本地加速

core.ac.uk 本地加速

uni-trier.de 本地加速

sci-hub.se PDF 下载加速

参考文章(10)

L. Masinter, T. Berners-Lee, M. McCahill, Uniform Resource Locators (URL) RFC 1738. ,vol. 1738, pp. 1- 25 ,(1994)

Tasawar Hussain, Sohail Asghar, Nayyer Masood, Web usage mining: A survey on preprocessing of web log file international conference on information and emerging technologies. pp. 1- 6 ,(2010) , 10.1109/ICIET.2010.5625730

Yu Zhang, Li Dai, Zhi-Jie Zhou, A New Perspective of Web Usage Mining: Using Enterprise Proxy Log web information systems modeling. ,vol. 1, pp. 38- 42 ,(2010) , 10.1109/WISM.2010.20

Z. Pabarskaite, Implementing advanced cleaning and end-user interpretability technologies in Web log mining information technology interfaces. pp. 109- 113 ,(2002) , 10.1109/ITI.2002.1024660

K. R. Suneetha, R. Krishnamoorthi, Identifying User Behavior by Analyzing Web Server Access Log File ,(2009)

M. A. Torsello, A. M. Fanelli, G. Castellano, LODAP: a log data preprocessor for mining web browsing patterns AIKED'07 Proceedings of the 6th Conference on 6th WSEAS Int. Conf. on Artificial Intelligence, Knowledge Engineering and Data Bases - Volume 6. pp. 12- 17 ,(2007)

D. Tanasa, B. Trousse, Advanced data preprocessing for intersites Web usage mining IEEE Intelligent Systems. ,vol. 19, pp. 59- 65 ,(2004) , 10.1109/MIS.2004.1274912

Sanjay Tyagi, Navin Kumar Tyagi, AN ALGORITHMIC APPROACH TO DATA PREPROCESSING IN WEB USAGE MINING ,(2010)

Nayyer, Hussain, Asghar, Sohail, Tasawar, Masood, [IEEE 2010 International Conference on Information and Emerging Technologies (ICIET) - Karachi, Pakistan (2010.06.14-2010.06.16)] 2010 International Conference on Information and Emerging Technologies - Web usage mining: A survey on preprocessing of web log file ,(2010)

10.

Brijendra Singh, Hemant Kumar Singh, Web Data Mining research: A survey international conference on computational intelligence and computing research. pp. 1- 10 ,(2010) , 10.1109/ICCIC.2010.5705856

EPLogCleaner: Improving Data Quality of Enterprise Proxy Logs for Efficient Web Usage Mining

来源期刊

我的账户

EPLogCleaner: Improving Data Quality of Enterprise Proxy Logs for Efficient Web Usage Mining

来源期刊

相似文章 9

我的账户