Clustering e-commerce search engines based on their search interface pages using WISE-cluster

作者: Yiyao Lu , Hai He , Qian Peng , Weiyi Meng , Clement Yu

DOI: 10.1016/J.DATAK.2006.01.010

关键词:

摘要: In this paper, we propose a new approach to clustering e-commerce search engines (ESEs) on the Web. Our utilizes features available interface page of each ESE, including label terms and value appearing in form, number images, normalized price as well other terms. The experimental results based more than 400 ESEs indicate that proposed has good accuracy. importance different types is analyzed form are most important feature obtaining quality clusters.

参考文章(23)
H. L. Le Roy, L. Lecam, J. Neyman, Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability; Vol. IV Revue de l'Institut International de Statistique / Review of the International Statistical Institute. ,vol. 37, pp. 230- ,(1969) , 10.2307/1402306
Yiming Yang, Seán Slattery, Rayid Ghani, A Study of Approaches to Hypertext Categorization intelligent information systems. ,vol. 18, pp. 219- 241 ,(2002) , 10.1023/A:1013685612819
Mike Perkowitz, Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld, Learning to Understand Information on the Internet: AnExample-Based Approach next generation information technologies and systems. ,vol. 8, pp. 133- 153 ,(1997) , 10.1023/A:1008672508721
Hai He, Weiyi Meng, Clement Yu, Zonghuan Wu, Wise-integrator: an automatic integrator of web search interfaces for E-commerce very large data bases. pp. 357- 368 ,(2003) , 10.1016/B978-012722442-8/50039-2
Hai He, Weiyi Meng, Clement Yu, Zonghuan Wu, Constructing interface schemas for search interfaces of web databases web information systems engineering. pp. 29- 42 ,(2005) , 10.1007/11581062_3
Mehran Sahami, Daphne Koller, Hierarchically Classifying Documents Using Very Few Words international conference on machine learning. pp. 170- 178 ,(1997)
Gerard Salton, Michael J. McGill, Introduction to Modern Information Retrieval ,(1983)
Bin He, Tao Tao, Kevin Chen-Chuan Chang, Organizing structured web sources by query schemas: a clustering approach conference on information and knowledge management. pp. 22- 31 ,(2004) , 10.1145/1031171.1031178
Yiming Yang, A study of thresholding strategies for text categorization international acm sigir conference on research and development in information retrieval. pp. 137- 145 ,(2001) , 10.1145/383952.383975
Robert B. Doorenbos, Oren Etzioni, Daniel S. Weld, A scalable comparison-shopping agent for the World-Wide Web adaptive agents and multi-agents systems. pp. 39- 48 ,(1997) , 10.1145/267658.267666