Towards logical hypertext structure

作者: Alexander Mehler , Matthias Dehmer , Rüdiger Gleim

DOI: 10.1007/11553762_14

关键词:

摘要: Facing the retrieval problem according to overwhelming set of documents online adaptation text categorization web units has recently been pushed. The aim is utilize categories sites and pages as an additional criterion. In this context, bag-of-words model utilized just HTML tags link structures. spite promising results stays in framework IR specific models since it neglects content-based structuring inherent hypertext units. This paper approaches modelling from perspective graph-theory. It presents XML-based format for representing websites hypergraphs. These hypergraphs are used shed light on relation structure types their web-based instances. We place emphasis two characteristics relation: terms realizational ambiguity we speak functional equivalents manifestation same type. polymorphism a single unit which manifests different types. shown that prevalent characteristic done by means experiment analyses corpus content conference websites. On background plead revision representation sensitive manifold documents.

参考文章(34)
Yiming Yang, Seán Slattery, Rayid Ghani, A Study of Approaches to Hypertext Categorization intelligent information systems. ,vol. 18, pp. 219- 241 ,(2002) , 10.1023/A:1013685612819
Andreas Winter, Bernt Kullbach, Volker Riediger, An Overview of the GXL Graph Exchange Language software visualization. pp. 324- 336 ,(2001) , 10.1007/3-540-45875-1_25
Information Retrieval and HyperText : Kluwer Academic Publishers. ,(1996) , 10.1007/978-1-4613-1373-1
Jonathan Furner, David Ellis, Peter Willett, The Representation and Comparison of Hypertext Structures Using Graphs Springer US. pp. 75- 96 ,(1996) , 10.1007/978-1-4613-1373-1_4
Rainer Kuhlen, Hypertext : ein nicht-lineares Medium zwischen Buch und Wissensbank Berlin [u.a.] : Springer. ,(1991)
Lada A. Adamic, The Small World Web european conference on research and advanced technology for digital libraries. pp. 443- 452 ,(1999) , 10.1007/3-540-48155-9_27
G. Rehm, Towards Automatic Web Genre Identification hawaii international conference on system sciences. ,vol. 5, pp. 101- 101 ,(2002) , 10.1109/HICSS.2002.10046