Structured information retrieval in XML documents

作者: Evangelos Kotsakis

DOI: 10.1145/508791.508919

关键词:

摘要: Query languages that take advantage of the XML document structure already exist. However, systems have been developed to query data explore sources from a database perspective. This paper examines an collection viewpoint Information Retrieval (IR). As such, we view documents as text with additional tags and attempt adapt existing IR techniques achieve more sophisticated search on documents. We employ class queries support path expressions suggest efficient index, which extends inverted file is accomplished by integrating in combining index. The proposed lexicographical may be used for evaluation involve expressions. Moreover, this discusses ranking scheme based both term distribution structure. Some performance remarks are also presented.

参考文章(19)
S. Abiteboul, J. Widom, A. Rajaraman, J. McHugh, Q. Luo, Indexing Semistructured Data Stanford. ,(1998)
Dan Suciu, Daniela Florescu, David Maier, Alin Deutsch, Mary F. Fernández, Alon Y. Levy, Querying XML Data. IEEE Data(base) Engineering Bulletin. ,vol. 22, pp. 10- 18 ,(1999)
Stefano Ceri, Piero Fraternali, Stefano Paraboschi, XML: Current Developments and Future Challenges for the Database Community extending database technology. pp. 3- 17 ,(2000) , 10.1007/3-540-46439-5_1
William B. Frakes, Ricardo Baeza-Yates, Information Retrieval: Data Structures and Algorithms ,(1992)
Gonzalo Navarro, Ricardo Baeza-Yates, Proximal nodes: a model to query document databases by content and structure ACM Transactions on Information Systems. ,vol. 15, pp. 400- 435 ,(1997) , 10.1145/263479.263482
Angela Bonifati, Stefano Ceri, Comparative analysis of five XML query languages international conference on management of data. ,vol. 29, pp. 68- 79 ,(2000) , 10.1145/344788.344822
Dongwook Shin, Hyuncheol Jang, Honglan Jin, BUS: an effective indexing and retrieval scheme in structured documents acm international conference on digital libraries. pp. 235- 243 ,(1998) , 10.1145/276675.276702
Yong Kyu Lee, Seong-Joon Yoo, Kyoungro Yoon, P. Bruce Berra, Index structures for structured documents acm international conference on digital libraries. pp. 91- 99 ,(1996) , 10.1145/226931.226950
Hyunchi Jang, Youngil Kim, Dongwook Shin, An effective mechanism for index update in structured documents conference on information and knowledge management. pp. 383- 390 ,(1999) , 10.1145/319950.320031
Chris Buckley, Alan F. Lewit, Optimization of inverted vector searches Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval - SIGIR '85. pp. 97- 110 ,(1985) , 10.1145/253495.253515