Unsupervised Metadata Extraction in Scientific Digital Libraries Using A-Priori Domain-Specific Knowledge.

作者： Alexander Ivanyukovich , Maurizio Marchese

DOI:

关键词:

摘要: Information extraction from unstructured sources is a crucial step in the semantic annotation of content. The challenge supporting an high quality automatic approach (or at least semi-automatic) order to sustain scalability semantic-enabled services future. Unsupervised information encompasses number underlying research problems, such as natural language processing, heterogeneous integration, knowledge representation, and others that are under past current investigation. In this paper we concentrate on problem unsupervised metadata Digital Libraries domain. We propose present novel focusing improvement without involving external (oracles, manually prepared databases, etc), but relying document itself its corresponding context. More specifically, focus improvements scientific papers (mainly computer science domain) collected various over Internet. Finally, compare results our with state art domain discuss future work.

uni-trier.de 本地加速

ceur-ws.org PDF 下载加速

rwth-aachen.de PDF 下载加速

参考文章(17)

Andy Powell, Pete Johnston, Guidelines for implementing Dublin Core in XML Dublin Core Metadata Initiative. ,(2003)

James R. Cordy, TXL - A Language for Programming Language Tools and Applications Electronic Notes in Theoretical Computer Science. ,vol. 110, pp. 3- 31 ,(2004) , 10.1016/J.ENTCS.2004.11.006

H. F. Moed, E. C. M. Noyons, M. Luwel, Combining mapping and citation analysis for evaluative bibliometric purposes: a bibliometric study Journal of the Association for Information Science and Technology. ,vol. 50, pp. 115- 131 ,(1999) , 10.1002/(SICI)1097-4571(1999)50:2<115::AID-ASI3>3.3.CO;2-A

S. AÏT-MOKHTAR, J.-P. CHANOD, C. ROUX, Robustness beyond shallowness: incremental deep parsing Natural Language Engineering. ,vol. 8, pp. 121- 144 ,(2002) , 10.1017/S1351324902002887

Silviu Cucerzan, David Yarowsky, Language independent, minimally supervised induction of lexical probabilities Proceedings of the 38th Annual Meeting on Association for Computational Linguistics - ACL '00. pp. 270- 277 ,(2000) , 10.3115/1075218.1075253

Yunhua Hu, Hang Li, Yunbo Cao, Li Teng, Dmitriy Meyerzon, Qinghua Zheng, Automatic extraction of titles from general documents using machine learning Information Processing & Management. ,vol. 42, pp. 1276- 1293 ,(2006) , 10.1016/J.IPM.2005.12.001

Eugene Agichtein, Luis Gravano, Snowball: extracting relations from large plain-text collections acm international conference on digital libraries. pp. 85- 94 ,(2000) , 10.1145/336597.336644

Min-Yuh Day, Tzong-Han Tsai, Cheng-Lung Sung, Cheng-Wei Lee, Shih-Hung Wu, Chorng-Shyong Ong, Wen-Lian Hsu, A knowledge-based approach to citation extraction information reuse and integration. pp. 50- 55 ,(2005) , 10.1109/IRI-05.2005.1506448

Vaclav Petricek, Ingemar J. Cox, Hui Han, Isaac G. Councill, C. Lee Giles, A Comparison of On-Line Computer Science Citation Databases Research and Advanced Technology for Digital Libraries. pp. 438- 449 ,(2005) , 10.1007/11551362_39

10.

Hongyuan Zha, E.A. Fox, Zhenyue Zhang, Hui Han, C.L. Giles, E. Manavoglu, Automatic document metadata extraction using support vector machines acm ieee joint conference on digital libraries. pp. 37- 48 ,(2003) , 10.5555/827140.827146

Unsupervised Metadata Extraction in Scientific Digital Libraries Using A-Priori Domain-Specific Knowledge.

来源期刊

我的账户

Unsupervised Metadata Extraction in Scientific Digital Libraries Using A-Priori Domain-Specific Knowledge.

来源期刊

相似文章 3

Knowledge and Artifact Representation in the Scientific Lifecycle

Assessing Quality Dynamics in Unsupervised Metadata Extraction for Digital Libraries

ScienceTreks: an autonomous digital library system

我的账户