Deep Learning for Character-Based Information Extraction

作者: Yanjun Qi , Sujatha G. Das , Ronan Collobert , Jason Weston

DOI: 10.1007/978-3-319-06028-6_74

关键词:

摘要: In this paper we introduce a deep neural network architecture to perform information extraction on character-based sequences, e.g. named-entity recognition Chinese text or secondary-structure detection protein sequences. With task-independent architecture, the relies only simple features, which obviates need for task-specific feature engineering. The proposed discriminative framework includes three important strategies, 1 learning module mapping characters vector representations is included capture semantic relationship between characters; 2 abundant online sequences unlabeled are utilized improve representation through semi-supervised learning; and 3 constraints of spatial dependency among output labels modeled explicitly in architecture. experiments four benchmark datasets have demonstrated that, consistently leads state-of-the-art performance.

参考文章(13)
Nianwen Xue, Libin Shen, Chinese Word Segmentation as LMR Tagging Proceedings of the Second SIGHAN Workshop on Chinese Language Processing. pp. 176- 179 ,(2003) , 10.3115/1119250.1119278
Petros Kountouris, Jonathan D Hirst, Prediction of backbone dihedral angles and protein secondary structure using support vector machines BMC Bioinformatics. ,vol. 10, pp. 437- 437 ,(2009) , 10.1186/1471-2105-10-437
Naiwen Xue, Fei Xia, Fu-Dong Chiou, Marta Palmer, None, The Penn Chinese TreeBank: Phrase structure annotation of a large corpus Natural Language Engineering. ,vol. 11, pp. 207- 238 ,(2005) , 10.1017/S135132490400364X
Geoffrey E Hinton, Ruslan R Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks Science. ,vol. 313, pp. 504- 507 ,(2006) , 10.1126/SCIENCE.1127647
Yanjun Qi, Merja Oja, Jason Weston, William Stafford Noble, A unified multitask architecture for predicting local protein properties. PLOS ONE. ,vol. 7, ,(2012) , 10.1371/JOURNAL.PONE.0032235
Ronan Collobert, Pavel Kuksa, Léon Bottou, Koray Kavukcuoglu, Michael Karlen, Jason Weston, Natural Language Processing (Almost) from Scratch Journal of Machine Learning Research. ,vol. 12, pp. 2493- 2537 ,(2011)
Stephen Clark, Yue Zhang, Joint Word Segmentation and POS Tagging Using a Single Perceptron meeting of the association for computational linguistics. pp. 888- 896 ,(2008)
Nianwen Xue, Chinese Word Segmentation as Character Tagging International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 1, February 2003: Special Issue on Word Formation and Chinese Language Processing. ,vol. 8, pp. 29- 48 ,(2003) , 10.30019/IJCLCLP.200302.0002