Deep Learning for Character-Based Information Extraction

作者： Yanjun Qi , Sujatha G. Das , Ronan Collobert , Jason Weston

关键词:

摘要: In this paper we introduce a deep neural network architecture to perform information extraction on character-based sequences, e.g. named-entity recognition Chinese text or secondary-structure detection protein sequences. With task-independent architecture, the relies only simple features, which obviates need for task-specific feature engineering. The proposed discriminative framework includes three important strategies, 1 learning module mapping characters vector representations is included capture semantic relationship between characters; 2 abundant online sequences unlabeled are utilized improve representation through semi-supervised learning; and 3 constraints of spatial dependency among output labels modeled explicitly in architecture. experiments four benchmark datasets have demonstrated that, consistently leads state-of-the-art performance.

参考文章(13)

Nianwen Xue, Libin Shen, Chinese Word Segmentation as LMR Tagging Proceedings of the Second SIGHAN Workshop on Chinese Language Processing. pp. 176- 179 ,(2003) , 10.3115/1119250.1119278

Burkhard Rost, Chris Sander, Combining evolutionary information and neural networks to predict protein secondary structure. Proteins. ,vol. 19, pp. 55- 72 ,(1994) , 10.1002/PROT.340190108

Petros Kountouris, Jonathan D Hirst, Prediction of backbone dihedral angles and protein secondary structure using support vector machines BMC Bioinformatics. ,vol. 10, pp. 437- 437 ,(2009) , 10.1186/1471-2105-10-437

Naiwen Xue, Fei Xia, Fu-Dong Chiou, Marta Palmer, None, The Penn Chinese TreeBank: Phrase structure annotation of a large corpus Natural Language Engineering. ,vol. 11, pp. 207- 238 ,(2005) , 10.1017/S135132490400364X

Geoffrey E Hinton, Ruslan R Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks Science. ,vol. 313, pp. 504- 507 ,(2006) , 10.1126/SCIENCE.1127647

James A. Cuff, Geoffrey J. Barton, Evaluation and Improvement of Multiple Sequence Methods for Protein Secondary Structure Prediction Proteins. ,vol. 34, pp. 508- 519 ,(1999) , 10.1002/(SICI)1097-0134(19990301)34:4<508::AID-PROT10>3.0.CO;2-4

Yanjun Qi, Merja Oja, Jason Weston, William Stafford Noble, A unified multitask architecture for predicting local protein properties. PLOS ONE. ,vol. 7, ,(2012) , 10.1371/JOURNAL.PONE.0032235

Ronan Collobert, Pavel Kuksa, Léon Bottou, Koray Kavukcuoglu, Michael Karlen, Jason Weston, Natural Language Processing (Almost) from Scratch Journal of Machine Learning Research. ,vol. 12, pp. 2493- 2537 ,(2011)

Stephen Clark, Yue Zhang, Joint Word Segmentation and POS Tagging Using a Single Perceptron meeting of the association for computational linguistics. pp. 888- 896 ,(2008)

10.

Nianwen Xue, Chinese Word Segmentation as Character Tagging International Journal of Computational Linguistics & Chinese Language Processing, Volume 8, Number 1, February 2003: Special Issue on Word Formation and Chinese Language Processing. ,vol. 8, pp. 29- 48 ,(2003) , 10.30019/IJCLCLP.200302.0002

Deep Learning for Character-Based Information Extraction

来源期刊

我的账户

Deep Learning for Character-Based Information Extraction

来源期刊

相似文章 10

我的账户