Named entities in judicial transcriptions: extended conditional random fields

作者: Elisabetta Fersini , Enza Messina

DOI: 10.1007/978-3-642-37247-6_26

关键词:

摘要: The progressive deployment of ICT technologies in the courtroom is leading to development integrated multimedia folders where entire trial contents (documents, audio and video recordings) are available for online consultation via web-based platforms. current amount unstructured textual data into judicial domain, especially related hearing transcriptions, highlights therefore need automatically extract structured from ones improving efficiency processes. In this paper we address problem extracting information transcriptions generated using an ASR (Automatic Speech Recognition) system, by integrating Conditional Random Fields with background information. computational experiments show promising results structuring outputs, enabling a robust efficient document consultation.

参考文章(13)
Alicia Iriberri, Chih Hao Ku, Gondy Leroy, Natural language processing and e-Government: crime information extraction from heterogeneous data sources digital government research. pp. 162- 170 ,(2008) , 10.5555/1367832.1367862
Leonard E. Baum, Ted Petrie, Statistical Inference for Probabilistic Functions of Finite State Markov Chains Annals of Mathematical Statistics. ,vol. 37, pp. 1554- 1563 ,(1966) , 10.1214/AOMS/1177699147
David Nadeau, Satoshi Sekine, A survey of named entity recognition and classification Lingvisticae Investigationes. ,vol. 30, pp. 3- 26 ,(2007) , 10.1075/LI.30.1.03NAD
Han-Shen Huang, Yu-Ming Chang, Chun-Nan Hsu, Training Conditional Random Fields by Periodic Step Size Adaptation for Large-Scale Text Mining international conference on data mining. pp. 511- 516 ,(2007) , 10.1109/ICDM.2007.39
Robert Malouf, A comparison of algorithms for maximum entropy parameter estimation international conference on computational linguistics. pp. 1- 7 ,(2002) , 10.3115/1118853.1118871
Chih Hao Ku, Alicia Iriberri, Gondy Leroy, Crime Information Extraction from Police and Witness Narrative Reports ieee international conference on technologies for homeland security. pp. 193- 198 ,(2008) , 10.1109/THS.2008.4534448
E. Fersini, E. Messina, F. Archetti, M. Cislaghi, Semantics and Machine Learning: A New Generation of Court Management Systems international joint conference on knowledge discovery, knowledge engineering and knowledge management. ,vol. 272, pp. 382- 398 ,(2010) , 10.1007/978-3-642-29764-9_26
J. Loof, D. Falavigna, R. Schluter, D. Giuliani, R. Gretter, H. Ney, Evaluation of automatic transcription systems for the judicial domain spoken language technology workshop. pp. 206- 211 ,(2010) , 10.1109/SLT.2010.5700852
Michael Chau, Hsinchun Chen, Jennifer J. Xu, Extracting meaningful entities from police narrative reports international conference on digital government research. pp. 1- 5 ,(2002) , 10.5555/1123098.1123138
John D. Lafferty, Andrew McCallum, Fernando C. N. Pereira, Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data international conference on machine learning. pp. 282- 289 ,(2001)