Extracting information from unstructured data and mapping the information to a structured schema using the naïve bayesian probability model

作者: Rajiv Subrahmanyam , Hector Aguilar-Macias

DOI:

关键词:

摘要: An “unstructured event parser” analyzes an that is in unstructured form and generates structured form. A mapping phase determines, for a given token, possible fields of the schema to which token could be mapped probabilities should those fields. Particular tokens are then particular schema. By using Naive Bayesian probability model, “probabilistic mapper” field, maps field. The probabilistic mapper can also used “regular expression creator” regex matches “parameter file helps user create parameter use with parameterized normalized generator generate based on event.

参考文章(8)
William M. Alexander, Rubin Jin, Hector Aguilar-Macias, Dhaval M. Shah, Specifying a Parser Using a Properties File ,(2010)
John Churchill, Philip Bernstein, Sergey Melnik, Selective schema matching ,(2006)
Bekim Demiroski, Michael E. Deem, Nigel R. Ellis, Michael J. Newman, Anil Kumar Nori, Gregory S. Friedman, Michael B. Taylor, Michael J. Pizzo, Sanjay Nagamangalam, Schema grammar and compilation ,(2005)
Thomas H. Hargrove, Vladimir Schipunov, Rajeev Prasad, Data processing over very large databases ,(2006)
Kamal Nigam, Andrew Kachites McCallum, Sebastian Thrun, Tom Mitchell, Text Classification from Labeled and Unlabeled Documents using EM Machine Learning. ,vol. 39, pp. 103- 134 ,(2000) , 10.1023/A:1007692713085
Bekim Demiroski, Michael J. Newman, Anil Kumar Nori, Gregory S. Friedman, Jeffrey T. Pearce, Jason T. Hunter, Srinivasmurthy P. Acharya, Amit Shukla, Nigel R. Ellis, Mapping of a file system model to a database object ,(2006)