On the Computation of Distances for Probabilistic Context-Free Grammars

作者: Colin de la Higuera , James Scicluna , Mark-Jan Nederhof

DOI:

关键词: Model checkingContext-sensitive grammarChebyshev distanceEquivalence (formal languages)Discrete mathematicsL-attributed grammarContext-free grammarUndecidable problemTree-adjoining grammarMathematics

摘要: Probabilistic context-free grammars (PCFGs) are used to define distributions over strings, and powerful modelling tools in a number of areas, including natural language processing, software engineering, model checking, bio-informatics, pattern recognition. A common important question is that comparing the generated or modelled by these grammars: this done through checking equivalence computing distances. Two PCFGs equivalent if every string has identical probability with both grammars. This also means distance (whichever norm used) null. It known problem interreducible multiple ambiguity for grammars, long-standing open question. In work, we prove distances corresponds solving undecidable questions: case L1, L2 norm, variation Kullback-Leibler divergence. more results less negative: 1. The most probable can be computed, and, 2. Chebyshev (where between two maximum difference probabilities all strings) problem.

参考文章(23)
R. C. Carrasco, Accurate computation of the relative entropy between stochastic regular grammars Theoretical Informatics and Applications. ,vol. 31, pp. 437- 444 ,(1997) , 10.1051/ITA/1997310504371
Vijay Balasubramanian, Equivalence and Reduction of Hidden Markov Models Massachusetts Institute of Technology. ,(1993) , 10.21236/ADA270762
John D. Lafferty, Frederick Jelinek, Computation of the probability of initial substring generation by stochastic context-free grammars Computational Linguistics. ,vol. 17, pp. 315- 323 ,(1991) , 10.5555/971764.971768
Michael A. Harrison, Introduction to formal language theory ,(1978)
F. Jelinek, J. D. Lafferty, R. L. Mercer, Basic Methods of Probabilistic Context Free Grammars Springer, Berlin, Heidelberg. pp. 345- 360 ,(1992) , 10.1007/978-3-642-76626-8_35
Arto Salomaa, Werner Kuich, Semirings, Automata, Languages ,(2011)
CORINNA CORTES, MEHRYAR MOHRI, ASHISH RASTOGI, Lp DISTANCE AND EQUIVALENCE OF PROBABILISTIC AUTOMATA International Journal of Foundations of Computer Science. ,vol. 18, pp. 761- 779 ,(2007) , 10.1142/S0129054107004966
CORINNA CORTES, MEHRYAR MOHRI, ASHISH RASTOGI, MICHAEL RILEY, On the Computation of the Relative Entropy of Probabilistic Automata International Journal of Foundations of Computer Science. ,vol. 19, pp. 219- 242 ,(2008) , 10.1142/S0129054108005644
Vojtěch Forejt, Petr Jančar, Stefan Kiefer, James Worrell, Language equivalence of probabilistic pushdown automata Information and Computation. ,vol. 237, pp. 1- 11 ,(2014) , 10.1016/J.IC.2014.04.003