作者: Antonio Gonzalez-Pardo , David Camacho
关键词:
摘要: Regular expressions, or regexes, have been used traditionally as a pattern matching tool to search for structures in set of objects, like files, text documents folders. Pattern can be look files whose name contains given string, that contain specific within them, simply extract documents. It is very popular apply regexes detect and patterns represent phone numbers, URLs, email addresses, etc. These kind information characterized because it has well defined structure. Nevertheless, are not frequently its high complexity both, syntax grammatical rules, makes difficult understand. For this reason, the development programs able automatically generate, evaluate, become valuable task. This work analyzes performance different evolutionary approaches generation URL patterns. Four types grammars evaluated: context-free grammar, grammar with penalized fitness function, an extensible Christiansen grammar. considered problem, experimental results show best system, measured cumulative success rate, achieved using grammars.