作者: Emanuele Giaquinta , Kimmo Fredriksson , Szymon Grabowski , Alexandru I. Tomescu , Esko Ukkonen
DOI: 10.1016/J.TCS.2014.06.032
关键词: Variable length 、 Mathematics 、 Fixed length 、 Alphabet 、 Combinatorics 、 Motif (music)
摘要: We consider the problem of matching a set \(\mathcal{P}\) gapped patterns against given text T length n, where pattern is sequence strings (keywords), over finite alphabet Σ size σ, such that there gap fixed between each two consecutive strings.We assume RAM model, with words w in bits.We are interested computing list ofmatching for position text. This specific instance Variable Length Gaps [2] (VLG problem) multiple and has applications discovery transcription factor (TF) binding sites DNA sequences when using generalized versions PositionWeightMatrix (PWM) model to representTF specificities. The paper [5] describes howa motif represented as generalizedPWM can bematched patternswith unit-length keywords, presents algorithms restricted case keywords.