作者: Xiaodan Fan , Tong Liang , Shuo−Yen R. Li , Qiwei Li
DOI: 10.1093/BIOINFORMATICS/BTR287
关键词: Structure (mathematical logic) 、 Biological data 、 String searching algorithm 、 Repeated sequence 、 Mathematics 、 Algorithm 、 Statistical inference 、 Task (computing) 、 Source code 、 Signal processing
摘要: Motivation: Repeats detection problems are traditionally formulated as string matching or signal processing problems. They cannot readily handle gaps between repeat units and incapable of detecting patterns shared by multiple sequences. This study detects short adjacent repeats with interunit insertions from For biological sequences, such studies can shed light on molecular structure, function evolution. Results: The task is a statistical inference problem using probabilistic generative model. An Markov chain Monte Carlo algorithm proposed to infer the parameters in de novo fashion. Its applications synthetic real data show that new method not only has competitive edge over existing methods, but also provide way structure evolution repeat-containing genes. Availability: related C++ source code datasets available at http://ihome.cuhk.edu.hk/%7Eb118998/share/BASARD.zip. Contact: xfan@sta.cuhk.edu.hk Supplementary information:Supplementary Bioinformatics online.