作者: Yufeng Wu , Jingyang Gao , Lei Cai
DOI: 10.1109/BIBM49941.2020.9313484
关键词:
摘要: Single-cell data are sparse and have coverage fluctuations, making it difficult, in comparison with obtained from next-generation sequencing (NGS), to call single nucleotide variants (SNVs) indels. Furthermore, most existing methods unable effectively whole-genome SNVs indels cell (SCS) data. In this study, we propose a new method for the efficient identification of SCS data, called scSNVIndel. scSNVIndel uses bidirectional long short-term memory (Bi-LSTM) as its base integrates natural language processing (NLP) technology. It automatically extracts features accurately calls when using which is characterized by uneven discontinuous coverage. Moreover, can sequence directly, retaining valuable information does not convert into an image like DeepVariant method. The results show that performs better terms accuracy recall calling variants, compared other methods. currently open-source method, available at https://github.com/CSuperlei/scSNVIndel, usage published on following website: https://www.aiguqu.com/2020/06/18/scSNVIndel/.