Filtering Next-Generation Sequencing of the Ig Gene Repertoire Data Using Antibody Structural Information

作者: Aleksandr Kovaltsuk , Konrad Krawczyk , Sebastian Kelm , James Snowden , Charlotte M. Deane

DOI: 10.4049/JIMMUNOL.1800669

关键词: Word error rateLine (text file)Computer scienceRepertoireDrug discoveryNucleic acid sequenceComputational biologySequence (medicine)DNA sequencingVariable (computer science)

摘要: Next-generation sequencing of the Ig gene repertoire (Ig-seq) produces large volumes information at nucleotide sequence level. Such data have improved our understanding immune systems across numerous species and already been successfully applied in vaccine development drug discovery. However, high-throughput nature Ig-seq means that it is afflicted by high error rates. This has led to error-correction approaches. Computational methods use alone, primarily designating sequences as likely be correct if they are observed frequently. In this work, we describe an orthogonal method for filtering data, which considers structural viability each sequence. A typical natural Ab structure requires presence a disulfide bridge within its variable chains maintain fold. Our Sequence Selector (ABOSS) uses presence/absence way both identifying structurally viable estimating rate. On simulated datasets, ABOSS able identify more than 99% sequences. Applying six independent datasets (one mouse five human), show calculations line with previous experimental computational estimates. We also how impossible missed other methods.

参考文章(41)
James Dunbar, Charlotte M. Deane, ANARCI: antigen receptor numbering and receptor classification Bioinformatics. ,vol. 32, pp. 298- 300 ,(2015) , 10.1093/BIOINFORMATICS/BTV552
S. Rudikoff, J. G. Pumphrey, Functional antibody lacking a variable-region disulfide bridge. Proceedings of the National Academy of Sciences of the United States of America. ,vol. 83, pp. 7875- 7878 ,(1986) , 10.1073/PNAS.83.20.7875
Namita Gupta, Joel N. H. Stern, Kevin C. O’Connor, David A. Hafler, Uri Laserson, Francois Vigneault, Steven H. Kleinstein, Gur Yaari, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Models of Somatic Hypermutation Targeting and Substitution Based on Synonymous Mutations from High-Throughput Immunoglobulin Sequencing Data Frontiers in Immunology. ,vol. 4, pp. 358- 358 ,(2013) , 10.3389/FIMMU.2013.00358
George Georgiou, Gregory C Ippolito, John Beausang, Christian E Busse, Hedda Wardemann, Stephen R Quake, The promise and challenge of high-throughput sequencing of the antibody repertoire. Nature Biotechnology. ,vol. 32, pp. 158- 168 ,(2014) , 10.1038/NBT.2782
Jian Ye, Ning Ma, Thomas L. Madden, James M. Ostell, IgBLAST: an immunoglobulin variable domain sequence analysis tool Nucleic Acids Research. ,vol. 41, pp. 34- 40 ,(2013) , 10.1093/NAR/GKT382
Linling He, Devin Sok, Parisa Azadnia, Jessica Hsueh, Elise Landais, Melissa Simek, Wayne C. Koff, Pascal Poignard, Dennis R. Burton, Jiang Zhu, Toward a more accurate view of human B-cell repertoire by next-generation sequencing, unbiased repertoire capture and single-molecule barcoding Scientific Reports. ,vol. 4, pp. 6778- 6778 ,(2015) , 10.1038/SREP06778
Mikhail Shugay, Olga V Britanova, Ekaterina M Merzlyak, Maria A Turchaninova, Ilgar Z Mamedov, Timur R Tuganbaev, Dmitriy A Bolotin, Dmitry B Staroverov, Ekaterina V Putintseva, Karla Plevova, Carsten Linnemann, Dmitriy Shagin, Sarka Pospisilova, Sergey Lukyanov, Ton N Schumacher, Dmitriy M Chudakov, None, Towards error-free profiling of immune repertoires Nature Methods. ,vol. 11, pp. 653- 655 ,(2014) , 10.1038/NMETH.2960
Yoshihisa Hagihara, Dirk Saerens, Engineering disulfide bonds within an antibody. Biochimica et Biophysica Acta. ,vol. 1844, pp. 2016- 2023 ,(2014) , 10.1016/J.BBAPAP.2014.07.005