作者: Jayanta Kumar Das , Antara Sengupta , Pabitra Pal Choudhury , Swarup Roy
DOI: 10.1016/J.GENE.2020.145096
关键词:
摘要: The phylogenetic analysis based on sequence similarity targeted to real biological taxa is one of the major challenging tasks. In this paper, we propose a novel alignment-free method, CoFASA (Codon Feature Amino acid Sequence Analyser), for nucleotide sequences. At first, assign numerical weights four nucleotides. We then calculate score each codon value constituent nucleotides, termed as degree codons. Accordingly, obtain amino codons towards specific acid. Utilizing twenty acids and their relative abundance within given sequence, generate 20-dimensional features every coding DNA or protein sequence. use performing set candidate multiple sequences derived from Beta-globin (BG), NADH dehydrogenase subunit 5 (ND5), Transferrins (TFs), Xylanases, low identity (<40%) high (⩾40%) (encompassing 533 1064 families) experimental assessments. compare our results with sixteen (16) well-known methods, including both alignment-based methods. Various assessment indices are used, such Pearson correlation coefficient, RF (Robinson-Foulds) distance ROC performance analysis. While comparing methods (ClustalW, ClustalΩ, MAFFT, MUSCLE), it shows very similar results. Further, better in comparison LZW-Kernal, jD2Stat, FFP, spaced, AFKS-D2s predicting taxonomic relationship among taxa. Overall, observe that by much useful isolating according labels. method cost-effective, at same time, produces consistent satisfactory outcomes.