作者: Wallison W. Guimarães , Cristiano L. N. Pinto , Cristiane N. Nobre , Luis E. Zárate
关键词: Context (language use) 、 Class (biology) 、 Transduction (machine learning) 、 Translation initiation sites 、 Caenorhabditis elegans 、 Upstream (networking) 、 Computer science 、 Support vector machine 、 Inductive reasoning 、 Computational biology
摘要: The prediction of Translation Initiation Site (TIS) from a mRNA (Ribonucleic Acid Messenger) is relevant and latent problem molecular biology, which has benefited the evolution computational techniques machine learning (ML). There are some scenarios where dataset either does not have enough classified sequences to train precise model, or it an upstream region, such as Caenorhabditis elegans. In this article, we compare inductive transductive approaches for TIS prediction, using methodology that disregards region. With proposed methodology, achieved 95% training accuracy, only 2.5% belonging elegans class, many available but 75% Rattus norvegicus fewer available, approach. Our results demonstrate viability approach with sequences, common situation organisms incomplete gene sequencing.