作者: Paul Pavlidis , Christopher Tang , William Stafford Noble
DOI:
关键词: Training set 、 Expression (mathematics) 、 Statistical model 、 Probabilistic logic 、 Microarray 、 Class (biology) 、 Sequence analysis 、 Pattern recognition 、 Gene 、 Computational biology 、 Computer science 、 Artificial intelligence 、 Supervised learning
摘要: Microarray expression data provides a new method for classifying genes and gene products according to their profiles. Numerous unsupervised supervised learning methods have been applied the task of discovering recognize classes co-expressed genes. Here we present based upon techniques borrowed from biological sequence analysis. The profile class is summarized in probabilistic model similar position-specific scoring matrix (PSSM). This insight into characteristics class, as well accurate recognition performance. Because PSSM models are generative, they particularly useful when biologist can identify priori but unable large collection non serve negative training set. We validate technique using S. cerevisiae C. elegans.