作者: Gene Yeo , Christopher B. Burge
关键词:
摘要: We propose a framework for modeling sequence motifs based on the maximum entropy principle (MEP). recommend approximating short motif distributions with distribution (MED) consistent low-order marginal constraints estimated from available data, which may include dependencies between nonadjacent as well adjacent positions. Many models (MEMs) are specified by simply changing set of constraints. Such can be utilized to discriminate signals and decoys. Classification performance using different MEMs gives insight into relative importance apply our large datasets RNA splicing signals. Our best out-perform previous probabilistic in discrimination human 5' (donor) 3' (acceptor) splice sites Finally, we discuss mechanistically motivated ways comparing models.