作者: Byron J. Gao , Obi L. Griffith , Martin Ester , Steven J. M. Jones
关键词:
摘要: Order-preserving submatrixes (OPSMs) have been accepted as a biologically meaningful subspace cluster model, capturing the general tendency of gene expressions across subset conditions. In an OPSM, expression levels all genes induce same linear ordering OPSM mining is reducible to special case sequential pattern problem, in which and its supporting sequences uniquely specify cluster. Those small twig clusters, specified by long patterns with naturally low support, incur explosive computational costs would be completely pruned off most existing methods for massive datasets containing thousands conditions hundreds genes, are common today's analysis. However, it particular interest biologists reveal such groups that tightly coregulated under many conditions, some pathways or processes might require only two act concert. this paper, we introduce KiWi framework datasets, exploits parameters k w provide biased testing on bounded number candidates, substantially reducing search space problem scale, targeting highly promising seeds lead significant clusters clusters. Extensive biological evaluations real demonstrate can effectively mine good efficiency scalability.