Query-based Biclustering using Formal Concept Analysis.

作者: Joel S. Bader , Chandan K. Reddy , Rajul Anand , Faris Alqadah

DOI:

关键词: Space (commercial competition)ScalabilityBag-of-words modelFormal concept analysisSet (abstract data type)Context (language use)Data miningComputer scienceBiclusteringQuality (business)

摘要: Biclustering methods have proven to be critical tools in the exploratory analysis of high-dimensional data including information networks, microarray experiments, and bag words data. However, most biclustering fail answer specific questions interest do not incorporate prior knowledge expertise from user. To this end, query-based algorithms that are recently developed context utilize a set seed genes provided by user which assumed tightly co-expressed or functionally related prune search space guide algorithm. In paper, novel QueryBased Bi-Clustering algorithm, QBBC, is proposed new formulation combines advantages low-variance techniques Formal Concept Analysis. We prove statistical dispersion measures order-preserving induce an ordering on biclusters turn, exploited form efficient manner. Our approach provides mechanism generalize sparse such as networks words. Moreover, framework performs local opposed global approaches previous employed. Experimental results indicate often produces higher quality precise compared state-of-the-art querybased methods. addition, our performance evaluation illustrate efficiency scalability QBBC full other existing approaches.

参考文章(22)
Ruggero G. Pensa, Jean-François Boulicaut, Constrained Co-clustering of Gene Expression Data siam international conference on data mining. pp. 25- 36 ,(2008)
Mitsunori Ogihara, Mohammed J. Zaki, Theoretical Foundations of Association Rules ,(2007)
Faris Alqadah, Raj Bhatnagar, Discovering Substantial Distinctions among Incremental Bi-Clusters. siam international conference on data mining. pp. 199- 210 ,(2009)
George M. Church, Yizong Cheng, Biclustering of Expression Data intelligent systems in molecular biology. ,vol. 8, pp. 93- 103 ,(2000)
Bernhard Ganter, Rudolf Wille, C. Franzke, Formal Concept Analysis: Mathematical Foundations ,(1998)
Yangqiu Song, Shimei Pan, Shixia Liu, Weihong Qian, Furu Wei, Michelle X. Zhou, Constrained co-clustering for textual documents national conference on artificial intelligence. pp. 581- 586 ,(2010)
Stuart Kim, Josh Stuart, Anne M. Villeneuve, Art B. Owen, Kathy Mach, A Gene Recommender Algorithm to Identify Coexpressed Genes in C. elegans Genome Research. ,vol. 13, pp. 1828- 1837 ,(2003) , 10.1101/GR.1125403
Omar Odibat, Chandan K. Reddy, Craig N. Giroux, Differential biclustering for gene expression analysis Proceedings of the First ACM International Conference on Bioinformatics and Computational Biology - BCB '10. pp. 275- 284 ,(2010) , 10.1145/1854776.1854815
Thomas Dhollander, Qizheng Sheng, Karen Lemmens, Bart De Moor, Kathleen Marchal, Yves Moreau, Query-driven module discovery in microarray data Bioinformatics. ,vol. 23, pp. 2573- 2580 ,(2007) , 10.1093/BIOINFORMATICS/BTM387
Sven Bergmann, Jan Ihmels, Naama Barkai, Iterative signature algorithm for the analysis of large-scale gene expression data. Physical Review E. ,vol. 67, pp. 031902- ,(2003) , 10.1103/PHYSREVE.67.031902