Two-Step Verifications for Multi-instance Features Selection: A Machine Learning Approach

作者: M. N. Y. Ali , S. F. Nimmy

DOI: 10.1007/978-3-319-65981-7_7

关键词:

摘要: Multi-instance features measurement is an important step in identifying characteristics that are bound to various experimental events. In biological data processing, a set of critical factors responsible for several diseases. Computational simulation will help design optimal tool cost-effective drug design. this regard, the processing big valuable efficient simulation. Recent results generate huge amounts related data. current work, noisy have been treated with three filtering techniques: cross-validated committees (CVCF), iterative partitioning (IPF) and ensemble (EF). A comparison was made these approaches. The filtered datasets were normalized. repeated application normalization techniques removed irregularities structured datasets. Wide ranges comparisons among techniques. After being appropriately structured, normalized transformed accordingly different transformation processes: rank transformation, nominal binary Box-Cox transformation. To prevent false positive negative outcomes experiments, certain key aspects considered: accuracy, sensitivity F-measures. Accuracy experiments relates level precise detection factors; specificity allows selection dominant F-measures ratio between training testing Detailed analysis included study four classifiers deoxyribonucleic acid (DNA) dataset.

参考文章(51)
Rajeev Gupta, Himanshu Gupta, Mukesh Mohania, Cloud Computing and Big Data Analytics: What Is New from Databases Perspective? Big Data Analytics. pp. 42- 61 ,(2012) , 10.1007/978-3-642-35542-4_5
Kouser, Lalitha Rangarajan, Darshan S Chandrashekar, K Acharya Kshitish, Emin Mary Abraham, None, Alignment Free Frequency Based Distance Measures for Promoter Sequence Comparison Bioinformatics and Biomedical Engineering. pp. 183- 193 ,(2015) , 10.1007/978-3-319-16480-9_19
Bin Liu, Fule Liu, Longyun Fang, Xiaolong Wang, Kuo-Chen Chou, repRNA: a web server for generating various feature vectors of RNA sequences. Molecular Genetics and Genomics. ,vol. 291, pp. 473- 481 ,(2016) , 10.1007/S00438-015-1078-7
Yann Chevaleyre, Jean-Daniel Zucker, Solving Multiple-Instance and Multiple-Part Learning Problems with Decision Trees and Rule Sets. Application to the Mutagenesis Problem Advances in Artificial Intelligence. ,vol. 2056, pp. 204- 214 ,(2001) , 10.1007/3-540-45153-6_20
Bin Liu, Fule Liu, Xiaolong Wang, Junjie Chen, Longyun Fang, Kuo-Chen Chou, Pse-in-One: a web server for generating various modes of pseudo components of DNA, RNA, and protein sequences Nucleic Acids Research. ,vol. 43, ,(2015) , 10.1093/NAR/GKV458
Chumphol Bunkhumpornpat, Krung Sinapiromsaran, Chidchanok Lursinsap, Safe-Level-SMOTE: Safe-Level-Synthetic Minority Over-Sampling TEchnique for Handling the Class Imbalanced Problem Advances in Knowledge Discovery and Data Mining. pp. 475- 482 ,(2009) , 10.1007/978-3-642-01307-2_43
Gary D. Stormo, Consensus patterns in DNA. Methods in Enzymology. ,vol. 183, pp. 211- 221 ,(1990) , 10.1016/0076-6879(90)83015-2
Zhi-Hua Zhou, Min-Ling Zhang, Ensembles of multi-instance learners european conference on machine learning. pp. 492- 502 ,(2003) , 10.1007/978-3-540-39857-8_44