摘要: Feature selection is often applied to high-dimensional data prior classification learning. Using the same training dataset in both and learning can result so-called feature subset bias. This bias putatively exacerbate over-fitting negatively affect performance. However, current practice separate datasets are seldom employed for learning, because dividing into two classifier respectively reduces amount of that be used either task. work attempts address this dilemma. We formalize analyze its statistical properties, study factors bias, as well how impacts via various experiments. research endeavors provide illustration explanation why may not cause negative impact much expected regression.