Breast Cancer Biomarker Selection Using Multiple Offspring Sampling


Biomarkers are biochemical facets that can be used to measure different aspects of a disease. In the last years, there has been much interest in biomarkers of different cancer variants for predicting future patterns of disease. However, DNA Biomarker selection is a difficult task as it involves dealing with a special type of datasets, microarrays, that consists of a large number of features with small number of samples. This paper proposes a new approach for biomarkers selection by means of an innovative parallel evolutionary algorithm that performs wrapper feature selection from thousands of genes to achieve a small set of most relevant ones. To test our method, the well known Van’t Veer dataset on Breast Cancer has been considered. Preliminary results outperform those reported by Van’t Veer both in accuracy and the number of genes selected.

Workshop Data Mining in Functional Genomics and Proteomics: Current Trends and Future Directions