A comparative review of the selection methods for discovering differentially expressed genes in microarray experiments for classification

Katarzyna Stąpor, Paweł Błaszczyk, Adrian Brückner


In this paper the feature selection methods applied to discovering differentially expressed genes in microarray experiments are compared. This compare­son includes both filter and optimal subset selection methods. The simulated and biological datasets are used as the microarray gene expression data, and the ability of selected genes for classification is also considered.


feature selection; multiple hypothesis testing; microarray experiment; supervised learning

Full Text:



Broberg P.: Statistical methods for ranking differentially expressed genes. Genome Biology 2003,4:R41.

Dudoit S., Fridlyand J., Speed T. P.: Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression Data. Journal of American Statistical Association, 2002, Vol.97, No. 457.

Dudoit S., Yang Y. H., Callow M. J., Speed T .P.: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments. Technical Report 578, Department of Statistics, UC Berkeley, CA, 2000.

Efron B.: Estimating the error rate of prediction rule improvement on cross-validation. Journal of the American Statistical Association, 1983, Vol. 78, No. 382.

Ge Y., Dudoit S., Speed T. P.: Resampling-based multiple testing for microarray data analysis. Technical Report 663, Department of Statistics, UC Berkeley, CA, 2003.

"Multiple Testing Corrections", Silicon Genetics 2003.

Guyon I., Weston J., Barnhill S.: Gene Selection for cancer classification using support vector machines. Machine Learning, 2002, Vol. 46, pp. 389-422.

Kohavi R., John G H.: Wrappers for feature subset selection Artificial Intelligence, 1997 pp. 273-324.

Storey J. D., Tibshirani R.: Statistical significance for genome wide studies. PNAS 2003, Vol. 100, No. 16.

DOI: http://dx.doi.org/10.21936/si2006_v27.n4.563