UNBIASEDNESS OF FEATURE SELECTION BY HYBRID FILTERING

Authors

  • Wiesław Pietruszkiewicz

DOI:

https://doi.org/10.47839/ijc.10.1.735

Keywords:

Feature selection, hybrid algorithms, machine learning, classification.

Abstract

In this article we examine characteristics of feature selection algorithms by introducing their aspects important in practice. We will focus on the unbiasedness, analyse it and investigate a robust hybrid method of feature selection, being a composition of several feature filters, that could ensure unbiased results of selection. Using parallel multi-measures and voting, we reduce the risk of selecting non-optimal features, a common situation when we select attributes using single evaluation based on one evaluation criterion. To test this method we selected a personal bankruptcy dataset, containing various types of attributes and one of the popular machine learning benchmarks. By the performed experiments we will demonstrate that an approach of multi-evaluation used for features filtering may lead to the creation of effective and fast methods of features selection with an unbiased outcome.

References

S. Avidan. Joint feature-basis subset selection. Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004.

W. Duch, T. Wieczorek, J. Biesiada, M. Blachnik. Comparison of feature ranking methods based on information en-tropy. Proceeding of International Joint Conference on Neural Networks, 2004.

A. Frank, A. Asuncion. UCI Machine Learning Repository http://archive.ics.uci.edu/ml. Irvine, CA: University of California, School of Information and Computer Science, 2010.

M. Gonen. Analyzing Receiver Operating Characteristic Curves Using SAS. SAS Press, 2007.

P. E. Greenwood, M. S. Nikulin. A Guide to Chi-Squared Testing. John Wiley Sons, New York, 1996

I. Guyon, A. Elisseeff. An introduction to variable and feature selection. Journal of Machine Learning Research (3) (2003), pp. 1157-1182.

M. A. Hall, L. A. Smith. Feature selection for machine learning: Comparing a correlation-based filter approach to the wrapper. Proceedings of the Twelfth International FLAIRS Conference, 1999.

D. Koller, M. Sahami. Toward optimal feature selection. Proceedings of the Thirteenth International Conference on Machine Learning, 1996.

L. Kuncheva. Combining Pattern Classi?ers: Methods and Algorithms. Wiley-IEEE, Hoboken, 2004.

H. Liu, R. Setiono. Incremental feature selection. Applied Intelligence (9) (1998), pp. 217-230.

J. R. Quinlan. C4.5: programs for machine learning. Morgan Kaufmann. San Francisco, 1993.

J. Ren, Z. Qiu, W. Fan, H. Cheng, P. S. Yu. Forward semi-supervised feature selection. Paci?c-Asia Conference on Knowledge Discovery and Data Mining. 2008.

R. Rojas. Neural Networks – A Systematic Introduction. Springer-Verlag. Berlin, 1996.

L. Rozenberg, W. Pietruszkiewicz. The methodic of diagnosis and prognosis of household bankruptcy. Di?n. Warszawa, 2008.

Y. Saeys, I. Inza, P. Larranaga, D. J. Wren. A review of feature selection techniques in bioinformatics. Bioinformatics 23 (19) (2007).

D. Ververidis. C. Kotropoulos. Sequential forward feature selection with low computational cost. Proceedings of European Signal Processing Conference, 2005.

I. H. Witten. E. Frank. Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann. San Francisco, 2005.

Z. Zheng. Feature selection for text categorization on imbalanced data. ACM SIGKDD Explorations Newsletter archive 6 (1) (2004), pp. 80-89.

J. Zhou, D. P. Foster, R. A. Stine, L. H. Ungar. Streamwise feature selection. Journal of Machine Learning Research (7) (2006), pp. 1861-1885.

Downloads

Published

2011-12-20

How to Cite

Pietruszkiewicz, W. (2011). UNBIASEDNESS OF FEATURE SELECTION BY HYBRID FILTERING. International Journal of Computing, 10(1), 42-49. https://doi.org/10.47839/ijc.10.1.735

Issue

Section

Articles