A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA

Authors

  • Nicoleta Rogovschi
  • Mustapha Lebbah
  • Younès Bennani

DOI:

https://doi.org/10.47839/ijc.10.1.733

Keywords:

Self-organizing map (SOM), unsupervised learning, continuous and categorical data.

Abstract

Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.

References

B. Andreopoulos, A. An, and X. Wang. Bilevel clustering of mixed categorical and numerical biomedical data. International Journal of Data Mining and Bioinformatics, 1 (1) (2006) pp. 19-56.

A. Asuncion and D. Newman. UCI machine learning repository. http://www.ics.uci.edu/-mlearn/MLRepository.html, 2007.

C.M. Bishop, M. Svensen, and C.K.I. Williams. GTM: The generative topographic mapping. Neural Comput, 10(1) (1998) pp. 215-234.

A. Blansche, P. Gancarski, and J. Korczak. Maclaw: A modular approach for clustering with local attribute weighting. Pattern Recognition Letters, 27(11) (2006) pp. 1299-1306.

N. Grozavu, Y. Bennani, and M. Lebbah. Ponderation locale des variables en apprentissage numerique non-supervise. Sophia-Antipolis, France, (2008) pp. 45-54.

N. Grozavu, Y. Bennani, and M. Lebbah. From variable weighting to cluster characterization in topographic unsupervised learning. In IJCNN’09: Proceedings of the 2009 international joint conference on Neural Networks, Institute of Electrical and Electronics Engineers Inc., The, 2009 pp. 609-614.

S. Guerif and Y. Bennani. Dimensionality reduction trough unsupervised features selection. International Conference on Engineering Applications of Neural Networks, 2007.

J. Z. Huang, M. K. Ng, H. Rong, and Z. Li. Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5) (2005) pp. 657-668.

A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988.

A. Kaban and M. Girolami. A combined latent class and trait model for the analysis and visualization of discrete data. IEEE Trans. Pattern Anal. Mach. Intell, (23) (2001) pp. 859-872.

S. S. Khan and S. Kant. Computation of initial modes for k-modes clustering algorithm using evidence accumulation. In IJCAI, (2007) pp. 2784-2789.

T. Kohonen. Self-organizing Maps. Springer Berlin, Vol. 30, Springer, Berlin, Heidelberg, New York, 1995, 1997, 2001. Third Extended Edition, 2001. 501 p.

T. Kohonen. Self-organizing Maps. Springer Berlin, 2001.

M. Lebbah, Y. Bennani, and N. Rogovschi. A probabilistic self-organizing map for binary data topographic clustering. International Journal of Computational Intelligence and Applications, 7(4) (2008) pp. 363-383.

M. Lebbah, S. Thiria, and F. Badran. Topological map for binary data. In Proceedings European Symposium on Artificial Neural Networks-ESANN, Bruges, April 26-27-28, 2000, pp. 267-272.

Downloads

Published

2011-12-20

How to Cite

Rogovschi, N., Lebbah, M., & Bennani, Y. (2011). A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA. International Journal of Computing, 10(1), 24-32. https://doi.org/10.47839/ijc.10.1.733

Issue

Section

Articles