A SELF-ORGANIZING MAP FOR MIXED CONTINUOUS AND CATEGORICAL DATA
DOI:
https://doi.org/10.47839/ijc.10.1.733Keywords:
Self-organizing map (SOM), unsupervised learning, continuous and categorical data.Abstract
Most traditional clustering algorithms are limited to handle data sets that contain either continuous or categorical variables. However data sets with mixed types of variables are commonly used in data mining field. In this paper we introduce a weighted self-organizing map for clustering, analysis and visualization mixed data (continuous/binary). The learning of weights and prototypes is done in a simultaneous manner assuring an optimized data clustering. More variables has a high weight, more the clustering algorithm will take into account the informations transmitted by these variables. The learning of these topological maps is combined with a weighting process of different variables by computing weights which influence the quality of clustering. We illustrate the power of this method with data sets taken from a public data set repository: a handwritten digit data set, Zoo data set and other three mixed data sets. The results show a good quality of the topological ordering and homogenous clustering.References
B. Andreopoulos, A. An, and X. Wang. Bilevel clustering of mixed categorical and numerical biomedical data. International Journal of Data Mining and Bioinformatics, 1 (1) (2006) pp. 19-56.
A. Asuncion and D. Newman. UCI machine learning repository. http://www.ics.uci.edu/-mlearn/MLRepository.html, 2007.
C.M. Bishop, M. Svensen, and C.K.I. Williams. GTM: The generative topographic mapping. Neural Comput, 10(1) (1998) pp. 215-234.
A. Blansche, P. Gancarski, and J. Korczak. Maclaw: A modular approach for clustering with local attribute weighting. Pattern Recognition Letters, 27(11) (2006) pp. 1299-1306.
N. Grozavu, Y. Bennani, and M. Lebbah. Ponderation locale des variables en apprentissage numerique non-supervise. Sophia-Antipolis, France, (2008) pp. 45-54.
N. Grozavu, Y. Bennani, and M. Lebbah. From variable weighting to cluster characterization in topographic unsupervised learning. In IJCNN’09: Proceedings of the 2009 international joint conference on Neural Networks, Institute of Electrical and Electronics Engineers Inc., The, 2009 pp. 609-614.
S. Guerif and Y. Bennani. Dimensionality reduction trough unsupervised features selection. International Conference on Engineering Applications of Neural Networks, 2007.
J. Z. Huang, M. K. Ng, H. Rong, and Z. Li. Automated variable weighting in k-means type clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence, 27(5) (2005) pp. 657-668.
A. K. Jain and R. C. Dubes. Algorithms for Clustering Data. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1988.
A. Kaban and M. Girolami. A combined latent class and trait model for the analysis and visualization of discrete data. IEEE Trans. Pattern Anal. Mach. Intell, (23) (2001) pp. 859-872.
S. S. Khan and S. Kant. Computation of initial modes for k-modes clustering algorithm using evidence accumulation. In IJCAI, (2007) pp. 2784-2789.
T. Kohonen. Self-organizing Maps. Springer Berlin, Vol. 30, Springer, Berlin, Heidelberg, New York, 1995, 1997, 2001. Third Extended Edition, 2001. 501 p.
T. Kohonen. Self-organizing Maps. Springer Berlin, 2001.
M. Lebbah, Y. Bennani, and N. Rogovschi. A probabilistic self-organizing map for binary data topographic clustering. International Journal of Computational Intelligence and Applications, 7(4) (2008) pp. 363-383.
M. Lebbah, S. Thiria, and F. Badran. Topological map for binary data. In Proceedings European Symposium on Artificial Neural Networks-ESANN, Bruges, April 26-27-28, 2000, pp. 267-272.
Downloads
Published
How to Cite
Issue
Section
License
International Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.