HUM-TO-CHORD CONVERSION USING CHROMA FEATURES AND HIDDEN MARKOV MODEL

Authors

  • Hariyanto Hariyanto
  • Suyanto Suyanto

DOI:

https://doi.org/10.47839/ijc.19.4.1988

Keywords:

chroma features, Hidden Markov Model, hum-to-chord conversion, music

Abstract

Music is basically a sound arranged in such a way to produce a harmonious and rhythmic sound. The basis of music is a tone, which is a natural sound and has different frequencies for each sound. Each constant sound represents a tone. The tones can also be represented in a chord. Humans are capable of creating a sound or imitating a tone from other human beings, but they are naturally unable to represent them into musical notation without musical instruments. This research addresses a model of Hum-to-Chord (H2C) conversion using a Chroma Feature (CF) to extract the characteristics and a Hidden Markov Model (HMM) to classify them. A 10-fold cross-validating shows that the best model is represented by the chroma coefficients of 55 and HMM with a codebook of 16, which gives an average accuracy of 94.83%. Examining on a 30% testing set proves that the best model has a high accuracy of up to 97.78%. Most errors come from the chords with both high and low octaves since they are unstable. Compared to a similar model called musical note classification (MNC), the proposed H2C model performs better in terms of both accuracy and complexity.

References

A. Ghias, J. Logan, D. Chamberlin, and B. C. Smith, “Query by Humming: Musical information retrieval in an audio database,” Proceedings of the Third ACM International Conference on Multimedia, 1995, pp. 231–236.

M. Ryynänen, A. Klapuri, “Query by humming of MIDI and audio using locality sensitive hashing,” Proceedings of the IEEE International Conference on Audio, Speech and Signal Processing, Las Vegas, USA, 2008, pp. 2249–2252.

M. Goto, T. Saitou, T. Nakano, and H. Fujihara, “Singing information processing based on singing voice modeling,” Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, 2010, pp. 5506–5509.

M. Goto, “Singing Information Processing,” Proceedings of the 12th International Conference on Signal Processing, 2014, pp. 2431–2438.

E. M. Martínez, “Singing Information Processing: Techniques and Applications,” Universidad de Málaga, 2017.

M. Muller, S. Ewert, and S. Kreuzer, “Making chroma features more robust to timbre changes,” Proceedinsg of the 2009 IEEE International Conference on Acoustics, Speech and Signal Processing, 2009, pp. 1877–1880.

M. Muller and S. Ewert, “Towards timbre-invariant audio features for harmony-based music,” IEEE Trans. Audio. Speech. Lang. Processing, vol. 18, no. 3, pp. 649–662, Mar. 2010.

M. Müller and S. Ewert, “Chroma toolbox: Matlab implementations for extracting variants of chroma-based audio features,” Proceedings of the 12th International Society for Music Information Retrieval Conference (ISMIR), 2011, pp. 215–220.

F. Korzeniowski and G. Widmer, “Feature learning for chord recognition: The deep chroma extractor,” Proceedings of the 17h International Society for Music Information Retrieval Conference (ISMIR), 2016, pp. 37–43.

J. d. J. Guerrero-Turrubiates, S. E. Gonzalez-Reyna, S. E. Ledesma-Orozco, and J. G. Avina-Cervantes, “Pitch estimation for musical note recognition using artificial neural networks,” Proceedings of the 2014 International Conference on Electronics, Communications and Computers (CONIELECOMP), 2014, pp. 53–58.

E. J. Humphrey and J. P. Bello, “Rethinking automatic chord recognition with convolutional neural networks,” Proceedings of the 2012 11th International Conference on Machine Learning and Applications, 2012, vol. 2, pp. 357–362.

S. Sigtia, N. Boulanger-Lewandowski, and S. Dixon, “Audio chord recognition with a hybrid recurrent neural networks,” Proceedings of the 16th International Society for Music Information Retrieval Conference (ISMIR), 2015, pp. 127–133.

S. Blasiak, H. Rangwala, “A hidden Markov model variant for sequence classification,” Proceedings of the 22nd International Joint Conference on Artificial Intelligence, 2006, pp. 1192–1197.

M. Gales and S. Young, “The application of hidden Markov models in speech recognition,” Found. Trends Signal Process., vol. 1, no. 3, pp. 195–304, 2007.

A. Amrouche, A. Taleb-ahmed, J. M. Rouvaen, and M. C. E. Yagoub, “A robust speech recognition system using a general regression neural network,” Int. J. Comput., vol. 6, no. 3, pp. 6–15, 2007.

L. Cuiling, “English speech recognition method based on hidden Markov model,” Proceedings of the 2016 International Conference on Smart Grid and Electrical Automation (ICSGEA), 2016, pp. 94–97.

H. Zhang, “Video action recognition based on hidden Markov model combined with particle swarm,” Int. J. Comput. Sci. Inf. Syst., vol. 7, no. 2, pp. 1–17, 2013.

Q. Zhang, S. Member, B. Li, and S. Member, “Relative hidden Markov models for video-based evaluation of motion skills in surgical training,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 37, no. 6, pp. 1206–1218, 2015.

A. Mesa, S. Basterrech, G. Guerberoff, and F. Alvarez-Valin, “Hidden Markov models for gene sequence classification,” Pattern Anal. Appl., vol. 19, no. 3, pp. 793–805, Aug. 2016.

K. Lee and M. Slaney, “Automatic chord recognition from audio using an HMM with supervised learning,” Proceedings of the 7th International Society for Music Information Retrieval Conference (ISMIR), 2006, pp. 133–137.

Y. Qi, J. W. Paisley, and L. Carin, “Music analysis using hidden Markov mixture models,” IEEE Trans. Signal Process., vol. 55, no. 11, pp. 5209–5224, Nov. 2007.

R. Kohavi, “A study of cross-validation and bootstrap for accuracy estimation and model selection,” Proceedings of the 14th International Joint Conference on Artificial Intelligence, vol. 2, 1995, pp. 1137–1143.

Downloads

Published

2020-12-30

How to Cite

Hariyanto, H., & Suyanto, S. (2020). HUM-TO-CHORD CONVERSION USING CHROMA FEATURES AND HIDDEN MARKOV MODEL. International Journal of Computing, 19(4), 555-560. https://doi.org/10.47839/ijc.19.4.1988

Issue

Section

Articles