Music Genre Classification Report

MMI 702 - Machine Learning for Multimedia Informatics
Ege Erdem
Project Phase 2 Report
Project Idea
Music genre classification is the aim of the project. Genre simply refers to styles of music which
share a tradition or set of conventions. In short, musical genres are categorical descriptions
that are used to characterize music. Genre classification is important in music industry in terms
of descriptive purposes, understanding the audience and by affecting decision making. Doing
it manually is slow and expensive therefore automatic genre classification systems are
developed which can assist or replace the human user (Mllers, 2015). Since most of the genre
classification works are using only genre&music data, purpose of this project is making a genre
classification not only with the music but also using the lyrics in the classification by analyzing
different common language usage in different genres.
Literature Survey
Music genre classification has became a point of interest when digital libraries showed up
(napster etc.) and mp3 players started to use widely. Since then, lots of different classification
techniques such as k-nearest neighbors [1], support vector machines [2], radial basis functions
[3], linear discriminant classifiers [4], multiple feature vectors and a pattern recognition
ensemble approach [5], hidden Markov models [6], non-negative tensor factorization [7],
neural newtorks, decision trees etc. are implemented. I focused on Music Genre Classification
with the Million Song Dataset, D. Liang, H. Gu, B. OConnor, Carnegie Mellon University,
2011 paper to see how lyrics and audio information may be combined.
Data Set
For the first trials, GTZAN Genre Collection (Tzanetakis and Cook, 2002) is used which is the
dataset for well known paper of G. Tzanetakis (Musical Genre Classification of Audio Signals
[8]). It includes 100 songs for each of the 10 different genres.
More recent and a broader dataset Million Song Dataset which includes 300GB of audio
features and metadata for a million contemporary popular music tracks will be used for the
final stage of the project, if lyrics extraction part is successfully implemented.
Figure 1. Subgenres of jazz

Papers to Read
Musical Genre Classification of Audio Signals. G. Tzanetakis, P. Cook. IEEE Transactions on

Speech and Audio Processing, 10 (5), pg. 293-302, July 2002. : Combines segmentation and
classification
Music Genre Classification Using MIDI and Audio Features, Z. Cataltepe, Y.Yaslan and A.
Sonmez, Journal of Applied Signal Processing (ISSN: 1687-6172), vol. 2007 (January), Article ID
36409.
Music Genre Classification with the Million Song Dataset, D. Liang, H. Gu, B. OConnor,
Carnegie Mellon University, 2011 Lyrics also examined
Music Genre Classification and Variance Comparison on Number of Genres, Stanford

University, Tech. Rep., 2013
Music Genre Classification, M. Creme, C. Burlin, R. Lenain, CS229 Report, Stanford University,
2016 (December) : Neural Network, SVM, KNN with PCA, Decision Tree
Algorithms
Mel-frequency cepstral coefficients (MFCCs) are very commonly used in speech recognition
and music information retrieval, as discussed in Muller (2007) [9]. It provides at least 13
features, which necessitates the reduction of dimensionality in further steps.
For the learning part, K-Nearest Neighbors will be used since it is a widely used an accessible
technique and one of the first techniques used. As mentioned above, PCA will be used to
reduce dimensionality since we have lots of features extracted.
Limitations
The artistic nature of music means that classifications are sometimes subjective and
controversial, and there is no strict rules or lines among them. In addition, new music genres
are showing up day by day making the classification much harder. There are also lots of sub-
genres. For example punk, indie, shoegaze, AOR, metal and many, all fall under the top level
genre category rock. Similarly, jazz has tones of subgenres (Figure 1). This work aims to include
only most fundamental genres: blues, classical, country, disco, hiphop, jazz, metal, pop, rock.
Main limitation is the lyrics part in the project. Since it requires emotion analysis and a broad
linguistic work, only that part will be a seperate project topic.
Code Implementation / Data Visualization
Different code parts that is implemented so far includes:
Converting au format to wav

Reading wav files
Applying FFT to wav files, and writing them as seperate files
Reading fft information from already saved files
Creating mfcc coefficients
Passing it to the training function
I am in the middle of training procedure which includes logistic regression and also having a
shufflesplit function to create random seperate test subsets of the dataset. I have difficulties
in the ShuffleSplit function and I am trying to figure out the errors with it. Hopefully, music and
audio part will be finished soon.
Figure 2. Part of the codes, from Anaconda Spyder and Pycharm

References
[1] T. Li, M. Ogihara, and Q. Li, A comparative study on content-based music genre
classification, in ACM SIGIR, 2003.
[2] C. Xu, N.C. Maddage, X. Shao, F. Cao, and Q. Tian, Musical genre classification using support
vector machines, in IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), April 2003, vol. 5, pp. 429432.
[3] D. Turnbull and C. Elkan, Fast recognition of musical genres using rbf networks, IEEE
Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 580584, 2005.
[4] Z. Cataltepe, Y.Yaslan and A. Sonmez, "Music Genre Classification Using MIDI and Audio
Features", Journal of Applied Signal Processing (ISSN: 1687-6172), vol. 2007 (January), Article
ID 36409.
[5] Silla Jr., Carlos N., Koerich, Alessandro L., & Kaestner, Celso A. A.. (2008). A machine learning
approach to automatic music genre classification. Journal of the Brazilian Computer
Society, 14(3), 7-18
[6] Chai, Wei and B. Vercoe. Folk Music Classification Using Hidden Markov Models.
Proceedings of International Conf erence on Artificial Intelligence, June 2001.
[7] Y. Panagakis, C. Kotropoulos, G. R. Arce. MUSIC GENRE CLASSIFICATION USING LOCALITY

PRESERVING NON-NEGATIVE TENSOR FACTORIZATION AND SPARSE REPRESENTATIONS.
10th International Society for Music Information Retrieval Conference (ISMIR 2009)
[8] Musical Genre Classification of Audio Signals. G. Tzanetakis, P. Cook. IEEE Transactions on
Speech and Audio Processing, 10 (5), pg. 293-302, July 2002.
[9] M. Muller. Information retrieval for music and motion. In Springer, 2007.
[10] H. McDonald (2017, October 25). Learn What a Music Genre Is. Retrieved from
https://www.thebalance.com/music-genre-what-is-it-and-why-does-it-matter-2460500

Music Genre Classification Report

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Music Genre Classification Report

Uploaded by

Copyright:

Available Formats

MMI 702 - Machine Learning for Multimedia Informatics

Figure 1. Subgenres of jazz

Musical Genre Classification of Audio Signals. G. Tzanetakis, P. Cook. IEEE Transactions on

Music Genre Classification and Variance Comparison on Number of Genres, Stanford

Different code parts that is implemented so far includes:

Converting au format to wav

Figure 2. Part of the codes, from Anaconda Spyder and Pycharm

[7] Y. Panagakis, C. Kotropoulos, G. R. Arce. MUSIC GENRE CLASSIFICATION USING LOCALITY

You might also like