You are on page 1of 4

MMI 702 - Machine Learning for Multimedia Informatics

Ege Erdem
Project Phase 2 Report
Project Idea

Music genre classification is the aim of the project. Genre simply refers to styles of music which
share a tradition or set of conventions. In short, musical genres are categorical descriptions
that are used to characterize music. Genre classification is important in music industry in terms
of descriptive purposes, understanding the audience and by affecting decision making. Doing
it manually is slow and expensive therefore automatic genre classification systems are
developed which can assist or replace the human user (Mllers, 2015). Since most of the genre
classification works are using only genre&music data, purpose of this project is making a genre
classification not only with the music but also using the lyrics in the classification by analyzing
different common language usage in different genres.

Literature Survey

Music genre classification has became a point of interest when digital libraries showed up
(napster etc.) and mp3 players started to use widely. Since then, lots of different classification
techniques such as k-nearest neighbors [1], support vector machines [2], radial basis functions
[3], linear discriminant classifiers [4], multiple feature vectors and a pattern recognition
ensemble approach [5], hidden Markov models [6], non-negative tensor factorization [7],
neural newtorks, decision trees etc. are implemented. I focused on Music Genre Classification
with the Million Song Dataset, D. Liang, H. Gu, B. OConnor, Carnegie Mellon University,
2011 paper to see how lyrics and audio information may be combined.

Data Set

For the first trials, GTZAN Genre Collection (Tzanetakis and Cook, 2002) is used which is the
dataset for well known paper of G. Tzanetakis (Musical Genre Classification of Audio Signals
[8]). It includes 100 songs for each of the 10 different genres.

More recent and a broader dataset Million Song Dataset which includes 300GB of audio
features and metadata for a million contemporary popular music tracks will be used for the
final stage of the project, if lyrics extraction part is successfully implemented.

Figure 1. Subgenres of jazz


Papers to Read

Musical Genre Classification of Audio Signals. G. Tzanetakis, P. Cook. IEEE Transactions on


Speech and Audio Processing, 10 (5), pg. 293-302, July 2002. : Combines segmentation and
classification

Music Genre Classification Using MIDI and Audio Features, Z. Cataltepe, Y.Yaslan and A.
Sonmez, Journal of Applied Signal Processing (ISSN: 1687-6172), vol. 2007 (January), Article ID
36409.

Music Genre Classification with the Million Song Dataset, D. Liang, H. Gu, B. OConnor,
Carnegie Mellon University, 2011 Lyrics also examined

Music Genre Classification and Variance Comparison on Number of Genres, Stanford


University, Tech. Rep., 2013

Music Genre Classification, M. Creme, C. Burlin, R. Lenain, CS229 Report, Stanford University,
2016 (December) : Neural Network, SVM, KNN with PCA, Decision Tree

Algorithms

Mel-frequency cepstral coefficients (MFCCs) are very commonly used in speech recognition
and music information retrieval, as discussed in Muller (2007) [9]. It provides at least 13
features, which necessitates the reduction of dimensionality in further steps.

For the learning part, K-Nearest Neighbors will be used since it is a widely used an accessible
technique and one of the first techniques used. As mentioned above, PCA will be used to
reduce dimensionality since we have lots of features extracted.

Limitations

The artistic nature of music means that classifications are sometimes subjective and
controversial, and there is no strict rules or lines among them. In addition, new music genres
are showing up day by day making the classification much harder. There are also lots of sub-
genres. For example punk, indie, shoegaze, AOR, metal and many, all fall under the top level
genre category rock. Similarly, jazz has tones of subgenres (Figure 1). This work aims to include
only most fundamental genres: blues, classical, country, disco, hiphop, jazz, metal, pop, rock.

Main limitation is the lyrics part in the project. Since it requires emotion analysis and a broad
linguistic work, only that part will be a seperate project topic.
Code Implementation / Data Visualization

Different code parts that is implemented so far includes:

Converting au format to wav


Reading wav files
Applying FFT to wav files, and writing them as seperate files
Reading fft information from already saved files
Creating mfcc coefficients
Passing it to the training function

I am in the middle of training procedure which includes logistic regression and also having a
shufflesplit function to create random seperate test subsets of the dataset. I have difficulties
in the ShuffleSplit function and I am trying to figure out the errors with it. Hopefully, music and
audio part will be finished soon.

Figure 2. Part of the codes, from Anaconda Spyder and Pycharm


References

[1] T. Li, M. Ogihara, and Q. Li, A comparative study on content-based music genre
classification, in ACM SIGIR, 2003.

[2] C. Xu, N.C. Maddage, X. Shao, F. Cao, and Q. Tian, Musical genre classification using support
vector machines, in IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), April 2003, vol. 5, pp. 429432.

[3] D. Turnbull and C. Elkan, Fast recognition of musical genres using rbf networks, IEEE
Transactions on Knowledge and Data Engineering, vol. 17, no. 4, pp. 580584, 2005.

[4] Z. Cataltepe, Y.Yaslan and A. Sonmez, "Music Genre Classification Using MIDI and Audio
Features", Journal of Applied Signal Processing (ISSN: 1687-6172), vol. 2007 (January), Article
ID 36409.

[5] Silla Jr., Carlos N., Koerich, Alessandro L., & Kaestner, Celso A. A.. (2008). A machine learning
approach to automatic music genre classification. Journal of the Brazilian Computer
Society, 14(3), 7-18

[6] Chai, Wei and B. Vercoe. Folk Music Classification Using Hidden Markov Models.
Proceedings of International Conf erence on Artificial Intelligence, June 2001.

[7] Y. Panagakis, C. Kotropoulos, G. R. Arce. MUSIC GENRE CLASSIFICATION USING LOCALITY


PRESERVING NON-NEGATIVE TENSOR FACTORIZATION AND SPARSE REPRESENTATIONS.
10th International Society for Music Information Retrieval Conference (ISMIR 2009)

[8] Musical Genre Classification of Audio Signals. G. Tzanetakis, P. Cook. IEEE Transactions on
Speech and Audio Processing, 10 (5), pg. 293-302, July 2002.

[9] M. Muller. Information retrieval for music and motion. In Springer, 2007.

[10] H. McDonald (2017, October 25). Learn What a Music Genre Is. Retrieved from
https://www.thebalance.com/music-genre-what-is-it-and-why-does-it-matter-2460500

You might also like