You are on page 1of 2

MACHINE LEARNING TECHNIQUES APPLIED FOR SINGERS CLASSIFICATION

A. C. S. Pessotti1
1

Laboratory of Phonetics and Psycholinguistics (Lafape), State University of Campinas (Unicamp), Campinas, Brazil antoniopessotti@gmail.com

I. INTRODUCTION Singers have different degrees of training, frequently linked to their own practice and scholarship. Several studies report comparisons between soloists and choristers through the analysis of acoustic measurements and ensuing articulatory inferences. Among them, [2] is noteworthy for having found that the singer's formant is more prominent in soloists than in choristers. In addiction, soloists sung the lyrics of the song in a more distinct way than choristers. Another work ([3]) showed prominence between 2-4 kHz in women soloists, and choristers' articulation close to speech. Other studies ([4], [5]) appoints to sung strategies that differentiates singers. These works suggest that there are patterns to be explored, in order to distinguish these singer groups. Pattern Recognition is the way to find types of patterns that are found in one class of signals, and, will often be found in the others with similar characteristic variability. It's necessary to choose hidden variables to describe these patterns. Machine learning techniques are used in Pattern Recognition. They have been applied in many areas, in many signals as images, written text, time series of prices or weather, sounds, etcetera. The aim of this study is apply some machine learning techniques for singer classification, based on segmented acoustic data. II. METHODS Material: Ten sopranos were selected: five soloists (SOL) and five choristers (CHOR). They recorded the Brazilian chamber song Conselhos, composed by Carlos Gomes (195 syllables). They were recorded in professional studio with 96kHz stereo sampling rate, and after converted to 16kHz mono to analysis. Sung performances were recorded ten times: five with digitized accompaniment and five without. The lyrics were read five times by each. Segmentation and statistical analysis was conducted by specific scripts, created with Praat and R [19]. Musical digitized accompaniment was heard by informants through headset. It was created from musical score with MuseScore 1.0. Variables: The independent variable were singer's group (SOL or CHOR). The dependent variables was: a) formants (F1, F2 and F3), in Hertz; b) the same formants converted to Bark, c) intonation (pitch) in semitones, with reference in 440Hz; d) segment duration in milliseconds; e) intensity, in decibels (dB). Machine Learning techniques were applied with Waikato Environment for Knowledge Analysis (WEKA). Models employed are: Multilayer Perceptron (MP), Classification Via Clustering, Classification Via Regression, inference and rules-based learner (JRip), Naive Bayes. III. RESULTS JRip and MP models showed good results, with higher significance (p < 0). JRip result rules and MP suggest minimal paths which it's possble to create a signal classification. IV. DISCUSSION Results points to a precise distinction between these groups, in sung and spoken productions. Pattern recognition techniques applied here (JRip and MP) allow the development of an automatic classification system, as a future work. The next step is work in this system, and create a real-time system that could permit an audio signal direct analysis. Similarly, we hope that this methodology could be applied in speech pathologies' classification.

V. REFERENCES [1] R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org/ ; 2009. [2] Rossing, T. D., Sundberg, J. and Ternstrm, S. (1986) Acoustic comparison of voice use in solo and choir singing. Journal of the Acoustical Society of America, 79 (6), 1975-1981. [3] Rossing, T. D., Sundberg, J. and Ternstrm, S. (1987) Acoustic comparison of soprano solo and choir singing. Journal of the Acoustical Society of America, 82 (3), 830- 836. [4] Garnier, M.; Henrich, N.; Smith, J.; Wolfe, J. (2010) Vocal tract adjustments in the high soprano range. Journal of Acoustic Society of America, 127-6. [5] Garnier, M., Wolfe, J., Henrich, N., Smith, J. (2008) Interrelationship between vocal effort and vocal tract acoustics: a pilot study, In INTERSPEECH-2008, 2302-2305.

You might also like