Multi-Instrument Based N-Grams for Composer Classification Task

Alexander Gelbukh, Daniel Alejandro Pérez Alvarez, Olga Kolesnikova, Liliana Chanona-Hernandez, Grigori Sidorov


In this research, we address the composerclassification supervised problem from a NaturalLanguage Processing perspective. Starting from digitalsymbolic music files, we build two representations: aclass representation and other based on MIDI pitches.We use the technique of n-grams to build featurevectors of musical compositions based on their harmoniccontent. For this, we extract n-grams of size 1 to 4 inharmonic direction, differentiating between all possiblesubsets of instruments. We populate a term-frequencymatrix with the vectors of compositions and we classifyby the means of Support Vector Machines (SVM)classifier. Different classification models are evaluated,e.g., using feature filters and varying hyperparameterssuch as TF-IDF formula, among others. The resultsobtained show that n-grams based on MIDI pitchesperform slightly better than n-grams based on classrepresentation in terms of overall results, but the bestresult of each one of these representations is identical.Some of our best models reach accuracy results thatexceed previous state of the art results based on awell-known dataset composed of string quartets byHaydn and Mozart.


Composer classification, composer attribution, composer recognition, composer identification, composer style, n-grams, harmonic n-grams, string quartet, mozart, haydn

Full Text: PDF