Tutorial Advanced Sheet Music Part 3: Corpus Statistics

(version 2022_02_08)

This tutorial introduces the computer-assisted possibilities of simple statistical queries based on CAMAT (Computer-Assisted Music Analysis Tool), now with comparative queries of several pieces. Queried are

Working through the tutorial should enable you to examine your own music examples using the computer-based methods presented and to compare between different pieces of music.

At the beginning you have to load various libraries with the following commands:

In the Statistics Tutorial (Part 2), we looked at the first movement from the String Quartet K. 171 by Wolfgang Amadeus Mozart. Now we want to compare between the four movements of the string quartet.

Here are the URL's of the four files:

We can load the four files with the following command under the common name 'xml_files':

The following command ('df = mp.core.corpus.analyse_interval') creates a table (or 'dataframe' named 'df'), which can then be displayed as a table in the browser by the command 'df' or saved as a csv file by the command 'df.to_csv'.

The calculation can take - depending on the number and size of the files - some seconds up to several minutes...

A number of parameters can be set beforehand.

The following parameters can be changed:

= name of the loaded note files (see above)

This parameter isolates the individual voices. (The specification 'False' is not possible here).

The statistical data will be displayed for all voices.

For all voices a distribution of pitch classes is shown. (From 0=C to 11=B (H).) If this is not desired, please select 'include_pitchclass=False'!

With this parameter you can set which interval increments are displayed. With [-6, 6], for example, fifths down (-6 semitones) to fifths up (+6 semitones) are displayed; all larger intervals are placed in two common rest classes (<6 or >6). For 'None', all intervals that occur are represented.

With this parameter all intervals that occur are displayed.

This allows you to switch between absolute frequencies (i.e. the number of pitch classes or intervals that occur) at 'False' and relative frequencies (i.e. percentage) at 'True'.

When you have set all parameters, please add the following simple command 'df', which will display the table ('Run').

Please compare the frequency of the pitch classes for different movements and voices. For an easier comparative evaluation, please, select the relative proportion in percentages (get_in_percentage=True).

How diatonic or chromatic are the individual movements and voices?

What can be said about the interval progressions in the individual voices? Which voice has the most or the largest leaps? In which voice do the (small) interval steps dominate?

The following command saves the table as a csv file.

Please enter first a local path name as well as the file name, then delete # and 'Run'!

You can vary the variable 'df' for the table when querying again, e.g. 'df2' instead of 'df'. Here is the example of a query of intervals up to 12 semitones, i.e. one octave (up and down) in percent. The general pitches and pitch classes are now no longer of interest, thus:

'include_basic_stats=False' and 'include_pitchclass=False'.

How do the individual movements (and voices) differ with respect to the intervals that occur? Which intervals occur frequently - which (almost) never?

And now have fun with the comparative study of the distribution of pitch classes and interval progressions between compositions of your choice!!!