(version 2022_02_06)
1. Query for simple statistical data
3. Two-dimensional frequency distributions
This tutorial introduces the computer-assisted possibilities of simple statistical queries based on CAMAT (Computer-Assisted Music Analysis Tool) with music examples.
Working through the tutorial should enable you to examine your own sheet music files using computer-assisted methods.
Each session with a Jupyter Notebook begins with the import of a set of Python libraries required for the analysis:
import sys
import os
sys.path.append(os.getcwd().replace(os.path.join('music_xml_parser', 'ipynb'), ''))
import music_xml_parser as mp
from music21 import *
import csv
from IPython.display import HTML, display
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# With these commands the CAMAT music_xml_parser
# as well as the libraries 'numpy' and 'pandas' for static evaluations,
# 'music21' and 'matplotlib' for graphical representations are activated.
# The following command enables the download of xml files from the Internet.
environment.set('autoDownload', 'allow')
# The following commands are used to set the formatting for the tables,
# which are shown below - '9999' is the maximum value:
pd.set_option('display.max_rows', 10)
pd.set_option('display.max_columns', 9999)
pd.set_option('display.width', 9999)
Then, you have to load the sheet music file you want to examine (from the internet or from your hard disk) and activate the xml-parser. This will create a new dataframe ('m_df') from the xml file, which will be the basis for the following statistical queries (cf. https://analyse.hfm-weimar.de/jupyter/CAMAT_Basics_Part1_Einfuehrung.html).
As music example for our tutorial we choose the first movement from the String Quartet K. 171 by Wolfgang Amadeus Mozart (see Basics Part 1).
xml_file = 'https://analyse.hfm-weimar.de/database/03/MoWo_K171_COM_1-4_StringQuar_003_00867.xml'
m_df = mp.parse.with_xml_file(file=xml_file,
save_file_name=None,
do_save=False)
# You can also assign another variable name instead of 'm_df'.
# This is useful if you want to work with multiple sheet music files in parallel.
# For the setting of the two parameters, please also refer to the introductory tutorial (Part1).
File at: ../music_xml_parser/data/xmls_to_parse/hfm_database/MoWo_K171_COM_1-4_StringQuar_003_00867.xml
# If you want to view the dataframe table in sections in the browser (and not in the external csv file),
# you have to activate the command 'm_df' by deleting the hash key.
# m_df
# For a complete view use the following command:
# mp.utils.print_full_df(m_df)
# IMPORTANT: This can be very computationally intensive!!
We start with simple statistical queries of the number of voices, the length in measures, the number of notes (total and per voice), and the ambitus of each voice.
Please open the sheet music file in parallel in your score editor (e.g. MuseScore).
Note: With the first evaluation command the data is read in for the first time. Therefore, the execution could take a relatively long time (up to several minutes – depending on your computer and the file size). However, all following commands will go very quickly!!!
v = m_df[['PartID','PartName']].drop_duplicates().to_numpy()
mp.utils.display_table(data=v,
columns=['Part ID', 'Part Name'])
# In the first command line, from the dataframe list of the xml parser ('m_df')
# the PartIDs and PartNames are read,
# i.e. the IDs and designations of the individual voices.
# The variable 'v' is assigned for this purpose.
# The second line defines the column headings of the table.
# IMPORTANT: If no voice name is specified in a MusicXML file,
# the PartName can, of course, not be displayed here ('None')!
Part ID | Part Name |
---|---|
1 | Violino I |
2 | Violino II |
3 | Viola |
4 | Violoncello |
Query of the piece length in measures:
m = m_df['Measure'].to_numpy(dtype=int)
max(m)
159
Query of the number of tones per voice, with tied tones each counted as one tone:
n_notes, c_notes = np.unique(m_df['PartName'], return_counts=True)
data = [[i, c] for i, c in zip(n_notes, c_notes) ]
mp.utils.display_table(data=data,
columns=['Part Name', 'Number of Notes'])
# If the voices in the xml file have no designations,
# so 'PartName' (and 'Part Name') must be replaced by 'PartID'!
Part Name | Notenanzahl |
---|---|
Viola | 382 |
Violino I | 576 |
Violino II | 626 |
Violoncello | 385 |
The ambitus per voice is given in semitone steps, i.e. the difference between the lowest (min) and highest (max) note; given in MIDI values with c' = C4 = 60; c'' = C5 = 72 and so on:
ambitus = mp.analyse.ambitus(m_df,
output_as_midi=True)
# By the parameter 'output_as_midi=False' the notes are specified with names.
# The following command sets the output table:
mp.utils.display_table(data=ambitus,
columns=['Part ID', 'PartName', 'min', 'max', 'Ambitus'])
# In the last command line the table columns are named.
# You can rename them as you wish.
Part ID | PartName | min | max | Ambitus |
---|---|---|---|---|
1 | Violino I | 57 | 87 | 30 |
2 | Violino II | 55 | 82 | 27 |
3 | Viola | 48 | 75 | 27 |
4 | Violoncello | 39 | 67 | 28 |
# The addition 'output_as_midi=False' specifies the tones with tone names.
ambitus = mp.analyse.ambitus(m_df,
output_as_midi=False)
mp.utils.display_table(data=ambitus,
columns=['Part ID', 'PartName', 'min', 'max', 'Ambitus'])
Part ID | PartName | min | max | Ambitus |
---|---|---|---|---|
1 | Violino I | A3 | D#6 | 30 |
2 | Violino II | G3 | A#5 | 27 |
3 | Viola | C3 | D#5 | 27 |
4 | Violoncello | D#2 | G4 | 28 |
To characterize individual compositions and to compare different pieces of music, it can be useful to determine the frequency of certain elements (pitches, note durations, etc.). For such issues frequency tables and graphical representations, so-called histograms, can be created.
Which pitches appear how often? How diatonic is the tonal resource of a composition, how many additional chromatic notes appear?
pitch_hist = mp.analyse.pitch_histogram(m_df,
do_plot=True,
visulize_midi_range=None)
# With the first line a histogram representation with the name 'pitch_hist' is generated from 'm_df'.
# By the parameter 'do_plot=True' the graphic is displayed.
# By 'visulize_midi_range=None' a table display (see below) is prevented.
What can we recognize?
Mozart apparently uses mainly notes of the E-flat major scale in the composition (E-flat=D#, A-flat=G#, B-flat=a#, etc.) and hardly any chromatic notes.
Tip: The graphic can also be displayed in an external pop-up window of the program Matplotlib and further processed, enlarged, reformatted and saved etc. there. To do this, the code must be preceded by the command '%matplotlib'. Afterwards Matplotlib must be switched off again by the command '%matplotlib inline'. Otherwise all following graphics are also displayed externally.
We now want to know exactly how often the individual pitches appear!
# The frequency table is generated when we rearrange the parameters:
# 'do_plot=None'
pitch_hist = mp.analyse.pitch_histogram(m_df,
do_plot=None,
do_plot_full_axis=True,
visulize_midi_range=None,
filter_dict=None,
enharmonic=True)
mp.utils.display_table(data=pitch_hist,
columns=['MIDI', 'Pitch', 'Octave', 'Occurences'])
# The second command displays and labels the table.
MIDI | Pitch | Octave | Occurences |
---|---|---|---|
39 | E-1 | 2 | 7 |
41 | F | 2 | 7 |
43 | G | 2 | 9 |
44 | A-1 | 2 | 13 |
45 | A0 | 2 | 6 |
46 | B-1 | 2 | 51 |
47 | B0 | 2 | 4 |
48 | C | 3 | 22 |
49 | D-1 | 3 | 2 |
50 | D | 3 | 25 |
51 | E-1 | 3 | 59 |
53 | F | 3 | 46 |
55 | G | 3 | 45 |
56 | A-1 | 3 | 39 |
57 | A0 | 3 | 35 |
57 | A | 3 | 2 |
58 | B-1 | 3 | 121 |
59 | B0 | 3 | 11 |
60 | C | 4 | 70 |
62 | D | 4 | 98 |
63 | E-1 | 4 | 174 |
65 | F | 4 | 119 |
66 | F1 | 4 | 2 |
67 | G | 4 | 84 |
68 | A-1 | 4 | 73 |
69 | A0 | 4 | 30 |
70 | B-1 | 4 | 101 |
71 | B0 | 4 | 2 |
72 | C | 5 | 53 |
74 | D | 5 | 54 |
75 | E-1 | 5 | 88 |
76 | E0 | 5 | 1 |
77 | F | 5 | 42 |
78 | F1 | 5 | 2 |
79 | G | 5 | 34 |
80 | A-1 | 5 | 25 |
81 | A0 | 5 | 13 |
82 | B-1 | 5 | 29 |
83 | B0 | 5 | 1 |
84 | C | 6 | 5 |
86 | D | 6 | 5 |
87 | E-1 | 6 | 7 |
The table also shows the respective accidentals in the score. This can be useful when modulations into distant keys occur, for example. Here -1 stands for a b-sign, 1 for a #-sign, 0 for a resolution sign, -2 for a double bb and so on. This means:
Please pay attention to A3: Midi pitch 57 appears in two lines, since the A occurs 35 times with natural sign in the score, but twice without natural sign (presumably later in the measure of an already resolved measure).
If these specifications are too differentiated for me, I can switch them off with the additional parameter 'enharmonic=False'. Now E-flat becomes D-sharp (D#), A-flat becomes G-sharp (G#) and so on. Additionally I have to rename the column names of the table now, because now pitch and octave position are merged into one column:
pitch_hist = mp.analyse.pitch_histogram(m_df,
do_plot=None,
do_plot_full_axis=True,
visulize_midi_range=None,
filter_dict=None,
enharmonic=False)
mp.utils.display_table(data=pitch_hist,
columns=['MIDI', 'Pitch', 'Occurences'])
# The second command redraws and labels the table columns.
MIDI | Pitch | Occurences |
---|---|---|
39 | D#2 | 7 |
41 | F2 | 7 |
43 | G2 | 9 |
44 | G#2 | 13 |
45 | A2 | 6 |
46 | A#2 | 51 |
47 | B2 | 4 |
48 | C3 | 22 |
49 | C#3 | 2 |
50 | D3 | 25 |
51 | D#3 | 59 |
53 | F3 | 46 |
55 | G3 | 45 |
56 | G#3 | 39 |
57 | A3 | 37 |
58 | A#3 | 121 |
59 | B3 | 11 |
60 | C4 | 70 |
62 | D4 | 98 |
63 | D#4 | 174 |
65 | F4 | 119 |
66 | F#4 | 2 |
67 | G4 | 84 |
68 | G#4 | 73 |
69 | A4 | 30 |
70 | A#4 | 101 |
71 | B4 | 2 |
72 | C5 | 53 |
74 | D5 | 54 |
75 | D#5 | 88 |
76 | E5 | 1 |
77 | F5 | 42 |
78 | F#5 | 2 |
79 | G5 | 34 |
80 | G#5 | 25 |
81 | A5 | 13 |
82 | A#5 | 29 |
83 | B5 | 1 |
84 | C6 | 5 |
86 | D6 | 5 |
87 | D#6 | 7 |
Now A3 appears in only one column, with frequency 37.
The following command exports the list of pitch frequencies as a csv file (csv = comma separated variables; readable and processable in Excel or the text editor, among others). The export can be used to generate tables for comparisons and corpus analysis. The csv file is saved in the export folder and can be opened with a text editor or a spreadsheet program (e.g. Excel).
mp.utils.export_as_csv(data=pitch_hist,
columns=['MIDI','Pitch','Occurrences'],
save_file_name ='pitch_histogram.csv', # auch andere Dateinamen sind möglich
do_save=True, # Command for saving
do_print=False, # at 'True' the file is displayed again in the browser
sep=';', # a semicolon is used as separator
header=True) # the headers of the columns are displayed
# The pitch_histogram.csv file is automatically saved in the music_xml_parser\data\exports\ folder.
# If you want to save it in another folder,
# you have to enter a path under save_file_name (e.g. 'C:/pitch_histogram.csv')
There are two ways to make the graphical representation a little clearer: On the one hand, the representation can be restricted to a certain pitch range. On the other hand, only those pitches can be selected that actually occur.
ph = mp.analyse.pitch_histogram(m_df,
do_plot=True,
visulize_midi_range=[50, 90])
# The addition 'visulize_midi_range=[50, 90]' limits the displayed section
# to the range between MIDI pitch 50 (= D3) and 90 (= F6).
ph2 = mp.analyse.pitch_histogram(m_df,
do_plot=True,
visulize_midi_range=None,
do_plot_full_axis=False)
# With the addition 'do_plot_full_axis=False,' only the frequencies
# of tones that actually occur are displayed in the graph.
# All other tones are deleted on the x-axis.
For harmonic analyses, it is much clearer not to group the individual pitches, but to group them into pitch classes.
pitchclass_hist = mp.analyse.pitch_class_histogram(m_df,
do_plot=True)
Now it can be seen at a glance: Mozart uses almost exclusively the notes of the E-flat major scale - with one interesting exception: the tritone a appears relatively often!
What could this be related to? To answer this question, of course, you need to look at the score and check the uses of the note a there. Could it have to do with the use of double dominants (F major)?
As for pitches (see section 2.1), tables can also be displayed for pitch classes. The frequency table is created automatically as soon as the plot parameter ('do_plot=None') is issued in the command:
pitchclass_hist = mp.analyse.pitch_class_histogram(m_df,
do_plot=None)
mp.utils.display_table(data=pitchclass_hist,
columns=['Pitch Class','Occurences'])
# The second command displays and labels the table.
# You can name the label individually ('red expressions')
Tonhöhenklasse | Häufigkeit |
---|---|
C | 150 |
C# | 2 |
D | 182 |
D# | 335 |
E | 1 |
F | 214 |
F# | 4 |
G | 172 |
G# | 150 |
A | 86 |
A# | 302 |
B | 18 |
# The frequencies of the pitch classes are exported as a csv file with the following command:
mp.utils.export_as_csv(data=pitchclass_hist,
columns=['Pitch Class','Occurrences'],
save_file_name ='pitch_class_hist.csv',
do_save=False,
do_print=None,
sep=';',
header=True)
# (For the parameters see above, 2.1)
How often does a certain interval step occur in the individual voices? Do all voices have a similar interval progression - or are there more leaps in the lower voices, for example, and more steps in the melody voice?
First, let's look at the interval distribution in the first violin
v = m_df[['PartID','PartName']].drop_duplicates().to_numpy()
print(v)
# This command first displays the voices and their names.
[['1' 'Violino I'] ['2' 'Violino II'] ['3' 'Viola'] ['4' 'Violoncello']]
# Now we select the first voice (with the PartID=1): 'part='1'.
interval_hist = mp.analyse.interval(m_df,
part='1',
do_plot=True)
The first violin progresses primarily in seconds, thirds, and fourths, with descending steps being more common than ascending ones. Larger intervals also occur, but are much rarer.
# The following command displays the distribution of interval frequencies
# as a table (only if 'do_print=True')
# and exports the table to the file 'interval_1.csv' (only if 'do_save=True').
mp.utils.export_as_csv(data=interval_hist,
columns=['Interval', 'Occurences'],
save_file_name ='interval_1.csv',
do_save=True,
do_print=False,
do_return_pd=False,
sep=';',
index=False,
header=True)
Now, what about the cello part?
For this you simply have to replace the '1' with a '4' at the right place...
Simply copy the entire command into a new code cell and adjust the voice selection:
# Here is the customized command:
interval_hist = mp.analyse.interval(m_df,
part='4',
do_plot=True)
Noticeably, fourths up (5), seconds up (2) as well as fifths down (-7) occur quite frequently. Perhaps this is a hint to the fundamental tones that can be interpreted harmonically?
Now let's turn to rhythmic shaping: what duration values are used in the composition, and how often do they occur in each case?
In the following evaluation, the quarter note is given the value 1. Shorter and longer note values are named accordingly as multiples or divisors of 1.
quarter_dur_hist = mp.analyse.quarterlength_duration_histogram(m_df,
do_plot=True)
As expected, Mozart uses mainly quarter notes (1) and smaller note values (<1). However, there are also a few longer notes. If we want to know the exact number of duration values and are also interested in the <1 range, we have to display the frequency table again:
quarter_dur_hist = mp.analyse.quarterlength_duration_histogram(m_df,
do_plot=False)
# The table is created only with 'do_plot=False' or 'do_plot=None'
# The table is created and labeled by the following command:
mp.utils.display_table(data=quarter_dur_hist,
columns=['Duration Class','Occurences'])
Dauernklasse | Häufigkeit |
---|---|
0.125 | 38 |
0.250 | 202 |
0.375 | 4 |
0.500 | 497 |
0.750 | 22 |
1.000 | 694 |
1.500 | 30 |
2.000 | 51 |
3.000 | 42 |
4.000 | 15 |
5.000 | 1 |
6.000 | 18 |
7.000 | 1 |
10.000 | 1 |
To explain the duration values: They are multiples or divisors of a quarter note (=1). Thus:
# Here is the command to save the table:
mp.utils.export_as_csv(data=quarter_dur_hist,
columns=['Duration Values', 'Occurrences'],
save_file_name ='quarter_duration_hist.csv',
do_save=True,
do_print=False,
sep=';',
index=False,
header=True)
How clearly is the meter articulated in the voices of a composition - by the placement of tones on measure beginnings or on metrically important positions within the measure (e.g. the middle of the measure or on the quarter positions)? For this purpose, a list of the frequencies of tones on the various metrical positions can be displayed.
Of course, such a profile presupposes that the examined piece is in a single meter and has no meter changes. This can be checked with the following command:
ts_hist = mp.analyse.time_signature_histogram(m_df,
do_plot=False)
mp.utils.display_table(data=ts_hist,
columns=['Meter', 'Ocurrences'])
Taktart | Anzahl |
---|---|
4/4 | 31 |
3/4 | 128 |
In our Mozart movement, both 4/4 and 3/4 measures appear, whereby the 3/4 measure even predominates - although the piece begins in 4/4 time. The following command therefore creates two different metric profiles - one for the 4/4 measures, one for the 3/4 measures.
mp_ts_dict_2d = mp.analyse.metric_profile_split_time_signature(m_df,
do_plot=True)
# The command for the table display (do_print=True)
# and the csv export (do_save=True) looks a bit complicated,
# but does the same as always:
for k2 in mp_ts_dict_2d.keys():
print(f"Time Signature {k2}")
saveas = 'metric_profile_ts_'+k2.replace('/','-')+'.csv'
mp.utils.export_as_csv(data=mp_ts_dict_2d[k2],
columns=['Metric Profile','Occurrences'],
save_file_name ='metric_profile_hist.csv',
do_save=False,
do_print=True,
do_return_pd=False,
sep=';',
index=False,
header=True)
Time Signature 3/4
Metric Profile | Occurrences |
---|---|
1.00 | 344 |
1.50 | 55 |
1.75 | 18 |
1.88 | 13 |
2.00 | 227 |
2.50 | 78 |
3.00 | 258 |
3.25 | 3 |
3.50 | 65 |
3.75 | 9 |
3.88 | 4 |
Time Signature 4/4
Metric Profile | Occurrences |
---|---|
1.00 | 85 |
1.25 | 12 |
1.38 | 4 |
1.50 | 27 |
1.75 | 12 |
2.00 | 71 |
2.25 | 12 |
2.50 | 46 |
2.75 | 12 |
3.00 | 91 |
3.25 | 16 |
3.50 | 29 |
3.75 | 13 |
4.00 | 52 |
4.25 | 13 |
4.50 | 34 |
4.75 | 13 |
We have already looked at the frequencies of pitches and pitch classes. Now we could say: Longer tones naturally have more weight than short tones or tones between beats. We can pursue this idea further by looking at combined, 'double' or 'bivariate' frequency distributions: for example, the frequencies of the pitches for different duration values, or the frequencies of the pitch class for the different metrical positions. In the following, we will deal with this by means of two examples.
Example 1: Durations per pitch classes. Are there differences in the duration values for different pitch classes?
Example 2: Pitch on metric positions. Are there differences in different positions in the measure regarding different pitch classes?
The following command creates a so-called 3D graphic, where the frequencies of duration values per pitch class are displayed. Both the height and the color of the columns stand for the respective frequency (from blue=very rare via green and yellow to red=very frequent):
dur_hist = mp.analyse.quarterlength_duration_histogram(m_df,
plot_with='PitchClass',
do_plot=True)
Since the assignment of the bars to the note values is a bit confusing (the numbers refer to the subsequent fields), we use the following command to display the corresponding frequency table:
dur_hist = mp.analyse.quarterlength_duration_histogram(m_df,
plot_with='PitchClass',
do_plot=False)
mp.utils.export_as_csv(data=dur_hist,
columns=['Pitch Class', 'Duration Value', 'Occurences'],
save_file_name='QuaterLength.csv',
do_save=False, # With =True a csv-file is saved.
do_print=True, # With =True al table is depicted.
do_return_pd=False,
sep=';',
index=False,
header=True)
Tonhöhenklasse | Dauernwert | Anzahl |
---|---|---|
C | 0.125 | 5 |
C | 0.25 | 16 |
C | 0.5 | 38 |
C | 0.75 | 4 |
C | 1.0 | 67 |
C | 1.5 | 6 |
C | 2.0 | 7 |
C | 3.0 | 3 |
C | 4.0 | 4 |
C# | 0.5 | 2 |
A# | 0.125 | 4 |
A# | 0.25 | 32 |
A# | 0.5 | 91 |
A# | 0.75 | 3 |
A# | 1.0 | 135 |
A# | 1.5 | 7 |
A# | 10.0 | 1 |
A# | 2.0 | 11 |
A# | 3.0 | 10 |
A# | 4.0 | 2 |
A# | 6.0 | 5 |
A# | 7.0 | 1 |
B | 0.5 | 2 |
B | 1.0 | 16 |
D | 0.125 | 4 |
D | 0.25 | 19 |
D | 0.5 | 69 |
D | 0.75 | 4 |
D | 1.0 | 79 |
D | 2.0 | 2 |
D | 3.0 | 5 |
D# | 0.125 | 5 |
D# | 0.25 | 41 |
D# | 0.375 | 2 |
D# | 0.5 | 100 |
D# | 0.75 | 2 |
D# | 1.0 | 144 |
D# | 1.5 | 7 |
D# | 2.0 | 13 |
D# | 3.0 | 15 |
D# | 4.0 | 2 |
D# | 5.0 | 1 |
D# | 6.0 | 3 |
E | 0.5 | 1 |
F | 0.125 | 8 |
F | 0.25 | 33 |
F | 0.5 | 63 |
F | 0.75 | 3 |
F | 1.0 | 88 |
F | 1.5 | 6 |
F | 2.0 | 7 |
F | 3.0 | 4 |
F | 4.0 | 2 |
F# | 1.0 | 4 |
G | 0.125 | 6 |
G | 0.25 | 28 |
G | 0.5 | 53 |
G | 0.75 | 3 |
G | 1.0 | 67 |
G | 1.5 | 2 |
G | 2.0 | 1 |
G | 3.0 | 5 |
G | 4.0 | 4 |
G | 6.0 | 3 |
G# | 0.125 | 2 |
G# | 0.25 | 29 |
G# | 0.375 | 2 |
G# | 0.5 | 46 |
G# | 0.75 | 1 |
G# | 1.0 | 51 |
G# | 1.5 | 2 |
G# | 2.0 | 9 |
G# | 4.0 | 1 |
G# | 6.0 | 7 |
A | 0.125 | 4 |
A | 0.25 | 4 |
A | 0.5 | 32 |
A | 0.75 | 2 |
A | 1.0 | 43 |
A | 2.0 | 1 |
It is not very surprising that the root E-flat (=D#) and the fifth B (=A#) occur mainly as eighth notes (0.5) and quarter notes (1.0). After all, these are the most frequent duration values!
The plot can be changed from pitch classes to pitches by selecting 'Pitch' (in single quotes) at the parameter plot_with=. Now, however, the 3D graphic becomes a bit more confusing...
dur_pitch_hist = mp.analyse.quarterlength_duration_histogram(m_df,
plot_with='Pitch',
do_plot=True)
# Commands for tables and export (see above)
# For column names (colums= ) 'PitchClass' () has to be changed into 'Pitch'.
dur_pitch_hist = mp.analyse.quarterlength_duration_histogram(m_df,
plot_with='Pitch',
do_plot=False)
mp.utils.export_as_csv(data=dur_pitch_hist,
columns=['Pitch','Duration Value', 'Occurences'],
save_file_name='QuaterLength.csv',
do_save=False,
do_print=True,
do_return_pd=False,
sep=';',
index=False,
header=True)
Tonhöhe | Dauernwert | Anzahl |
---|---|---|
D#2 | 0.5 | 2 |
D#2 | 1.0 | 4 |
D#2 | 3.0 | 1 |
F2 | 0.5 | 2 |
F2 | 1.0 | 4 |
F2 | 2.0 | 1 |
G2 | 0.5 | 2 |
G2 | 1.0 | 6 |
G2 | 3.0 | 1 |
G#2 | 0.5 | 3 |
G#2 | 1.0 | 8 |
G#2 | 1.5 | 1 |
G#2 | 6.0 | 1 |
A2 | 1.0 | 6 |
A#2 | 0.5 | 18 |
A#2 | 1.0 | 27 |
A#2 | 2.0 | 2 |
A#2 | 3.0 | 4 |
B2 | 0.5 | 1 |
B2 | 1.0 | 3 |
C3 | 0.5 | 7 |
C3 | 1.0 | 12 |
C3 | 2.0 | 2 |
C3 | 4.0 | 1 |
C#3 | 0.5 | 2 |
D3 | 0.5 | 4 |
D3 | 1.0 | 18 |
D3 | 2.0 | 1 |
D3 | 3.0 | 2 |
D#3 | 0.5 | 8 |
D#3 | 1.0 | 39 |
D#3 | 1.5 | 2 |
D#3 | 2.0 | 1 |
D#3 | 3.0 | 8 |
D#3 | 5.0 | 1 |
F3 | 0.5 | 11 |
F3 | 1.0 | 28 |
F3 | 1.5 | 2 |
F3 | 2.0 | 2 |
F3 | 3.0 | 1 |
F3 | 4.0 | 2 |
G3 | 0.25 | 2 |
G3 | 0.5 | 11 |
G3 | 1.0 | 25 |
G3 | 1.5 | 1 |
G3 | 3.0 | 4 |
G3 | 4.0 | 2 |
G#3 | 0.25 | 5 |
G#3 | 0.5 | 11 |
G#3 | 1.0 | 15 |
G#3 | 1.5 | 1 |
G#3 | 2.0 | 3 |
G#3 | 4.0 | 1 |
G#3 | 6.0 | 3 |
A3 | 0.25 | 4 |
A3 | 0.5 | 5 |
A3 | 1.0 | 27 |
A3 | 2.0 | 1 |
A#3 | 0.25 | 15 |
A#3 | 0.5 | 9 |
A#3 | 1.0 | 72 |
A#3 | 1.5 | 4 |
A#3 | 2.0 | 7 |
A#3 | 3.0 | 6 |
A#3 | 4.0 | 1 |
A#3 | 6.0 | 5 |
A#3 | 7.0 | 1 |
A#3 | 10.0 | 1 |
B3 | 0.5 | 1 |
B3 | 1.0 | 10 |
C4 | 0.25 | 11 |
C4 | 0.5 | 12 |
C4 | 0.75 | 2 |
C4 | 1.0 | 36 |
C4 | 1.5 | 3 |
C4 | 2.0 | 1 |
C4 | 3.0 | 3 |
C4 | 4.0 | 2 |
D4 | 0.25 | 19 |
D4 | 0.5 | 27 |
D4 | 0.75 | 2 |
D4 | 1.0 | 46 |
D4 | 2.0 | 1 |
D4 | 3.0 | 3 |
D#4 | 0.25 | 37 |
D#4 | 0.5 | 41 |
D#4 | 1.0 | 74 |
D#4 | 1.5 | 4 |
D#4 | 2.0 | 7 |
D#4 | 3.0 | 6 |
D#4 | 4.0 | 2 |
D#4 | 6.0 | 3 |
F4 | 0.25 | 29 |
F4 | 0.5 | 30 |
F4 | 1.0 | 50 |
F4 | 1.5 | 3 |
F4 | 2.0 | 4 |
F4 | 3.0 | 3 |
F#4 | 1.0 | 2 |
G4 | 0.125 | 2 |
G4 | 0.25 | 23 |
G4 | 0.5 | 27 |
G4 | 1.0 | 25 |
G4 | 1.5 | 1 |
G4 | 2.0 | 1 |
G4 | 4.0 | 2 |
G4 | 6.0 | 3 |
G#4 | 0.125 | 1 |
G#4 | 0.25 | 20 |
G#4 | 0.5 | 21 |
G#4 | 0.75 | 1 |
G#4 | 1.0 | 23 |
G#4 | 2.0 | 4 |
G#4 | 6.0 | 3 |
A4 | 0.125 | 3 |
A4 | 0.5 | 18 |
A4 | 0.75 | 1 |
A4 | 1.0 | 8 |
A#4 | 0.125 | 3 |
A#4 | 0.25 | 14 |
A#4 | 0.5 | 45 |
A#4 | 0.75 | 2 |
A#4 | 1.0 | 31 |
A#4 | 1.5 | 3 |
A#4 | 2.0 | 2 |
A#4 | 4.0 | 1 |
B4 | 1.0 | 2 |
C5 | 0.125 | 5 |
C5 | 0.25 | 5 |
C5 | 0.5 | 18 |
C5 | 0.75 | 2 |
C5 | 1.0 | 18 |
C5 | 1.5 | 3 |
C5 | 2.0 | 1 |
C5 | 4.0 | 1 |
D5 | 0.125 | 4 |
D5 | 0.5 | 34 |
D5 | 0.75 | 2 |
D5 | 1.0 | 14 |
D#5 | 0.125 | 5 |
D#5 | 0.25 | 4 |
D#5 | 0.375 | 2 |
D#5 | 0.5 | 43 |
D#5 | 0.75 | 2 |
D#5 | 1.0 | 26 |
D#5 | 1.5 | 1 |
D#5 | 2.0 | 5 |
E5 | 0.5 | 1 |
F5 | 0.125 | 8 |
F5 | 0.25 | 4 |
F5 | 0.5 | 20 |
F5 | 0.75 | 3 |
F5 | 1.0 | 6 |
F5 | 1.5 | 1 |
F#5 | 1.0 | 2 |
G5 | 0.125 | 4 |
G5 | 0.25 | 3 |
G5 | 0.5 | 13 |
G5 | 0.75 | 3 |
G5 | 1.0 | 11 |
G#5 | 0.125 | 1 |
G#5 | 0.25 | 4 |
G#5 | 0.375 | 2 |
G#5 | 0.5 | 11 |
G#5 | 1.0 | 5 |
G#5 | 2.0 | 2 |
A5 | 0.125 | 1 |
A5 | 0.5 | 9 |
A5 | 0.75 | 1 |
A5 | 1.0 | 2 |
A#5 | 0.125 | 1 |
A#5 | 0.25 | 3 |
A#5 | 0.5 | 19 |
A#5 | 0.75 | 1 |
A#5 | 1.0 | 5 |
B5 | 1.0 | 1 |
C6 | 0.5 | 1 |
C6 | 1.0 | 1 |
C6 | 2.0 | 3 |
D6 | 0.5 | 4 |
D6 | 1.0 | 1 |
D#6 | 0.5 | 6 |
D#6 | 1.0 | 1 |
# In the external display, the graphic can be rotated and enlarged.
# Please delete the # in front of the command:
# %matplotlib
dur_p_hist = mp.analyse.quarterlength_duration_histogram(m_df,
plot_with='Pitch',
do_plot=True)
Using matplotlib backend: Qt5Agg
# To switch off the external display, please, choose the following command:
%matplotlib inline
Now to the question: on which positions in the bar occur the twelve pitch classes? The following command will generate the corresponding 3D graphic.
mp_p_hist = mp.analyse.metric_profile(m_df,
plot_with='PitchClass',
do_plot=True)
# Here is the corresponding table with the usual export function.
mp_p_hist = mp.analyse.metric_profile(m_df,
plot_with='PitchClass',
do_plot=False)
mp.utils.export_as_csv(data=mp_p_hist,
columns=['Pitch Class', 'Metric Position', 'Occurences'],
save_file_name ='metric_profile_hist.csv',
do_save=True,
do_print=True,
do_return_pd=False,
sep=';',
index=False,
header=True)
Tonhöhenklasse | Metrische Position | Anzahl |
---|---|---|
C | 0.0 | 48 |
C | 0.375 | 2 |
C | 0.5 | 4 |
C | 0.75 | 2 |
C | 0.875 | 1 |
C | 1.0 | 23 |
C | 1.25 | 3 |
C | 1.5 | 13 |
C | 2.0 | 33 |
C | 2.5 | 9 |
C | 3.0 | 3 |
C | 3.25 | 2 |
C | 3.5 | 7 |
C# | 0.0 | 1 |
C# | 1.0 | 1 |
D | 0.0 | 34 |
D | 0.25 | 1 |
D | 0.5 | 11 |
D | 0.75 | 2 |
D | 0.875 | 2 |
D | 1.0 | 42 |
D | 1.25 | 1 |
D | 1.5 | 14 |
D | 1.75 | 3 |
D | 2.0 | 46 |
D | 2.5 | 12 |
D | 3.0 | 6 |
D | 3.25 | 4 |
D | 3.5 | 4 |
D# | 0.0 | 108 |
D# | 0.25 | 4 |
D# | 0.5 | 21 |
D# | 0.75 | 8 |
D# | 0.875 | 2 |
D# | 1.0 | 47 |
D# | 1.5 | 22 |
D# | 1.75 | 1 |
D# | 2.0 | 65 |
D# | 2.25 | 6 |
D# | 2.5 | 24 |
D# | 2.75 | 7 |
D# | 3.0 | 11 |
D# | 3.5 | 2 |
D# | 3.75 | 7 |
E | 1.0 | 1 |
F | 0.0 | 58 |
F | 0.25 | 1 |
F | 0.375 | 2 |
F | 0.5 | 3 |
F | 0.75 | 6 |
F | 0.875 | 1 |
F | 1.0 | 35 |
F | 1.25 | 3 |
F | 1.5 | 27 |
F | 1.75 | 3 |
F | 2.0 | 42 |
F | 2.25 | 3 |
F | 2.5 | 8 |
F | 2.75 | 7 |
F | 2.875 | 2 |
F | 3.0 | 5 |
F | 3.5 | 8 |
F# | 2.0 | 4 |
G | 0.0 | 45 |
G | 0.5 | 7 |
G | 0.75 | 1 |
G | 0.875 | 3 |
G | 1.0 | 30 |
G | 1.25 | 4 |
G | 1.5 | 9 |
G | 2.0 | 42 |
G | 2.5 | 17 |
G | 2.75 | 2 |
G | 3.25 | 6 |
G | 3.5 | 6 |
G# | 0.0 | 26 |
G# | 0.25 | 4 |
G# | 0.5 | 18 |
G# | 0.75 | 5 |
G# | 0.875 | 1 |
G# | 1.0 | 35 |
G# | 1.5 | 13 |
G# | 2.0 | 20 |
G# | 2.25 | 7 |
G# | 2.5 | 2 |
G# | 3.0 | 16 |
G# | 3.25 | 1 |
G# | 3.75 | 2 |
A | 0.0 | 8 |
A | 0.25 | 1 |
A | 0.5 | 4 |
A | 0.75 | 2 |
A | 1.0 | 26 |
A | 1.5 | 5 |
A | 1.75 | 3 |
A | 2.0 | 28 |
A | 2.5 | 6 |
A | 2.875 | 2 |
A | 3.0 | 1 |
A# | 0.0 | 101 |
A# | 0.25 | 1 |
A# | 0.5 | 14 |
A# | 0.75 | 4 |
A# | 0.875 | 3 |
A# | 1.0 | 46 |
A# | 1.25 | 1 |
A# | 1.5 | 21 |
A# | 1.75 | 2 |
A# | 2.0 | 63 |
A# | 2.25 | 3 |
A# | 2.5 | 16 |
A# | 2.75 | 6 |
A# | 3.0 | 10 |
A# | 3.5 | 7 |
A# | 3.75 | 4 |
B | 1.0 | 12 |
B | 2.0 | 6 |
The same works with pitches. For this we simply have to replace 'PitchClass' with 'Pitch' in the ‘plot_with’ parameter:
mp_pc_hist = mp.analyse.metric_profile(m_df,
plot_with='Pitch',
do_plot=True)
# table (do_print=True) and csv export (do_save=True):
mp_pc_hist = mp.analyse.metric_profile(m_df,
plot_with='Pitch',
do_plot=None)
# There are now four columns in the table output,
# therefore four labels has to be selected:
mp.utils.export_as_csv(data=mp_pc_hist,
columns=['MIDI-Pitch','Pitch','Metric Position','Occurences'],
save_file_name ='metric_profile_hist.csv',
do_save=False,
do_print=False,
do_return_pd=False,
sep=';',
index=False,
header=True)
Again, it may be useful to look at the graph in Matplotlib's external pop-up window.
# %matplotlib
# mp_pc_hist = mp.analyse.metric_profile(m_df, plot_with='PitchClass', do_plot=True)
%matplotlib inline
However, we have now made a mistake: the Mozart movement does have a change of time signatures!!! However, we did not distinguish between 4/4 time and 3/4 time in our evaluation. Therefore, we have to repeat all commands again after we have differentiated into the two time signature types.
# Here again the command for the distinction of two different meters:
# In direct connection the pitch-duration profiles are displayed separately:
mp_ts_dict_2d = mp.analyse.metric_profile_split_time_signature(m_df,
plot_with=None,
do_plot=True)
# Here is the command for displaying
# he 3D graphics (do_plot=True)
# the table (do_print=True)
# and the csv export (do_save=True):
mp_ts_dict_p = mp.analyse.metric_profile_split_time_signature(m_df,
plot_with='PitchClass',
do_plot=True)
for k2p in mp_ts_dict_p.keys():
print(f"Time Signature {k2p}")
saveas = 'metric_profile_ts_p_'+k2p.replace('/','-')+'.csv'
mp.utils.export_as_csv(data=mp_ts_dict_p[k2p],
columns=['MIDI', 'Metric Profile','Occurrences'],
save_file_name ='metric_profile_hist.csv',
do_save=False,
do_print=False,
do_return_pd=False,
sep=';',
index=False,
header=True)
# Here is the command for displaying he 3D graphics (do_plot=True)
Time Signature 3/4 Time Signature 4/4
All statistical queries can be executed on any sections and voices with the help of an easy-to-use filter function (cf. Tutorial Part 1, Section 5). All you have to do is to enter the corresponding measures and voice names.
Here are two examples:
This selection can be changed at will!
filter_dict_interval ={'PartID':'1-2', 'Measure':'1-5'}
interval_hist_example = mp.analyse.interval(m_df,
do_plot=True,
filter_dict = filter_dict_interval)
mp.utils.export_as_csv(data=interval_hist_example,
columns=['Interval', 'Occurrences'],
save_file_name ='interval.csv',
do_save=False,
do_print=False,
do_return_pd=False,
sep=';',
index=False,
header=True)
This selection can also be changed as desired!
#filter_dict_cello ={'PartID':'4', 'Measure':'1-10'}
filter_dict_cello ={'PartID':'4'}
pitchclass_hist_cello = mp.analyse.pitch_class_histogram(m_df,
do_plot=True,
filter_dict = filter_dict_cello)
mp.utils.export_as_csv(data=pitchclass_hist_cello,
columns=['Pitch Class', 'Occurrences'],
save_file_name ='cello.csv',
do_save=False,
do_print=True,
do_return_pd=False,
sep=';',
index=False,
header=True)
Tonhöhenklasse | Häufigkeit |
---|---|
C | 26 |
C# | 2 |
D | 22 |
D# | 49 |
F | 33 |
G | 21 |
G# | 19 |
A | 8 |
A# | 57 |
B | 4 |
So far, we have only looked at the results on the basis of a single piece. But how does the situation look now if we compare several pieces, e.g. several or all movements of a composition, with each other - and with other pieces? Are there stylistic regularities - or do the differences predominate?
Load compositions of your choice (different genres, composers, and eras) and compare these pieces with each other in terms of frequencies of pitches, pitch classes, note values, and intervals. Interpret the results in each case with a look at the sheet music!