In this tutorial, the insights gained in Tutorial: Spectral Representation with the Sonic Visualiser are continued and deepened using a vocal recording. The focus is on the analysis of the melodic design, the sound characteristics and the rhythm.

 Please download Audio02.mp3 to your computer. 
 Listen to the recording at your leisure. What stands out?


This is a recording of the song „Come Back, Baby“ (1954) by soul singer Ray Charles. In this mono recording, Charles' vocals take center stage; the backing band (rhythm section, horns) can be heard relatively quietly in the background. This is advantageous for the analysis of the vocals, whose spectral representation is clearly visible in the spectrogram and not „covered up“ by the accompanying band.

 Please make a spectral representation of the recording in the Sonic Visualiser. 
 Look at the spectral representation when you listen to it again (follow playback: scroll). 
 Compare your hearing impression with what you see in the spectrogram.
 What else do you notice?    

The following sections are devoted to conspicuous features of the melodic, tonal and rhythmic design of the recording as observed in the spectrogram.

The fundamental (F0) and harmonics can be seen as parallel lines in the spectrogram.

 What stands out when you look at the vocal line?
 In comparison, how do the wind chords in the background look like?
 Look for passages in the recording with
 - a strong glide of the voice
 - vibrato
 - various ornamentations 

Various aspects overlap in the sound of the singing voice:

  • peculiarities of speech (vowel formants, consonants),
  • the anatomically determined personal sound of a voice,
  • special melodic means of expression (sliders, ornaments, etc.),
  • use of different vocal registers (chest voice, falsetto, etc.),
  • use of different types of voicing and vocal techniques (shouting, belting, crooning,; roughness, breathy, twang, etc.).

With a little practice, some of these peculiarities can be detected in the spectrogram - but unfortunately not all and not in every vocal recording.

 Listen to the recording again. 
 Make a note of Ray Charles' sonic idiosyncrasies.
 How does his voice sound - in general and in certain passages?

Overtone structure and formants (formant ranges) of vowels:
The fundamental vibration of the vocal folds (F0) generally corresponds to the perceived fundamental pitch. The first two formants F1 (pharyngeal area) and F2 (labial area) are important for the comprehensibility of the vowels. The position of these two formant areas characterizes the spoken or sung vowel. Here is a table with orientation values for the position of F1 and F2.

 Vowel      F1         F2
 U    u     320 Hz   800 Hz
 O    o     500 Hz   1000 Hz
 Å    ɑ     700 Hz   1150 Hz
 A    a     1000 Hz  1400 Hz
 ö    ø     500 Hz   1500 Hz
 ü    y     320 Hz   1650 Hz
 ä    ɛ     700 Hz   1800 Hz
 E    e     500 Hz   2300 Hz
 I    i     320 Hz   3200 Hz
 Please enlarge the lower range of the spectrogram 0 - approx. 4000 Hz. 
 Can you recognize the different formant ranges F1 and F2 for (loudly) sung vowels?

The third and the fourth formant range F3 (oral cavity, approx. 2-3 kHz) and F4 (approx. 3-5 kHz) are no longer essential for the articulation of the vowels. They rather characterize the individual anatomy of the singer, her articulation peculiarities and her 'timbre'.

There are only rough orientation points for the sound effect of a singing voice depending on particularly emphasized formant or frequency ranges (source):

 high level at    	Sound reception   Formanten
 200 bis 400 Hz 	sonorous    1. Formant u
 400 bis 600 Hz 	full        1. Formant o
 800 bis 1200 Hz 	marked   	1. Formant a
 1200 bis 1800 Hz 	nasal     	2. Formant ü
 1800 bis 2600 Hz 	bright      2. Formant e
 2600 bis 4000 Hz 	brilliant 	2. Formant i
 8000 Hz 	        sharp       diffuse Höhen
 über 10000 Hz 	    sharp	    overtone “shine”

Consonants: Sibilants such as s or sch are depicted as clouds (noise).

 Where are sibilants clearly visible in the spectral representation of the recording?

Passages in the falsetto (man) or the head voice (woman) have relatively few or less pronounced harmonics - compared to the full chest or modal voice. This is due to the fact that only the margins of the vocal fold vibrates in a comparatively sinusoidal motion.

Passages with rough and hoarse voicing are either recognizable as noise (gray clouds), or other parallel lines are discernible between the parallel harmonics. These subharmonics (between the harmonics, the harmonic partials) indicate another vibrational element (e.g., the pocket folds above the vocal folds) whose vibrations overlap with the vocal fold vibrations.
A breathed vocalization can be recognized by the high proportion of noise (gray clouds).

 Now try to relate your notes on the vocal sound (see above) to the phenomena described. 
 Look for the phenomena mentioned in the spectrogram of the Ray Charles recording: 
 - certain timbre qualities
 - passages in falsetto
 - passages with great roughness
 Take a closer look at the passage 1:42 - 1:46. 
 What can be discerned here?

 What is the relationship between the rhythm of the vocals and of the accompanying band?

Tip: Choose a slower playback speed and then try to hit along with the basic beat of the band (drums). Pay attention to which beat vertical lines in the spectrogram correspond to . Is the basic beat divided into two (binary), three (ternary) or four (quaternary) beats?\ Now look where the vocal notes of Ray Charles begin. Do they sound in sync with the band, before or after? If the latter: What is the approximate distance between the vocal and the band in milliseconds? What note value does this correspond to? (For reference, at a tempo of 60 bpm, each beat is exactly 1 second long).

Now visualize a vocal recording of your own choice and try to describe the special features of the melodic, tonal and rhythmic design using the spectral representation of selected excerpts. When making your selection, make sure that the vocals are not obscured by the accompanying band - or select passages in which the vocals are particularly exposed (e.g. solo passages at the beginning of a recording or during breaks).

Tip: You can also write notes directly into the spectrogram by creating a text layer and then saving the entire session. Export meaningful sections (File - Export Image File) to include in text file or presentation.

Heidemann, Kate (2016): „A System for Describing Vocal Timbre in Popular Song“, in it: Music Theory Online 22/1, online.

  • en/tutorial_singing.txt
  • Zuletzt geändert: 2021/10/04 07:15
  • von andres_romero