openopen
previous chapterprevious chapterprevious chapterprevious chapter
next chapternext chapternext chapternext chapter
closeclose
Materials Part III

M-N: Note on the Method

Empirical basis

As mentioned in the introduction, the empirical basis of this treatise—and the basis of the series of vowel sounds selected for presentation here—consists of recordings from various areas of everyday life, the entertainment sector and art, that is, stage voices in music and straight theatre. (For an additional investigation of sounds of birds imitating human utterances, see Section M10.A.)

The recordings were collected over a time period of more than 20 years with different techniques related to different sound qualities, and they represent utterances of speakers different in age and gender, producing vowel sounds in different contexts, with different durations and different vocal efforts. However, such variation is not a shortcoming but an intention here, since this treatise focuses on the psychophysical question of the vowel (see the introduction and Section 13.7): given that different vowel sounds are perceived as being related to a single vowel quality—in contrast to the variation of other vocal sound characteristics—, which describable physical characteristic or which ensemble of physical characteristics may be said to represent that quality?

Concerning the acoustic characteristics of vowel sounds, the sound examples presented here were produced in isolation or in word context by native German or Swiss-German speakers, with a few exceptions, and the vowel qualities correspond to Standard German. Because of the psychophysical perspective adopted here, and because of the large fundamental frequency range considered—including many high-pitched vowel sounds produced in isolation or in the context of high-pitched speech by untrained children, women and men as well as by professional actresses and actors—, no principal difference is made between speaking and singing for isolated vowel sounds or extracted vowel nuclei and no corresponding indication is given in the figures which would relate to a classificatory system of modes of vowel production.—Acoustic analysis as well as perceptual identification relates to sounds produced in isolation or extracted as vowel nuclei from words.

Concerning the acoustic characteristics of pitch contours, the examples presented here (see Section 8.2) only concern contours of speech. Thereby, they relate to utterances of speakers of different languages (see the corresponding figure legends).

Whereas one part of these recordings forms the basis of single, published investigations undertaken in the past, which included listening tests, another part is unpublished and the corresponding recordings have not been subject to any further identification tests, apart from the identification by the author: in the course of creating this publication, for each of the sound series of a single figure presented in the Materials section, the author has evaluated the perceptual vowel quality of each sound separately. Moreover, only sounds are presented for which the intended and the perceived vowel quality correspond.

Acoustic analysis

With regard to the acoustic analysis of the sounds in general and to the calculation of fundamental and formant frequencies in particular, automatically calculated values using routines from the PRAAT Software (Boersma & Weenink, 2015) related to corresponding standard parameters are given in the figures of Chapters 7 to 10.

Acoustic analysis was conducted on isolated vowel sounds or on extracted vowel nuclei and concerned F0, spectrum, formant frequencies and LPC curve. The present digital version of the Materials further includes the analysis of pitch contour, spectrogram, formant tracks and comparison of three formant patterns and three LPC curves related to the three standard parameter settings for children, women and men (see below).

For longer vowel sounds, a middle sound fragment of 0.3 s, and for shorter sounds, a middle vowel nucleus excluding onset and offset was analysed.

The fundamental frequency of a sound fragment was calculated as average value using the Praat command To Pitch. Calculated values were perceptually crosschecked. If calculation errors occurred, the parameters "Pitch floor" and "Pitch ceiling" were adjusted.

The spectrum of a sound fragment was calculated as average spectrum for 0–5.5 kHz.

The formant frequencies of a sound fragment were automatically calculated as average values of LPC analysis using the Praat command To Formant (robust). For the analysis of a frequency range of 0–5.5 kHz, PRAAT indicates a maximum number of formants = 5 for women as a standard which was applied here; relating to this standard, the maximum number of formants for children was set = 4, and for men, this number was set = 6.—For illustration purposes, an LPC curve was calculated related to the analysis window in the middle of the sound fragment analysed.

Please note:

For longer recordings of speech (see Section M8.2), only the pitch contour was analysed and perceptually crosschecked. If major calculation errors occurred, the parameters "pitch floor" and "Pitch ceiling" were again adjusted.

Illustrations related to vowel sounds

For each figure relating to a series of vowel sounds, the subject matter of illustration is explained in the text; it is also indicated in short form in the figure legend, followed by a corresponding link to the sound archive for a display of the sound spectra and for additional illustration of acoustic analysis.

Illustration form of the thumbnails (Mini) and Medium (M) layouts: Each spectrum of a vowel sound is given in terms of the display of the sound pressure level (SPL) in dB/ Hz (y-coordinate) for a frequency range of 0–5.5 kHz (x-coordinate). The LPC curve corresponding to the middle window of the sound analysed (see above) is overlaid.

Below a spectrum, the following indications are given:

Since the single vowel spectra relate to single vowel sounds, the vowel quality is given in square brackets. Note that the vowel quality of /a–ɑ/ is represented by the character "a" with no further differentiation. 

Additional indications in the layouts Medium and Large: Below the information listed above, all three formant patterns related to the three different age- and gender-related parameter settings are given in order to indicate the influence of these settings on the calculated formant frequencies.

The illustration in the layout Large (L) contains:

Note that the order of sound presentation in relation to vowel qualities and to F0 is not uniform throughout the entire Materials section; for each single section, this order accords to the subject matter illustrated and to the choice of the author.

Illustrations related to extracts of speech and of singing

For each figure relating to a series of extracts of speech and/or singing, as for the above, the subject matter of illustration is explained in the text; it is also indicated in short form in the figure legend, followed by a corresponding link to the sound archive for a display of the pitch contours and for additional illustration of acoustic analysis.

Illustration form of the thumbnails (Mini) and Medium (M) layouts: Each pitch contour of an extract of speech or singing is given as the pitch frequency in Hz (y-coordinate) over a time range in s (x-coordinate).

Below each contour, the following indications are given:

The illustration in the layout Large (L) contains:

Display options

For the display options, please refer to the menu item "Help – Assistant" in the sound archive: the assistant informs about the details of the user interface.