openopen
previous chapterprevious chapterprevious chapterprevious chapter
next chapternext chapternext chapternext chapter
closeclose
Materials Part III - Lack of Correspondence between Patterns of Relative Spectral Energy Maxima or Formant Patterns and Age- and Gender-Related Speaker Groups or Vocal-Tract Sizes

M10.1: Similar Patterns of Relative Spectral Maxima and Similar Formant Patterns ≤ 1.5 kHz for Different Age- and Gender-Related Speaker Groups or Vocal-Tract Sizes

Content of illustration

Figure 1 shows sounds of the vowel /o / produced by a child (age 8), a woman and a man. Each speaker produced sounds at different F0 in a way that allowed for a comparison of the sounds of the three speakers (representing the three main speaker groups according to age and gender) at different and similar F0. The comparison shows that age- and gender-related differences ≤ 1.5 kHz as given in formant statistics for citation-form words can decrease or even disappear if F0 of the vocalisations correspond for children, women and men. In this regard, comparisons of vocalisations of /o / are of special interest (and shown first) because an F0-dependence of the lower spectral frequency range can be observed for F0 clearly below statistical F1, and because the frequency range ≤ 1.5 kHz covers the entire range related to the vowel identity in question.—Data for speakers, ranges of F0 and calculated F1 and F2:

Figure 2 demonstrates this phenomenon for sounds of the vowel /e / produced by a child (age 10), a woman and a man, concerning the lowest spectral peak and F1.—Data for speakers, ranges of F0 and calculated F1:

Similar indications as shown for sounds of /e / can be found for sounds of /ø /.

Figure 3 demonstrates this phenomenon for sounds of the vowel /u / produced by a child (age 8), a woman and a man. However, only the first lower peak and calculated F1 are discussed because, for several sounds, an interpretation of F2 lacks methodological substantiation.— Data for speakers, ranges of F0 and calculated F1: 

Figure 4 demonstrates this phenomenon for sounds of the vowel / i / produced by a child (age 8), a woman and a man, concerning the lower spectral peak and calculated F1.—Data for speakers, ranges of F0 and calculated F1: 

Similar indications as shown for sounds of / i / can be found for sounds of /y/.

With regard to sounds of /a–ɑ /, a compilation of corresponding sound series similar to those presented for the other vowels often encounters some difficulties for two main reasons: spectral peaks and formant patterns often do not shift markedly with rising F0, and children often produce a very open /a /, while many adults produce an intermediate sound of /a–ɑ / or even a sound of /ɑ /, although all speakers speak the same language and live in a geographically limited area. However, Figure 5 demonstrates a case of comparable vowel spectra and comparable formant patterns for sounds of /a / produced by a child (age 10), a woman and a man.—Data for speakers, ranges of F0 and calculated F1:

The sounds presented in the previous figures may lead to the question whether, with rising F0 and related shifts of the lower spectral peaks and of the calculated lower formants, the perception of age and gender of the speaker alters, i.e. whether the sounds of adults are perceived as produced by children at F0 > c. 260 Hz, and whether sounds of men are perceived as produced by women > c. 200 Hz. This may indeed be the case for the comparison of the sounds of some speakers, while it does not hold true for others. To demonstrate the latter, Figure 6 shows similar vowel spectra and similar formant patterns for sounds of the vowel /o / produced by a child (age 10), a woman (untrained speaker) and a man (classical opera singer, baritone). For these sounds, the perceived vowel quality corresponds very well. However, the baritone is always perceived as such at all F0 of his singing, which is represented in his vowel spectra by a so-called “singer’s formant cluster”. (Again, only the first lower peak and calculated F1 are discussed since most sounds exhibit only one spectral peak; for these sounds, the calculated F2 is weak and its role for vowel perception is questionable; see Section M7.1.)—Data for speakers, ranges of F0 and calculated F1: 

As a direct consequence of the documented observations, it follows that, for back vowels, the sounds of men (at higher F0) may exhibit higher vowel-related spectral peaks and higher calculated F1 or F1–F2 patterns than the sounds of women (at lower F0). The same holds true for the lowest spectral peak and calculated F1 of front vowels and may also occur when comparing sounds of adults and children.

Figure 7 shows such an “inversion” of expected age- and gender-related differences comparing sounds of the vowel /o / produced by a child and a man, selected from the sound series of the previous Figure 6. If the F0 of the sounds of the man substantially exceeds the F0 of a sound of the child, the first spectral peak and calculated F1 of the sounds of the man are also above the corresponding peak and F1 of the sound of the child (compare Spectra 7-1 to 7-3). The same holds true for calculated F2, but as mentioned, the measurement and perceptual role of F2 are in question. However, if the comparison relates to the sounds of the man at F0 corresponding to statistical values (given for citation-form words), the first spectral peak and calculated F1 (and F2) are found as lower for the man than for the child, as this is generally expected (see Spectra 7-4 and 7-5).—Data for speakers, ranges of F0 and calculated F1 (and F2), in the order of F0:

Figure 8 demonstrates this phenomenon < 1.5 kHz by comparing selected sounds of the vowel /e / shown in Figure 2.—Data for speakers and ranges of F0 and calculated F1:

Figure 9 demonstrates this phenomenon < 1.5 kHz by comparing selected sounds of the vowel /u / shown in Figure 3.—Data for speakers and ranges of F0 and calculated F1:

Figure 10 demonstrates this phenomenon < 1.5 kHz by comparing selected sounds of the vowel / i / shown in Figure 4.—Data for speakers and ranges of F0 and calculated F1:

Comparisons are limited to children and men because the corresponding differences in the vocal-tract sizes are assumed to be highest.

For earlier accounts, see Maurer, Cook, Landis, and d’Heureuse (1992), Maurer, Suter, Friedrichs, and Dellwo (2015b); note also some related reflections in Potter and Steinberg (1950).

Link to the spectra of the Figures

Figure 1: Sounds of /o/ produced by a child, a woman and a man at comparable levels of F0.

>> Link to Figure 1

Figure 2: Sounds of /e/ produced by a child, a woman and a man at comparable levels of F0.

>> Link to Figure 2

Figure 3: Sounds of /u/ produced by a child, a woman and a man at corresponding F0.

>> Link to Figure 3

Figure 4: Sounds of /i/ produced by a child, a woman and a man at corresponding F0.

>> Link to Figure 4

Figure 5: Sounds of /a/ produced by a child, a woman and a man at comparable levels of F0.

>> Link to Figure 5

Figure 6: Sounds of /o/ produced by a child, a woman (untrained speaker) and a man (professional opera singer, baritone) at comparable levels of F0.

>> Link to Figure 6

Figure 7: "Inverted" age- or size-related differences in vowel-related lower spectral peak(s) and calculated F1 (and F2) for sounds of /o/.

>> Link to Figure 7

Figure 8: "Inverted" age- or size-related differences in vowel-related lower spectral peak(s) and calculated F1 (and F2) for sounds of /e/.

>> Link to Figure 8

Figure 9: "Inverted" age- or size-related differences in vowel-related lower spectral peak(s) and calculated F1 (and F2) for sounds of /u/.

>> Link to Figure 9

Figure 8: "Inverted" age- or size-related differences in vowel-related lower spectral peak(s) and calculated F1 (and F2) for sounds of /i/.

>> Link to Figure 10