Materials Part II

M3: Vowels and Number of Formants

Formant merging - Spurious formant - “Flat” vowel spectra

Formant merging

“If you know you are analyzing a low back vowel, don’t be surprised to find one thick bar on the spectrogram that really corresponds to two formants close together below 1’000 Hz.” (Ladefoged, 2003, p. 114)

Referring to vocalisations of /ɔ / as in caught: “When the formants are close together […] neither the wide- nor the narrowband spectrum gives a good indication of the formant frequencies. […] The first two formants appear as a single peak below 1’000 Hz. Their frequencies cannot be determined from these spectra.” (Ladefoged, 2003, pp. 119–120)

Spurious formant

“Sometimes it is not immediately obvious whether a particularly wide band represents one formant or two. Figure 5.8 is a spectrogram of the word bud, spoken by a female speaker of Californian English. There is a wide band below 1,000 Hz, but is this one formant or two formants close together as in Figure 5.7? Noting that there is a clear formant at about 1,500 Hz in Figure 5.8, and additional formants higher, we must take it that there is only a single formant below 1,000 Hz. It seems that there is some kind of extra formant near the first formant, making this dark bar wider. From the evidence of this one vowel it is impossible to say whether the additional energy is above or below the first formant. Further analysis of this speaker’s voice showed that there was often energy around the 1,000 Hz region, irrespective of the vowel. This spurious formant is not connected with the vowel quality, but is simply a characteristic of the particular speaker’s voice. This is a good example of the necessity of looking at a representative sample of a speaker’s voice before making any measurements of the formants.” (Ladefoged, 2003, pp. 114–115)

“Flat” vowel spectra

“Flat-spectrum stimuli, consisting of many equal‐amplitude harmonics, produce timbre sensations that can depend strongly on the phase angles of the individual harmonics. For fundamental frequencies in the human pitch range, many realizable timbres have vowel-like perceptual qualities. This observation suggests the possibility of constructing intelligible voiced speech signals that have flat-amplitude spectra.” (Schroeder & Strube, 1986)