Speech Rhythm
Neuroscientists are often interested in the temporal correspondence between speech and a listener's brain response. For instance, many hypothesise that the coordinated activity of groups of neurons may become coupled in time with recurring elements in speech rhythm. The unit of this recurrence is widely speculated to be the syllable, a phonetically-defined construct. Acoustically speaking, syllables in natural, continuous speech are troublesome to define; some researchers, however, believe that characteristic peaks within the speech envelope provide a close approximation of syllables. The bounds of this similarity have not been empirically established, especially in the context of spontaneous natural speech.
We systematically compared across several methods of speech envelope extraction to determine the similarity between manually annotated syllables and peaks in the speech envelope. We found that the choice of speech envelope algorithm has nontrivial consequences concerning its resemblance to the annotated vowel time series. Moreover, the algorithmic performance was also strongly modulated by language and speaking style. Our results call into question the equivalence of the syllabic time series with that calculated from acoustic features, and in addition, underscore possible limitations of applying linguistic-theoretical concepts in speech perception research.