By Andries W. Coetzee
Apr 07, 2013
Five current and former members of our Department just had a paper appear in the Journal of the Acoustical Society of America. This paper is the result of a long collaboration between Pam Beddor, Kevin McGowan (Ph.D. 2011, currently at Rice University), Julie Boland, Andries Coetzee and Anthony Brasher. The paper combines some of Pam's longstanding research interests in the perception of coarticulated speech with Julie's experience and expertise in using eye-tracking to probe into language processing.
Given that speech sounds influence the articulation, and hence acoustic properties, of other speech sounds in their environment, a single phoneme does not have the same acoustic realization in all contexts in which it occurs. This study probes into how listeners cope with these coarticulatory effects between neighboring speech sounds. This particular study goes beyond existing research by using eye-tracking to measure perceptual processing. Most previous research into the perception of coarticulated speech has relied on behavioral measures such as button pushes. Although a lot can be learned about how listeners process coarticulated speech by analyzing the patterns observed in listeners' button pushes, much about the real time processing of coarticulated speech is also missed in this approach. Button pushes reflect only the end-state of perceptual processing—they cannot tell us anything about the moment-by-moment processing of coarticulated speech. Given the high time resolution of eye-tracking, however, it is possible to see how listeners respond to coarticulated speech as the speech signal unfolds over time. This study documents many interesting and new results. It finds, for instance, that listeners who arrive at the same final percept (i.e. who would have pushed the same button in a more traditional button push experiment) do not necessarily arrive at that endpoint via the same processing route. These results answers some longstanding questions in the field of speech processing, and it also opens up exciting new avenues for research.
The full bibliographic information of the paper, as well as an abstract, is given below, and is also available on the journal's website here.
Beddor, Patrice Speeter, Kevin B. McGowan, Julie E. Boland, Andries W. Coetzee & Anthony Brasher. (2013) The time course of perception of coarticulation. Journal of the Acoustical Society of America, 133(4):2350-2366.
The perception of coarticulated speech as it unfolds over time was investigated by monitoring eye movements of participants as they listened to words with oral vowels or with late or early onset of anticipatory vowel nasalization. When listeners heard [CV~NC] and had visual choices of images of CVNC (e.g., send) and CVC (said) words, they fixated more quickly and more often on the CVNC image when onset of nasalization began early in the vowel compared to when the coarticulatory information occurred later. Moreover, when a standard eye movement programming delay is factored in, fixations on the CVNC image began to occur before listeners heard the nasal consonant. Listeners' attention to coarticulatory cues for velum lowering was selective in two respects: (a) listeners assigned greater perceptual weight to coarticulatory information in phonetic contexts in which [V~] but not N is an especially robust property, and (b) individual listeners differed in their perceptual weights. Overall, the time course of perception of velum lowering in American English indicates that the dynamics of perception parallel the dynamics of the gestural information encoded in the acoustic signal. In real-time processing, listeners closely track unfolding coarticulatory information in ways that speed lexical activation.