Ongoing Research Projects

Maintenance of Subcategorical Information in Speech Perception

Suppose you hear someone say "I noticed a ?ent in my fender." How do you know what the "?" segment is? Of course, there is the acoustic information on the segment itself -- perhaps it's most consistent with "t", a little less with "d", and even less with "v", "l", etc. However, there is other information in the sentence, namely the subsequent context "fender", which would presumably be most consistent with the speaker intending to produce "d". How do listeners come to a categorization decision in this case? What is interesting about spoken language is that it unfolds across time -- if a listener makes an early categorization judgment (say, "t") early on in the sentence, then there is no way for them to integrate subsequent context, since they already made a binary decision. But if the listener maintains in their memory a more complex (subcategorical) representation of the segment, they can then successfully integrate the two pieces of information together to understand the word. In a series of experiments, we have shown that listeners appear to take the latter strategy, successfully integrating time-distant cues together. In current work, we are exploring more dynamics of this process. For example, optimal cue integration would predict that the two cues (acoustics and context) should be integrated in the same way regardless of their ordering; is this actually the case? While prior work has established that context is used in both cases (preceding or following acoustic information), these two cases have not been directly tested together and compared. Zac Longo, an undergraduate at University of Hartford, is working on a study to test this question.

Related publications: Bushong & Jaeger (under review); Bushong & Jaeger (2019) JASA; Bushong & Jaeger (2017) CogSci

Use of Contextual Information in Speech Adaptation

There is idiosyncratic individual variability within speech, meaning that every person speaks with a slightly different "accent" even if they come from a similar background. And yet, we have no trouble immediately understanding a new acquaintance, because our minds engage in an adaptation process whereby we learn the characteristics of the new person's speech. What might aid adaptation? From previous work in the lab (described above), we know that listeners can integrate together multiple cues to understand a speech sound. It's possible that listeners could use this information to update their expectations about a speaker's accent. Dan LaMarche, lab alumnus and PhD student at the University at Buffalo, is investigating this question within a distributional learning paradigm. Elayna Espinal, current undergraduate student, assists on the project.

Modeling Cue Integration in Word Recognition

The projects above are part of an effort to characterize the basic properties of how listeners use multiple cues in speech perception and spoken word recognition. However, these behavioral results alone are potentially consistent with many different theories of memory and cue integration. Thus, it is critical that we develop formal quantitative models which make precise predictions about results in the paradigms we use. Wednesday's dissertation took a first step in this direction, spelling out four cognitive mechanisms by which listeners might integrate contextual and acoustic cues in word recognition. Ongoing work seeks to relate these models to existing theories of speech perception and word recognition.

Related publications: Bushong (in prep); Bushong (2020); Bushong & Jaeger (2019) CMCL