Abstract
The audiovisual speech signal contains multimodal information to phrase boundaries. In three artificial language learning studies with 12 groups of adult participants we investigated whether English monolinguals and bilingual speakers of English and a language with opposite basic word order (i.e., in which objects precede verbs) can use word frequency, phrasal prosody and co-speech (facial) visual information, namely head nods, to parse unknown languages into phrase-like units. We showed that monolinguals and bilinguals used the auditory and visual sources of information to chunk “phrases” from the input. These results suggest that speech segmentation is a bimodal process, though the influence of co-speech facial gestures is rather limited and linked to the presence of auditory prosody. Importantly, a pragmatic factor, namely the language of the context, seems to determine the bilinguals’ segmentation, overriding the auditory and visual cues and revealing a factor that begs further exploration.
Author supplied keywords
Cite
CITATION STYLE
de la Cruz-Pavía, I., Werker, J. F., Vatikiotis-Bateson, E., & Gervain, J. (2020). Finding Phrases: The Interplay of Word Frequency, Phrasal Prosody and Co-speech Visual Information in Chunking Speech by Monolingual and Bilingual Adults. Language and Speech, 63(2), 264–291. https://doi.org/10.1177/0023830919842353
Register to see more suggestions
Mendeley helps you to discover research relevant for your work.