Cross-Modal Effects in Statistical Learning

Open Access
Author:
Mitchel, Aaron Daniell
Graduate Program:
Psychology
Degree:
Doctor of Philosophy
Document Type:
Dissertation
Date of Defense:
December 14, 2010
Committee Members:
  • Daniel J. Weiss, Committee Chair
  • Judith F. Kroll, Committee Member
  • Reginald B. Adams, Committee Member
  • Chip Gerfen, Committee Member
Keywords:
  • multimodal integration
  • statistical learning
  • language acquisition
Abstract:
A central question of research on language acquisition concerns which types of information in the environmental input are available to language learners, as well as the extent to which learners utilize this information. Although research examining this issue has traditionally focused on the nature of the auditory input to language learners, there is a growing body of research indicating that the language learning environment is multimodal and that processes supporting language acquisition exploit cues in the visual domain (e.g. cues in the speaker’s face). In this dissertation, I examined how mechanisms of early language acquisition operate over multimodal input, focusing on processes underlying speech segmentation, an early obstacle faced by language learners. One mechanism believed to support speech segmentation is statistical learning, in which learners use distributional regularities in the input to identify word boundaries. Few studies have examined statistical learning in a multimodal context, despite evidence that perception is fundamentally multisensory. Thus, across a series of three studies, I investigated the interaction of visual and auditory input during statistical learning. In Chapter II, I tested adults’ ability to use distributional cues to segment streams of tone and shape triplets presented simultaneously. Learners were able to segment each stream so long as the triplet boundaries aligned across streams. When the streams were misaligned, performance dropped to chance. In Chapter III, I used an illusion arising from audiovisual integration (the McGurk effect) to alter the statistical representations of two languages, indicating that learners can integrate audio and visual input during statistical learning. Finally, in Chapter IV, I found that learners were able to use facial cues alone to segment a speech stream. The results of these three studies provide evidence of cross-modal effects on statistical learning, suggesting that statistical learning does not occur independently across modalities. In addition, these results provide further evidence for the relevance of visual input, in particular cues in the speaker’s face, for processes supporting language acquisition. I conclude the dissertation with a discussion of how these findings inform models of statistical learning, as well as potential broader applications of these results for atypical language development.