Infants' audiovisual speech integration does not hinge on phonetic knowledge

Baart, M. ¹ , Vroomen, J. ² , Shaw, K. ³ & Bortfeld, H. ^3, ⁴

1 Basque Center on Cognition, Brain and Language, Donostia, Spain
2 Tilburg University, Dept. of Cognitive Neuropsychology, Tilburg, the Netherlands
3 University of Connecticut, Dept. of Psychology, Storrs, CT, the United States of America
4 Haskins Laboratories, New Haven, CT, the United States of America

Infants and adults are well able to match auditory and visual speech, but the cues on which they rely (viz. temporal, energetic and phonetic correspondence in the auditory and visual speech streams) may differ. Here we assessed the relative contribution of the different cues using sine-wave speech (SWS). Adults (N=52) and infants (N=30) matched 2 trisyllabic speech sounds (?kalisu? and ?mufapi?), either natural speech or SWS, with visual speech information. On each trial, adults saw two articulating faces and matched a sound to one of these faces, while infants were presented the same stimuli in a preferential looking paradigm. Adults were almost flawless with natural speech, but significantly less accurate with SWS. In contrast, infants looked longer at the articulating face that matched the sound irrespective of whether it was natural speech or SWS. These findings are in-line with a multi-stage view on audiovisual speech integration and suggest that during development, phonetic knowledge becomes perceptually more important.