Understanding the mix: Clear cases and ambiguity in infant word-referent learning.

Smith, L.

Indiana University, USA

The world offers data to novice word learners in the form of word-scene co-occurrences. Many theories of early word-referent learning begin with the assumption that these data are noisy with many spurious co-occurrences between heard words and scene elements. Theories and experiments have shown how infant word learners might use statistical inference procedures to find the underlying system of words and referents in the noisy data. However, other research shows that amidst the ambiguity that characterizes very early learning environments, there are also moments of great clarity in which the referent of a heard word is visually salient and dominant over competitors. If enough of the early data are relatively clean, infants might not need powerful statistical inference mechanisms at all; they could simply ignore moments that are too uncertain and learn only from the clear cases. I will present findings on the statistical regularities of words and scenes in infant learning environments and show that there is a mix of clear and unclear cases. I will then present findings from both experimental and modeling studies that show that infant word-referent learning critically depends on this mix. The talk will finish with a consideration of what we do not but need to know about infant statistical word-referent learning.