What are the units? Factoring morphological statistics into the learning of verbs

Willits, J. & Jones, M.

Indiana University

Statistical learning mechanisms seem to be important to many aspects of language acquisition. But what are the units over which statistical learning operates? Phonemes? Syllables? Morphemes? Words? Claims about the sufficiency of statistical learning mechanisms rest on which units are being used. Previous research has shown that infants don't recognize verbs in fluent speech until 10-13 months, compared with 7.5 months for nouns. This makes sense given their relative distributional statistics; noun statistics provide better segmentation cues. However, this statistical difference goes away in the presence of inflections like -ing. In accordance with these morphological-level statistics, Willits, Seidenberg, & Saffran (2009) found that 7.5-MO infants presented with verbs in -ing contexts do show evidence of recognizing verbs. In the present work we present an analysis of infant and toddler verb-learning based on statistics in child-directed speech, with the critical manipulation being how verb inflections are treated in these analyses. The analysis incorporates a number of statistics about verbs, including their ease of articulation, word frequency, repetition frequency, and contextual diversity. All of these factors are found to significantly contribute to predictions about verbs' difficulty of acquisition. Multiple regression analyses also showed that the importance of these statistics changes drastically from month zero to month thirty. Frequency, repetition, and discourse situation contextual diversity play important roles at younger ages, and lexical contextual diversity plays a critical role at older ages. Critically, all of these statistics provide a better account of verbs' age of acquisition if inflections are broken off of verbs and counted as separate units in the analysis. Together, these studies help answer questions about which factors contribute to vocabulary development. They also address the scope of statistical learning-based theories of acquisition, suggesting that tracking morpheme-level statistics plays a critical role in infants' acquisition of their native language.