Roete, I. 1, 2 , Casillas, M. 1 , Frank, S. 2 & Fikkert, P. 2
1 Max Planck Institute for Psycholinguistics, Nijmegen
2 Centre for Language Studies, Radboud University, Nijmegen
Usage-based approaches to language acquisition (e.g. Tomasello, 2003) propose that children use multi-word utterances -chunks- to build up grammatical knowledge from recurring patterns in their linguistic input. We investigate the changing influence of this statistical, chunk-based learning on children's language production over time using the CAPPUCCINO model (McCauley & Christiansen, 2011), which simulates child language production using chunks extracted from caregivers' speech.
We selected transcriptions of conversations between 6 North American children and their caregivers, sampled at 6-month intervals between 1;0 and 4;0 (Providence; Demuth, Culbertson, & Alter, 2006). After training the model on this input, we reconstructed children's actual utterances based on the transitional probabilities of the chunks detected during training. The number of child utterances that were reconstructed correctly based on transitional probabilities between chunks in the caregivers' speech decreased over time (beta = - 0.720, SE = 0.157, p < 0.001). However, the number of utterances that contained words or chunks the caregivers did not use, increased (beta = 0.547, SE = 0.064, p<0.001). These results indicate that, over time, children's speech less directly imitates chunk sequences in their caregivers' speech, partly because their chunk combinations become more inventive.