Segmenting words of varying length: Statistics don't always win

Pomper, R. , Potter, C. , Benitez, V. & Saffran, J.

University of Wisconsin-Madison

Human learners consistently demonstrate a powerful ability to segment words in artificial language experiments (e.g., Saffran et al., 1996). However, few studies have explored exactly what units they are abstracting in these tasks. Adult participants (N = 130) heard an artificial language containing both disyllabic (e.g., mufi) and trisyllabic (e.g., pabiku) words. Participants successfully segmented words from this stream, rating words as more familiar on a Likert scale than both non-words and partwords (e.g., bikumu). However, partial word lures that either contained a disyllabic word (e.g., mufipa & kumufi) or consisted of the first two syllables of trisyllabic words (e.g., pabi) were rated equally familiar as words from the language. Lures lacking the first syllable of a trisyllabic word (e.g., biku) were rated as less familiar than words despite being statistically coherent, suggesting that participants were most sensitive to the co-occurrences between the first two syllables of words. Preliminary data using 2AFC measures provide converging support for this pattern. These results suggest that participants are not weighting all statistics equally. Ongoing studies are exploring how the composition of the artificial language affects learner biases.