[PS-1.24] A Cognitive Explanation for Zipfian Distributions? Low Entropy Is Beneficial for Language Learning

Lavi-Rotbain, O. & Arnon, I.

The Hebrew University of Jerusalem

Words in natural language show a Zipfian distribution: few words are very frequent while the rest have low frequencies. This distribution is consistently found across languages and parts of speech, yet its source remains unclear. Here, we examine the possible cognitive roots of this distribution by showing that low entropy is beneficial for language learning. We tested children's and adults' word segmentation in an artificial language across three levels of entropy: high (with a uniform distribution of items), medium and low. Entropy was reduced by making one word more frequent. For both age groups, segmentation was better in the low entropy condition for the entire language, and for the less frequent words: they were learned better than the words in the high entropy condition despite appearing half the number of times (9 vs. 19). This facilitation wasn't driven by learning only the frequent word: accuracy was similar whether or not foils shared a syllable with the frequent word. This is the first demonstration, to our knowledge, that low entropy is beneficial for language learning, suggesting a cognitive explanation for Zipfian distributions in natural language. We discuss implications for artificial language learning experiments, which may underestimate performance by using uniform distributions.