OS_20.2 - Information-theoretic measures of cognitive processing effort predict word-reading times

Frank, S.

Department of Cognitive, Perceptual and Brain Sciences. University College London. London, United Kingdom

In the Computational Psycholinguistics literature, it has been argued that the amount of information conveyed by each word in a sentence is a measure of the amount of cognitive effort needed to process the word. Two complementary formalizations of word information have been proposed: surprisal and entropy reduction. These quantify, respectively, the extent to which a word’s occurrence was unexpected, and the word’s effect on the uncertainty about the rest of the sentence. The goal of our study was to investigate whether both these information measures indeed predict processing effort, as observed in word-reading times. A recurrent neural network was trained on 700,000 sentences (comprising 6.9 million word tokens; 7,754 types) from the British National Corpus. Next, the network generated surprisal and entropy reduction estimates for 5,043 word tokens of 361 sentences, selected from three novels on www.freeonlinenovels.com. Reading times on the same words were collected in a self-paced reading task involving 54 native English speakers. Mixed-effect regression analyses showed that both surprisal and entropy reduction are positively related to word-reading time. This supports the hypothesis that more cognitive effort is required to process words that convey more information, and suggests that both unexpectedness and uncertainty reduction quantify information content.