[PS-2.10] Corpus is better than cloze: comparing two predictability measures

Lopukhina, A. 1 , Lopukhin, K. 2 & Laurinavichyute, A. 1

1 Center for language and brain, National Research University Higher School of Economics
2 Scrapinghub

When reading or listening people can anticipate upcoming words based on the context - thus, each word has a degree of predictability that is usually measured by the cloze task. Being convenient, this task is criticized for its systematic lexical and semantic biases (Staub et al., 2015) and the lack of information about very improbable continuations, which may affect the processing difficulty (Smith and Levy, 2013). To overcome these limitations several studies turn to corpus data to quantify predictability. Although corpus-based probabilities are sometimes used as a substitute for cloze, we still don?t know whether they are better than cloze.
We compared corpus-based probabilities obtained from the LSTM recurrent neural network language model trained on a large corpus with cloze probabilities in the amount of variance each type of the probabilities explains in eye movements during reading. For that we fitted a linear model that explains variance at the item-level, averaged across participants. Following the study by Hofmann et al. (2017), we calculated the Pearson correlation coefficient r and squared it to determine the amount of explained variance (r-squared). We used two baseline features - word frequency and relative position of a word in a sentence - and added to them either cloze probabilities, or corpus-based probabilities, or both probabilities.
We found that corpus-based probabilities explain more variance in eye movements during reading than cloze probabilities (e.g., total reading time measure: for cloze probabilities, r-squared is 0.476, for corpus-based probabilities - 0.505, for both types of probabilities - 0.512), although the difference is not significant. We suggest corpus-based probabilities should be used in psycholinguistic experiments as they are free from biases of the cloze task, usually contain more information about possible continuations of a sentence than cloze data, and are much easier to obtain.