The applet enables researchers to estimate infant and toddler vocabulary size from his/her score on MacArthur-Bates CDI, Words and Gestures (CDI-WG) or Words and Sentences (CDI-WS).

The mapping from CDI score to vocabulary size is highly non-linear; the difference between CDI score and vocabulary size increases dramatically with age and vocabulary. This estimation of vocabulary size enables researchers to perform unbiased correlational analyses, e.g., from indices of cognitive development to lexical development, by effectively removing the ceiling effect introduced when infants and toddlers know a substantial fraction of words on the CDI forms.

Please enter the CDI score of the infant or toddler:  or

Estimated vocabulary size:   

Reference: Mayor, J. and Plunkett, K. (in press). A Statistical Estimate of Infant and Toddler Vocabulary Size from CDI Reports. Developmental Science.

For the last twenty years, developmental psychologists have measured the variability in lexical development of infants using the MacArthur-Bates Communicative Development Inventories (CDIs) - the most widely-used parental report forms for assessing language and communication skills in infants and young children. We show that CDI reports can serve as a basis for estimating infants' total vocabulary sizes, beyond serving as a tool for assessing their language development relative to other infants. We investigate the link between estimated total vocabulary size and raw CDI scores from a mathematical perspective, using both single developmental trajectories and population data. The method capitalises on robust regularities, such as the overlap of individual vocabularies observed across infants and toddlers, and takes into account both shared knowledge and idiosyncratic knowledge. This statistical approach enables researchers to approximate the total vocabulary size of an infant or a toddler, based on her raw MacArthur-Bates CDI score. Using the model, we propose new normative data for productive and receptive vocabulary in early childhood, as well as a tabulation that relates individual CDI measures to realistic lexical estimates. The correction required to estimate total vocabulary is non-linear, with a far greater impact at older ages and higher CDI scores. Therefore, we suggest that correlations of developmental indices to language skills should be made to vocabulary size as estimated by the model rather than to raw CDI scores.