Learning to predict rewards: episodes, averages, and integration

Shohamy, D.

Predicting rewards is central to adaptive behavior. How do we learn from past experience to predict which actions will lead to reward in the future? Recent advances suggest that repeated experience with reward serve to update the learned value of candidate cues and actions, leading to improved predictions in the future. These advances have emerged from the field of reinforcement learning, which offers insight into the computational, cognitive, and neural mechanisms supporting the learning of stimulus-reward associations. At first glance, reinforcement learning shares many features of statistical learning: both involve gradual and incremental learning processes that capture regularities in temporal coupling of experienced events. However, the comparison also exposes fundamental gaps in models of reinforcement learning and their ability to account for many aspects of reward and value learning. In this talk I will share recent studies that show that even seemingly simple forms of reward learning involve construction of stimulus-stimulus associative structures, encoding rich information about context, about trial-unique features, and about cue configurations. This learning involves interactions between the striatum and the hippocampus, two learning systems typically considered to operate in separation. Together, these findings offer a new framework for how multiple forms of learning work together to support the integration of relational stimulus-stimulus learning with stimulus-reward associations, offering a mechanism for building a rich, predictive model of the world that is well-suited to guide flexible behavior.