Human subthalamic nucleus and globus pallidus internus carry information on word onsets and show speaker selectivity

Schepers, I. M. 1, 4, 5 , Ahrens, H. 2 , Beck, A. 3, 4 , Schwabe, K. 3, 4 , Abdallat, M. 3 , Krauss, J. K. 3, 4 & Rieger, J. W. . 1, 4, 5

1 Department of Psychology, Oldenburg University, Oldenburg, Germany
2 Neuroimaging Center, Oldenburg University, Oldenburg, Germany
3 Department of Neurosurgery, Hannover Medical School, Hannover, Germany
4 Cluster of Excellence Hearing4all
5 Research Center Neurosensory Science, Oldenburg University, Oldenburg, Germany

The basal ganglia play an important role in time perception, temporal chunking, rhythm processing and sensory and attentional gating. We wanted to determine whether temporal speech information is represented in the neural responses in human basal ganglia nuclei.
In patients implanted bilaterally for deep brain stimulation (DBS), we obtained local field potential recordings from the subthalamic nucleus (STN; N = 8, 48 bipolar contacts) or the globus pallidus internus (GPi; N = 6, 36 bipolar contacts), while they listened to two-speaker speech streams. One stream was task relevant, the other served as distractor. Temporal response functions (TRFs) were estimated for each contact separately. These TRFs describe the mapping between the word onsets of both speaker streams and the neural responses at the respective DBS contacts.
All subjects showed sustained neural responses in the beta to low gamma range (15-60 Hz) to speech compared to baseline (p < 0.05). Encoding models based on TRF estimation showed that these neural responses track the word onsets in the two speech streams (34/48 STN contacts (71%), 23/36 GPi contacts (64%), p<0.05). Next, speech stream tracking was compared between the two speakers at contacts showing significant word onset tracking. This analysis revealed that 27 of 34 STN contacts (79%) and 10 of 23 GPi contacts (44%) showed selectivity for one of the two speaker streams (p<0.05). A similar number of contacts showed speaker selectivity for the task-relevant speech stream (STN: 14, GPi: 7) as for the distractor speech stream (STN: 13, GPi: 3).
Our findings provide evidence that neural responses in human STN and GPi contain information on temporal speech information (i.e., word onsets) during continuous speech presentation and that speaker selectivity is present in these subcortical structures. This selectivity may be important for gating task-relevant and suppressing task-irrelevant information and thus for attentional selection.