Towards Automaticity in Reinforcement Learning: A Model-Based Functional Magnetic Resonance Imaging Study

Burak ERDENİZ, John DONE
2020 June - 57 (2)
TURKISH PDF ENGLISH PDF

Highlights


Abstract

Introduction: Previous studies showed that over the course of learning
many neurons in the medial prefrontal cortex adapt their firing rate
towards the options with highest predicted value reward but it was
showed that during later learning trials the brain switches to a more
automatic processing mode governed by the basal ganglia. Based on
this evidence, we hypothesized that during the early learning trials the
predicted values of chosen options will be coded by a goal directed
system in the medial frontal cortex but during the late trials the predicted
values will be coded by the habitual learning system in the dorsal
striatum.
Methods: In this study, using a 3 Tesla functional magnetic resonance
imaging scanner (fMRI), blood oxygen level dependent signal (BOLD)
data was collected whilst participants (N=12) performed a reinforcement
learning task. The task consisted of instrumental conditioning trials
wherein each trial a participant choose one of the two available options
in order to win or avoid losing money. In addition to that, depending
on the experimental condition, participants received either monetary
reward (gain money), monetary penalty (lose money) or neural outcome.
Results: Using model-based analysis for functional magnetic resonance
imaging (fMRI) event related designs; region of interest (ROI) analysis
was performed to nucleus accumbens, medial frontal cortex, caudate
nucleus, putamen and globus pallidus internal and external segments. In
order to compare the difference in brain activity for early (goal directed)
versus late learning (habitual, automatic) trials, separate ROI analyses
were performed for each anatomical sub-region. For the reward
condition, we found significant activity in the medial frontal cortex
(p<0.05) only for early learning trials but activity is shifted to bilateral
putamen (p<0.05) during later trials. However, for the loss condition
no significant activity was found for early trials except globus pallidus
internal segment showed a significant activity (p<0.05) for later trials.
Conclusion: We found that during reinforcement learning activation in
the brain shifted from the medial frontal regions to dorsal regions of the
striatum. These findings suggest that there are two separable (early goal
directed and late habitual) learning systems in the brain.