Abstract Cognitive flexibility, the capacity to adapt behaviour to changes in the environment, is impaired in a range of brain disorders, including substance use disorder and Parkinson’s disease. Putative neural substrates of cognitive flexibility include mesencephalic pathways to the ventral striatum (VS) and dorsomedial striatum (DMS), hypothesised to encode learning signals needed to maximize rewarded outcomes during decision-making. However, it is unclear whether mesencephalic projections to the ventral and dorsal striatum are distinct in their contribution to flexible reward-related learning. Here, rats acquired a two-choice spatial probabilistic reversal learning (PRL) task, reinforced on an 80%:20% basis, that assessed the flexibility of behaviour to repeated reversals of response-outcome contingencies. We report that optogenetic stimulation of projections from the ventral tegmental area (VTA) to the nucleus accumbens shell (NAcbS) in the VS significantly impaired reversal learning when optical stimulation was temporally aligned with negative feedback (i.e., reward omission). Moreover, the exploitation-exploration parameter, β , was increased (indicating greater exploitation of information) when this pathway was optogenetically stimulated after a spurious loss (i.e. an incorrect (20%) response at the 80% reinforrced location) compared to after a spurious win (i.e. a correct (20%) response at the 20% reinforced location). VTA → NAcbS stimulation during other phases of the behavioural task was without effect. Optogenetic stimulation of projection neurons from the substantia nigra (SN) to the DMS, aligned either with reward receipt or omission or prior to making a choice, had no effect on reversal learning. These findings are consistent with the notion that enhanced activity in VTA → NAcbS projections leads to maladaptive perseveration as a consequence of an inappropriate bias to exploitation via positive reinforcement.