• Nem Talált Eredményt

2. SPECIFIC AIMS

5.2. Reinforcement learning in PD

5.2.1. The role of dopamine in salience, motivation, and reward

Our results from PD patients suggest that dopamine is essential in reward signals: in the case of mesencephalic-striatal dopamine loss, reward learning is impaired but punishment learning is spared, and dopamine agonists reverse this pattern of deficit.

Therefore, dopaminergic signals do not encode the general behavioral relevance of outcome signals (salience) regardless of their value (reward vs. punishment), as suggested by some authors (Berridge, 2007). However, one important issue must be mentioned: the drugs used in our study were agonists of the D2 and D3 dopamine receptors, and therefore it is not equivalent with a general increase of dopamine level or phasic dopamine release stimulating all receptor types. Second, it is unknown how these drugs affect reward learning in healthy persons and how they interfere with the altered dopaminergic status in PD. Third, dopaminergic neurotransmission was not directly visualized in our PD patients, and hence we have no direct data regarding dopamine loss in different subregions of the striatum or neuronal signals in these regions.

Frank and O’Reilly (2006) compared the effect of low doses of D2 agents cabergoline and haloperidol on reward and punishment learning in healthy volunteers.

Paradoxically, cabergoline impaired, while haloperidol enhanced learning from positive reinforcement, which can be explained by their effect on the presynaptic autoreceptors in the low dose range. Cabergoline also caused a globally decreased inhibitory signal.

Importantly, these effects were observed in the case of attention and working memory, suggesting that basal ganglia dopaminergic transmission contributes to the updating of prefrontal representations.

In addition to behavioral tasks, Pessiglione et al. (2006) used pharmacological functional neuroimaging to visualize the role of dopaminergic modulation in reward prediction error in humans. During learning, the magnitude of reward prediction error neuronal signal in the striatum was increased by L-dopa and decreased by haloperidol, a D2/D1 receptor antagonist. At the behavioral level, participants receiving L-dopa tended to chose the most rewarding action in contrast to participants on haloperidol.

Antipsychotics with strong dopamine receptor antagonist properties dampen reward prediction signal in the ventral striatum (Juckel et al., 2006).

In contrast, Jensen et al. (2007) arrived at a different conclusion. These authors stressed that reward prediction studies focus on the role of ventral striatum in positive reinforcers, but it is not clear whether this region responds to behaviorally relevant aversive stimuli (e.g., a loud noise). Using functional neuroimaging, Jensen et al. (2007) showed that prediction error signal in the ventral striatum was not modulated by the valence of the stimuli (reward vs. punishment), in contrast to the orbitofrontal and insular regions that differentiated between reinforcers with positive and negative values.

However, the clear relevance of these results regarding our data is not clear, given that Jensen et al. (2007) used a classic Pavlovian paradigm.

Au et al. (2011) used a set-shifting task to evaluate the effect of dopaminergic status on learning in PD. The response times was prolonged in PD patients when they received no L-dopa, which normalized it in the case of positive feedback (reward). The ability to set-shift using negative feedback was not affected in PD. Interestingly, positive feedback deactivated the lateral prefrontal cortex in unmedicated PD patients, whereas negative feedback enhanced brain activation in posterior sensorimotor regions.

L-dopa ameliorated activation in the cortico-striatal loops but not in each region of the neocortex. In conclusion, PD patients’ ability on the set-shifting task was modified by dopamine status and feedback valence. The key finding was the same as in our study: unmedicated PD patients had deficits when positive feedback was used, which was restored by dopamine replacement therapy.

Recent animal studies seem to detangle the components of general motivational and reinforcing signals. First of all, it seems that the mesocortical and mesolimbic dopaminergic pathways are involved in multiple functions, including hedonic impact of stimuli (“liking”), motivation (“wanting”), and prediction errors during associative learning (“correcting”) (Smith et al., 2011). In a Pavlovian paradigm, rats were trained to associate stimuli with sucrose reward. If an opioid-stimulating drug was injected into the ventral striatum (nucleus accumbens), hedonic “liking” impact of sucrose was increased in parallel with the firing rates of neurons in the ventral pallidum. The incentive salience of stimuli was also encoded by these neurons but only if the stimuli were close to reward in time. In contrast, dopamine agonists enhanced the motivation component but did not affect hedonic aspects. There were different neuronal groups that encoded hedonic impact and incentive salience, and even these neurons’ firing rate was

different for these functions (Smith et al., 2011). Separate striatal neuronal groups also exist for motor control, motivation, and reward prediction (Ena et al., 2011).

Direct registrations from dopaminergic neurons in the mesencephalon also revealed their multifaceted role (Matsumoto and Hikosaka, 2009). The classic hypothesis was that these neurons are activated by stimuli predicting reward and inhibited by aversive events (Schultz, 2007). If the properties of a more widespread population of these neurons are investigated in a Pavlovian paradigm with reward and punishment (liquid rewards and airpuffs directed at the face, respectively), two subpopulation of cells can be distinguished: some dopamine neurons are excited by reward-predicting stimuli and inhibited by punishment-predicting stimuli, whereas another groups of cells are excited by both of these stimuli. Neurons excited by the punishment-predicting stimuli were located dorsolaterally in the substantia nigra pars compacta, whereas those inhibited by the stimuli were located ventromedially, extending to the ventral tegmental area (Matsumoto and Hikosaka, 2009).

Our results raise the possibility that dopamine agonists used in the clinical practice may clearly affect only one aspect of these basic neuronal coding mechanisms in patients with PD: enhancing reward signals and diminishing punishment signals in the clinically used dose ranges for a longer period of time. The most likely explanation is that these drugs predominantly bind to D2/D3 receptors densely expressed in neurons coding reward signals (Lee et al., 2007). However, in this respect presynaptic and postsynaptic effects are hard to differentiate, and dose and treatment duration are essential. For example, a single low dose of a D2/D3 agonist pramipexole activates autoreceptors, reduces phasic dopamine release and impairs reward learning in humans (Santesso et al., 2009). These participants show pramipexole-induced feedback-related negativity to rewards and decreased activation in dorsal anterior cingulate cortex;

disrupted reward learning is associated with reduced presynaptic dopaminergic signaling in response to reward (Santesso et al., 2009).

5.2.2. The possible clinical relevance of reward-learning and personality changes associated with dopamine agonists in PD

Although anecdotal reports suggest personality changes, i.e. decreased novelty seeking, rigidity, and neuroticism, well before the onset of PD motor symptoms (Menza, 2000), the data are not consistent and there are methodological uncertainties in these reports. Arabia et al. (2010) found that introversion and decreased novelty seeking do not predict PD in a community sample, whereas anxious temperament may represent a minor risk (Bower et al., 2010). However, the nature of anxious temperament is too general to be related to specific diseases and may be related to general health issues.

Gatto et al. (2011) argued against the existence of a specific “Parkinsonian” personality pattern.

In a positron emission tomography study, Kaasinen et al. (2001) found that novelty seeking was not associated with (18)F-dopa uptake in any of the brain regions studied in patients with PD. Harm avoidance, which is closely related to anxious and depressive traits, was increased in PD patients and exhibited a positive correlation with the (18)F-dopa uptake in the right caudate nucleus.

Tomer and Aharon-Peretz (2004) suggested an interesting lateralization theory of personality changes in PD: patients with greater dopamine loss in the left hemisphere showed reduced novelty seeking, whereas patients with reduced dopamine in the right hemisphere reported higher harm avoidance. Kaasinen et al. (2004) found that decreased novelty seeking in PD patients may be related to altered dopaminergic transmission in the insula.

Our results from never-medicated, young, non-depressed patients showed that decreased novelty seeking and reward processing deficits are early signs of PD, although we can not claim that these changes appeared before the evolution of motor symptoms. Our sample was not large enough to test differences between patients with right- and left-sided motor symptoms, but many of our patients displayed right-sided symptoms (left-hemisphere dopamine deficiency), which is consistent with the results of Tomer and Aharon-Peretz (2004). Furthermore, dopaminergic replacement increased novelty seeking, which is similar to that found after deep-brain stimulation in PD (Fassino et al., 2010). Deep-brain stimulation of the subthalamic nucleus is also associated with increased reward-sensitivity and enhanced reinforcement learning (Frank et al., 2007).

In the current literature there is enormously increased attention to dopamine dysregulation syndrome (DDS) in PD, which is interpreted as a consequence of dysfunctional reward system due to dopamine replacement therapy. The symptoms can be varied from craving for dopaminergic medication to impulse control disorders (gambling, punding, hypersexuality, compulsive shopping, eating disorders, compulsive internet use), hypomania, and even psychosis (Evans and Lees, 2004; Merims and Giladi, 2008). Although neuropsychiatric complications associated with dopamine replacement were reported at the same time when L-dopa entered into clinical practice (Yahr et al., 1969), some authors believe that the increasing prevalence of DDS is due to the application of dopamine agonists (Perez-Lloret and Rascol, 2010). According to Bostwick et al. (2009), 18% of PD patients receiving dopamine agonists develop impulse control disorders, whereas in the case of L-dopa such problems are much more rare (<5%).

In the so far largest case-control study, Voon et al. (2011a) showed that impulse control disorder in PD is associated with general functional impairment, more severe anxiety and depression, obsessive-compulsive symptoms, dyskinesias, higher novelty seeking, impulsivity, and reward preference. These clinical observations are highly consistent with our results, also indicating increased novelty seeking and reward preference after the initiation of dopamine agonist medications. However, it is important to emphasize that none of the patients in our study showed DDS, which may indicate that biases in novelty seeking and reward sensitivity appear without the development of clinical symptoms. These changes in tests and TCI scores may reflect an endophenotype for DDS and in the future they may serve as vulnerability markers.

In addition to specific dysfunctions in reward processing and novelty seeking, DDS is associated with general neuropsychological deficits. Vitale et al. (2011) showed that PD patients with impulse control disorders scored lover on tasks for spatial planning and attentional set-shifting. Patients with hypersexuality, compulsive eating, and multiple impulsive symptoms performed worse on verbal learning and memory tests than did patients with pathological gambling. Punding may be a different form of DDS associated with the severity of dyskinesia, younger age at disease onset, longer disease duration, and male gender (Spencer et al., 2011).

Beyond reward, novelty seeking, and general neuropsychological functions what are the mechanisms of DDS? Using the Balloon Analogue Risk Task, Claassen et al.

(2011) assessed PD patients with impulse control disorders when they received and when they ceased dopamine agonists. Dopamine agonists boosted risk taking only in patients with impulse control disorders, especially when doses were higher. However, both patients groups similarly reduced risk-taking choices in high risk conditions and after negative consequences.

To elucidate the neuronal mechanism of DDS in PD, Voon et al. (2011b) used a pharmacological functional neuroimaging method during a gambling task. As expected, patients with impulse control disorders executed more risky decision especially when there was likelihood to gain points (reward), which was associated with decreased orbitofrontal cortex and anterior cingulate activity. Healthy controls and PD patients without DDS showed the opposite pattern. After the administration of dopamine agonists, risky gain choices were increased in DDS patients, together with a decreased activity in the ventral striatum. Again, it was exactly the opposite in PD patients without DDS and healthy controls. These results highlight the importance of non-striatal regions in risk evaluation in DDS, and the fact that the ventral striatum exhibits a paradoxically decreased responsiveness after dopamine agonist challenge in DDS but not in PD without such neuropsychiatric complications.