Skip to main content

Verified by Psychology Today

Punishment

How Does the Brain Learn What Counts?

Are we consistent in assigning value?

Charles Deluvio. Unsplash.
Does it always taste the same?
Source: Charles Deluvio. Unsplash.

This morning as you opened the refrigerator door, you looked at the food options for breakfast and asked yourself, “What shall I have for breakfast? Cereal topped with bananas? Sausage and eggs? Pancakes? Fruit bowl? Bagel?

Each option has a value, which changes from morning to morning. Each morning, the brain has you assess the relative values and decides which food choice counts the most for that particular moment. How does the brain determine that value?

Whether it is in the home, workplace, the school, or in interpersonal relationships, we typically face experiences to which we assign value. Often, we must weigh the relative values of several competing experiences in order to choose a single option to act upon.

In humans, it seems clear that value assignment depends heavily upon function in the prefrontal cortex (PFC). But how does this circuitry assign value? Two possibilities come to mind: 1) the neural response to the stimuli associated with an experience may be consistent across different contexts, or 2) the encoding may be relative, depending on the experience’s context. Relative coding seems likely for the breakfast choice example mentioned above. Any morning’s choice will not be the same every morning. The choice depends, among other things, on how you feel, your level of appetite, and your recent breakfast choices.

Also, value assignment can be thought of as a learning experience. If we have never tasted bananas, for example, our first exposure entails a value assessment on how good it tastes, which we then learn to apply to our future decisions about whether we want to have a banana for breakfast or any other time. Clearly, feedback is important. How did it taste and how did that taste compare with other kinds of food that we have eaten, especially eaten recently? If you have had bananas every morning for a week, you may be tired of eating bananas. Then, there is the issue of the context in which perception occurs. Bananas might have more appeal for breakfast that they do for supper.

Research a decade or so ago revealed that our perception is stable across multiple contexts. For example, we can see a banana in the light or dark. We see a green banana or a ripe yellow banana, and still know it is a banana. We can even shut our eyes and feel the shape and still conclude it is a banana.

Assigning value to what we perceive could be another matter. Does value depend on the choice context, rather than being invariant across contexts? That is, do neural circuits rescale value assignment depending on the context? Earlier fMRI brain-scan studies showed that context is important to value assessment in several areas (mPFC, orbitofrontal cortex (OFC), and cingulate cortex).

A recent study has examined whether context-depending coding occurs in all PFC regions and how it is affected by feedback information. Twenty-eight human participants (both sexes) performed an instrumental learning task in which they were trained to maximize their monetary payoff. Choice options produce either reward (adding money to their account) or punishment (subtracting money).

Subjects performed four learning trials while in the fMRI scanner in which they were repeatedly were presented with a pair of abstract symbols. For each run, they were presented eight different symbols pairs to produce four choice contexts (i.e., reward/partial feedback, reward/complete, punishment/partial, and punishment/complete).

In each trial, they chose between two symbols associated with a certain outcome of money reward. Thus, the contexts were defined based on the possible outcome (either reward or punishment of receiving or losing a specified amount of money). Half of the trials presented complete feedback in which the outcome of the unchosen option was displayed as well, while in the other half of trials subjects were informed of only the value payoff of chosen options.

As repeated learning trials progressed, subjects were learning to optimize their payoffs. MRI signal change reflecting differences between good and bad outcomes was higher for chosen than for unchosen outcomes, with no difference between the chosen outcomes in terms of whether the feedback was partial or complete.

Increased activity in all of the PFC regions and cingulate cortex confirmed their role in encoding and processing value assessment. Anterior PFC activity increased for chosen outcomes but decreased by unchosen outcome processing. Activity patterns also varied depending on whether partial or complete feedback was given. How does the brain assign different values according to situational context? The explanation is that the neurons must rescale their impulse discharge response to the perceived value of object properties relative to the specific context.

The amount of feedback, partial or complete, greatly affected context-dependent value learning, as revealed by brain activation in multiple regions of PFC and the cingulate cortex. Complete feedback produced the best learning and also caused a switch to assigning value depending on the context. Overall, the subjects learned equally well in reward and punishment contexts.

The authors used a complicated way to show what we already know from personal experience. We learn what we value from the feedback we receive from our choices, and the value we assign depends on situational context. We readily learn to like bananas on breakfast cereal, but bananas have much less value on pizza at dinner.

The demonstration of the role of the prefrontal cortex is important. These results tell us that concussion, stroke, or other damage that affects this part of the brain will impair our ability to make reasoned judgments about the choices we make.

The take-home message is that value assessment occurs in multiple PFC areas in multiple ways, and neural activity does depend on situational context. The coding process is learned by experience and the comprehensiveness of feedback. This learning is consistent with what has been learned over decades of learning and memory research, as I summarize in my book on memory.

References

Doris Pischedda, Stefano Palminteri and Giorgio Coricelli (2020).The effect of counterfactual Information on outcome value coding in medial prefrontal and cingulate cortex: from an absolute to a relative neural code. Journal of Neuroscience 15 April 2020, 40 (16) 3268-3277; DOI: https://doi.org/10.1523/JNEUROSCI.1712-19.2020

Klemm, W. R. (2012). Memory Power 101. New York: Skyhorse.

advertisement
More from William R. Klemm Ph.D.
More from Psychology Today
More from William R. Klemm Ph.D.
More from Psychology Today