Comparing Pain, Cognitive, and Salience Accounts of dACC
A reply to Tal Yarkoni's blog on our paper
Posted Dec 10, 2015
Naomi Eisenberger and I recently published a paper at Proceedings of the National Academy of Science titled “The dorsal anterior cingulate is selective for pain: Results from large-scale reverse inference”. You can read the paper here or here. Just for reference sake here is the abstract:
“Dorsal anterior cingulate cortex (dACC) activation is commonly observed in studies of pain, executive control, conflict monitoring, and salience processing, making it difficult to interpret the dACC’s specific psychological function. Using Neurosynth, an automated brainmapping database [of over 10,000 functional MRI (fMRI) studies], we performed quantitative reverse inference analyses to explore the best general psychological account of the dACC function P(Ψ process |dACC activity). Results clearly indicated that the best psychological description of dACC function was related to pain processing—not executive, conflict, or salience processing. We conclude by considering that physical pain may be an instance of a broader class of survival-relevant goals monitored by the dACC, in contrast to more arbitrary temporary goals, which may be monitored by the supplementary motor area.”
Tal Yarkoni (hereafter, TY), the creator of Neurosynth, has now posted a blog (here) suggesting that pretty much all of our claims are either false, trivial, or already well-known. While this response was not unexpected, it’s disappointing because we love Neurosynth and think it’s a powerful tool for drawing exactly the kinds of conclusions we’ve drawn. While TY is the creator of Neurosynth, we don’t think that means he has the last word when it comes to what is possible to do with it (nor does he make this claim). In the end, we think there may actually be a fair bit of agreement between us and TY. We do think that TY has misunderstood some of our claims (section 1 below) and failed to appreciate the significance and novelty of our actual claims (sections 2 and 4). TY also thinks we should have used different statistical analyses than we did, but his critique assumes we had a different question than the one we really had (section 5).
Not having word limits and having a lot to say made our blog longer than the original paper. We’ve added the list of issues and our very abbreviated replies (here) so you can get the gist of our reply without wading through all of this if you prefer.
1. Misunderstandings (where we sort of probably agree)
We think a lot of the heat in TY’s blog comes from two main misunderstandings of what we were trying to accomplish. The good news (and we really hope it is good news) is that ultimately, we may actually mostly agree on both of these points once we get clear on what we mean. The two issues have to do with the use of the term “selective” and then why we chose to focus on the four categories we did (pain, executive, conflict, salience) and not others like fear and autonomic.
Misunderstanding #1: Selectivity. Regarding the term selective, I suppose we could say there’s a strong form and a weak form of the word, with the strong form entailing further constraints on what constitutes an effect being selective. TY writes in his blog: “it’s one thing to use Neurosynth to support a loose claim like “some parts of the dACC are preferentially associated with pain”, and quite another to claim that the dACC is selective for pain, that virtually nothing else activates dACC”. The last part there gets at what TY thinks we mean by selective and what we would call the strong form of selectivity. We can state it this way:
Selectivitystrong: The dACC is selective for pain, if pain and only pain activates the dACC.
We agree this is an absurdly strong claim to make and we never took selectivity to imply this. I’m not sure if I know of any region or even network in the brain that I would make this sort of selectivity claim for. But the truth is that we haven’t been able to find this strong use of selectivity in the fMRI literature. There are many MVPA studies that talk about selectivity, particularly whether certain regions are category selective. These studies (and there are many including multiple papers from Jim Haxby, who published the first empirical MVPA paper) tend to use selectivity when comparing a few classes of stimuli – usually two. Here are two examples that reflect roughly what we meant by selectivity.
- From Haxby (2006): “numerous small spots of cortex were found that respond with very high selectivity to faces. However, these spots were intermixed with spots that responded with equally high selectivity to the other three categories.”
- From Pereira, Mitchell, & Botvinick (2009): “features in the fMRI domain are often voxels and there is a desire to be able to look for certain types of voxel behaviour, e.g. is a voxel very selective for one particular condition or discriminating between two groups of conditions.”
In both of these cases, as in many others, selectivity is defined in terms of the differential response to one class of stimuli than to another class of stimuli (e.g. faces vs. objects). We’ve never seen a response to one of these papers that says they were wrong to make these claims because they didn’t test for the thousands of other things the region of interest might respond to. Thus the weak form of selectivity, the version we were using, can be stated this way:
Selectivityweak: The dACC is selective for pain, if pain is a more reliable source of dACC activation than the other terms of interest (executive, conflict, salience).
We mean this in the same way that Haxby and lots of others do. We never give a technical definition of selectivity in our paper, though in the abstract we do characterize our results as follows:
“Results clearly indicated that the best psychological description of dACC function was related to pain processing—not executive, conflict, or salience processing.”
Thus, the context of what comparisons our selectivity refers to is given in the same sentence, right up front in the abstract. In the end, we would have been just as happy if “selectivity” in the title was replaced with “preferentially activated”. We think this is what the weak form of selectivity entails and it is really what we meant. We stress again, we are not familiar with researchers who use the strong form of selectivity. TY’s blog is the first time we have encountered this and was not what we meant in the paper.
Before moving on, we want to note that in TY’11 (i.e. the Yarkoni et al., 2011 paper announcing Neurosynth), the weak form of selectivity is used multiple times. In the caption for Figure 2, the authors refer to “regions in c were selectively associated with the term” when as far as we can tell, they are talking only about the comparison of three terms (working memory, emotion, pain). Similarly on p. 667 the authors write “However, the reverse inference map instead implicated the anterior prefrontal cortex and posterior parietal cortex as the regions that were most selectively activated by working memory tasks.” Here again, the comparison is to emotion and pain, and the authors are not claiming selectivity relative to all other psychological processes in the Neurosynth database. If it is fair for Haxby, Botvinick, and the eminent coauthors of TY’11 to use selectivity in this manner, we think it was fine for us as well.
We would also point readers to the fullest characterization of the implication of our results on p. 15253 of the article:
“The conclusion from the Neurosynth reverse inference maps is unequivocal: The dACC is involved in pain processing. When only forward inference data were available, it was reasonable to make the claim that perhaps dACC was not involved in pain per se, but that pain processing could be reduced to the dACC’s “real” function, such as executive processes, conflict detection, or salience responses to painful stimuli. The reverse inference maps do not support any of these accounts that attempt to reduce pain to more generic cognitive processes.”
We think this claim is fully defensible and nothing in TY’s blog contradicts this. Indeed, he might even agree with it.
Misunderstanding #2: We did not focus on fear, emotion, and autonomic accounts. TY criticizes us several times for not focusing on other accounts of the dACC including fear, emotion, and autonomic processes. We agree with TY that these kind of processes are relevant to dACC function. Indeed, we were writing about the affective functions of dACC (Eisenberger & Lieberman, 2004) when the rest of the field was saying that the dACC was purely for cognitive processes (Bush, Luu, & Posner, 2000). We have long posited that one of the functions of the dACC was to sound an alarm when certain kinds of conflict arise. We think the dACC is evoked by a variety of distress-related processes including pain, fear, and anxiety. As Eisenberger (2015) wrote: “Interestingly, the consistency with which the dACC is linked with fear and anxiety is not at odds with a role for this region in physical and social pain, as threats of physical and social pain are key elicitors of fear and anxiety.” And the outputs of this alarm process are partially autonomic in nature. Thus, we don’t think of fear and autonomic accounts as in opposition to the pain account, but rather in the same family of explanations. We think this class of dACC explanations stands in contrast to the cognitive explanations that we did compare to (executive, conflict, salience). Most of this, and what is said below, is discussed in Naomi Eisenberger’s (2015) Annual Review chapter.
We speak to some but not all of this in the paper. On p. 15254, we revisit our neural alarm account and write “Distress-related emotions (“negative affect” “distress” “fear”) were each linked to a dACC cluster, albeit much smaller than the one associated with “pain”.” While we could have said more explicitly that pain is in this distress-related category, we have written about this several times before and assumed this would be understood by readers.
So why did we focus on executive, conflict, and salience? Like most researchers, we are the products of our early (academic) environment. When we were first publishing on social pain, we were confused by the standard account of dACC function. A half century of lesion data and a decade of fMRI studies of pain pointed towards more evidence of the dACC’s involvement in distress-related emotions (pain & anxiety), yet every new paper about the dACC’s function described it in cognitive terms. These cognitive papers either ignored all of the pain and distress findings for dACC or they would redescribe pain findings as reducible to or just an instance of something more cognitive.
When we published our first social pain paper, the first rebuttal paper suggested our effects were really just due to “expectancy violation” (Somerville et al., 2006), an account that was later invalidated (Kawamoto 2012). Many other cognitive accounts have also taken this approach to physical pain (Price 2000; Vogt, Derbyshire, & Jones, 2006).
Thus for us, the alternative to pain accounts of dACC all these years were conflict detection and cognitive control explanations. This led to the focus on the executive and conflict-related terms. In more recent years, several papers have attempted to explain away pain responses in the dACC as nothing more than salience processes (e.g Iannetti’s group) that have nothing to do with pain, and so salience became a natural comparison as well. We haven’t been besieged with papers saying that pain responses in the dACC are “nothing but” fear or “nothing but” autonomic processes, so those weren’t the focus of our analyses.
We want to comment further on fear specifically. We think one of the main reasons that fear shows up in the dACC is because so many studies of fear use pain manipulations (i.e. shock administration) in the process of conditioning fear responses. This is yet another reason that we were not interested in contrasting pain and fear maps. That said, if we do compare the Z-scores in the same eight locations we used in the PNAS paper, the pain effect has more accumulated evidence than fear in all seven locations where there is any evidence for pain at all.
Its interesting to us that TY does not in principle seem to like us trying to generate some kind of unitary account of dACC writing “There’s no reason why nature should respect our human desire for simple, interpretable models of brain function.” Yet, TY then goes on to offer a unitary account more to his liking. He highlights Vogt’s “four-region” model of the cingulate writing “I’m especially partial to the work of Brent Vogt…”. In Vogt’s model, the aMCC appears to be largely the same region as what we are calling dACC. Although the figure shown by TY doesn’t provide anatomical precision, in other images, Vogt shows the regions with anatomical boundaries. Rotge et al. (2015) used such an image from Vogt (2009) to estimate the boundaries of aMCC as spanning 4.5 ≤ y ≤ 30 which is very similar to our dACC anterior/posterior boundaries of 0 ≤ y ≤ 30) (see Figure below). Vogt ascribes the function of avoidance behavior to this region - a pretty unitary description of the region that TY thinks we should avoid unitary descriptions of.
In the end though, if TY prefers a fear story to our pain story, we think there is some evidence for both of these (a point we make in our PNAS paper). We think they are in a class of processes that overlap both conceptually (i.e. distress-related emotions) and methodologically (i.e. many fear studies use pain manipulations to condition fear).
2. Lieberman & Eisenberger’s (hereafter, L&E) unobjectionable claims are hardly novel.
After focusing on potential misunderstandings we want to turn to our first disagreement with TY. Near the end of his blog, TY surprised us by writing that the following conclusions can be reasonably drawn from Neurosynth analyses:
- “There are parts of dACC (particularly the more posterior aspects) that are preferentially activated in studies involving painful stimulation.”
- “It’s likely that parts of dACC play a greater role in some aspect of pain processing than in many other candidate processes that at various times have been attributed to dACC (e.g., monitoring for cognitive conflict)”
Our first response was ‘Wow. After pages and pages of criticizing our paper, TY pretty much agrees with what we take to be the major claims of our paper. Yes, his version is slightly watered down from what we were claiming, but these are definitely in the ballpark of what we believe.’ But then TY’s next statement surprised us in a different sort of way. He wrote
“I think these are all interesting and potentially important observations. They’re hardly novel…”.
We’ve been studying the dACC for more than a decade and wondered what he might have meant by this. We can think of two alternatives for what he might have meant:
- That L&E and a small handful of others have made this claim for over a decade (but clearly not with the kind of evidence that Neurosynth provides).
- That TY already used Neurosynth in 2011 to show this. In the blog, he refers to this paper writing “We explicitly noted that there is preferential activation for pain in dACC”.
In either case, “they’re hardly novel” implies this is old news and that everyone knows and believes this, as if we’re claiming to have discovered that most people have two eyes, a nose, and a mouth. But this implication could not be further from the truth.
There is a 20+ year history of researchers ignoring or explaining away the role of pain processing in dACC. When pain effects are mentioned in most papers about the function of dACC, it is usually to say something along the lines of ‘Pain effects in the dACC are just one manifestation of the broader cognitive function of conflict detection (or salience or executive processes)’. This long history is indisputable. Here are just a few examples (and these are all reasonable accounts of dACC function in the absence of reverse inference data):
- Executive account: Price’s 2000 Science paper on the neural mechanisms of pain assigns to the dACC the roles of “directing attention and assigning response priorities”
- Executive account: Vogt et al. (1996) says the dACC “is not a ‘pain centre’” and “is involved in response selection” and “response inhibition or visual guidance of responses”
- Conflict account: Botvinick et al. (2004) wrote that “the ACC might serve to detect events or internal states indicating a need to shift the focus of attention or strengthen top-down control (, see also ), an idea consistent, for example, with the fact that the ACC responds to pain ” (Botvinick et al. 2004)
- Salience account: Iannetti suggests the ‘pain matrix’ is a myth and in Legrain et al. (2011) suggests that the dACC’s responses to pain “could mainly reflect brain processes that are not directly related to the emergence of pain and that can be engaged by sensory inputs that do not originate from the activation of nociceptors.”
But perhaps this approach to dACC function has changed in light of TY’11 findings (i.e. Yarkoni et al. 2011). There he wrote “For pain, the regions of maximal pain-related activation in the insula and DACC shifted from anterior foci in the forward analysis to posterior ones in the reverse analysis.” This hardly sounds like a resounding call for a different understanding of dACC that involves an appreciation of its preferential involvement in pain. Here are quotes from other papers showing how they view the dACC in light of TY’11:
- Poldrack (2012) “The striking insight to come from analyses of this database (Yarkoni et al., in press) is that some regions (e.g., anterior cingulate) can show high degrees of activation in forward inference maps, yet be of almost no use for reverse inference due to their very high base rates of activation across studies”
- Chang, Yarkoni et al. (2012) “the ACC tends to show substantially higher rates of activation than other regions in neuroimaging studies (Duncan and Owen 2000; Nelson et al. 2010; Yarkoni et al. 2011), which has lead some to conclude that the network is processing goal-directed cognition (Yarkoni et al. 2009)”
- Atlas & Wager (2012) “In fact, the regions that are reliably modulated (insula, cingulate, and thalamus) are actually not specific to pain perception, as they are activated by a number of processes such as interoception, conflict, negative affect, and response inhibition”
Perhaps the reason why people who cite TY’11 in their discussion of dACC didn’t pay much attention to the above quote from TY’11 (““For pain, the regions of maximal pain-related…”) was because they read and endorsed the following more direct conclusion that followed “…because the dACC is activated consistently in all of these states [cognitive control, pain, emotion], its activation may not be diagnostic of any one of them” (bracketed text added). If this last quote is taken as TY’11’s global statement regarding dACC function, then it strikes us still as quite novel to assert that the dACC is more consistently associated with one category of processes (pain) than others (executive, conflict, and salience processes).
3. L&E cherry picked the data they showed
In the article, we showed forward and reverse inference maps for 21 terms and then another 9 in the supplemental materials. These are already crowded busy figures and so we didn’t have room to show multiple slices for each term. Fortunately, since Neurosynth is easily accessible (go check it out now at neurosynth.org – its awesome!) you can look at anything we didn’t show you in the paper. Tal takes us to task for this. He then shows a bunch of maps from x=-8 to x=+8 on a variety of terms. Many of these terms weren’t the focus of our paper because we think they are in the same class of processes as pain (as noted above). So it’s no surprise to us that terms such as ‘fear,’ ‘empathy,’ and ‘autonomic’ produce dACC reverse inference effects. In the paper, we reported that ‘reward’ does indeed produce reverse inference effects in the anterior portion of the dACC (and show the figure in the supplemental materials), so no surprise there either. Then at the bottom he shows cognitive control, conflict, and inhibition which all show very modest footprints in dACC proper, as we report in the paper. There are two things that make the comparison of what he shows and what we reported in the paper not a fair comparison.
First, his maps are thresholded at p<.001 and yet all the maps that we report use Neurosynth’s standard, more conservative, FDR criterion of p<.01 (a standard TY literally set). Here, TY is making a biased, apples-to-oranges comparison by juxtaposing the maps at a much more liberal threshold than what we did. Given that each of the terms we were interested in (pain, executive, conflict, salience) had more than 200 studies in the database its not clear why TY moved from FDR to uncorrected maps here.
Second, the Neurosynth database has been updated since we did our analyses. The number of studies in the database has only increased by about 5% (from 10,903 to 11,406 studies) and yet there are some curious changes. For instance, fear shows more robust dACC now than it did a few months ago even though it only increased from 272 studies to 298 studies. We were more surprised to discover that the term ‘rejection’ has been removed from the Neurosynth database altogether such that it can no longer be used as a term to generate forward and reverse inference maps (even though it was in the database prior to the latest update). Given that Neurosynth is practically a public utility and federally funded, it would be valuable to know more about the specific procedures that determine which journals and articles are added to the database and on what schedule. Also, what are the conditions that can lead to terms being removed from the database and what are the set of terms that were once included that have now been removed.
In any event, we did not cherry pick data. We used the data that was available to us as of June 2015 when we wrote the paper. For the four topics of interest, below we provide more representative views of the dACC, thresholded as typical Neurosynth maps are, at FDR p<.01. We’ve made the maps nice and big so you can see the details and have marked in green the dACC region on the different slices (the coronal slice are at y=14 and y=22). When you look at these, we think they tell the same story we told in the paper.
4. Surprising lack of appreciation for what the reverse inference maps show in pretty straightforward manner.
Let’s start with pain and salience. Iannetti and his colleagues have made quite a bit of hay the last few years saying that the dACC is not involved in pain, but rather codes for salience. One of us has critiqued the methods of this work elsewhere (Eisenberger, 2015, Annual Review). The reverse inference maps above show widespread robust reverse inference effects throughout the dACC for pain and not a single voxel for salience. When we ran this initially for the paper, there were 222 studies tagged for the term salience and now that number is up to 269 and the effects are the same.
Should our tentative conclusion be that we should hold off judgment until there is more evidence? TY thinks so: “If some terms have too few studies in Neurosynth to support reliable comparisons with pain, the appropriate thing to do is to withhold judgment until more data is available.” This would be reasonable if we were talking about topics with 10 or 15 studies in the database. But, there are 269 studies for the term salience and yet there is nothing in the dACC reverse inference maps. I can’t think of anyone who has ever run a meta-analysis of anything with 250 studies, found no accumulated evidence for an effect and then said “we should withhold judgment until more data is available”.
TY and his collaborators have criticized researchers in major media outlets (e.g. New York Times) for poor reverse inference – for drawing invalid reverse inference conclusions from forward inference data. The analyses we presented suggest that claims about salience and the dACC are also based on unfounded reverse inference claims. One would assume that TY and his collaborators are readying a statement to criticize the salience researchers in the same way they have previously.
But no. Nowhere in the blog does TY comment on this finding that directly contradicts a major current account of the dACC. Not so much as a “Geez, isn’t it crazy that so many folks these days think the dACC and AI can be best described in terms of salience detection and yet there is no reverse inference evidence at all for this claim.”
For the terms executive and conflict, our Figure 3 in the PNAS paper shows a tiny bit of dACC. We think the more comprehensive figures we’ve included here continue to tell the same story. If someone wants to tell the conflict story of why pain activates the dACC, we think there should be evidence of widespread robust reverse inference mappings from the dACC to conflict. But the evidence for such a claim just isn’t there. Whatever else you think about the rest of our statistics and claims, this should give a lot of folks pause, because this is not what almost any of us would have expected to see in these reverse inference maps (including us).
If you generally buy into Neurosynth as a useful tool (and you should), then when you look at the four maps above, it should be reasonable to conclude, at least among these four processes, that the dACC is much more involved in that first one (i.e. pain). Let’s test this intuition in a new thought experiment.
Imagine you were given the three reverse inference maps below and you were interested in the function of the occipital cortex area marked off with the green outline. You’d probably feel comfortable saying the region seems to have a lot more to do with Term A than Terms B or C. And if you know much about neuroanatomy, you’d probably be surprised, and possibly even angered, when I tell you that Term A is ‘motor’, Term B is ‘engaged’, and Term C is ‘visual’. How is this possible since we all know this region is primarily involved in visual processes? Well it isn’t possible because I lied. Term A is actually ‘visual’ and Term C is ‘motor’. And now the world makes sense again because these maps do indeed tell us that this region is widely and robustly associated with vision and only modestly associated with engagement and motor processes. The surprise you felt, if you believed momentarily that Term A was motor was because you have the same intuition we do that these reverse inference maps tell us that Term A is the likely function of this region, not Term B or Term C – and we’d like that reverse inference to be what we always thought this region was associated with – vision. It’s important to note that while a few voxels appear in this region for Terms B and C, it still feels totally fine to say this region’s psychological function can best be described as vision-related. It is the widespread robust nature of the effect in Term A, relative to the weak and limited effects of Terms B and C, that makes this a compelling explanation of the region.
Another point of this thought experiment is that if Term A is what we expect it to be (i.e. vision) then we can keep assuming that Neurosynth reverse inference maps tell us something valuable about the function of this region. But if Term A violates our expectation of what this region does, then we are likely to think about the ways in which Neurosynth’s results are not conclusive on this point.
We suspect if the dACC results had come out differently, say with conflict showing wide and robust reverse inference effects throughout the dACC, and pain showing little to nothing in dACC, that most of our colleagues would have said “Makes sense. The reverse inference map confirms what we thought – that dACC serves a general cognitive function of detecting conflicts.” We think it is because of the content of the results rather than our approach that is likely to draw ire from many.
5. L&E did the wrong analyses
TY suggests that we made a major error by comparing the Z-scores associated with different terms and should have used posterior probabilities instead. If our goal had been to compare effect sizes this might have made sense, but comparing effect sizes was not our goal. Our goal was to see whether there was accumulated evidence across studies in the Neurosynth database to support reverse inference claims from the dACC. While we think the maps for each term speak volumes just from visual inspection, we thought it was also critical to run the comparisons across terms directly. We all know the statistical error of showing that A is significant, while B is not and then assuming, but not testing A > B, directly. TY has a section called “A>B does not imply ~B” (where ~B means ‘not B’). Indeed it does not, but all the reverse inference maps for the executive, conflict, and salience terms already established ~B. We were just doing due diligence by showing that the difference between A and B was indeed significant.
If it’s reasonable to use the Z-scores from Neurosynth to say “How much evidence is there for process A being a reliable reverse inference target for region X” then it has to be reasonable to compare Z-scores from two analyses to ask “How much MORE evidence is there for process A than process B being a reliable reverse inference target for region X”. This is all we did when we compared the Z-scores for different terms to each other (using a standard formula from a meta-analysis textbook) and we think this is the question many people are asking when they look at the Neurosynth maps for any two competing accounts of a neural region.
TY then raises two quite reasonable issues with the Z-score comparisons, one of which we already directly addressed in our paper. First, TY raises the issue that Z-scores increase with accumulating evidence, so terms with more studies in the database will tend to have larger Z-scores. This suggests that terms with the most studies in the database (e.g. motor with 2081 studies) should have significant Z-scores everywhere in the brain. But terms with the most studies don’t look like this. Indeed, the reverse inference map for “functional magnetic” with 4990 studies is a blank brain with no significant Z-scores.
However, TY has a point. If two terms have similar true underlying effects in dACC, then the one with the larger number of studies will have a larger Z-score, all else being equal. We addressed this point in the limitations section of our paper writing “It is possible that terms that occur more frequently, like “pain,” might naturally produce stronger reverse inference effects than less frequent terms. This concern is addressed in two ways. First, the current analyses included a variety of terms that included both more or fewer studies than the term “pain” and no frequency-based gradient of dACC effects is observable.” So while pain (410 studies) is better represented in the Neurosynth database than conflict (246 studies), effort (137 studies), or Stroop (162 studies), several terms are better represented than pain including auditory (1004 studies), cognitive control (2474 studies), control (2781 studies), detection (485 studies), executive (531 studies), inhibition (432 studies), motor (1910 studies), and working memory (815). All of these, regardless of whether they are better or worse represented in the Neurosynth database show minimal presence in the dACC reverse inference maps. It’s also worth noting that painful and noxious, with only 158 and 85 studies respectively, both show broader coverage within the dACC than any of the cognitive or salience terms considered in our paper.
TY’s second point is also reasonable, but is also not a problem for our findings. TY points out that some effects may be easier to produce in the scanner than others and thus may be biased towards larger effect sizes. We are definitely sympathetic to this point in general, but TY goes on to focus on how this is a problem for comparing pain studies to emotion studies because pain is easy to generate in the scanner and emotion is hard. If we were writing a paper comparing effect sizes of pain and emotion effects this would be a problem but (a) we were not primarily interested in comparing effect sizes and (b) we definitely weren’t comparing pain and emotion because we think the aspect of pain that the dACC is involved in is the affective component of pain as we’ve written in many other papers dating back to 2003 (Eisenberger & Lieberman, 2004; Eisenberger, 2012; Eisenberger, 2015).
Is TY’s point relevant to our actual terms of comparison: executive, conflict, and salience processes? We think not. Conflict tasks are easy and reliable ways to produce conflict processes. In multiple ways, we think pain is actually at a disadvantage in the comparison to conflict. First, pain effects are so variable from one person to the next that most pain researchers begin by calibrating the objective pain stimuli delivered, to each participant’s subjective responses to pain. As a result, each participant may actually be receiving different objective inputs and this might limit the reliability or interpretability of certain observed effects. Second, unlike conflict, pain can only be studied at the low end of its natural range. Due to ethical considerations, we do not come close to studying the full spectrum of pain phenomena. Both of these issues may limit the observation of robust pain effects relative to our actual comparisons of interest (executive, conflict, and salience processes.
6. About those effect size comparison maps
After criticizing us for not comparing effect sizes, rather than Z-scores, TY goes on to produce his own maps comparing the effect sizes of different terms and claiming that these represent evidence that the dACC is not selective for pain. A lot of our objections to these analyses as evidence against our claims repeats what’s already been said so we’ll start with what’s new and then only briefly reiterate the earlier points.
a) We don’t think it makes much sense to compare effect sizes for terms in voxels for which there is no evidence that it is a valid reverse inference target. For instance, the posterior probability at 0 26 26 for pain is .80 and for conflict is .61 (with .50 representing a null effect). Are these significantly different from one another? I don’t think it matters much because the Z-score associated with conflict at this spot is 1.37, which is far from significant (or at least it was when we ran our analyses last summer. Strangely, now, any non-significant Z-scores seem to come back with a value of 0, whereas they used to give the exact non-significant Z-score).
If I flip a coin twice I might end up with a probability estimate of 100% heads, but this estimate is completely unreliable. Comparing this estimate to those from a coin flipped 10,000 times which comes up 51% heads makes little sense. Would the first coin having a higher probability estimate than the second tell us anything useful? No, because we wouldn’t trust the probability estimate to be meaningful. Similarly, if a high posterior probability is associated with a non-significant Z-score, we shouldn’t take this posterior probability as a particularly reliable estimate.
b) TY’s approach for these analyses is to compare the effect sizes for any two processes A & B by finding studies in the database tagged for A but not B and others tagged for B but not A and to compare these two sets. In some cases this might be fine, but in others it leaves us with a clean but totally unrealistic comparison. To give the most extreme example, imagine we did this for the terms pain and painful. It’s possible there are some studies tagged for painful but not pain, but how representative would these studies be of “painful” as a general term or construct? It’s much like the clinical problem of comparing depression to anxiety by comparing those with depression (but not anxiety) to those with anxiety (but not depression). These folks are actually pretty rare because depression and anxiety are so highly comorbid, so the comparison is hardly a valid test of depression vs. anxiety. Given that we think pain, fear, emotion, and autonomic are actually all in the same class of explanations, we think comparisons within this family are likely to suffer from this issue.
c) TY compared topics (i.e., a cluster of related terms), not terms. This is fine, but it is one more way that what TY did is not comparable to what we did (i.e. one more way his maps can’t be compared to those we presented).
d) Finally and most importantly, our question would not have led us to comparing effect sizes. We were interested in whether there was greater accumulated evidence for one term (i.e. pain) being a reverse inference target for dACC activations than for another term (e.g. conflict). Using the Z-scores as we did is a perfectly reasonable way to do this.
7. Biases all around
Towards the end of his blog, TY says what we think many cognitive folks believe:
“I don’t think it’s plausible to think that much of the brain really prizes pain representation above all else.”
We think this is very telling because it suggests that the findings such as those in our PNAS paper are likely to be unacceptable regardless of what the data shows.
In contrast, we can’t think of too many things that the brain would prize above pain (and distress) representations. People who don’t feel pain (i.e. congenital insensitivity to pain) invariably die an early death – it is literally a death sentence to not feel pain. What could be more important for survival? Blind and deaf people survive and thrive, but those without the ability to feel pain are pretty much doomed.
So we all have our biases and it’s not surprising that this might shape how we evaluate the evidence. We’re not really referring to TY here, because we suspect that we might actually share a more similar view of the dACC than our respective blogs suggest. But we do think there is a strong expectancy on the part of many cognitive neuroscientists that the dACC performs either a general cognitive function (e.g. executive processing) or a more specific cognitive function (e.g. conflict detection or salience).
Similar (but not identical) to TY’s conclusions that we opened this blog with, we think the following conclusions are supported by the Neurosynth evidence in our PNAS paper:
- There is more widespread and robust reverse inference evidence for the role of pain throughout the dACC than for executive, conflict, and salience-related processes.
- There is little to no evidence from the Neurosynth database that executive, conflict, and salience-related processes are reasonable reverse inference targets for dACC activity.
- Pain processes, particularly the affective or distressing part of pain, are in the same family with other distress-related processes including terms like distress, fear, and negative affect.
Postscript. L&E should have used reverse inference, not forward inference, when examining the anatomical boundaries of dACC.
We saved this one for the postscript because this has little bearing on the major claims of our paper. In our paper, we observed that when one does a forward inference analysis of the term ‘dACC’ the strongest effect occurs outside the dACC in what is actually SMA. This suggested to us that people might be getting activations outside the dACC and calling them dACC (much as many activations clearly not in the amygdala have been called amygdala because it fits a particular narrative). TY admits having been guilty of this in TY’11 and points out that we made this mistake in our 2003 Science paper on social pain. A couple of thoughts on this.
a) In 2003, we did indeed call an activation outside of dACC (-6 8 45) by the term “dACC”. TY notes that if this is entered into a Neurosynth analysis the first anatomical term that appears is SMA. Fair enough. It was our first fMRI paper ever and we identified that activation incorrectly. What TY doesn’t mention is that there are two other activations from the same paper (-8 20 40; -6 21 41) where the top named anatomical term in Neurosynth is anterior cingulate. And if you read this in TY’s blog and thought “I guess social pain effects aren’t even in the dACC”, we would point you to the recent meta-analysis of social pain by Rotge et al. (2015) where they observed the strongest effect for social pain in the dACC (8 24 24; Z=22.2 PFDR<.001). So while we made a mistake, no real harm was done.
In contrast, TY’11’s mistake is probably of greater significance. Many have taken Figure 3 of TY’11 as strong evidence that the dACC activity can’t be reliably associated with working memory, emotion, or pain. If TY had tested instead (2 8 40), a point directly below his that is actually in dACC (rather than 2 8 50 which TY now acknowledges is in SMA), he would have found that pain produces robust reverse inference effects, while neither working memory or emotion do. This would have led to a very different conclusion than the one most have taken from TY’11 about the dACC.
b) TY suggested that we should have looked for “dACC” in the reverse inference map rather than the forward inference map writing “All the forward inference map tells you is where studies that use the term “dACC” tend to report activation most often”. Yet this is exactly what we were interested in. If someone is talking about dACC in their paper, is that the region most likely to appear in their tables? The answer appears to be no.
c) But again, this is not one of the central claims of the paper. We just thought it was noteworthy so we noted it. Nothing else in the paper depends on these results.