Antidepressant, Talk Therapy Fail to Beat Placebo--Really?

A fresh look at news coverage and the actual study

Posted Jan 03, 2012

"Antidepressant, talk therapy fail to beat placebo." So headlined a Reuters news item that was quickly picked up by other new services, almost always verbatim. The message spread across the Internet via Twitter and blog commentaries. The usual ideologues interpreted the study as evidence for claims that antidepressants are no better than placebos. If the claim were true about antidepressants, it would be just as true of psychotherapies versus placebo, but no one wants to make that point. We will come back to that.

The news item continued:

  • Neither antidepressants nor "talk therapy" were able to outperform inactive placebo pills in a new clinical trial on depression treatment -- though there were hints that the effects varied based on people's sex and race, researchers report.
  • After 16 weeks, there were no overall differences in how the three groups fared.
  • Of antidepressant patients, 31 percent were treatment "responders" (meaning they'd fallen below a certain score on a standard measure of depression symptoms, or had seen their score drop at least 50 percent.)
  • The same was true of about 28 percent of patients in the talk-therapy group, and 24 percent in the placebo group. The differences among the three groups were so small as to be likely due to chance.
  • "I was surprised by the results. They weren't what I'd expected," said lead researcher Jacques P. Barber, dean of the Institute of Advanced Psychological Studies at Adelphi University in Garden City, New York.

Immediate Grounds for Skepticism.

• This finding is contradicted by a large literature demonstrating that antidepressants are superior to pill placebo and a smaller literature showing that psychotherapy has a similar advantage when measured against a pill placebo provided in a clinical trial.
• Supportive expressive therapy, a short-term version of psychodynamic and psychoanalytic treatments, lacks empirical support for depression, particularly compared to the well-validated cognitive behavior therapy,. So, the label "talk therapy" in the news item is too broad and inaccurate.
• The rates of response to both antidepressants and pill placebo in this trial are lower than in other trials.
• This trial is too small to detect differences among active treatments, even if they were present.

Going to the Original Article.

I was most interested in whether this trial succeeded in obtaining the intended sample size, if the trial was analyzed with "gold standard" intent-to-treat analyses that take into account all patients who were randomized, and if sufficient numbers of patients obtained adequate exposure to treatment.

The article acknowledges difficulties recruiting the intended sample size of 180 patients. A number of different strategies were adopted to recruit 156 patients, including ads in newspapers. The article does not state what incentives were offered, but such recruitment strategies require financial incentives to attract and retain patients Substantial payments have the disadvantage of attracting patients who are motivated by the money, not improvement in their depression, . Many "professional research participants" earn a good proportion of their income from volunteering for clinical trials.

For patients with insurance, treatment is already available in the community from primary care physicians. The antidepressants used in the study have already gone generic, cheap versions are available for a few dollars a month. Additionally, patients with insurance have little incentive to enroll in a clinical trial where they might not get their preferred treatment. These considerations become important in a trial where there are no innovative treatments: a rather traditional psychodynamic treatment versus a cheap, generic antidepressant.

So, this is an unrepresentative sample that likely lacks the motivation of depressed patients seeking treatment in clinical settings. The sample was low income--typical of Philadelphians enrolling in clinical trials with financial incentives. But the article indicates no special steps to educate patients about treatment, increase their adherence, or retain them in the study. That missing ingredient may have contributed to the 40% dropout rate. My group found that it takes persistent effort, special accommodations like flexible scheduling of contacts, and incentives to recruit and retain representative samples of low income, urban research participants.

The study is flawed by too many patients getting an inadequate exposure to treatment. Only 91 of the 156 patients completed the study. The investigators attempted to compensate for the dropouts by using a statistical technique known as Last Observation Brought Forward (LOBF). This technique considers the last outcome data collected from a patient as that patient's final outcome. LOBF is known to provide biased estimates of group differences in the outcome of a clinical trial. It assumes dropouts are random and ignores whether patients were improving or deteriorating when they dropped out.

Thus, use of LOBF allowed the investigators to have data for all patients, even dropouts. Including all patients who were randomized--- what are called intent-to-treat analyses-- is the gold standard for clinical trials. Such analyses answer the question of what happens if patients are randomized to a particular treatment. If many patients drop out, that is a relevant outcome. Moreover, limiting analyses only to patients who completed a trial, no longer preserves the benefits of randomization. Dropouts are not random. So, the best strategy is to rely on intent-to-treat analyses, but relying on the LOBF introduced bias in this trial. Still, it is also important to pay attention to "as treated" analyses focusing on patients who actually got the treatment as planned. And here we see these numbers get a lot smaller.

The odds were against finding any differences between treatments. According to the statistical power analyses used to design the study, if the investigators had gotten the sample size they planned, there would have been an 80% chance of finding a difference between either the psychodynamic therapy or the antidepressants and an inert treatment, such as waitlist control. However, the problem is that pill placebo administered in the context of a clinical trial is not an inert treatment, as the investigators note. Both patients and providers are blinded, so they do not know the patients getting a placebo are not getting an antidepressant. The patients are given positive expectations and a lot of encouragement and support that may be sufficient to produce improvement.

So, the initial power calculations used to determine sample size were unrealistic because they assumed that the comparison was with an inert treatment. The intended sample size of 180 distributed across three groups is too small. But we should take into account that the investigators succeeded in recruiting only 156 patients, with 40% dropouts among patients getting medication or pill placebo and 23% of the patients getting psychotherapy, the chances of obtaining a significant effect are well below 50-50, especially if any effect depended on patients actually having been sufficiently exposed to treatments. The investigators had little chance of finding a statistically significant difference among the three groups. Why was the lead investigator surprised by the results?

In the larger literature, getting an antidepressant versus a pill placebo in a clinical trial has a small effect size (about r = .30). In the entire literature, there are less than a dozen comparisons between psychotherapy and pill placebo, but the advantage of psychotherapy over placebo is about the same as for an antidepressant. This, of course, is an average effect, with some patients doing much better and others worse.

This does not mean that a primary care physician prescribing a sugar pill to patients would have a similar sized positive effect. Actually, given the poor quality of routine care for depression in the community, antidepressants given by a primary care physician have about the same effects on depressed patients as pill placebos given in the context of a clinical trial, where there is support and active clinical management and, importantly, follow-up.

Pim Cuijpers has pointed out that the largest difference typically found between two credible, structured psychotherapies is r = .20. A trial able to detect such a small effect without missing it by chance would require about 1000 patients. That would be wasteful and impractical, leading to the question of why NIH felt the need to fund an expensive, 6 year trial comparing a short term psychodynamic therapy to antidepressants and a pill placebo, given what we already know about cognitive behavior therapy and when there is little chance of finding a clinically significant difference.
The claim in the press releases coverage of outcomes varying by gender and race is likely to be due to chance and certainly not meaningful, because such claims do not take differences in exposure to treatment into account and are based on exceptionally small samples of patients. And lots of comparisons were examined to produce a few positive results that probably would not have been predicted ahead of time.

Why was the lead investigator surprised by the results? He was probably lulled into believing that such a small trial with substantial dropout rates could yield a significant effect because of what he saw already published in supposedly the best of the journals. The frequency with which small trials with positive results occur in the literature and the rarity of negative results being published are striking. They defy all predictions from power analyses, particularly in what are considered the best journals for publishing psychotherapy trials, like Journal of Consulting and Clinical Psychology (JCCP).

In an earlier blog, I gave an example from of a study of whether a short course of acceptance and commitment therapy (ACT) reduced rehospitalization. The investigators conveniently dropped patients who killed themselves or went to jail and used flexible approaches to data analysis in order to get published as a positive study in the prestigious JCCP. This journal has a long-standing policy of rejecting studies with negative findings and not allowing post peer commentary on the flaws of studies already published there. A strong confirmatory bias is maintained by proponents of particular therapies getting away with bias analysis and interpretation of their results and the journal rejecting honestly and transparently presented negative results. And no one gets to publish critical commentaries on egregious examples of this appearing in the journal.

Elsewhere, the meta analysis by ACT's developer that served as the basis of his claims in Time Magazine for its superiority over other therapies and the more recent claims for the superiority of long-term psychodynamic psychotherapy over briefer therapies made in British Journal of Psychiatry and JAMA depend entirely on flawed, underpowered studies achieving positive results at statistically improbable rates. This bias is further amplified in selection, synthesis, and interpretation of available studies in meta analyses. Show me claims for the superiority of a psychotherapy over a credible, structured alternative, and I will probably be able to show you a reliance on underpowered, flawed studies and a gross confirmatory bias in the publishing and synthesis of the studies.

The article that prompted the claims of no difference between psychotherapy, medication and pill placebo was not particularly flawed, relative to other papers published in JCCP, but given its negative results, there was no way that it would get into that journal. And its results really cannot unseat conclusions from a large literature. I agree with at least half of a comment in the Reuters news article: " Those findings are interesting, bu t need to be interpreted with a grain of salt," said Dr. David Mischoulon, an associate professor of psychiatry at Medical School.

Stay tuned, and the Skeptical Sleuth will be providing more such examples of exaggerated claims based on modest, flawed studies and clues for detecting hype and hokum yourself.

About the Author

Jim Coyne, Ph.D., is a clinical health psychologist and Professor in the Department of Psychiatry at the University of Pennsylvania.

More Posts