Five years ago, Maryland psychologist Ed Pigott read the first published results of the NIMH's large STAR*D study of antidepressants and depression. However, even as he read that first article, he got the sense that "significant researcher trickery was afoot." Since then he has systematically exposed the trickery, piece by piece.
His latest article on the study, "STAR*D: A Tale and Trail of Bias," has just been published in Ethical Human Psychology and Psychiatry. He is also now blogging about his findings on madinamerica.com, and has posted documents there that he relied upon in his "deconstruction" of the $35 million study.
Here is a quick summary of his findings.
I. The study was designed to produce an inflated "remission" rate
The study was designed to assess the results that could be obtained with exemplary free acute and continuing care. Patients who didn't respond to a first antidepressant would then be offered a second pharmacological treatment, and so on through four possible drug regimens.
The protocol allowed for the enrollment of patients who were only mildly depressed (a Hamilton Rating Scale Depression (HRSD) score of ≥ 14, whereas in most studies of antidepressants patients are required to have a HRSD score ≥ 20. The protocol allowed for liberal use of medications other than antidepressants. The protocol also had an "educational" component, which informed the patients and their families that depression was a "disease, like diabetes or high blood pressure," and that this disease "can be treated as effectively as other illnesses." These protocol features were said to mimic exemplary real-world care.
But the protocol also included an unusual method for determining whether patients had "remitted." During the 14 weeks of acute treatment, the patients' symptoms were assessed every two weeks. Now depressive symptoms are known to wax and wane, particularly in patients who are only mildly depressed, and if a patient -- at any one of these two-week assessments -- was assessed as having remitted, then the patient was removed from the acute part of the trial and whisked into the long-term followup, marked down as having remitted on the drug. That patient might have become depressed again in the following days, and have ended up quite depressed at the end of 14 weeks (and thus enjoyed only a few days of relief), but still would be counted as a "remitted" patient in the published reports.
Wrote Pigott: "I termed it the ‘tag, you're healed' research design, since once ‘tagged,' patients were counted as remitted without the possibility of unremitting during the remaining weeks of acute-care treatment. Everyone knows that depression ebbs and flows . . . I'd never before seen such an obviously biased research design whose very purpose seemed to be to inflate the reported remission rate."
II. Various statistical manipulations were used to inflate the reported "remission" rate
The STAR*D researchers reported in the American Journal of Psychiatry that the "overall cumulative remission rate was 67%," which told of a paradigm of care that was highly effective. Here are the main statistical machinations--all of which fall into the "bad science" category--that the researchers used to produce that highly inflated result.
a) There were 607 patients entered into the trial who had a baseline HRSD score < 14, and thus, according to the protocol, were ineligible to be included in the analysis. However, the researchers included this group of patients when summarizing their findings in their published reports. Naturally, these patients were more likely to remit than those with a baseline HRSD score ≥ 14, and thus including them in the published reports inflated the remission rate.
b) There were 4041 patients (including the 607 ineligible for the trial) who were treated with the antidepressant citalopram (Celexa) in the first stage of treatment. Of this group, 370 never came back for a second visit. The researchers had previously decided that any patient who was treated with an antidepressant and dropped out (without allowing researchers to take an "exit HRSD score") would be classified as a "non-remitter." However, when the STAR*D investigators reported their cumulative remission rates, they stated that only 3671 people were eligible for analysis, rather than 4,041. The 370 who dropped out without coming back for a second visit were now excluded from the study pool. This increased the percentage of patients said to have remitted during the four stages of treatment, as the denominator used in that numerical calculation (number remitted divided by number in study pool) was now lower than it should have been.
c) The protocol stated that the HRSD scale would be used to assess remission and response rates, with a patient scoring < 7 on the HRSD scale deemed to have remitted. The NIMH also noted that this was a blinded assessment: "To ensure that there would be no bias in assessing how well each treatment worked, the information that was used for measuring the outcome results of the study was collected both by an expert clinician over the phone who had no knowledge of what treatment the participants were receiving and by a novel computer-based interactive voice response system."
However, clinicians also used the Self-Reported Quick Inventory of Depressive Symptoms (QIDS-SR) to periodically assess depressive symptoms, with this non-blinded assessment used to guide subsequent care (dosage levels, etc.) According to the protocol, this score was not to be used to assess the reported "outcomes" of the treatment. Yet, the STAR*D investigators did just that in their published articles, highlighting remission rates according to the QIDS-SR scale, rather than the HRSD scores. Switching to this non-blinded scale, Pigott found, added more than 200 patients to the remitted group. (This increased the numerator in the equation used to calculate the percentage of patients who remitted during the four stages of treatment.)
d) Even with the use of the three statistical manipulations cited above and the "tag, you're healed" research design, only 1854 of the 3671 patients (50.5%) "remitted" during the four stages of treatment. But the researchers were not yet finished with their statistical adjustments. They then reasoned that if the all of the non-remitted patients who dropped out during the acute treatment phase had instead stayed in the trial throughout all four stages of treatment, these drop-outs would have remitted at the same rate as those who went through all four stages, and if so, this would have produced a cumulative remission rate of 67 percent.
Pigott concluded that it is impossible from the published data to precisely calculate a remission rate based on the guidelines set forth in the protocol (i.e. the remission rate for the 3,343 patients with a baseline HRSD score ≥ 14 who then saw their symptoms drop to HRSD < 7 during the four stages of acute treatment. My best estimate is that roughly 1300 remitted, or around 38%.)
III. In its press releases, the NIMH further hyped the already inflated results
In its many press releases, the NIMH regularly announced that "over the course of all four levels, about 70% of those who did not withdraw from the study became symptom free." But, as Pigott observed, the remitted patients did not necessarily become "symptom free." Remission was defined as a HRSD score < 7, but the patients could still have up to seven "mild" depressive symptoms and end up with that score. For instance, on the Hamilton scale, the following symptoms all account for only one point: "feels like life is not worth living," "feels he/she has let people down," "feels incapable, listless, less efficient," and "has decreased sexual drive and satisfaction." The press release statements that 70% of the patients became symptom free, Pigott wrote, "makes medical claims for antidepressant drugs' level of effectiveness that are simply not true and suggests a profound pro-drug bias within this taxpayer-funded agency."
IV. The STAR*D investigators hid the long-term results
When the STAR*D investigators designed the study, they sought to maximize the stay-well rate for remitted patients during twelve months of "continuing care." In this stage of treatment, physicians could change the patients' medications, alter dosages, and add new medications. Patients were paid $25 each time they had their symptoms assessed, as it was thought this would help keep patients in the study.
In their published articles, the STAR*D investigators reported that a majority of the remitted patients stayed well during the twelve months. And if this finding was combined with the 67% cumulative remission rate, it seemed that around 40% of the patients who had entered the trial had remitted and stayed well for a year. However, in a 2006 article, the researchers did publish a graphic that seemed to provide numerical data charting the stay-well rate for the remitted patients, if only the chart could be understood (most readers couldn't make any sense of it.) However, Pigott and his collaborators eventually did, and they found that only 108 of the 4,041 patients (3%) who had entered the trial remitted and then remained well and in the study throughout the 12 months of continuing treatment. This is data that tells of a failed paradigm of care.
V. The STAR*D investigators failed to publish the data that was gathered to assess global outcomes
During the trial, investigators gathered data to assess 12 other long-term outcome measures that would provide a global picture of how drug treatment affected patients' lives. These measures included assessment of depressive symptoms before and after the study, level of functioning, patient satisfaction with the treatment, quality of life, the burden of side effects, health care utilization and cost of care, health status, work productivity, and personal income. Even though STAR*D investigators have published more than 70 peer-reviewed articles, they still have not published the data for any of these 12 long-term measures.
It is easy to speculate why that is so. "STAR*D's failure to publish its findings as prespecified (in the protocol) is highly suggestive that antidepressant drug care failed to deliver the wide range of positive outcomes and offsetting costs its authors and NIMH expected so they chose not to publish this damning data," Pigott wrote.
Such is the latest on the STAR*D trial. It cost American taxpayers $35 million, and was touted as the "largest antidepressant effectiveness trial ever conducted." As Pigott's deconstruction of the study makes clear, what the taxpayers got for their money were published reports and press releases that should be classified as government "propaganda," rather than reports of honest science. Unfortunately, many prescribers of antidepressants now rely on that propaganda as their "evidence base," citing the 70% figure as proof of the effectiveness of this form of treatment. Propaganda masquerading as science can exact a very high cost.