We Need a Thorough Investigation of the STAR*D Scandal
Should reports from the NIMH's STAR*D trial be retracted?
Posted May 04, 2011
For some time now, the medical community -- and to a certain extent, the general public -- has understood that the reports in the medical literature of industry-funded trials of psychiatric drugs do not provide an accurate representation of the drugs' merits. The trials of the second-generation psychotropics were often biased by design; published results were spun; adverse events were minimized; negative studies went unpublished. The studies published in the medical literature really tell of a marketing exercise, as opposed to a scientific one.
However, the medical community and the public have long thought -- or at least hoped -- that psychiatric drug studies funded by the National Institute of Mental Health are not similarly tainted. Here, at least, the published reports would tell of studies that were not biased by design, with the results honestly reported. At least that is the expectation. And this is why the NIMH's STAR*D study is such a disappointment, and why it is so important that the full details of that scandal be made known.
Maryland psychologist Ed Pigott has spent more than five years investigating the details of this study. He has authored or co-authored three journal articles about it, and is now posting new findings, along with supporting documents, on a blog. In his latest post, Pigott tells of how he requested that two journals, the Journal of Clinical Psychopharmacology and Psychological Medicine, retract two STAR*D articles they published.
By doing so, Pigott is continuing to put a spotlight on this scandal, which I believe needs to be thoroughly investigated by the NIH, or other governmental investigative body. We need to know all of the scientific sins that were committed, and we need an accounting of the investigators' financial conflicts of interest. STAR*D was hailed as the largest trial of antidepressants ever conducted, at a cost of $35 million to the American taxpayers, and we deserve to know why the results weren't honestly reported.
Here are several of the major sins that Pigott and others have revealed:
• Patients who started on Celexa in the initial step of the trial but then dropped out without having their exit symptoms measured by the Hamilton Rating Scale of Depression (HRSD) were supposed to be counted as treatment failures. The STAR*D authors did not include these failures when calculating remission and response rates.
• In their reporting of outcomes, the STAR*D investigators included patients who were too mildly depressed at baseline (as measured by HRSD) to be eligible for inclusion in the analysis of outcomes, thereby boosting reported response and remission rates.
• The protocol specified the HRSD as the primary tool for measuring outcomes. However, in their published articles, the STAR*D authors switched to reporting outcomes as measured by the Quick Inventory of Depressive Symptoms-Self Report (QIDS-SR) scale. This switch, Pigott writes, "had a dramatic impact on inflating STAR*D's published remission and response rates."
• The authors hid the fact that only 108 of the 4,041 patients who entered the trial remitted and then stayed well and in the trial through the year-long followup. That is a documented stay-well rate of 3%; the STAR*D authors published reports that indicated around 40% of the entering patients remitted and then stayed well during the maintenance period.
• In a STAR*D summary article, the researchers rounded up results six times in an incorrect manner.
As Pigott notes, all of these statistical machinations served to inflate the reported remission and response rates. "This is a story of scientific fraud," he said.
In an April 5 letter to the editors of the Journal of Clinical Psychopharmacology, Pigott asked for a retraction of an article it published in April titled "Residual Symptoms in Depressed Outpatients Who Respond by 50% But Do Not Remit to Antidepressant Medication." A day later, he wrote the editors of Psychological Medicine to ask that they retract a 2010 article titled "Residual Symptoms After Remission of Major Depressive Disorder with Citalopram and Risk of Relapse."
In his letters to the editors, Pigott recited all of the various ways that the investigators inflated the remission and response rates. In addition, he charged that the investigators lied about their use of the QIDS-SR scale, presenting it as a blinded assessment, when in fact much of the QIDS-SR data was collected in a non-blinded manner.
According to Pigott, the protocol stated that the QIDS-SR would be administered in a non-blinded manner by a clinical research coordinator at the beginning of each clinical visit. The protocol "explicitly excluded" the use of this non-blinded assessment to measure outcomes. However, the protocol also stated that the QIDS-SR would be periodically administered as patients moved from one stage to the next using a "telephone-based interactive voice response" system, which would thus gather "blinded" data. Yet, in the articles that Pigott wants retracted, the STAR*D investigators reported results that they said were collected through the administration of the QIDS-SR by telephone "within 72 hours of each clinic visit." Pigott maintains this is impossible, because the protocol didn't call for such telephone assessments after each visit.
If Pigott is correct about this, the STAR*D investigators not only switched scales in order to inflate the reported remission and response rates (from HRSD to QIDS-SR), they also have reported -- at least in the two articles that Pigott wants retracted -- that the QIDS-SR data was blinded data, when in fact it was non-blinded data.
Pigott noted that there is another problem with the two articles that he wants retracted. In these reports, the STAR*D investigators relied on the QIDS-SR data to conclude that few patients experienced "treatment-emergent" suicidal ideation. For instance, in the Journal of Clinical Psychopharmacology article, the STAR*D investigators wrote that "suicidal ideation was the least common treatment-emergent symptom," occurring in only .7% of patients." They concluded that "this study provides new evidence to suggest little to no relation between use of a selective serotonin reuptake inhibitor and self-reported suicidal ideation."
The STAR*D investigators are seeking here to remove the dark cloud of drug-induced suicide from the SSRIs. But as Pigott noted in his retraction letters, the STAR*D investigators previously reported in 2007 -- in articles published in the American Journal of Psychiatry and Archives of General Psychiatry -- that more than 6% of patients taking Celexa experienced treatment-emergent suicidal ideation, and that much-higher rate was also derived from the QIDS-SR. So how -- and why -- does that rate decline to .7% in the more recently published articles?
As to the why, here is one speculative possibility. In the earlier articles, the investigators were reporting that they may have found "genetic markers" for predicting patients that would be at high risk of suffering this side effect, with several of the STAR*D investigators filing for a patent on the biomarkers. Thus, it's easy to see that the report of a higher suicidal ideation rate would make such a patent potentially more valuable than a lower rate (since the biomarkers could help minimize a very significant risk with use of the drugs.) But in the two more recently published articles, the authors came to a conclusion that presented Celexa (and other SSRIs) in a favorable light. And why might they be eager to do that? Most of the STAR*D authors had financial ties to Forest Pharmaceuticals and other makers of SSRIs. In the recent articles, the financial forces pushed for findings in a different direction.
As to the "how," the researchers -- in their recent articles -- didn't refer to the earlier reports and thus didn't explain the discrepancy. But the newer articles report on a smaller subset of patients than the earlier ones. Thus, by failing to reference the findings from the earlier articles, the researchers are committing another scientific sin which comes under the heading of "data mining." Researchers will sift through the data for various populations groups and then publish results that make the drug look good. Here, they are presenting the subset data as "global" evidence that SSRIs don't cause treatment-emergent suicidal ideation, even though that finding is contradicted by their earlier articles, which reported on treatment-emergent suicidal ideation in a larger group of patients.
The editors of the two journals have informed Pigott that they won't retract the articles. Their response is, in fact, understandable. They don't have any way of knowing what is true and what isn't. "We do not have access to the protocol, or other documents that are not in the public domain . . . [and] the Journal is not in the business of investigation," explained the editors of the Journal of Clinical Psychopharmacology in their reply to Pigott. "Your concerns are best addressed, for example, to the NIH project or contract officer who oversees the study, and/or to the department chairperson at the senior author's institution."
That is what we need now -- an investigation. We need an independent group of researchers to review the data and report what percentage of patients who were eligible for the trial "remitted" according to the HRSD scale, with that percentage calculated based on the methodology specified in the protocol. We need to know why the long-term outcomes data was hidden. We need to know whether the investigators, in their reports of QIDS-SR data, were relying on non-blinded assessments made by the clinical research coordinators, or on assessments made over the telephone, and whether the investigators correctly distinguished between the two in their articles (or instead lied to readers about this.) We need to know why the researchers reported a very low rate of treatment-emergent suicidal ideation in recently published articles, and nearly a ten-fold higher rate in a previous set of articles.
We also need an investigation of the investigators' financial interests. Most had ties to Forest Laboratories or some other maker of antidepressants, and thus they were in a position to want to report that the drugs were effective. One of the principal investigators, John Rush, owned the copyright for the QIDS-SR. Thus, using the QIDS-SR to assess outcomes in this prominent trial would serve to validate its use, and this promised to make the copyright much more valuable. The investigators that may have found a genetic marker for those at risk of experiencing drug-caused suicidal ideation took out a patent on their findings. How did all of these financial ties affect the reporting of results?
The public has lost faith that industry-funded trials of psychiatric drugs will be honestly designed and reported. Now, with the ongoing reports of this STAR*D scandal, the public has reason to lose faith in NIMH-funded trials of psychiatric drugs. We need a thorough, honest investigation of STAR*D in order for that faith to be restored.