Readers should be to assume that the conclusions of a meta-analysis published in a prestigious journal are valid. After all, the article survived rigorous peer review and probably was strengthened by revisions made in the authors' response to a likely "revise and resubmit" decision. But seriously flawed meta-analyses are all too common in high-impact journals, as my colleagues and I showed that in the case of meta-analyses appearing in JAMA and Health Psychology. Over the past three blog posts [1,2,3], I have revealed the exceptionally bad science of a meta-analysis written by an American antiabortion activist and published in British Journal of Psychiatry. The meta-analysis made wild claims, such as that over a third of all suicides and 10% of all mental health problems among women of reproduction age are due to their being distraught after having had an abortion.
Scathing Rapid Responses from an international group of experts continue to be posted on the journal website, attacking the article's many flaws and the serious conflict of interest of its author, Priscilla Coleman. The author's evasive Rapid Response reply is unlikely to satisfy anyone who has actually read her meta-analyses carefully.
Coleman dismisses criticisms that she relies too heavily on her own poor quality articles to the exclusion of work of others or that she has no business performing a meta-analysis and evaluating what is basically her own work. "I believe I am more widely published on this topic than any other researcher in the world. It makes sense, therefore, that I am a co-author on a significant proportion of the included studies." Hmm, but when a Royal College of Psychiatrists panel independently evaluated Coleman's research, it concluded that 10 of the 11 studies she that included in her own meta-analysis should have been excluded because they were poor quality and not relevant to answering the research questions that Coleman posed.
Coleman dismisses the allegations of well-documented statements expressing extreme bias by her and her co-authors:
Rather than hurling unfounded accusations of personal bias, we need to more effectively utilize the well-established methods of science to fairly scrutinize the methodologies of individual studies, expand the empirical investigation of abortion and mental health, and develop a consensus-based standardized set of criteria for ranking studies meriting inclusion in reviews.
Fortunately, there are accepted standards for evaluating meta-analyses for publication. These standards evaluate not so much the complicated matter of the validity of a meta analysis's conclusions, but whether the meta-analysis is reported with sufficient transparency so that readers can decide for themselves if it is valid.
The Assessment of Multiple Systematic Reviews (AMSTAR) is a short checklist that has good content and face validity. Professor Julia Littell of Bryn Mawr College and I used the checklist to evaluate the Coleman meta-analysis and found her review deficient for all items. In her Rapid Response posted at BMJ Coleman noted our Rapid Response in which we detailed our AMSTAR analysis, but she did not respond to our specific criticisms.
There are some technical aspects of meta-analysis that require considerable background to evaluate. But policymakers, clinicians, and patient-consumers are often confronted with the need to make decisions based on health-relevant information that obtained from a meta-analysis. Should they simply defer to the "experts" and be left at the mercy of authors or editors and peer reviewers who decided to publish a meta-analysis? Is it possible for motivated readers to decide whether they are being fed junk science?
Maybe. It would be a useful exercise to retrieve the Coleman article and the AMSTAR and see to what extent the AMSTAR can be used, if not to evaluate the validity of the Coleman article definitively, to at least decide whether the meta-analysis was well done and free of bias. You can do just that, by clicking on the links, or you can simply read on and learn what Professor Littell and I concluded. But, first some definitions to help.
A meta-analysis is a review in which the results of previous studies are combined and re-analyzed, allowing calculation of an overall summary effect size characterizing the literature. This effect size might represent the benefits of an intervention or the strength of an association between antecedent conditions and consequences. The Coleman paper was an attempt to characterize the overall association between a woman obtaining an abortion and her subsequent mental health.
Ideally, a meta-analysis should be conducted like an experiment, with investigators committing themselves to a hypothesis, deciding on the means of collecting data, and testing their hypothesis and coming to a conclusion. However unlike a laboratory experiments, the evidence that is collected consists of studies that have already been conducted.
There are a number of threats to the validity of a meta-analysis. First, there is the adequacy of the systematic search to locate all relevant studies, typically done using electronic bibliographic resources such as PubMed with appropriate search terms. Next, there is the overall quality of the studies on which the meta-analysis depends. Meta-analyses can overcome some weaknesses in individual studies, but cannot compensate for a literature that shares the same basic limitations. Readers need to be assured that investigators conducting a meta-analysis have evaluated the quality of studies, and there should be some formal system of rating done by more than one person, in order to reduce bias. The literature assembled for meta-analysis may also be subject to a publication bias, normally due to positive studies being published and negative studies being left in desk drawers. Various means exist for determining publication bias, such as visually examining a graph of results and seeing whether their distribution suggests a gap due to weak or negative findings are being left out because such results do not get published.
There is also the question whether there are enough similarities across studies to warrant calculating of a single effect size. Thus, in the case of the Coleman meta-analysis, studies were included that compared the prior marijuana smoking of women who subsequently received an abortion to that of women who were carrying a wanted pregnancy to term. These results were then combined with studies comparing the depressed mood of women after completing a wanted pregnancy versus the mood of women who had aborted an unwanted pregnancy. Calculating a single effect combining these studies would not make a great lot of sense. So, readers need to be assured that a meta-analysis has examined of the heterogeneity of studies, how different they are, both statistically and in terms of the nature of the outcomes were combined.
Some of the main conclusions we reached concerning Coleman review using AMSTAR to evaluate it:
Reviewers for BJP evaluating the Coleman meta-analysis for publication should have been expected to evaluate her manuscript with respect to these noncontroversial criteria. They could readily have done so with the AMSTAR, and readers can again puzzle over the question of just how did this paper make it through peer review and got published in such a prestigious journal.
But another question to you, the reader: Can the AMSTAR standards be readily applied by motivated readers? Consumers of meta-analyses cannot be expected to be experts, but if conclusions of a meta-analysis have importance for them, can they at least be equipped to decide for themselves whether a meta-analysis is biased and seriously flawed?
Special thanks to Iona Cristea, Acacia Parks, and Ghassan El-baalbaki who offered helpful feedback on earlier drafts.