This editorial appears in the March Issue of the American Journal of Bioethics
ncreasing attention has been devoted to questionable research practices (QRPs) in the health and behavioral sciences. Practices such as presenting post hoc findings as if they were a priori hypotheses, failing to report null results in a study with multiple outcomes, and manipulating sample sizes or covariates to yield statistically significant results (“p-hacking”), are ethically and practically problematic because they create the impression that research findings are more robust than they actually are. More generally, they can foster cynicism about the scientific endeavor and skepticism about the findings from health and behavioral science research. Miscitations are another QRP that has generally received less attention than QRPs focused on methodological and statistical issues.
According to Cobb et al., miscitation is a “failure to adhere to the practice of providing a complete and correct account of the cited content of a study” (300). Citations can be accurate, partially accurate (omitting relevant context or nuance that would qualify the findings), or inaccurate (making claims contrary to or irrelevant to the findings from the cited study). Miscitations in scientific papers can impede theory development, contribute to a false consensus in which researchers assume that particular findings are more consistent than they are, and support the implementation of interventions whose empirical support is weaker than presumed. Although these are all serious concerns, miscitations of scientific studies in government reports and other documents that inform public policy can have even more wide-ranging harmful consequences. These documents are often used to set funding priorities, justify laws and regulations, and guide public health decisions. Distortion of the evidence-base can result in a misallocation of scarce resources, where research and policies supporting dubious, ineffective, or harmful interventions are promoted, while efforts into promising and well-established interventions may be undermined. These effects may compound over time, threatening individual and public health outcomes (including increased injuries, disability, and death) and eroding public trust in the scientific enterprise (see, for example, the impact of The Joint Commission’s pain standards on the opioid epidemic). Consequently, it is essential that the citations to empirical research in “The Make America Healthy Again (MAHA) Report”, which aims to set government policy for improving children’s physical and mental health, should be free from inaccuracies.
When it was first released, the MAHA Report garnered attention for citing studies that did not exist. A revision of the report deleted all but one of these phantom studies, but some researchers whose work was cited in the report claimed that their work did not find what the report claimed it found. Despite these anecdotal reports, there has not been a systematic review of the citation accuracy of the MAHA Report. Assessing whether these high-profile miscitations were anomalies or whether they reflect a more pervasive pattern in the MAHA report can help discern whether the scientific literature was used to accurately inform evidence-based policies (i.e., policies and recommendations were made following a careful review of the evidence) or whether the research citations were used as rhetorical window dressing to bolster preconceived beliefs.
We applied Cobb et al.’s tripartite coding system to the 154 citations to peer-reviewed empirical studies cited in the MAHA Report. Because we were interested in how the authors of the MAHA Report used scientific research to support their claims, we limited coding to peer-reviewed empirical studies (including meta-analyses) and did not code citations to narrative reviews, policy papers, commentaries, government documents, etc. Each citation was independently coded by two of the coauthors, and the pairings were rotated so that each author coded roughly the same number of citations with each of the other coauthors. After completing their independent coding, the two raters met and resolved any discrepancies in their ratings. If, after discussion, the two raters were unable to reach agreement, they assigned the less critical code to the citation, giving the MAHA Report the benefit of the doubt and providing a lower bound estimate of the rate of miscitations. We examined interrater reliability using the initial codes provided by the independent raters before they resolved any discrepancies. A weighted kappa was used to assess the level of interrater agreement for these ordinal data. Kappa corrects for chance agreement, so values between .61–.80 are evidence of substantial agreement and values greater than .81 are indicative of almost perfect agreement. The weighted kappa for the present study was substantial, .69 (95% CI [.60–.78]).
Only 75 (48.7%) of the 154 citations to peer-reviewed studies in the MAHA Report were accurate, whereas 48 (31.2%) were partially accurate, and 31 (20.1%) were inaccurate. These results represent a strikingly high rate of inaccurate and partially accurate citations. By contrast, Cobb et al. found that 81.2% of the citations in major psychology journals were accurate. Our recent review of citation accuracy in American Psychological Association amicus briefs found that 20.3% of the citations were partially accurate and that 6.9% of the citations were inaccurate. Perfect citation accuracy may be aspirational, but when fewer than half of a document’s citations are accurate, epistemic integrity is seriously weakened and the reliability of the evidence becomes unclear. In the context of health care policies affecting children, this uncertainty raises concerns regarding the ethical principles of beneficence and nonmaleficence by obscuring whether recommendations accurately reflect established risks and benefits. Implementing policies with unclear evidentiary support can have long-lasting consequences, particularly in developmental samples.
Although a detailed review of all the inaccurate citations in the MAHA Report is beyond the scope of this editorial, a sampling of the inaccurate and partially accurate citations in the report can give readers a sense of the problem. In a section outlining “proven harms” of overtreatment, the MAHA Report cited Waters et al. to support the claim that “Adenotonsillectomy for children with sleep apnea, an historically common procedure, conferred no benefit in trials” (p.60), when in fact Waters et al. reported that “Improvements were seen for polysomnogram arousals and apnea indices and for parent reports of symptoms…, behavior…, overall health, and daytime napping” (1), so the claim that adenotonsillectomy provided no benefits is a gross misrepresentation of the study. The MAHA Report cited Hibbs et al.’s analysis of data submitted to the Vaccine Adverse Event Reporting System (VAERS) to support the claim that “Many health care professionals do not report to VAERS because they are not mandated to do so or they may not connect the adverse event to a vaccination” (63). Hibbs et al. examined the reporting of vaccination errors (e.g., inappropriately scheduled vaccination, storage and dispensing errors) and not adverse reactions to vaccines.
Partially accurate citations can often be as misleading as inaccurate citations. For example, the report cited Sofi et al.’s meta-analysis to support the claim that “Research also consistently links diets centered on whole foods to lower rates of obesity, type 2 diabetes, heart disease, certain cancers, and mental illness” (p. 26). Although this meta-analysis found associations between a Mediterranean diet and lower heart disease and cancer, none of the supporting citations examined obesity or diabetes. More importantly, some of the whole foods highlighted within the MAHA Report as beneficial—including beef and whole milk—are rarely consumed within a Mediterranean diet. To support its call for reducing food safety regulations, the MAHA report wrote “the implementation of the Hazard Analysis and Critical Control Points (HACCP) system has further complicated operations for smaller producers without the expertise or capital to navigate such comprehensive safety protocols”, citing Dima, Radu, and Dobrin. Although readers could reasonably assume that this study examined US producers, the MAHA Report failed to note that Dima et al. investigated the Romanian meat industry. As a final example the MAHA Report wrote, “Tympanostomy tubes for recurrent ear infections, despite being recommended by professional societies, did not reduce infections in trials—showing common surgeries cause harm without offering benefits”, citing Hoberman et al. Hoberman et al. found that tubes did not improve outcomes compared to antibiotic treatment; however, several secondary treatment outcomes did favor the tube group and there was minimal evidence of serious adverse events during the 2-year monitoring period in either treatment arm. Omitting that the comparison was an active treatment may not have been a serious miscitation, except that the MAHA Report also advocated against the use of antibiotics writing “Antibiotics are over-prescribed to millions of US children annually, causing serious harms like rashes, diarrhea, recurrent infections, allergic reactions, and antibiotic resistance”.
A recent editorial in Science argued that “checking citations is just as important as carrying out experiments” and noted that there was at least one citation in the MAHA Report where the “content was misrepresented”. Our systematic review found that in fact more than half of the citations were less than fully accurate. The MAHA Report accurately identified several legitimate concerns about the health and mental health of children in the United States. However, effective interventions should be informed by the science and evidence-based, rather than misrepresenting the research literature to claim support for questionable policies.
Notes
Interested readers can find a complete listing of all of the citations and their codes at https://osf.io/b4rew/overview?view_only=a84540244e314bb6835bc1350ddcc6d1
David K. Marcus, PhD; Keira L. Monaghan, Jessica L. Fales, PhD & Christopher T. Barry, PhD