From biology, to economics, to psychology, researchers who examine whether or not the conclusions of published peer-reviewed papers can be reproduced have found plenty of reasons to be concerned. Because independent reproducibility is a cornerstone of valid scientific research, many prominent scientific journals, including Proceedings of the National Academy of Sciences (PNAS) and Science, have made the research they publish more transparent and the underlying data sets more available. The federal journals controlled by the U.S. Department of Health and Human Services (HHS), however, are exceptions to this trend toward transparency. These journals include Emerging Infectious Diseases (EID), Environmental Health Perspectives, Morbidity and Mortality Weekly Report (MMWR), and Journal of Health and Pollution, which was recently acquired by the U.S. National Institute of Environmental Health Sciences.
Federal and state policymakers craft laws and regulations based on the findings published in these journals—yet it can sometimes be impossible for other experts to check the original work. None of these journals has adopted the research transparency policies that are now common among scientific journals. None even asks or recommends that authors share data and computer code with other researchers who request them. HHS or Congress should require these federally controlled and supported peer-reviewed journals to adopt modern transparency measures to promote the reproducibility and robustness of the research that they publish.
Improving the reproducibility and replicability of scientific research has long been seen as key to improving scientific progress and the credibility of scientific research. In 1986, Dewald, Thursby, and Anderson published their largely unsuccessful efforts to reproduce peer-reviewed economics papers, taking advantage of support from the National Science Foundation. That work spurred broader attention to reproducibility, including changes in access to data and code at the American Economic Association, and reports and debate in Journal of the American Medical Association in 2005, Nature in 2016, and PNAS in 2018. The National Academies of Sciences, Engineering, and Medicine issued a report on open science, including sharing of data and code, in 2018, and on reproducibility and replicability in science in 2019.
The credibility of scientific research depends on its independent reproducibility. If irreproducible results are published, researchers and students may waste time and resources, wrongly believing that some peer-reviewed papers have reliable findings. Worse, it means that private firms and government officials may make decisions with the expectation of certain effects, that instead can occur only by serendipity. Ultimately, publication of irreproducible results may threaten respect for science as an enterprise capable of improving human understanding.
Best Practices to Promote Transparency
To promote independent reproducibility, many top journals have adopted policies to increase transparency, especially public access to data and computer code.
Six journals associated with Science generally require all data underlying the results in published papers to be publicly and immediately available. These journals do not allow post-publication embargos, nor are readers required to contact the authors for data and code. PNAS states that authors must make materials, data, and associated protocols, including code and scripts, available to readers in a public repository upon publication, with some exceptions. The American Economic Association’s policy—which covers eight AEA journals, including American Economic Review—is to publish papers only if the data and code used in the analysis are clearly and precisely documented and if access to the data and code is nonexclusive to the authors. Nature requires authors to make data and materials publicly available upon publication. This includes depositing data into the relevant databases and arranging for them to be publicly released by the online publication date (not after). The American Chemical Society’s Environmental Science & Technology requires all authors of published articles to make materials, data, and protocols available to readers through deposition in a public database. The submitted manuscript must include a statement confirming submission of the data and indicating the data bank and any pertinent accession codes. Clinical Microbiology Reviews has similar data access policies.
At least one federal journal has adopted a strong policy to promote public access to data suitable for statistical analysis (but not the computer code for that analysis). The U.S. Fish and Wildlife Service, which controls the peer-reviewed Journal of Fish and Wildlife Management, requires, as a condition of publication, that data supporting the results be published, either in the JFWM paper, in supplemental online material, or in an online data archive. It elaborates that “all data required to recreate the results in the paper … should be provided.”
Journals promoting public data access include those specializing in epidemiology. International Journal of Epidemiology, among the highest ranked in that field, seeks to protect the transparency of the research that it publishes through access to data and code. It requires authors to make all software code underlying the paper available to readers. Where ethically feasible, the journal also strongly encourages authors to make all data on which the conclusions of the paper rely available to readers. Authors are required to include a data availability statement in their articles.
Federal Health Journals Lag Best Practices on Transparency
The four HHS-managed scientific journals are woeful laggards when it comes to promoting research transparency. Because these journals have no policies regarding access to data and code, they fall short of best practices both among private journals (such as Science) and among other federal journals (such as Journal of Fish and Wildlife Management). These HHS journals do not meet the most modest “level-one” set of policies identified by Oxford University Press (OUP), regarding access to data and code. The level-one policy simply “encourages all authors, where ethically possible, to publicly release all data underlying any published paper” (emphasis added). Level four, which is the strongest in terms of transparency, “requires all authors, where ethically possible, to publicly release all data underlying any published paper as a condition of publication” (emphasis added). Furthermore, the data “must undergo peer review along with the manuscript” as part of the acceptance and publication process. Finally, the level-four standard also requires authors to include a statement on data availability in their published articles.
The lack of attention on the independent reproducibility of HHS research is not merely an academic concern. In 2020, the U.S. Government Accountability Office studied HHS modeling of infectious disease and found that guidelines and policy decisions from the Centers for Disease Control and Prevention (CDC) did not address reproducibility of models or their code—a flaw that may jeopardize the reliability of CDC’s research into Covid-19. It recommended that CDC establish guidelines to ensure full reproducibility of CDC’s research by sharing with the public all permissible and appropriate information needed to reproduce research results, including, but not limited to, model code.
Many of the journals that turned toward transparency were motivated by large-scale assessments of published research that found that many peer-reviewed papers are not independently reproducible:
- A 2015 project to explore the independent replicability of 100 psychology experiments published in high-ranking journals found that only one-third to half of the original findings were also observed in studies that attempted to replicate the results.
- In 2015, Begley and Ioannidis summarized problems of limited reproducibility in more than a dozen specific areas of biology (in microarray data, mouse in-vivo studies, oncology, and more) and argued that limited access to raw data is a key contributor.
- In 2015, economists Chang and Li found that results in less than half of more than 60 published and peer-reviewed research papers could be reproduced.
- In 2018, Niven et al. evaluated critical care practices recommended by research published in high-profile journals and found that less than half had reproducible results.
- A 2018 reproducibility assessment of 21 recent social science experiments in Nature and Science by Camerer et al. found a significant effect in the same direction as the original study for just 13 studies (62%) and an effect size, on average, of only about 50% of the original.
Generalizing about reproducibility across academic disciplines is difficult because reproducibility and replicability have different meanings in different contexts. For psychologists who conduct psychology experiments in the classroom, as well as others who use randomized trials, reproducibility may mean that applying the same protocol to a new sample of subjects drawn from the same population generates the same findings. For economists, it typically means that independent researchers applying identical analytic methods to a given data set get identical results. In some cases, however, it means that application of similar analytic methods to the same data yields results that are similar. Importantly, different academic disciplines have made unequal efforts to assess independent reproducibility, meaning that while evidence of irreproducible peer-reviewed research is common in many disciplines, the absence of such evidence should not be interpreted to mean that there is not a reproducibility problem.
The 2019 NAS report found: “Reproducibility is strongly associated with transparency; a study’s data and code have to be available in order for others to reproduce and confirm results” (emphasis added). While mandatory access to data and computer code is associated with much higher success at independent replication, mandating such access is only a partial remedy.
Preregistration and Research Accountability
In a detailed 2018 analysis about research practices in many fields, Christensen and Miguel identify two concerns with published research papers that public access to data and code is unlikely to resolve.
One concern is publication bias—the tendency of researchers and journal editors to publish papers that report statistically significant results. This tendency can lead to a mistaken impression that certain relationships among variables of interest may be observed in multiple data sets simply because analyses of those relationships that show no link are rarely or never published. Worse, the publication process may make such biases self-reinforcing if journal editors seek experts to conduct peer review from among the authors of the published studies that have reported a relationship. To reduce such publication bias, Christensen and Miguel suggest the registration of studies in advance, so that a hypothesis, research design, and analysis are documented before implementing a research project. This practice is common in randomized clinical trials and in randomized experiments in economics. Registration in advance could limit publication bias by allowing identification of studies that were initiated but not published in a final form.
Christensen and Miguel’s second concern is with the cherry-picking of results. Researchers using modern software to analyze a given data set may generate hundreds or thousands of models. They may consciously or subconsciously select those that yield results consistent with their preconceived notions. Such data-mining or cherry-picking of results invalidates conventional tests for statistical significance. As a result, some journals and professional societies have moved toward preregistration of research plans—i.e., prior to conducting the data analysis, researchers must publicly articulate an analysis plan that specifies measurement of outcomes of interest, criteria for dropping potentially invalid observations, choice of statistical models, and so on.
Preregistration of research plans limits data-mining by requiring researchers to think carefully about how to test hypotheses prior to analyzing data and to distinguish carefully between predictions and “postdictions.” It does not limit researchers to analyzing only the plans made prior to data analysis, but it does require them to distinguish between tests implemented based on those plans and other ad hoc analyses developed after data analysis has begun.
PNAS encourages preregistration of research plans, a transparency measure consistent with long-standing guidance from the Food and Drug Administration (FDA) and therefore very common in clinical trials of medical products. The PNAS policy extends to all empirical studies, including, for example, retrospective analysis of longitudinal observational data.
Federal Protections for Research Transparency
Reforming federal journals’ policies to promote transparency is important because federal efforts regarding taxpayer-funded data and code are inadequate. National Institutes of Health guidelines for research that it funds do not specify conditions for researchers to make their data and code publicly available. In February 2020, the White House Office of Science and Technology Policy published a request for information on this topic, citing an earlier 2013 memo. The seven-year delay suggests a glacial pace of reform. In its budget request for fiscal year 2023, CDC states that it “provides agency-wide leadership and oversight to ensure the highest standards of scientific integrity, relevance, credibility, and transparency for any data, publications, research, and communication materials.” Left unaddressed, however, is whether data used in peer-reviewed articles published in federally controlled journals like CDC’s MMWR and EID will be publicly available at the time those articles are published.
Additional measures to promote research transparency sponsored or published by the federal government are surely feasible. Policies regarding data and code at Science and PNAS are quite broad, covering all submitted papers, including clinical research. These journals have specific policies that could be applied to the HHS journals with minimal adaptation.
Policies to require authors of articles published in federal journals to share data and code would generally have relatively low costs for HHS and researchers alike, since they are already commonplace at many private-sector journals, and the federal Journal of Fish and Wildlife Management has required authors to provide access to data for years. Preregistration of research plans is less common but growing fast and may be the best approach to limit data-mining.
If editors of federal journals are unable to adopt the best practices used by private journals to promote research transparency and reproducibility, perhaps those federal journals should be privatized—managed and controlled by private entities and not government agencies.
Sharing data and code may help researchers by increasing citations to their work. Christensen et al. assessed the impact of data-sharing on article citations and found that authors who share data may eventually be rewarded with more citations from their colleagues in the industry.
Measures to promote transparency in research—and thereby to improve the reliability of research findings—are not free. Preparing computer code that can be understood by third parties and sharing data in an accessible manner require skilled staff. Preregistration of research plans may require additional time and effort in the development of a research project. All such measures, however, are actively used by researchers in different research fields, suggesting that the costs to the researchers is typically manageable within research budgets.
Influential Journals Need Accountability
After a journal publishes irreproducible results, the publication may be retracted or withdrawn if the researchers or the editors become aware of disqualifying flaws. Retractions or withdrawals may occur long after the research in question was used to make public policy, develop new medical products, or pursue related research questions. Retraction Watch, an affiliate of the Center for Scientific Integrity, tracks retractions of academic papers and lists 280 retracted papers about Covid-19, as of mid-December 2022. Some of those examples involve influential papers. For instance, the online medical journal Lancet retracted a 2020 article that addressed the risks of off-label use of hydroxychloroquine to treat Covid-19 because the owner of the underlying data would not provide the data for a third-party review. A similar fate hit a related paper in the New England Journal of Medicine. A publisher of medical “preprints” retracted a large study of ivermectin for Covid-19 patients because of concerns over plagiarism and serious anomalies in the data set.
Extreme examples of published peer-reviewed papers with serious errors are not limited to papers about the Covid-19 pandemic. Nature Biotechnology recently retracted a 2010 paper—involving the survival of mice with spinal muscular atrophy—that supported later clinical research to develop Zolgensma, a gene-therapy drug now approved by FDA and marketed by Novartis. Nature recently reported that an 11-year-old paper assessing genome sequences of orangutans inadvertently mixed up which sequence belonged to which orangutan, suggesting material flaws in data underlying hundreds of later papers.
The journals managed by HHS are important according to impact factors that suggest their influence among other researchers. Impact factors reflect the average number of times an article published in the last two years has been cited in the current year. Impact factors for 2022 for EID, Environmental Health Perspectives, Journal of Health and Pollution, and MMWR are 9.9, 7.3, 2.8, and 23.0, respectively.
Despite their influence, these journals have no identifiable policies regarding public access to data and computer code used to prepare the research papers submitted for publication or published. They do not require as a condition of publication that authors of published papers make data and code available to the public or, more simply, upon request. They lack statements about the value of preregistering research plans for studies that test hypothesized relationships among variables of interest. Like other scientific and technical journals, however, they do have policies regarding the disclosure of conflicts of interest and minimum contributions for authorship.
The senior management of these federal journals has effectively been AWOL in ongoing efforts to improve protections for the validity of published research. While the policies of leading journals to promote independent reproducibility through greater transparency are notably stronger than 20 years ago, efforts to strengthen them further are ongoing. For example, the American Economic Association addresses distinctions between “raw” data extracted from other databases and analysis data—and proposes solutions for replication posed by “confidential” data. It also supports ongoing inquiries into whether policies of Institutional Review Boards (IRBs) to protect human subjects’ privacy—e.g., destruction of original data after publication—conflict with conditions to ensure independent reproducibility.
The example of AEA is relevant because its journals publish many articles using sophisticated statistical analyses of large-scale data sets—analyses broadly similar to the epidemiological studies published in the federal HHS journals. Economists have also been creative in finding and exploiting natural experiments to make valid causal inferences from observational data. To take one example, Freedman et al. identify the real-world effects of vaccination and also the indirect effects of vaccination on third parties who may be exposed to the people who were vaccinated. Their report builds on the safe assumption that when sixth-graders were too young to be eligible for vaccines, their exposure to the new coronavirus in school was likely much lower in middle schools, where fellow students in higher grades were vaccine-eligible, than in elementary schools, where fellow students in lower grades were then vaccine-ineligible. Their estimates of indirect effects were small and statistically insignificant; there was “essentially no difference in Covid-19 incidence between sixth graders in middle schools and sixth graders in elementary schools.” This is exactly the kind of research, with real-world implications, that HHS journals purport to delve into. We would all benefit if their analysis were transparent and the data and code were available.
The U.S. Department of Health and Human Services should:
- Direct all federal journals managed by HHS to adopt public-access policies for data, materials, and code no weaker than those of the multidisciplinary journals Proceedings of the National Academy of Sciences or Science.
- Sponsor or encourage studies of independent reproducibility of research published in federal journals that it controls.
- Encourage preregistration of research plans, following Proceedings of the National Academy of Sciences, for studies of animals and human subjects alike.
- Encourage registration of all randomized trials generally.
- Direct journals to cosponsor research endeavors, such as the Conference on Reproducibility and Replicability in Economics and the Social Sciences, to look at how federal regulations for the protection of human subjects may interact with efforts to protect reproducibility and replicability.
HHS has a lot of work to do to adequately protect the reproducibility and robustness of research published in the journals that it controls. This is essential to protect and improve the reputation of the agency’s work. Given the lack of agency attention to this issue, as well as the slow pace of reform within the executive branch more broadly, congressional oversight may be necessary to strengthen management of the HHS-controlled journals.
Photo by BSIP/Universal Images Group via Getty Images