2025/021 | Questionable Research Practices: The “Gray Zone” Between Good Science and Misconduct
Questionable Research Practices
The “Gray Zone” Between Good Science and Misconduct
Questionable research practices (QRPs) describe behaviors that fall between clearly acceptable scientific conduct and obvious misconduct such as fabrication, falsification, and plagiarism. QRPs inhabit a “gray zone” because they often involve real data and ostensibly legitimate methods, yet distort the research record in subtle but systematic ways. They are especially problematic because they are widespread, frequently rationalized as harmless or necessary, and driven by structural pressures in academia rather than by overtly malicious intent. Understanding what QRPs are, why they occur, and how they differ from outright misconduct is essential for promoting research integrity and designing effective reforms.
Defining QRPs and the Gray Zone
Classical research misconduct is typically defined using the “FFP” triad: fabrication (making up data or results), falsification (manipulating research materials, equipment, or processes, or changing data or results), and plagiarism (appropriating another’s ideas, processes, results, or words without proper credit). By contrast, QRPs involve practices that may not clearly violate formal rules but conflict with the spirit of honest, transparent, and rigorous science. Examples include selectively reporting only significant outcomes, failing to disclose methodological changes, or presenting exploratory findings as if they were confirmatory tests of pre-specified hypotheses.
This gray zone arises because science is complex, methods are flexible, and many analytical decisions are ambiguous. Researchers routinely choose among multiple measures, models, and exclusion criteria, and the line between defensible flexibility and opportunistic manipulation is often blurry. Unlike fabrication and falsification, which most researchers agree are unacceptable, QRPs are frequently debated: what one scholar sees as scientific pragmatism, another sees as biasing the evidence base. The gray zone therefore reflects both ethical ambiguity and the limitations of existing formal definitions of misconduct.
Common Types of Questionable Research Practices
Several QRPs have been repeatedly documented across disciplines. One of the most discussed is “p-hacking,” where researchers try multiple statistical analyses, variable codings, or subsamples and then report only those combinations that produce statistically significant results. A related practice is outcome switching: changing primary outcomes after seeing the data or emphasizing secondary outcomes that turned out significant, without transparently reporting these shifts. Both practices inflate the likelihood of false positives, giving the impression of robust effects where none may exist.
Another prevalent QRP is HARKing—hypothesizing after results are known—where researchers present post hoc explanations as if they had been predicted in advance. When hypotheses are retrofitted to match observed data but framed as pre-specified, the literature starts to look more confirmatory and theoretically cohesive than it truly is. Selective reporting also takes the form of omitting null results, dropping inconvenient conditions, or reporting only a subset of measures. Surveys of researchers have found that a large proportion admit to at least one such practice in their careers, suggesting that QRPs are not rare aberrations but embedded in normal scientific work.
Prevalence and Perceived Severity
Empirical studies over the last decades have attempted to quantify how common QRPs are and how scientists perceive them. Survey research in psychology, medicine, and other fields shows that while very few researchers admit to fabrication or falsification, substantial minorities acknowledge engaging in QRPs such as data exclusion without reporting, selective reporting of outcomes, or failing to publish studies with null results. Meta-research mapping fifty years of literature on QRPs indicates that concern about these practices has grown in response to replication crises and high-profile failures to reproduce famous findings.
Perceptions of severity vary across practices. Many scientists view fabrication and falsification as clearly unacceptable and career-ending if detected, whereas QRPs are seen as more ambiguous, sometimes justified as “borderline” or “minor” offenses. Some practices, such as not reporting all variables or conditions, are often seen as problematic but still less serious than direct data fabrication. This divergence between perceived severity and actual impact is crucial: modest-seeming QRPs, when aggregated across many studies, can produce serious distortions in the evidence base, arguably rivaling the harm of rare cases of outright fraud.
Structural Causes and Incentive Systems
QRPs do not occur in a vacuum. They are closely linked to the incentive structures of modern academia. Researchers face pressures to publish frequently, preferably in high-impact journals, and such outlets often prioritize novel, positive, and “clean” results over incremental, null, or messy findings. This “publish or perish” environment creates strong incentives to produce statistically significant outcomes and compelling narratives, even when the underlying data are complex or equivocal. QRPs offer a way to reconcile these pressures with the constraints of real research by subtly shaping analyses and reporting to make results more publishable.
In addition, career advancement often depends on metrics like number of publications, citation counts, and grant income. These metrics rarely reward transparency, replication, or publication of negative findings. Junior researchers, in particular, may feel compelled to adopt local norms that tolerate or even encourage QRPs, especially if senior colleagues implicitly endorse such strategies. Organizational culture and mentoring thus play a central role: departments that emphasize ethical reflection, methodological rigor, and openness can mitigate QRPs, whereas competitive, metric-driven environments can normalize them.
Ambiguity, Rationalization, and Moral Disengagement
One reason the gray zone persists is that many QRPs are easily rationalized. Researchers may convince themselves that removing outliers without reporting, or trying several models and presenting the best one, is consistent with good scientific judgment. Ambiguity in methodological standards allows for motivated reasoning, where choices that favor significant results are interpreted as reasonable and those that do not are discarded as flawed. Over time, such rationalizations can become embedded in disciplinary norms, making it difficult for individuals to recognize their behavior as questionable.
Psychological mechanisms of moral disengagement further help explain why otherwise conscientious scientists may engage in QRPs. By diffusing responsibility (“everyone does this”), minimizing harm (“it’s just one analysis choice”), or appealing to higher goals (“this will help my students’ careers”), researchers can distance themselves from the ethical implications of their actions. These mechanisms are not unique to science, but they interact with the technical complexity of research to make boundaries especially fuzzy. The gray zone is thus sustained not only by structural incentives but also by human tendencies to protect self-image and justify borderline behaviors.
Consequences for Science and Society
The cumulative consequences of QRPs are profound. At the epistemic level, QRPs inflate the prevalence of false-positive findings, exaggerate effect sizes, and reduce the replicability of published results. This contributes to replication crises, where attempts to reproduce established findings fail at high rates. When literatures are built on selectively reported and analytically massaged evidence, meta-analyses become misleading, theoretical debates rest on unstable foundations, and the efficiency of scientific progress declines. Even if individual QRPs appear minor, their aggregate effect can severely distort what is believed to be true.
Beyond epistemic damage, QRPs have social and ethical costs. Public trust in science can be undermined when high-profile results fail to replicate or when practices like p-hacking come to light. Policymaking based on biased evidence may misallocate resources or promote ineffective interventions, particularly in medicine, education, and public health. Within the scientific community, QRPs can entrench unfair advantages: researchers willing to stretch norms may publish more and gain prestige over those who adhere to stricter standards, creating a perverse selection mechanism that rewards questionable behavior.
Distinguishing QRPs from Misconduct
Despite their serious consequences, QRPs are typically treated differently from outright misconduct in policy frameworks and institutional practice. Formal misconduct investigations usually focus on FFP, leaving most QRPs outside the scope of legal or administrative sanctions. This distinction is often grounded in differences in intent and clarity. Fabrication and falsification involve clear deceptive intent and explicit rule violations, whereas many QRPs emerge from ambiguous norms, poor training, or unreflective conformity to local practices rather than explicit plans to deceive.
However, the moral distinction is not absolute. Some QRPs—such as knowingly omitting critical methodological details to make results appear stronger—may border on falsification, and repeated use of such practices with awareness of their misleading nature can be ethically comparable to classical misconduct. The gray zone thus challenges binary categorizations of behavior as either “ethical” or “fraudulent” and raises questions about whether policy should be expanded to explicitly address a wider spectrum of behaviors. It also underscores the need to move beyond individual blame toward systemic reforms that reduce the temptation and opportunity for QRPs.
Strategies for Prevention and Reform
Addressing QRPs requires both cultural and structural changes. On the cultural side, research training should explicitly discuss QRPs, their consequences, and the moral reasoning that can lead to their normalization. Open conversation about errors, uncertainty, and ambiguity can help shift norms toward valuing transparency over perfection. Mentors who model robust practices—such as pre-registering studies, sharing data and code, and fully reporting analyses—can shape the expectations of early-career researchers and demonstrate that integrity and success need not conflict.
Structurally, reforms in publication and evaluation systems are crucial. Journals can adopt practices such as registered reports, in which study protocols are peer reviewed and accepted before results are known, reducing incentives for p-hacking and HARKing. Encouraging or requiring data and code sharing enables independent verification and reanalysis. Funding agencies and institutions can revise reward systems to value replication studies, methodological rigor, openness, and responsible conduct, rather than focusing narrowly on publication counts and journal prestige. Over time, such reforms can shrink the gray zone by aligning incentives with the ideals of good science.
Reframing Good Science and Responsibility
The existence of QRPs forces a re-examination of what “good science” means in practice. Good science is not merely the absence of overt fraud but a commitment to transparency, honesty about uncertainty, and a willingness to expose methods and data to scrutiny. In this light, practices that systematically bias the record, even if technically defensible or widely used, fall short of scientific ideals. Recognizing this gap does not require demonizing individual researchers; instead, it invites collective responsibility for changing norms, training, and incentives.
Ultimately, the gray zone between good science and misconduct is shaped by the interaction of individual choices, community standards, and institutional structures. Questionable research practices persist because they are often rewarded, rarely sanctioned, and easily rationalized. Reducing their prevalence will require not only clearer guidelines and stronger oversight but also a cultural shift toward valuing integrity, openness, and long-term epistemic reliability over short-term productivity and prestige. By confronting QRPs as a systemic issue rather than isolated moral failings, the scientific community can strengthen both the credibility and the cumulative progress of research.
Bibliography
- Butler, N., Delaney, H., & Spoelstra, S. (2017). The gray zone: Questionable research practices in the business school. Academy of Management Learning & Education, 16(1), 94–109. https://doi.org/10.5465/amle.2015.0201
- Fanelli, D. (2009). How many scientists fabricate and falsify research? A systematic review and meta-analysis of survey data. PLoS ONE, 4(5), e5738. https://doi.org/10.1371/journal.pone.0005738
- Gopalakrishna, G., Ter Riet, G., Vink, G., Stoop, I., Wicherts, J. M., & Bouter, L. M. (2022). Prevalence of questionable research practices, research misconduct and their potential explanatory factors: A survey among academic researchers in The Netherlands. PLOS ONE, 17(2), e0263023. https://doi.org/10.1371/journal.pone.0263023
- Jiang, M., Rocktäschel, T., & Grefenstette, E. (2023a). General intelligence requires rethinking exploration. Royal Society Open Science, 10(6), 230539. https://doi.org/10.1098/rsos.230539
- Jiang, M., Rocktäschel, T., & Grefenstette, E. (2023b). General intelligence requires rethinking exploration. Royal Society Open Science, 10(6), 230539. https://doi.org/10.1098/rsos.230539
- Jiang, M., Rocktäschel, T., & Grefenstette, E. (2023c). General intelligence requires rethinking exploration. Royal Society Open Science, 10(6), 230539. https://doi.org/10.1098/rsos.230539
- John, L. K., Loewenstein, G., & Prelec, D. (2012). Measuring the prevalence of questionable research practices with incentives for truth telling. Psychological Science, 23(5), 524–532. https://doi.org/10.1177/0956797611430953
- Nosek, B. A., Spies, J. R., & Motyl, M. (2012). Scientific utopia: Ii. Restructuring incentives and practices to promote truth over publishability. Perspectives on Psychological Science, 7(6), 615–631. https://doi.org/10.1177/1745691612459058
- Sijtsma, K. (2016). Playing with data—Or how to discourage questionable research practices and stimulate researchers to do things right. Psychometrika, 81(1), 1–15. https://doi.org/10.1007/s11336-015-9446-0
- Smaldino, P. E., & McElreath, R. (2016). The natural selection of bad science. Royal Society Open Science, 3(9), 160384. https://doi.org/10.1098/rsos.160384
- Sterling, S., Yaw, K., Plonsky, L., Larsson, T., & Kytö, M. (2025). Investigating researcher perceptions of questionable research practices. Journal of Second Language Studies, 8(2), 219–243. https://doi.org/10.1075/jsls.00048.ste
- Wicherts, J. M., Veldkamp, C. L. S., Augusteijn, H. E. M., Bakker, M., Van Aert, R. C. M., & Van Assen, M. A. L. M. (2016). Degrees of freedom in planning, running, analyzing, and reporting psychological studies: A checklist to avoid p-hacking. Frontiers in Psychology, 7. https://doi.org/10.3389/fpsyg.2016.01832