Hostname: page-component-68c7f8b79f-s5tvr Total loading time: 0 Render date: 2026-01-05T06:12:09.001Z Has data issue: false hasContentIssue false

Determinants of publication likelihood and timeliness for clinical studies

Published online by Cambridge University Press:  04 December 2025

Haoyuan Wang
Affiliation:
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA
Le Li
Affiliation:
University of Texas at Austin, Austin, TX, USA
Chuan Hong
Affiliation:
Department of Biostatistics and Bioinformatics, Duke University, Durham, NC, USA Duke Clinical Research Institute, Duke University, Durham, NC, USA
Rui Yang
Affiliation:
Department of Biomedical Informatics, Yong Loo Lin School of Medicine, National University of Singapore, Singapore, Singapore
Karen Chiswell
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
Sara B. Calvert
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
Lesley Curtis
Affiliation:
Department of Medicine, Duke University, Durham, NC, USA
Ali B. Abbasi
Affiliation:
Department of Surgery, University of California, San Francisco, CA, USA
Scott Michael Palmer
Affiliation:
Department of Medicine, Duke University, Durham, NC, USA
Adrian F. Hernandez
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
Frank W. Rockhold
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
Christopher Lindsell*
Affiliation:
Duke Clinical Research Institute, Duke University, Durham, NC, USA
*
Corresponding author: C.J. Lindsell; Email: chris.lindsell@duke.edu
Rights & Permissions [Opens in a new window]

Abstract

Introduction:

Timely dissemination of clinical trial results is essential to advance knowledge, guide practice, and improve outcomes, yet many trials remain unpublished, limiting impact. We examine what drives publication and timelines across three major clinical domains.

Methods:

We analyzed study design and factors associated with dissemination of interventional trials, focusing on cardiovascular disease (CVD), cancer, and COVID-19. A total of 10,785 trials (CVD: 5929; cancer: 4210; COVID-19: 646) were linked to PubMed publications using National Clinical Trial identifiers. Study design, operational, and transparency-related features were assessed as predictors of time to publication, defined as the interval from study completion to first publication, using Cox proportional hazards model.

Results:

COVID-19 trials had the highest publication rate (49.6%), followed by CVD (42.3%) and cancer (32.9%), likely reflecting pandemic-related prioritization. Faster publication was associated with larger enrollment, more sites, result posting, randomization, DMC presence, and higher blinding levels (all p < 0.05). Slower publication was linked to supportive care or diagnostic trials (CVD), basic science (cancer), and later COVID-19 trial completion. In subgroups, U.S. facility presence (CVD) and phase 3 design (cancer) predicted faster publication, while healthy volunteer inclusion (CVD) predicted slower publication. Among DMC trials, more secondary outcomes were linked to faster publication across all disease areas.

Conclusions:

Key study design and operational factors consistently predict whether and when trials are published. Strengthening methodological rigor, result reporting, and multi-site collaboration may accelerate timely dissemination into peer-reviewed literature.

Information

Type
Research Article
Creative Commons
Creative Common License - CCCreative Common License - BY
This is an Open Access article, distributed under the terms of the Creative Commons Attribution licence (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted re-use, distribution and reproduction, provided the original article is properly cited.
Copyright
© The Author(s), 2026. Published by Cambridge University Press on behalf of Association for Clinical and Translational Science

Introduction

The dissemination of clinical research findings is vital for advancing medical knowledge, guiding evidence-based practices, and improving patient outcomes. Despite the increasing number of clinical trials conducted globally, a significant proportion of these studies fail to progress to publication [Reference Fogel1]. This limits their impact on the scientific community and healthcare decision-making, highlighting the importance of identifying and addressing barriers to the dissemination of clinical trial findings. Such selective availability of evidence increases the risk of clinicians making decisions based on incomplete or skewed data, with potential harm to patients and public health. Recognizing these dangers underscores the importance of understanding the factors that influence publication likelihood to ensure that valuable research outcomes remain accessible and inform clinical practice and policy [Reference Prager, Chambers and Plotkin2].

Several barriers to publication have been documented, including limitations in study design, lack of transparency, and operational challenges [Reference Prager, Chambers and Plotkin2]. Single-site trials, for instance, are often perceived as less rigorous or generalizable compared to multi-center randomized controlled trials, leading to less desirability for publication [Reference He, Tang and Yang3]. Additionally, design features like the presence of data monitoring committees (DMCs) and masking protocols are associated with increased credibility and higher publication rates, underscoring the importance of study rigor in promoting research dissemination [Reference Fleming4].

Clinical trial registries and machine learning tools, such as TrialEnroll [Reference Yue, Xing, Chen and Fu5] and LIFTED [Reference Zheng, Peng and Xu6] have been used to optimize operational efficiencies, predict enrollment success and trial outcomes [Reference Bashir, Bourgeois and Dunn7, Reference Scherer, Huynh, Ervin, Taylor and Dickersin8, Reference Citrome9, Reference Venugopal and Saberwal10, Reference Showell, Buckman and Berber11]. However, gaps in timely result reporting and publication persist.

This study builds on prior efforts by examining what drives publication and timelines across three major clinical domains: cardiovascular disease (CVD), cancer, and COVID-19. These areas reflect different research contexts: chronic high-burden (CVD), complex innovation-driven (cancer), and urgent public health response (COVID-19). Using a large ClinicalTrials.gov dataset, we analyze patterns and predictors of publication likelihood and speed, with a focus on design, operations, and transparency.

Materials and methods

Cohort definition & inclusion/exclusion criteria

This observational cohort study used data from ClinicalTrials.gov and PubMed to examine factors associated with whether and when interventional clinical trials are published. The study population consisted of interventional trials registered on ClinicalTrials.gov. Data for interventional trials were programmatically extracted from ClinicalTrials.gov using structured fields for study characteristics, operational details, masking protocols, and outcome measures. Trials were identified based on their disease category using disease-specific terms from the conditions and browse_conditions tables. Disease-specific terms were grouped into the three target categories (CVD, cancer, and COVID-19) by mapping condition keywords and MeSH terms to broader disease domains using a standardized codebook developed through clinical expert review and prior published taxonomies [Reference Tasneem, Aberle and Ananth12].

Eligible trials were interventional studies of CVD, cancer, or COVID-19 registered on or after January 1, 2010 and completed before January 1, 2023. Trials were required to be available in the AACT database with a National Clinical Trial (NCT) number available.

Outcomes ascertainment

To link trials with corresponding publications, PubMed was searched for articles that contained the trial’s NCT identifier in their abstract, keywords, or body text. An automated algorithm facilitated this process, ensuring precise matching and minimizing errors associated with manual linkage (Appendix A of Supplementary Materials) [Reference Tasneem, Aberle and Ananth12]. For trials without a direct match, additional metadata, including study title, principal investigator (PI) name, and publication year, were used to perform supplementary searches. This approach maximized linkage completeness while maintaining accuracy. Multiple publications per trial were recorded to capture the full publication history.

The primary outcome was the time from a trial’s primary completion date (index date) to its first linked publication. Publications that appeared before the primary completion date (typically protocol or design papers) were not counted toward this outcome. Trials with at least one eligible publication before January 1, 2025, were considered to have an observed event; those without were administratively censored on January 1, 2025. We note that posting results to ClinicalTrials.gov is distinct from publication in a peer-reviewed journal; the primary outcome for this study refers specifically to the latter.

To address the potential misclassification of linked publications that may not represent primary results (e.g., protocol papers, systematic reviews, or commentaries), we conducted a targeted manual check of a random subset of linked articles. For a sample of 100 linked publications across CVD, cancer, and COVID-19, we reviewed abstracts to confirm that the publication reported primary or secondary trial results. This spot check indicated that the large majority (>90%) were indeed primary result papers, suggesting limited risk of systematic misclassification. We acknowledge that some protocol or secondary analyses may remain in the linked set, but their proportion is expected to be small.

In instances where the first linked publication date preceded the trial’s recorded primary completion date – which typically indicates a protocol paper or interim report – these publications were excluded from the primary time-to-event outcome to ensure that only result reports contributed to the time-to-publication measure. Trials with only pre-completion publications but no post-completion result publications were considered unpublished for the main outcome. Publications that appeared before trial completion were excluded. Trials with only pre-completion publications and no post-completion result publications were considered unpublished for the main outcome.

Statistical analysis

Study characteristics and comparison

Categorical variables are presented as frequencies and percentages, and continuous variables are summarized using means with standard deviations (SDs) and medians where appropriate. The descriptive variables included study design characteristics such as the presence of a DMC, whether the study had a U.S.-based facility, whether it was multi-site trial or not, masking level (none, single, double, triple, or quadruple), intervention model (parallel, crossover, single-group, or other), allocation (randomized or non-randomized), primary purpose, inclusion of healthy volunteers, gender eligibility, inclusion of adults, children, or older adults, minimum age, and whether the study provided expanded access. Operational characteristics included funding source (Government, Private, or Other), total number of facilities, classification of arms (number of arms = 1, 2, or >2), enrollment size, number of primary and secondary outcomes to be measured, trial duration measures (e.g., months to completion and primary completion), and whether the study was classified as multiple records. For each disease type (i.e., CVD or Cancer or COVID-19), variables with more than 30% missing data were excluded from the analysis (Appendix B of Supplementary Materials). For the remaining variables, missingness was addressed through complete case analysis. COVID-19 analyses also include the calendar year of trial completion date as one additional predictor, to account for rapid changes in research context and urgency over the course of the pandemic.

The outcome variable, “Publication,” represents whether a corresponding PubMed publication was observed. Bivariate analyses compared trials with and without a publication. Chi-squared tests or Fisher’s exact tests were used to compare categorical variables, while t-tests were employed for continuous variables, depending on their distribution. Kaplan-Meier curves and log-rank tests were used to compare time-to-event outcomes (e.g., time to the publication) across key predictor groups.

To examine factors associated with publication likelihood, the Cox Proportional Hazards (Cox PH) model was applied. The PH assumption was tested using Schoenfeld residuals, and stratification was used if the assumption was violated for any variable. All study design and operational characteristics described above were included as covariates in the models. For COVID-19 trials, the calendar year of trial completion was also included to account for changes in the research context during the pandemic. Categorical variables were regrouped for interpretability and converted to factors. Reference categories generally reflected the most common or policy-relevant level; for example, “Government” (funding source), “Early Phase” (phase), “Treatment” (primary purpose), “Parallel” (intervention model), “None” (masking), “Randomized” (allocation), and “all” (gender eligibility).

Results

Study characteristics and comparison

Overall study characteristics

As shown in Table 1, studies commonly had a DMC, with proportions ranging from about 45% for CVD trials to over 59% for COVID-19 studies. Most trials were funded by non-governmental sponsors, with “Other” or private sources – including individual and industry funding – accounting for the vast majority across all disease areas. Enrollment sizes varied widely, with cancer trials averaging 120 participants (median 43) and COVID-19 studies averaging 4344 participants (median 120). Most studies focused on treatment and used parallel or single-group designs. Randomization was used in most trials: about 90% for CVD and in COVID-19 trials. Blinding levels ranged widely from none to quadruple; open-label designs remained common, especially in cancer trials. Eligibility was generally open to all genders and age groups, though inclusion of children was rare. The median number of secondary outcomes typically ranged from about four to five. Expanded access programs were uncommon in all three areas. See Table S1 of Supplementary Materials for variable details.

Table 1. Descriptive characteristics of clinical studies across three types of studies

Variation across disease areas

CVD trials (N = 6617) were notable for larger average enrollments (mean ∼ 3898, median ∼ 79) and moderate trial duration (mean ∼ 28, median ∼ 24 months). They were more likely to be conducted at a single site (68.3%) and less likely to include a U.S. facility (34.1%) compared to cancer trials. Cancer trials (N = 4525) tended to run longer (mean duration ∼ 40 months) but enrolled far fewer participants (mean ∼ 120, median 36), consistent with their frequent focus on smaller, tightly controlled early-phase studies. Over half of cancer trials were single-arm (51.1%) or two-arm (32.6%), and nearly 87% used no blinding, indicating open-label designs were dominant. COVID-19 studies were also more likely to use parallel designs (88.7%) and had relatively high rates of randomization (90%).

U.S. facility inclusion was highest for cancer trials (55%) and lowest for COVID-19 studies (26.8%). Cancer studies continued to have the highest proportion reporting results (39.2) in ClinicalTrials.gov, compared to CVD (25.0%) and COVID-19 (27.2%). Blinding patterns varied substantially: most cancer studies were open-label, while COVID-19 studies more frequently used higher-level masking than CVD trials. Minimum participant age and use of healthy volunteers also varied, with COVID-19 trials more frequently including healthy volunteers (31.6%) than CVD (18.1%) or cancer (3.3%) studies.

Factors associated with publication

Overall, publication was more common for COVID-19 trials (49.6%) than for CVD (42.3%) or cancer (32.9%). As shown in Table 2, across all three disease areas, published trials consistently showed patterns of greater operational scale, stronger oversight, and more robust design features compared to unpublished trials. Published trials were more likely to have a DMC, to report results, to be randomized, and to use a parallel intervention model. They also tended to involve multiple facilities, rather than single sites, and to apply higher levels of masking. Published studies generally reported more secondary outcomes and showed faster submission and posting timelines. While the mean enrollment was higher for published cancer trials than unpublished cancer trials, a similar difference was not observed for CVD or COVID-19.

Table 2. Comparison of study characteristics between clinical trials with and without a linked PubMed publication

Variation across disease areas

The patterns linking design features to publication varied by condition. For CVD trials, published studies more often had a DMC (52.3% vs. 39.6%), were randomized (94.1% vs. 88%), used parallel designs (83.8% vs. 78.4%), ran longer (30.02 vs. 26.23 months), operated at more sites (18 vs. 5), and measured more secondary outcomes than unpublished CVD studies. In cancer, the strongest differences reflected scale and phase: published trials were more often Phase 3 (18% vs. 8%), enrolled more participants (188 vs. 86), used parallel rather than single-group designs (44.5% vs. 34.5%), ran at more sites (25 vs. 10), and applied higher blinding levels, particularly quadruple blinding (6% vs. 3%). For COVID-19 trials, the clearest variation was operational urgency: published studies more often had a DMC (65.9% vs. 52.9%), used parallel models (97.1% vs. 85.8%), spanned more sites (11 vs. 5), measured more secondary outcomes, and showed faster registration and result reporting than unpublished COVID trials.

Variation across U.S. sponsored trials (Supplementary Table S2)

In the subgroup analysis of trials with U.S. sponsors, results showed patterns largely consistent with the overall trials, with only modest differences in magnitude. For CVD trials, published studies more often had a DMC (55.3% vs. 38.4%), were randomized (93.5% vs. 85.2%), and used parallel designs (82.1% vs. 75.8%). They also tended to have longer durations (31.4 vs. 30.1 months) and a greater number of participating sites (23 vs. 5). In cancer trials, the most pronounced differences reflected scale and phase. Published trials were more often Phase 3 (19% vs. 7%), enrolled more participants (181 vs. 86), and used parallel rather than single-group designs (47.2% vs. 33.6%). They also operated at more sites (27 vs. 11) and employed higher levels of blinding, particularly quadruple blinding (6% vs. 3%). For COVID-19 trials, published trials more frequently included a DMC (68.0% vs. 52.2%), used parallel intervention models (91.0% vs. 80.4%), and spanned more study sites (21 vs. 8).

Kaplan-Meier analysis of publication rates

Figure 1 presents the Kaplan–Meier curves comparing time to publication across the three disease areas. COVID-19 trials demonstrated the fastest time to publication, with survival probabilities dropping sharply within the first 1000 days, reflecting rapid dissemination during the pandemic. In contrast, CVD trials showed more moderate publication timelines, while cancer trials had the slowest publication rates, maintaining the highest probability of remaining unpublished over time.

Figure 1. Kaplan-Meier curve for time to publication of CVD, cancer, and COVID-19 clinical studies at ClinicalTrial.gov.

Factors associated with time to publication

As shown in Figure 2, across all three disease areas, the forest plot results showed that trials with greater operational scale and richer data collection tended to publish results more quickly. Larger enrollment sizes, more study sites, and measuring more secondary outcomes were consistent drivers of faster publication for CVD and cancer trials. Indicators of trial rigor (e.g., having a DMC, using robust masking, and reporting results) were significant in some contexts but did not emerge as significant predictors for COVID-19 trials. Offering expanded access was associated with increased publication likelihood for cancer and COVID-19 studies.

Figure 2. Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

Variation across disease areas

While some predictors were broadly consistent, the strength and pattern of associations varied by disease. In CVD trials (Figure 2A), multiple operational and design features, such as having a DMC (0.20, p < 0.001), randomization (0.38, p < 0.001), higher blinding (single: 0.10, p = 0.03; quadruple: 0.23, p < 0.001), and including older adults (0.18, p = 0.03), were strongly linked to faster publication. In contrast, privately funded CVD trials (−0.24, p = 0.03) and those focused on supportive care (−0.22, p = 0.01) or diagnostics (−0.32, p = 0.005) lagged behind.

For cancer trials (Figure 2B), operational scale and transparency stood out more clearly: larger enrollment (0.24, p < 0.001), more sites (0.04, p = 0.04), more secondary outcomes (0.12, p < 0.001), and result reporting (0.25, p < 0.001) all favored faster publication. Unique to cancer, basic science trials (−1.45, p = 0.003), supportive care (−0.37, p = 0.03), and trials with healthy volunteers (−0.35, p = 0.05) showed clear delays. COVID-19 trials (Figure 2C) differed: most design features were not significant predictors once urgency faded. Instead, larger enrollment (0.18, p < 0.001) and expanded access (1.05, p = 0.045) drove faster reporting. Crucially, trials registered in later pandemic years were much slower to publish (−1.94, p < 0.001).

Subgroup analysis across trial structure and contextual factors

As shown in Figure 3, in rigorously designed trials that included a DMC, a greater number of secondary outcomes was consistently associated with faster publication across all three disease areas: CVD (0.10), cancer (0.09), and COVID-19 (0.12), all with p < 0.05. Interestingly, in contrast to the overall analysis, single-arm design was negatively associated with publication in cancer trials (−0.71; p = 0.01), suggesting that even among trials with oversight, certain design choices may reduce the likelihood of dissemination.

Figure 3. Subgroup analysis among trials with DMC: Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

Figure 4 shows results from the subgroup of randomized trials. In cardiovascular studies, the presence of U.S.-based facilities (0.15; p = 0.04) and the use of intervention models other than crossover or single-group designs (0.38; p = 0.03) were positively associated with publication. In cancer trials, earlier phase designation was linked to more timely publication (0.38 for phase 2, p = 0.02; 0.60 for phase 3, p = 0.001). In contrast, inclusion of healthy volunteers in cardiovascular trials was associated with reduced publication likelihood (−0.56; p < 0.001), possibly reflecting differences in study aims or journal prioritization.

Figure 4. Subgroup analysis among trials with randomized allocation: Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

We also examined a subgroup of U.S.-sponsored trials (Supplementary Figure S1). In cardiovascular studies, the presence of a DMC (0.33) and random allocation (0.40) were both positively associated with publication (both p < 0.01), as were higher enrollment (0.23), more facilities (0.11), and more secondary outcomes (0.11), all with p < 0.001. Conversely, having only a single facility was negatively associated with publication (−0.15; p < 0.10). In cancer studies, several factors were negatively associated with publication, including inclusion of healthy volunteers (−0.53), private U.S. sponsors (−0.42), and a primary purpose of basic science (−1.46), all at p < 0.10. On the other hand, trials offering expanded access were more likely to be published (0.71; p < 0.05). For COVID-19 studies, use of a crossover intervention model was strongly negatively associated with publication likelihood (−4.56; p < 0.05), potentially reflecting challenges in trial design appropriateness or operational feasibility during the early pandemic period.

Discussion

Main findings

In this large-scale analysis of over 11,870 interventional trials across CVD, cancer, and COVID-19, we found that stronger oversight, larger operational scale, and greater transparency were consistently linked to a higher probability of publishing and faster publication in peer-reviewed journals. Trials with a DMC, randomization, robust blinding, larger enrollments, more study sites, more secondary outcomes, and posted results were more likely to be published sooner than those without these features. However, the strength and details of these patterns varied by context. In CVD trials, faster publication was linked to DMC presence, randomization, older adult inclusion, and higher masking, while privately funded, supportive care, and diagnostic trials lagged. In cancer, larger enrollment, more sites, expanded access, and result reporting promoted publication, whereas supportive care, basic science, and healthy volunteer trials were slower. For COVID-19, quicker publication was associated with larger enrollment and expanded access, but trials completed later in the pandemic period were published significantly more slowly, suggesting that the alignment of urgency, funding, and infrastructure that characterized the initial pandemic response diminished over time. Subgroup analysis of DMC demonstrates a greater number of secondary outcomes consistently predicted faster publication across CVD, cancer, and COVID-19, while single-arm design in cancer trials was linked to slower publication. In randomized trials, U.S. site involvement and use of an intervention model other than crossover and single-group design were associated with faster publication in CVD. Phases 2 and 3 in cancer studies were more likely to be published, whereas the inclusion of healthy volunteers in CVD was strongly linked to slower dissemination. The results show that even among rigorously designed studies, trial structure, geographic setting, and participant characteristics can meaningfully affect publication. Collectively, these results highlight that operational scale, methodological rigor, and proactive result reporting remain critical levers for shortening the time from trial completion to peer-reviewed evidence. The implications are that trials without these features are less likely to benefit the public good because the results do not make it into the public domain.

Relation to prior work

Our findings are broadly consistent with prior evidence showing that trial characteristics such as sample size, randomization, and funding source influence the likelihood and timeliness of publication [Reference Showell, Buckman and Berber11,Reference Ross, Tse, Zarin, Xu, Zhou and Krumholz13Reference Zwierzyna, Davies, Hingorani and Hunter17]. Previous analyses have demonstrated that larger trials are generally more likely to be published but can sometimes face delays due to complex enrollment and analysis requirements [Reference Lin, Fuller and Verma18,Reference Speich, Gryaznov and Busse19]. Randomized controlled trials tend to be more publishable than non-randomized designs, although added complexity can extend timelines [Reference Rivero-de-Aguilar, Pérez-Ríos and Ruano-Raviña20]. Disease area matters as well; for instance, oncology trials have historically reported lower timely publication rates than other specialties [Reference Rivero-de-Aguilar, Pérez-Ríos and Ruano-Raviña20,Reference Chen, Desai and Ross21]. By comparing patterns across CVD, cancer, and COVID-19 trials, our study underscores both common predictors and how exceptional contexts (e.g., the urgency of a pandemic) can accelerate dissemination when scientific and operational momentum align.

Interpreting headline predictors

Key predictors, including DMC presence, randomization, larger enrollment, more sites, higher masking, and timely result posting, likely reflect broader institutional capacity and trial rigor, not isolated design choices that independently drive faster publication. Well-resourced sponsors and networks are better positioned to implement oversight, recruit diverse populations, measure more outcomes, and manage multi-site operations, all of which support stronger findings and smoother peer review. Thus, simply adding a DMC or masking to a small, under-resourced study may not accelerate publication without broader support. As prior research shows, complete result reporting, enrollment success, and trial phase or scope remain critical to producing publishable, practice-informing evidence.

Policy implications and the causality caveat

Our findings highlight several actionable areas to reduce publication delays and improve the dissemination of clinical trial evidence. Strengthening independent oversight, particularly through the presence of DMCs, and promoting rigorous design features such as randomization and masking are consistently associated with faster publication. These features likely signal methodological quality and facilitate smoother peer review, suggesting that enhancing trial rigor could accelerate dissemination. Additionally, supporting larger operational scale through adequate funding, multi-site collaboration, and robust infrastructure may help overcome logistical and institutional barriers that hinder timely reporting.

However, while these characteristics are associated with faster publication, the relationships are not necessarily causal. Trials with more rigorous designs and greater scale may benefit from being embedded in better-resourced institutions or sponsor networks with established pipelines for dissemination. These advantages may not be replicable through design choices alone. Indeed, our data show that trials conducted at single sites, lacking U.S. facilities, or funded by private or ambiguous sources were consistently slower to publish, pointing to deeper structural challenges related to infrastructure, visibility, and incentives. For example, industry-funded trials may prioritize regulatory milestones over peer-reviewed publication, while single-site studies may struggle with resource limitations, limited generalizability, or publication gatekeeping.

Therefore, improving publication rates requires more than mandating methodological features; it calls for investments in shared infrastructure, clearer expectations from sponsors, and accountability mechanisms for timely dissemination. Addressing these systemic challenges, particularly for under-resourced trials, is essential to ensure that evidence generation efforts translate into public health benefit. While methodological rigor is an important enabler, its effectiveness ultimately depends on the organizational quality, incentives, and collaborative ecosystems in which trials are conducted.

Why peer-reviewed publication still matters

Finally, this study reinforces why peer-reviewed journal publication continues to matter, even in an era of mandatory trial registration and result posting. Registries like ClinicalTrials.gov provide a baseline for transparency, but journal articles add critical value by subjecting findings to independent peer review, providing context and detailed analysis, and making results discoverable and citable within the broader evidence base. Publication helps integrate individual trials into systematic reviews, clinical guidelines, and health policy decisions. As our findings show, trials with stronger oversight, broader operational scope, and more rigorous designs were more likely to reach publication, underscoring that registries alone cannot fully substitute for high-quality, peer-reviewed dissemination.

Limitations and future directions

To link PubMed articles to ClinicalTrials.gov records, we used NCT identifiers as unique tags. While efficient, this approach can undercount publications, as some articles omit NCT numbers. A sensitivity check in CVD trials suggested that ∼19% of “unpublished” studies likely had publications without NCT links, modestly underestimating true rates. However, this nondifferential misclassification likely did not bias observed associations.

Additionally, we recognize that ClinicalTrials.gov may contain incomplete or inconsistently reported information about funding sources, which served as one of our key predictor variables. Prior studies [Reference Ross, Tse, Zarin, Xu, Zhou and Krumholz13,Reference Hartung, Zarin, Guise, McDonagh, Paynter and Helfand22] have documented substantial inconsistencies in the way sponsor and funding details are reported in the registry, particularly for trials with non-governmental or industry support. These discrepancies may result in the under identification of private or governmental funding in some trials, which could attenuate observed associations between sponsor type and publication outcomes in our analysis. While we attempted to mitigate this issue by disaggregating funding into finer-grained subgroups (e.g., NIH, industry, network), some degree of misclassification is likely, and we now explicitly acknowledge this as a limitation. Future efforts may benefit from triangulating registry data with sponsor disclosures in publications to enhance the accuracy of funding classification.

We also noted that testing multiple predictors across three Cox models may raise the possibility of spurious associations, but our analyses were descriptive rather than confirmatory, so we did not adjust for multiple comparisons. In addition, we modeled each disease area separately rather than using a pooled model, which limited formal interaction testing but allowed inclusion of richer, area-specific covariates given differing missingness patterns. Future work could explore integrated models, improved linkage methods, and targeted interventions to reduce publication delays across trial settings.

Conclusion

This study highlights the key operational and methodological factors that shape whether and how quickly interventional trials reach publication. Across more than 10,000 CVD, cancer, and COVID-19 trials, timely publication was strongly associated with DMC presence, methodological rigor (randomized and masking) and greater operational scale (large enrollment, multi-site participation, multiple outcome measures). Transparent practices, particularly the posting of trial results, also accelerated dissemination. Conversely, trials with private or other non-governmental funding, single-site settings experienced slower publication. Subgroup analysis showed that even among rigorous designs (DMC and randomized), specific structural and design choices influence dissemination speed and vary by disease. Collectively, these findings underscore the need to strengthen infrastructure, promote rigorous and transparent practices, and address structural barriers to support more equitable and timely dissemination. Our survival analysis reveals actionable opportunities to improve trial design, transparency, and the translation of findings into clinical and policy impact.

Supplementary material

The supplementary material for this article can be found at https://doi.org/10.1017/cts.2025.10201.

Data sharing statement

The data that support the findings of this study are available from the author Le Li or Haoyuan Wang upon reasonable request.

Code availability statement

The custom code that supports the findings of this study is available from the author Le Li or Haoyuan Wang upon reasonable request. The code is provided for academic, non-commercial use only.

Acknowledgments

The authors did not make use of an AI system/tool to assist with the drafting of this article.

Author contributions

Haoyuan Wang: Data curation, Formal analysis, Investigation, Methodology, Visualization, Writing – original draft; Le Li: Formal analysis, Investigation, Visualization, Writing – original draft; Chuan Hong: Conceptualization, Formal analysis, Investigation, Methodology, Project administration, Writing – original draft, Writing – review & editing; Rui Yang: Data curation, Formal analysis, Investigation, Visualization; Karen Chiswell: Formal analysis, Investigation, Project administration, Writing – original draft, Writing – review & editing; Sara B. Calvert: Project administration, Writing – review & editing; Lesley Curtis: Writing – review & editing; Ali B. Abbasi: Writing – review & editing; Scott Palmer: Writing – review & editing; Adrian F. Hernandez: Writing – review & editing; Frank W. Rockhold: Conceptualization, Methodology, Writing – review & editing; Christopher Lindsell: Conceptualization, Investigation, Project administration, Supervision, Writing – original draft, Writing – review & editing.

Funding statement

This research received no specific grant from any funding agency, commercial, or not-for-profit sectors.

Competing interests

The author(s) declare none.

Footnotes

Haoyuan Wang, Le Li and Chuan Hong contributed equally.

References

Fogel, DB. Factors associated with clinical trials that fail and opportunities for improving the likelihood of success: A review. Contemp Clin Trials Commun. 2018;11:156.Google Scholar
Prager, EM, Chambers, KE, Plotkin, JL, et al. Improving transparency and scientific rigor in academic publishing. Brain Behav. 2018;9:e01141.Google Scholar
He, Z, Tang, X, Yang, X, et al. Clinical trial generalizability assessment in the big data era: A review. Clin Transl Sci. 2020;13:675.Google Scholar
Fleming, TR. Data monitoring committees and capturing relevant information of high quality. Stat Med. 1993;12:565570. doi: 10.1002/sim.4780120524.Google Scholar
Yue, L, Xing, S, Chen, J, Fu, T. TrialEnroll: Predicting clinical trial enrollment success with deep & cross network and large language models. In: Proceedings of the 15th ACM International Conference on Bioinformatics, Computational Biology and Health Informatics (BCB ’24). 2024: 19. doi: 10.1145/3698587.3701375.Google Scholar
Zheng, W, Peng, D, Xu, H, et al. Multimodal clinical trial outcome prediction with large language models. 2024. (http://arxiv.org/abs/2402.06512) Accessed January 23, 2025.Google Scholar
Bashir, R, Bourgeois, FT, Dunn, AG. A systematic review of the processes used to link clinical trial registrations to their published results. Systematic Reviews. 2017;6:117.Google Scholar
Scherer, RW, Huynh, L, Ervin, AM, Taylor, J, Dickersin, K. ClinicalTrials.gov registration can supplement information in abstracts for systematic reviews: A comparison study. Bmc Med Res Methodol. 2013;13:79. doi: 10.1186/1471-2288-13-79.Google Scholar
Citrome, L. Beyond PubMed: Searching the “Grey Literature” for Clinical Trial Results. Innov Clin Neurosci. 2014;11:42.Google Scholar
Venugopal, N, Saberwal, G. A comparative analysis of important public clinical trial registries, and a proposal for an interim ideal one. Plos One. 2021;16:e0251191.Google Scholar
Showell, M, Buckman, S, Berber, S, et al. Publication bias in trials registered in the Australian New Zealand Clinical Trials Registry: Is it a problem? A cross-sectional study. Plos One. 2023;18:e0279926.Google Scholar
Tasneem, A, Aberle, L, Ananth, H, et al. The database for aggregate analysis of ClinicalTrials.gov (AACT) and subsequent regrouping by clinical specialty. PloS One. 2012;7:e33677. doi: 10.1371/journal.pone.0033677.Google Scholar
Ross, JS, Tse, T, Zarin, DA, Xu, H, Zhou, L, Krumholz, HM. Publication of NIH funded trials registered in ClinicalTrials.gov: Cross sectional analysis. BMJ. 2012;344:d7292d7292. doi: 10.1136/bmj.d7292.Google Scholar
Chen, YP, Liu, X, Lv, JW, et al. Publication status of contemporary oncology randomised controlled trials worldwide. Eur J Cancer Oxford Eng. 2016;66:1725. doi: 10.1016/j.ejca.2016.06.010.Google Scholar
Saito, H, Gill, CJ. How frequently do the results from completed US clinical trials enter the public domain? - A statistical analysis of the ClinicalTrials.gov database. Plos One. 2014;9:e101826.Google Scholar
van Heteren, JAA, van Beurden, I, Peters, JPM, Smit, AL, Stegeman, I. Trial registration, publication rate and characteristics in the research field of otology: A cross-sectional study. Plos One. 2019;14:e0219458.Google Scholar
Zwierzyna, M, Davies, M, Hingorani, AD, Hunter, J. Clinical trial design and dissemination: Comprehensive analysis of clinicaltrials.gov and PubMed data since 2005. BMJ. 2018;361:k2130. doi: 10.1136/bmj.k2130.Google Scholar
Lin, TA, Fuller, CD, Verma, V, et al. Trial sponsorship and time to reporting for Phase 3 randomized cancer clinical trials. Cancers. 2020;12:2636. doi: 10.3390/cancers12092636.Google Scholar
Speich, B, Gryaznov, D, Busse, JW, et al. Nonregistration, discontinuation, and nonpublication of randomized trials: A repeated metaresearch analysis. Plos Med. 2022;19:e1003980.Google Scholar
Rivero-de-Aguilar, A, Pérez-Ríos, M, Ruano-Raviña, A, et al. Evidence of publication bias in multiple sclerosis clinical trials: A comparative analysis of published and unpublished studies registered in ClinicalTrials.gov. J Neurol Neurosurg Psychiatry. 2023;94:597604. doi: 10.1136/jnnp-2023-331132.Google Scholar
Chen, R, Desai, NR, Ross, JS, et al. Publication and reporting of clinical trial results: Cross sectional analysis across academic medical centers. BMJ. 2016;352:i637. doi: 10.1136/bmj.i637.Google Scholar
Hartung, DM, Zarin, DA, Guise, JM, McDonagh, M, Paynter, R, Helfand, M. Reporting discrepancies between the ClinicalTrials.gov results database and peer-reviewed publications. Ann Intern Med. 2014;160:477483.Google Scholar
Figure 0

Table 1. Descriptive characteristics of clinical studies across three types of studies

Figure 1

Table 2. Comparison of study characteristics between clinical trials with and without a linked PubMed publication

Figure 2

Figure 1. Kaplan-Meier curve for time to publication of CVD, cancer, and COVID-19 clinical studies at ClinicalTrial.gov.

Figure 3

Figure 2. Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

Figure 4

Figure 3. Subgroup analysis among trials with DMC: Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

Figure 5

Figure 4. Subgroup analysis among trials with randomized allocation: Forest plot of log hazard ratio from Cox PH model for predictors of linked PubMed publication. (A) Forest plot for CVD trials. (B) Forest plot for cancer trials. (C) Forest plot for COVID trials.

Supplementary material: File

Wang et al. supplementary material

Wang et al. supplementary material
Download Wang et al. supplementary material(File)
File 893.9 KB