logo
ResearchBunny Logo
Perceived benefits of open data are improving but scientists still lack resources, skills, and rewards

Interdisciplinary Studies

Perceived benefits of open data are improving but scientists still lack resources, skills, and rewards

J. Borycz, R. Olendorf, et al.

This intriguing study by Joshua Borycz and colleagues explores global scientific challenges through the lens of data-sharing behavior, revealing how individual and organizational factors influence researchers' willingness to share critical resources. Discover how satisfaction with these resources correlates inversely with sharing willingness and how effective communication networks play a role in this dynamic.... show more
Introduction

The paper examines how scientists’ attitudes toward open data sharing and reuse have evolved over the past decade and which factors drive those attitudes. Open data is increasingly viewed as central to addressing grand scientific challenges and improving research ROI, but adoption depends on cultural shifts among researchers, institutions, and funders. Prior DataONE surveys (2011, 2015, 2020) documented barriers such as lack of recognition, perceived risks, and insufficient resources, while also noting growing willingness to share. The authors seek to quantitatively connect these surveys to assess longitudinal change and to identify the influence of individual motivations, social norms, and organizational supports on data-sharing attitudes and behaviors, guided by the Theory of Planned Behavior and the Technology Acceptance Model. They hypothesize that a trustworthy and open research climate, perceived career benefits with reduced risks, and increased mandates/tools/supports will be positively associated with improved data-sharing and reuse attitudes.

Literature Review

The study builds on substantial literature on open data’s impact on research quality, reproducibility, and scientific progress, and international initiatives like FAIR, GO FAIR, and DataONE that promote data-sharing infrastructure and practices. Prior work shows that lack of recognition and rewards is a key barrier; positive experiences with data reuse correlate with greater reuse; and institutional resources (training, repositories, metadata tools) significantly increase reuse. Theoretical grounding includes the Theory of Reasoned Action, Theory of Planned Behavior, and Technology Acceptance Model, emphasizing roles of social norms, perceived usefulness/benefits, ease-of-use/effort, and facilitating conditions (e.g., mandates, training, tools). Earlier DataONE surveys (2011, 2015, 2020) reported increasing willingness to share, ongoing perceived risks, and impediments stemming from lack of resources or perceived benefits. The current work integrates these strands to analyze individual, social, and organizational influences over time.

Methodology

Design: The authors integrate three DataONE surveys originally fielded circa 2009–2010 (published 2011), 2013–2014 (published 2015), and 2017–2018 (published 2018/2020). They select only questions that are comparable across all three surveys to construct a longitudinal dataset.

Data cleaning and harmonization: Minor wording and response-option differences across surveys were reconciled. For yes/no/"not sure" items, "not sure" was set to NA. For Likert items where both "neither agree nor disagree" and "not sure" appeared, categories were combined into "neither agree nor disagree." Respondents who skipped more than 5 questions in any survey were excluded. Demographic variables were harmonized: regions aggregated into six IUCN-based regions (Africa & Middle East, Asia & Southeast Asia, Australia & New Zealand, Europe & Russia, Latin America, USA & Canada); domains clustered into Natural Sciences (e.g., biology, medicine, zoology), Physical Sciences (e.g., physics, chemistry, engineering), Information Science, Social Science, and Other; work sectors into Academic, Commercial, Government, Non-Profit, Other; funding sources into Corporation, Federal & national government, Private foundation, State/regional/local government, Other.

Sample and variables: The cleaned dataset comprised 42 questions and 3,197 respondents (2011: 1,214; 2015: 539; 2020: 1,444). Four questions captured demographics; the remaining measured attitudes/beliefs about data sharing via binary and Likert-scale responses.

Statistical analysis: Multiple Factor Analysis (MFA) was conducted in R using FactoMineR and factoextra, with respondents as rows and survey items as columns, to reduce dimensionality and identify latent dimensions explaining variance. The first two orthogonal dimensions exceeding the average explained variance threshold were retained and interpreted based on top contributing questions. ANOVA (R stats) tested effects of demographics (region, domain, work sector, funding) on MFA dimensions, followed by Tukey’s HSD for pairwise comparisons. Diagnostics included Bartlett’s test for homogeneity of variance and residual/QQ plots; non-homogeneous variances were noted, reflecting categorical response formats and sampling differences across surveys. All data and code are available on Zenodo (Olendorf et al., 2022).

Key Findings
  • Two MFA dimensions exceeded the average explained variance: Dimension 1 (4.0%) interpreted as willingness to share; Dimension 2 (2.9%) interpreted as satisfaction with resources.
  • Dimension 1 (willingness to share) was driven primarily by items reflecting openness to sharing and reduced concerns about risks (e.g., willingness to place data in unrestricted repositories; agreements on use). Top contributing items showed negative correlations when expressing openness and positive correlations when expressing conditions/risks.
  • Dimension 2 (satisfaction with resources) was driven by items reflecting satisfaction with processes, long-term storage, repositories, and access to others’ data; it correlated negatively with agreement to satisfaction statements and positively with risk/condition statements.
  • Framework contributions: For Dimension 1, contributions were Individual 68.2%, Social 27.5%, Organizational 4.3%. For Dimension 2, contributions were Individual 44.1%, Social 25.0%, Organizational 30.9%.
  • Regional patterns: Australia & New Zealand (mean 0.288), USA & Canada (0.238), and Europe & Russia (−0.037) were more willing to share than Asia & Southeast Asia (−1.020) and Africa & Middle East (−0.939). USA & Canada showed significant increases 2011→2015 (−0.446 to 0.635, p<0.001) and 2011→2020 (−0.446 to 0.908, p<0.001); Europe & Russia increased 2011→2020 (−0.644 to 0.264, p<0.001). Satisfaction with resources was higher in Africa & Middle East (0.770), Asia & Southeast Asia (0.587), and Latin America (0.510) than in USA & Canada (−0.163) and Europe & Russia (−0.086). Across regions and surveys, willingness to share was inversely correlated with satisfaction with resources (R^2=0.80, p=0.017; F=15.67).
  • Domain patterns: Physical (0.291) and Information Science (0.660) were more willing to share than Natural Sciences (−0.128); Social Science (−0.592) was less willing than Physical, Natural, and Information Sciences. Willingness to share increased significantly over time in Physical (2011→2015: −0.566 to 0.462, p<0.01; 2011→2020: −0.566 to 0.585, p<0.001) and Natural Sciences (2011→2015: −0.485 to 0.179, p<0.01; 2011→2020: −0.485 to 0.216, p<0.001). Satisfaction with resources increased only in Social Science (2011→2020: −0.259 to 1.026, p<0.05).
  • Work sector: Government (0.756) more willing to share than Academic (−0.046) and Commercial (−0.264). Willingness to share rose in Government (2011→2015: −0.197 to 1.020, p<0.01; 2011→2020: −0.197 to 1.249, p<0.001) and Academia (2011→2015: −0.604 to 0.161, p<0.001; 2011→2020: −0.604 to 0.155, p<0.001). Satisfaction with resources was higher in Commercial (0.645) than Government (−0.015) and Academic (−0.046); no clear time trend.
  • Funding source: Federal & national funding (0.257) associated with higher willingness to share than State/regional/local (−0.676), Corporate (−0.555), or Private foundation (−0.312). Willingness to share increased among federally/nationally funded researchers (2011→2015: −0.323 to 0.605, p<0.001; 2011→2020: −0.323 to 0.596, p<0.001); corporate-funded showed a marginal, non-significant increase (2011→2020: −1.305 to 0.318, p=0.055). Satisfaction with resources showed no significant overall change by funding source.
  • Overall pattern: Individual perceptions of benefit, risk, and effort chiefly drive willingness to share; satisfaction with resources is more strongly shaped by organizational and social conditions. Notably, higher willingness to share often coincided with lower satisfaction with resources across regions, suggesting that engagement exposes unmet resource needs.
Discussion

Findings support the hypothesized roles of individual, social, and organizational influences on open data attitudes. Willingness to share (Dimension 1) is most strongly associated with individual-level cost–benefit and risk assessments, also reflecting social norms of trust and credit within communities. Satisfaction with resources (Dimension 2) is more influenced by organizational supports such as training, repositories, and data policies, as well as social conditions. The inverse relationship between willingness to share and satisfaction with resources across regions suggests that as researchers become more engaged in sharing, they better recognize gaps in tools, training, and infrastructure. Policy environments appear influential: increases in government sectors and among federally funded researchers align with mandates and open government data policies (e.g., U.S. Open Data Policy, EU PSI). Domain differences are consistent with collaborative infrastructures and shared instruments in physical/natural sciences fostering openness, while social sciences face data type constraints and differing norms. Overall, the study indicates broad gains in willingness to share without commensurate improvements in perceived resource adequacy, highlighting a need to scale infrastructure and support to meet growing demand for open data practices.

Conclusion

The study quantitatively integrates three influential DataONE surveys to show that scientists’ willingness to share data has increased over the past decade, particularly in regions with strong collaboration networks, in government and academic sectors, and among researchers funded by federal or national agencies. Individual perceptions of benefits and reduced risks, along with trustworthy, credit-giving social norms, are key to positive attitudes. Organizational tools, training, repositories, and mandates enhance satisfaction with resources, but many researchers remain under-resourced relative to their growing openness. The authors recommend: (1) strengthening mandates and incentives alongside scalable training and infrastructure; (2) fostering collaborations with countries and communities with fewer resources to spread open data norms; and (3) designing future surveys with consistent theoretical frameworks and standardized questions to enable robust longitudinal analysis. Increasing investment in data-sharing resources is vital to translate willingness into actual practice.

Limitations
  • The three surveys were not originally designed as a longitudinal study or with a unified theoretical framework, limiting the strength of across-time inferences.
  • Question wording and response scales varied between survey waves; although harmonized, this introduces potential measurement inconsistencies.
  • Categorical response formats (yes/no, 5-point Likert) and sampling differences led to non-homogeneous variances in diagnostics (e.g., Bartlett’s tests), complicating some analyses.
  • Sampling proportions differed notably in 2020 across demographics, which could affect generalizability, though primary changes were observed between 2011 and 2015 when samples were more similar.
  • The first two MFA dimensions explained a modest share of variance, reflecting constraints of post hoc integration rather than purpose-built measurement.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny