logo
ResearchBunny Logo
Perceived benefits of open data are improving but scientists still lack resources, skills, and rewards

Interdisciplinary Studies

Perceived benefits of open data are improving but scientists still lack resources, skills, and rewards

J. Borycz, R. Olendorf, et al.

This intriguing study by Joshua Borycz and colleagues explores global scientific challenges through the lens of data-sharing behavior, revealing how individual and organizational factors influence researchers' willingness to share critical resources. Discover how satisfaction with these resources correlates inversely with sharing willingness and how effective communication networks play a role in this dynamic.

00:00
00:00
Playback language: English
Introduction
Open data practices are crucial for addressing global scientific challenges like climate change and infectious diseases. Data-centered science, emphasizing open data, is increasingly recognized as fundamental to maximizing research return on investment. This requires a cultural shift in scientific practices, demanding concerted effort from researchers, institutions, and funding sources. The impact of data sharing on research quality has been extensively studied, leading to initiatives like the FAIR initiative and DataONE. These initiatives aim to improve data sharing practices and understand scientists' attitudes. Previous DataONE surveys (2011, 2015, 2020) have highlighted barriers to data sharing, including lack of recognition, rewards, and resources. This paper analyzes these three surveys to quantitatively assess the changing attitudes of scientists towards data sharing and reuse and identify the factors driving these changes. It uses the Theory of Planned Behavior (TPB) and the Technology Acceptance Model (TAM) to analyze the influences of individual, social, and organizational factors on data sharing attitudes.
Literature Review
Existing research indicates that lack of recognition and rewards for data publication is a major barrier to data sharing. There's a positive correlation between researchers' data reuse behavior and attitudes; those who find data reuse efficient are more likely to reuse shared data. The availability of institutional resources, such as training and data repositories, strongly influences data reuse rates. DataONE, established in 2009, aimed to change attitudes and behaviors by providing infrastructure and training, focusing on the biological and environmental sciences. The FAIR initiative offers a multifaceted approach to assess data sharing and guide training. Mandates from organizations like the OSTP and NIH highlight the importance of understanding the impact of these initiatives. Three large DataONE surveys (2011, 2015, 2020) have significantly influenced the field, providing a benchmark for assessing data-sharing attitudes and behaviors. These surveys revealed challenges such as access to and preservation of data, alongside willingness to share data with conditions. Subsequent surveys showed increased willingness to share data but also perceived risks and a persistent lack of resources and perceived benefits as impediments. This paper aims to integrate and quantitatively analyze these three surveys.
Methodology
This study integrated data from three DataONE surveys (2011, 2015, 2020). Data cleaning involved harmonizing questions and answer types across surveys, handling inconsistencies in wording and response options. Respondents who did not answer at least five questions were excluded. The cleaned dataset consisted of 42 questions and responses from 3197 individuals. Demographic information (region, domain, sector, funding) was included and standardized across surveys. Countries were grouped into six regions based on IUCN regions. Domains were categorized (natural science, physical science, information science, social science, other). Work sectors were classified (academic, commercial, government, non-profit, other), and funding agencies were grouped (corporation, federal/national government, private foundation, state/regional/local government, other). Multiple Factor Analysis (MFA) was performed using R (Factominer and Factoextra packages) to reduce the dimensionality of the survey data and identify key factors explaining variance in responses. Analysis of Variance (ANOVA) and Tukey's HSD test were used to assess the effects of demographics on the identified factors. Data and code are available on Zenodo. Bartlett's test for homogeneity of variance and residual tests were conducted; non-homogeneous variance was observed due to the nature of categorical response variables. Despite this, standardized residual plots generally showed zero correlation with the two primary dimensions of the MFA, indicating that the data likely originated from the same distributions.
Key Findings
MFA of the survey data revealed two dominant dimensions explaining 4.0% and 2.9% of the variance respectively: 'willingness to share' and 'satisfaction with resources'. Dimension 1 ('willingness to share') was strongly associated with individual perceptions of career benefits and risks. Dimension 2 ('satisfaction with resources') was more significantly affected by social and organizational factors. Analysis of demographic impacts revealed regional variations in willingness to share and satisfaction with resources. Researchers from Australia & New Zealand, USA & Canada, and Europe & Russia demonstrated greater willingness to share than those from Asia & Southeast Asia and Africa & Middle East. However, satisfaction with resources was higher in the latter regions. A significant inverse correlation was found between willingness to share and satisfaction with resources across regions and surveys. Analyzing the data by scientific domain revealed that physical and information scientists showed higher willingness to share than natural scientists; social scientists displayed the lowest willingness. Over time, willingness to share increased significantly in physical and natural sciences. Satisfaction with resources significantly increased only for social scientists. Examination by work sector indicated that government employees were far more willing to share data than academics or those in the commercial sector. Government sector willingness to share increased significantly over time. Satisfaction with resources was higher in the commercial sector. Finally, analyzing funding sources showed that researchers funded by federal and national governments exhibited substantially higher willingness to share than those with other funding sources. Federal/national government funding showed a significant increase in willingness to share over time.
Discussion
The findings support the hypotheses derived from the TPB and TAM framework. The strong influence of individual perceptions of career benefits and risks on willingness to share aligns with H2. The significant impact of social influences on willingness to share supports H1, indicating that a trustworthy and open research climate encourages sharing. The positive association between organizational factors (training, tools) and satisfaction with resources confirms H3. The inverse relationship between willingness to share and satisfaction with resources suggests that practical experience with data-sharing tools and procedures is essential for understanding resource needs. Regional differences reflect variations in collaboration networks and data infrastructure. Domain differences likely reflect the nature of research, with more collaborative fields exhibiting higher willingness to share. Sectoral differences highlight the role of mandates and organizational culture in driving data sharing. Funding source effects underscore the influence of policies and expectations from funding agencies. The substantial increase in willingness to share since 2011 is likely attributable to the implementation of open data policies and the growth of international collaborations.
Conclusion
This study demonstrates the importance of individual, social, and organizational factors in shaping scientists' attitudes toward data sharing. Increased willingness to share data in recent years, particularly in government and academic sectors, suggests that mandates and social norms are effective. The variations in domain-specific uptake highlight the role of community norms and collaborative infrastructure. The inverse relationship between willingness to share and satisfaction with resources underscores the need for practical experience to inform resource requirements. Future research should focus on targeted interventions to address specific regional, domain, sectoral, and funding-related barriers to data sharing.
Limitations
The study's limitations include the fact that the surveys were not originally designed for longitudinal analysis with this specific theoretical framework. The survey questions were broad, resulting in less precise contributions to the MFA dimensions. Sampling differences between surveys may have influenced the results. Future surveys should use a consistent theoretical framework and standardized questions over time.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny