Medicine and Health
Making Biomedical Research Software FAIR: Actionable Step-by-step Guidelines with a User-support Tool
B. Patel, S. Soundarajan, et al.
Research software is increasingly vital across scientific domains, including biomedical research, where software underpins data collection, analysis, modeling, and AI/ML applications. Surveys indicate high reliance on software (e.g., 92% of academics use software; 69% say research would not be practical without it; 56% develop their own software). Despite the 2016 FAIR data principles, multiple groups have shown the original principles do not directly capture software-specific traits (e.g., dependencies, versioning). Reformulated principles tailored to research software emerged via Lamprecht et al. and, subsequently, the FAIR4RS Working Group, culminating in FAIR4RS v1.0 (2022). However, FAIR4RS remains aspirational and lacks concrete implementation guidance. Prior actionable efforts based on original FAIR (e.g., five recommendations for FAIR software) do not fully map to FAIR4RS. The absence of practical, step-by-step guidance hinders widespread FAIR adoption in research software, with the COVID-19 context highlighting reproducibility challenges when software is hard to find or cite. This work introduces the first minimal, actionable, step-by-step guidelines—FAIR-BioRS—focused on biomedical research software, derived from FAIR4RS v1.0 and a thorough review of current practices. To reduce burden on researchers and facilitate adoption, the authors also developed FAIRshare, a free, open-source tool that guides users through implementing FAIR-BioRS with user-friendly interfaces and automation.
The authors conducted a comprehensive review to identify practical recommendations aligned with FAIR4RS implementation. Sources included: published FAIR-for-software resources; references cited by those sources; FAIR4Software and FAIR4RS reading lists and Zenodo community page; PubMed literature (Jan 2015–Feb 2023); and additional studies known to the authors. English-language resources across domains were considered when relevant to research software reusability. After de-duplication, 261 records were screened, 70 full texts assessed, and 39 resources included. Data from the review were organized in a structured dataset (data.xlsx) and analyzed using a Jupyter notebook with pandas, Matplotlib, and seaborn to synthesize recommendations for each category (standards/best practices, metadata, licensing, repositories, registries).
The work proceeded in stages: (1) Derive concise, high-level instructions to fulfill each FAIR4RS v1.0 principle by analyzing the FAIR4RS publications and introduction manuscript; (2) Group overlapping instructions into five action-based categories: Develop with standards/best practices; Include metadata; Provide a license; Share in a repository; Register in a registry; (3) Define outstanding questions per category (e.g., which repositories/registries to use, what metadata to include); (4) Conduct a structured literature review and analyze current practices to answer these questions; (5) Organize the resulting recommendations into step-by-step guidelines that align with a typical software development lifecycle, producing the FAIR-BioRS guidelines (v2.0.0). The guidelines prescribe concrete actions such as selecting a VCS platform, choosing an OSI-approved license (preferably MIT or Apache 2.0), maintaining README and CHANGELOG, including codemeta.json and CITATION.cff with specified minimum fields, archiving each release on Zenodo (or Figshare) and Software Heritage, optionally using deployment package registries (e.g., PyPI/CRAN), and registering on bio.tools (and optionally RRID). To facilitate adoption, the authors developed FAIRshare, a cross-platform desktop application (Electron frontend with Vue.js; Python/Flask backend) that guides users through a workflow aligned with FAIR-BioRS, including forms to generate codemeta.json and CITATION.cff, license selection from OSI-approved options, DOI reservation via draft deposits on Zenodo/Figshare before publication, optional GitHub integration, and optional registration on bio.tools. FAIRshare pre-populates forms from existing codemeta.json or GitHub metadata and enforces essential fields identified by FAIR-BioRS.
- The authors distilled FAIR4RS principles into five actionable categories and produced the first minimal, step-by-step FAIR-BioRS guidelines (v2.0.0) for biomedical research software, aligning concretely with all FAIR4RS requirements (crosswalk in Supplementary Table 2).
- Literature review outcomes (n=39 included studies): • Category 1 (Standards/best practices): 21 resources suggested best practices. Recommendations: use VCS platforms (GitHub/Bitbucket/GitLab), include code-level documentation where needed, record dependencies per language conventions (e.g., requirements.txt, package.json, DESCRIPTION), follow language-specific style guides (e.g., PEP 8, Google R Style Guide), and adopt community data/IO standards via FAIRsharing when applicable. Containerization discussed but excluded from minimal guidelines. • Category 2 (Metadata): 24 resources recommended software metadata approaches. Strong consensus around CodeMeta (codemeta.json, JSON-LD) and widespread support for Citation File Format (CITATION.cff). Also recommend human-readable documentation (README; CHANGELOG with Keep a Changelog; Semantic Versioning). Minimum fields were specified for codemeta.json (e.g., name, description, identifier, authors/affiliations, keywords, programmingLanguage, datePublished, dateModified, license) and CITATION.cff (authors/affiliations, abstract, identifiers, keywords, license, date-released). • Category 3 (Licensing): 18 resources emphasized OSI-approved open-source licenses; permissive MIT or Apache 2.0 encouraged. Include full license terms in LICENSE in the project root. • Category 4 (Repositories): 27 resources suggested repositories; Zenodo most common, with Figshare and Software Heritage also recommended. Zenodo assigns DOIs per release and a concept DOI; >80% of >50,000 software DOIs were registered via Zenodo. Deployment repositories (e.g., PyPI, CRAN, Conda, Dockstore) increase reuse but may not provide persistent identifiers. Recommendations: archive each version on Zenodo (preferred) or Figshare with source code, metadata files, executables, and sample data; also archive in Software Heritage to obtain SWHIDs across granular levels. No community consensus yet on a single identifier (DOI vs SWHID). • Category 5 (Registries): 8 resources suggested registries; bio.tools recommended for life sciences with biotoolsSchema/EDAM; optional RRID registration. Update registry metadata as versions evolve.
- The FAIR-BioRS guidelines detail six steps from selecting VCS and licensing, through documentation and metadata, to sharing/archiving and registering.
- FAIRshare implements a guided workflow that: pre-populates metadata; generates codemeta.json and CITATION.cff; assists license selection; reserves DOIs in draft deposits before publication (addressing FAIR F3); supports both local folders and GitHub repositories; and offers optional bio.tools registration.
- Community engagement informed v2.0.0 (BOSC, SciCodes, AI-READI), and alignment with NIH best practices was noted.
The work addresses a critical gap by translating aspirational FAIR4RS principles into concrete, minimal steps tailored for biomedical research software, facilitating reproducibility and reuse. The FAIR-BioRS guidelines align with NIH recommendations and were iteratively refined through community feedback (e.g., BOSC, SciCodes, AI-READI). Key impacts include standardized use of machine- and human-readable metadata (CodeMeta and CFF), explicit guidance on licensing and repositories, and pragmatic registration practices (bio.tools, optional RRID). FAIRshare enhances adoption by integrating guidance, automation, and repository/registry interactions, notably enabling DOI reservation prior to publication to include identifiers in metadata (supporting FAIR F3). The authors identify major ecosystem gaps that hinder uniform implementation: lack of consensus on development standards and documentation practices; fragmentation across metadata schemas/ontologies (CodeMeta, CFF, Bioschemas, EDAM/biotoolsSchema); insufficiently consolidated domain data format standards; and absence of a single, agreed persistent identifier for software (DOI vs SWHID vs biotoolsID vs RRID). They plan ongoing revisions of FAIR-BioRS, more domain/tool-specific guidelines (e.g., Python packages, notebooks), and enhancements to FAIRshare (Software Heritage integration, RRID portal, support for Bitbucket/GitLab, automated validation of standards, NLP-assisted metadata).
This paper delivers the first minimal, actionable guidelines (FAIR-BioRS v2.0.0) to operationalize the FAIR4RS principles for biomedical research software and a complementary tool (FAIRshare) that guides and automates implementation. Together, they lower barriers to making software findable, accessible, interoperable, and reusable by prescribing concrete steps for development practices, metadata, licensing, archiving, and registry registration, and by integrating these steps into a user-friendly workflow. The authors foresee continued evolution based on community feedback and emerging standards, development of more specific guidelines by software type, and expanded FAIRshare capabilities to further streamline and validate FAIR practice adoption, ultimately accelerating reproducible and reusable biomedical research.
- Lack of community consensus on development standards/best practices and documentation approaches complicates universal guidance.
- Fragmentation and overlap among metadata schemas and ontologies (CodeMeta, CFF, Bioschemas, EDAM/biotoolsSchema) hinder a single, unified software metadata approach.
- Limited agreement on standard file formats across biomedical data types challenges interoperability of software inputs/outputs; FAIRsharing can be overwhelming to navigate.
- No consensus on a single identifier for software (DOI vs SWHID vs biotoolsID vs RRID), impacting citation and tracking.
- FAIRshare currently focuses on end-of-cycle publication workflows; assisting from the outset of development (e.g., continuous guidance and validation within VCS platforms) is planned but not yet implemented.
Related Publications
Explore these studies to deepen your understanding of the subject.

