
Biology
Open science and data sharing in cognitive neuroscience with MouseBytes and MouseBytes+
S. Memar, E. Jiang, et al.
Discover MouseBytes, a revolutionary web-based repository designed by Sara Memar and her colleagues for sharing and analyzing rodent cognitive data. This innovative platform not only visualizes touchscreen-based behavioral data but also integrates with neuro-technologies for comprehensive analysis, adhering to FAIR principles and benefitting thousands of researchers. Dive into the future of neuroscience research!
~3 min • Beginner • English
Introduction
The paper addresses the gap in open, standardized sharing of rodent cognitive behavioral data compared to other neuroscience fields like neuroimaging and genomics. Traditional rodent behavioral assays lack standardization and automation, hindering reproducibility and data reuse. Touchscreen-automated cognitive testing offers standardized tasks and outputs compatible with FAIR data principles. The purpose of the study is to present MouseBytes, a FAIR-compliant, web-based platform for storing, sharing, visualizing, and analyzing touchscreen-based rodent cognitive data, and MouseBytes+, which integrates complementary neuro-technologies with behavior. The platforms aim to enhance data reusability, transparency, and multi-modal integration to advance understanding of brain-behavior relationships and support Open Science.
Literature Review
The authors situate MouseBytes within the broader Open Science movement and FAIR Data Principles, noting successful data-sharing platforms in neuroimaging (e.g., PRIME Data Exchange) and genomics (e.g., ENCODE, ArrayExpress, BioGRID). They highlight the lack of similar standardization and sharing in rodent behavioral research due to heterogeneous protocols and outputs. Touchscreen technology, widely adopted across hundreds of labs, provides automated, standardized tasks analogous to human cognitive tests (e.g., PAL, LD, TUNL, rCPT, 5-CSRTT), enabling high-resolution behavioral measures and synchronization with neuro-technologies such as fiber photometry, miniscopes, optogenetics, MRI, PET, and OMICs. Prior work introduced MouseBytes as a pipeline and database used in a multi-site Alzheimer’s disease mouse cohort study, motivating the expanded architecture and functionalities described here.
Methodology
System design and data flow: MouseBytes is a web application connected to a database of touchscreen-derived cognitive data. Data are collected globally using touchscreen systems and exported primarily as XML files (from ABET II) to ensure inclusion of all features and metadata; CSV exports are supported but may omit features depending on user selection. Uploaded files undergo automated, task-specific quality control (QC) to detect software issues, human errors, rodent under-performance, duplication, and compatibility with experiment/task definitions. QC outcomes are surfaced to users via an Upload Log dashboard for error handling.
Metadata and data curation: Authenticated users create Experiments with required metadata (title, dates, cognitive task, species, PI and institution, public/private status, task battery, DOI, description). Sub-experiments capture variables such as age, intervention, light cycle, housing, stimuli. Animal records include ID, sex, strain, genotype. Metadata are primarily entered via controlled vocabularies (drop-downs) for consistency. Public services (Data Lab, Data Visualization, Search) enable open access; user-only services manage uploads and private data.
Data discovery and extraction: The Search page lists all datasets/experiments with key metadata and a persistent identifier (DOI via DataCite) for public datasets. Filters include status, task, age, strain, genotype. Data Lab provides query-based extraction of open datasets under CC0 license, with filters for task, experiment title, task-specific features, PI, site, intervention, age, sex, strain, genotype. Users can extract trial-level or aggregated data (mean, standard error, count, sum), export to CSV, and generate unique links for sharing or embedding in publications. Private data can be shared via unique links at the owner’s discretion.
Visualization and analytics: MouseBytes integrates TIBCO Spotfire for interactive dashboards linked to the SQL Server data source. Users select a cognitive task to view key metrics and apply filters (site, strain, genotype, sex, age). Owners can visualize private datasets when authenticated, enabling comparison with public data upon upload.
MouseBytes+ complementary repositories: MouseBytes+ serves as a structured repository for complementary data modalities (e.g., fiber photometry, electrophysiology, miniscopes, MRI, PET, OMICs), software/scripts, and media. Each repository entry includes searchable metadata (Author, PIs, Title, Date, Keywords, DOI, URLs, Status). Within entries, users create typed uploads (Dataset—Touchscreen, Fiber Photometry, MRI Imaging, Miniscope; Software; Video/Audio; Other) with descriptions and dataset-specific metadata (Cognitive Task, Species, Sex, Strain, etc.) and attach files. Public repositories expose file lists and downloads; links to associated MouseBytes experiments are displayed.
Open Science integration and sustainability: Public datasets receive DOIs via DataCite and are indexed for discovery. MouseBytes integrates with the Canadian Open Neuroscience Platform (CONP) via DataLad to mirror public, published datasets in CSV format; updates post-publication trigger automated re-synchronization. Two on-premises servers (Linux and Windows) are maintained by Western Technology Services; SSL ensures data-in-transit security. A scientific consulting board advises on dataset standardization and governance.
Architecture and technologies: Client side uses Angular (TypeScript, HTML templates, Angular Material, HTTP services). Server side uses ASP.NET Web API with .NET Core; API controllers handle business logic and communicate with a Microsoft SQL Server 2017 database via a data access layer. Visualization is delivered via Spotfire, and data formats emphasize XML for completeness. Source code is available on GitHub (Rodent-Cognition-Core/CBAS) under GPL 3.0.
Key Findings
Community adoption and usage metrics: Over 3,000 individual mice represented in MouseBytes (public and private datasets). Web pages have >30,000 views since Feb 2019; ~13,000 homepage visits from ~4,600 individual users. Datasets downloaded 868 times; ~1,200 unique data-access links generated since June 2020. The platform hosts 88 datasets across 10 cognitive tasks: 57 private (ongoing), 31 public (many linked to publications via DOIs). Thirteen laboratories (17 PIs; 69 registered users) have uploaded datasets.
Data composition: >90% of data points come from four tasks—5-CSRTT (5C), Pairwise Visual Discrimination (PVD), Paired Associates Learning (PAL), and Continuous Performance Task (CPT)—with >40,000 data points each for 5C and PVD, >25,000 for PAL, and >15,000 for CPT. Frequently studied models include APP, presenilin, tau, and TDP-43 mutant mice. Age distribution skews younger: about half the data points in 5C, PVD, and PAL are from 3–6 month-old mice; CPT has a more even 3–13 month distribution. Sex distribution: in 5C, PVD, and PAL, male:female ≈ 6:4; CPT is ~equal.
Platform capabilities and compliance: MouseBytes provides FAIR-compliant data management with DOIs for public datasets, standardized metadata, public access to data under CC0 for reuse, and automated QC. MouseBytes+ enables integration of complementary neuro-technologies with behavior. Data visualization supports interactive analysis for both public and authenticated private datasets. Integration with CONP expands discoverability and access.
Impact: The platforms facilitate reproducibility, cross-lab comparisons, rapid meta-analyses, and link primary datasets directly to figures via unique URLs. They lower barriers to collaboration and support Open Science mandates from journals and funders.
Discussion
The work demonstrates that standardized touchscreen-based behavioral data, paired with robust metadata and automated QC, can be shared, visualized, and reanalyzed at scale, addressing historical barriers to data reuse in rodent behavioral neuroscience. MouseBytes operationalizes FAIR principles through persistent identifiers, standardized metadata, open licensing (for public datasets), and interoperable formats, directly supporting reproducibility and transparency. By enabling immediate comparison of new (including private) datasets against aggregated public data, the platform helps researchers assess robustness across age, sex, strain, genotype, site, and task features.
MouseBytes+ extends the platform to multi-modal integration, allowing complementary datasets (fiber photometry, miniscopes, electrophysiology, MRI, PET, OMICs) and code/media to be organized, linked via DOIs, and searched in one place. This integration is crucial for connecting neural activity, structure, and behavior and for fostering cross-disciplinary analyses. The linkage to the Touchscreen Cognition community (SOPs, training, forums) further strengthens user engagement and standardization, aligning with INCF guidance for FAIR repositories.
Collectively, the platforms address the need for scalable, standardized, and open infrastructure in rodent cognitive research, enabling broader meta-analyses, facilitating compliance with data-sharing policies, and democratizing access to curated datasets.
Conclusion
MouseBytes and MouseBytes+ provide a comprehensive, FAIR-aligned ecosystem for storing, sharing, visualizing, and integrating touchscreen-based rodent cognitive data with complementary neuro-technologies. The platforms deliver standardized metadata, automated QC, DOIs for public datasets, open licensing, interactive analytics, and cross-repository integration (CONP), thereby advancing reproducibility and data reuse. Usage metrics indicate growing community adoption across multiple labs, tasks, and disease models.
Future directions include continued development of governance and standardization with a scientific consulting board, expansion of multi-modal data types and analytics, deeper integration with community resources, and sustained scaling to support increasingly complex, high-dimensional datasets across disorders and species.
Limitations
Related Publications
Explore these studies to deepen your understanding of the subject.