Introduction
The exponential growth of scientific literature presents challenges for conducting systematic reviews and meta-analyses, crucial for providing comprehensive overviews across various fields. Traditional manual screening of titles and abstracts is time-consuming, prone to errors, and inefficient due to the imbalanced nature of the data (a small fraction of studies being relevant). This manual process becomes unsustainable with the increasing volume of publications. Machine learning, particularly active learning, offers a solution by prioritizing relevant studies for manual review. Existing tools often lack transparency (closed-source) and flexibility in handling diverse concepts within systematic reviews. This paper introduces ASReview, an open-source, flexible, and transparent machine learning-aided pipeline designed to overcome these limitations. ASReview aims to enhance the efficiency and transparency of systematic reviews by utilizing active learning to focus researchers' efforts on the most relevant studies. The software’s open-source nature encourages community collaboration and continuous improvement.
Literature Review
The paper reviews existing tools that use active learning for systematic reviewing, highlighting their limitations. Many are closed-source, lacking transparency and flexibility in handling diverse concepts and classifier types. A table summarizes existing tools, comparing their machine learning algorithms, active learning features, and privacy policies, exposing the lack of transparency and flexibility in many existing systems. This absence of transparency and flexibility motivates the development of ASReview, which directly addresses these shortcomings.
Methodology
ASReview's pipeline begins with a researcher's initial literature search. The resulting records (titles and abstracts) are uploaded into the ASReview software. The researcher selects prior knowledge (relevant and irrelevant examples) to train an initial machine learning model. The active learning cycle then commences: ASReview presents a record, the researcher labels it as relevant or irrelevant, and the model retrains. This cycle continues until a user-defined stopping criterion is met. The system offers various options for classifiers (Naive Bayes, SVM, Neural Network, Logistic Regression, Random Forests, LSTM-base, LSTM-pool), feature extraction (TF-IDF, Embedding-IDF, Sentence BERT, Doc2Vec, Embedding LSTM), query strategies (certainty-based, uncertainty-based, random, mixed), and balance strategies (dynamic resampling, undersampling, simple). The software supports multiple file formats (RIS, CSV, XLSX, XLS) and allows for advanced options, such as weighting titles and abstracts independently. The core of ASReview is designed to be easily extensible, permitting third-party additions of modules that improve its capabilities. While the focus is on systematic reviews, ASReview can also process any text source. The software provides Oracle, Simulation, and Exploration modes. The Oracle mode facilitates interactive systematic reviews. The Simulation mode evaluates performance on existing datasets, and the Exploration mode serves educational purposes. ASReview maintains user data confidentiality; data remain on the user’s computer, without third-party access. The design prioritizes transparency and reproducibility, crucial for open science.
Key Findings
Simulation studies using four labelled datasets demonstrated ASReview's effectiveness. The work saved over sampling (WSS) at 95% recall averaged 83%, ranging from 67% to 100%, showcasing significant efficiency gains over random sampling. The percentage of relevant references found after reviewing only 10% of the abstracts ranged from 70% to 100%. User experience testing involved both unstructured interviews and a systematic test, gathering feedback from diverse user groups (academics, medical guideline developers, pharmaceutical reviewers). Participants highly rated the software's usability (average score of 7.9 out of 10), praising its helpfulness, accessibility, clarity, and ease of use. Feedback from these tests led to several software updates, including a revised graphical user interface and improved documentation. The simulation results and user feedback show ASReview significantly reduces the time and effort required for systematic reviews without compromising quality or transparency.
Discussion
ASReview successfully addresses the limitations of existing systematic review tools by providing an open-source, flexible, and transparent platform that utilizes active learning. The significant work saved and high user satisfaction demonstrated in the simulation and user experience testing validate ASReview's effectiveness. The open-source nature ensures continuous improvement through community contributions, crucial in addressing the increasing volume of scientific literature. ASReview’s flexibility in model choices allows researchers to optimize the system for specific needs.
Conclusion
ASReview offers a significant advancement in conducting efficient and transparent systematic reviews and meta-analyses. Its open-source nature, combined with the superior efficiency demonstrated by simulation and the high usability scores from user testing, makes it a valuable tool for researchers. Future work could focus on improving error rate estimation, expanding benchmarks to other applications beyond systematic reviews, and integrating ASReview with tools automating other stages of the systematic review process, such as data extraction and bias assessment. Further research could also explore the performance with different document lengths, domain-specific terminologies, and alternative methods for selecting prior knowledge in the absence of expert knowledge.
Limitations
While ASReview significantly reduces screening workload, it does not eliminate the need for manual review. Accurate error rate estimation remains a challenge. The tool automates only the screening stage of systematic reviews; integration with tools for other stages is needed for complete automation. The performance may vary depending on the specific dataset and research question. The selection of prior knowledge is important for model performance; alternative methods might be needed when expert knowledge is lacking.
Related Publications
Explore these studies to deepen your understanding of the subject.