
Interdisciplinary Studies
Human-machine-learning integration and task allocation in citizen science
M. Ponti and A. Seredko
This research by Marisa Ponti and Alena Seredko delves into the intriguing dynamics of task allocation in citizen science projects that blend human effort with AI capabilities. It highlights the delicate balance between volunteer engagement and meaningful task assignment, sparking vital conversations about who gets to do what in these collaborative efforts.
~3 min • Beginner • English
Introduction
The paper addresses how tasks are currently distributed among citizen scientists, domain experts, and AI computational technologies within citizen science projects. With rapid advances in machine learning (ML) and neural network-based approaches being integrated into citizen science workflows (e.g., image recognition and consensus algorithms), questions arise about how humans and machines can complement each other and how such distributions affect volunteer engagement. Prior debates in human-automation interaction emphasize optimizing function allocation for efficiency (e.g., Fitts’ HABA–MABA principles), but concerns persist that AI may deskill or diminish the qualitative experience of participation. The study aims to map task distribution, required skills, and research activities where each actor contributes, asking: (1) What tasks do citizen scientists, experts, and computational technologies perform? (2) What skills are needed for these tasks? (3) In which research activities (data collection, processing, analysis) are these tasks performed?
Literature Review
Task allocation has long been examined in human-automation interaction, cognitive engineering, human factors, and HCI. Fitts’ (1951) HABA–MABA list, despite critiques, remains influential as a heuristic for assigning functions to humans or machines. More recent perspectives (e.g., Tausch and Kluge, 2020) call for approaches that also consider human experience and empowerment, not only efficiency. Policy and research (e.g., STOA 2021) highlight that AI can both upgrade skills and deskill work, impacting autonomy and job quality. Within citizen science, concerns were voiced that fast progress in ML (e.g., CNNs, GANs) could reduce authentic roles for volunteers and disengage them if algorithms can carry out core tasks. Previous work (Wiggins and Crowston, 2012) studied task distribution to participants but not the distribution among experts, citizen scientists, and AI. The authors adopt and generalize Franzoni and Sauermann’s (2014) framework—task nature (complexity and structure) and required skills—to include computational technologies as task performers. The paper also draws on literatures on crowdsourcing, interdependence in collaborative work, and automation suitability criteria (Brynjolfsson and Mitchell, 2017) to interpret observed allocations.
Methodology
Design: Integrative literature review following PRISMA principles to allow systematic yet flexible inclusion of studies.
Data sources and timeframe: Web of Science, Scopus, and ACM Digital Library; English-language, peer-reviewed journal articles published up to July 2020; preprints excluded. Searches and synthesis conducted from April 1 to September 30, 2020.
Search procedures:
- Procedure 1: Terms included “citizen science” OR “citizen scientist*” AND AI/ML-related terms (e.g., “artificial intelligence”, “machine learning”, “supervised/unsupervised learning”, “reinforcement learning/algorithm”, “deep learning”, “neural network*”, “transfer learning”). Results: 170 records (Web of Science 83; Scopus 86; ACM 1); 99 unique.
- Procedure 2: To capture citizen science games often not labeled as “citizen science” in abstracts, searched 36 game titles plus AI/ML terms (with additional step for ‘Neo’, ‘Turbulence’, ‘The Cure’ including the keyword ‘game’). Results: 28 records; 20 unique; 17 not covered by Procedure 1.
Screening and eligibility: After duplicate removal, 116 records screened at title/abstract/keywords and, as needed, full text by a single reviewer. Full-text assessed: 116. Excluded: 66, per predefined criteria (e.g., not related to AI computational technologies; not citizen science; CS and AI not examined in combination; simulated data; irrelevant topics). Included in qualitative synthesis: 50 papers.
Data extraction and synthesis: For each included paper, recorded bibliographic data, research field, project type, aims, computational technologies used, and tasks assigned to citizens and experts. Tasks were mapped to research activities (data collection, processing, analysis) and classified by the adapted framework’s two dimensions: (1) nature of the task (interdependence/structure), and (2) required skills (common, specialized, expert; for AI: recognition, prediction). Skill types were often inferred from task descriptions due to infrequent explicit reporting.
Key Findings
Dataset characteristics: 50 papers (2011–2020), with most (35/50) published 2018–2020, indicating growing interest in integrating AI/ML with citizen science. Research areas included astronomy/astrophysics (n=16), ecology/biodiversity/conservation (n=23), biology (n=4), environment (n=3), archeology (n=1), neuroinformatics/imaging/medicine (n=1), wildlife recording (n=1), seismology (n=1).
Citizen scientists:
- Predominant tasks: Data collection (e.g., photos, audio/video; sometimes passive sensing with devices/apps) and data processing/classification (e.g., identifying/counting objects, choosing labels from predefined taxonomies, describing features). Additional roles: proposing new taxonomies/classes, validating algorithm outputs, puzzle-solving in games, completing training.
- Nature/skills: Mostly low-complexity, well-structured, independent subtasks requiring common skills (e.g., taking photos, simple identifications). Some specialized skills/training needed in specific contexts (e.g., EyeWire 3D neuron mapping; field protocols). In games like EteRNA, citizens solved 2D puzzles—moderate complexity with some interdependence and specialized visualization/manipulation skills; citizen strategies informed algorithm development.
Experts:
- Tasks: Curate and preprocess data; create gold-standard datasets; design training sets (including pseudo-absences, augmentation, synthetic data); train/calibrate/validate ML models; recruit, train, and support volunteers; set consensus thresholds and label aggregation rules; evaluate predictive accuracy and compare modeling approaches; conduct field validation with volunteers.
- Nature/skills: Well-structured tasks at medium/high interdependence, requiring specialized/expert domain skills; experts often highly interdependent (e.g., gold-standard creation, comparative model evaluation) and act as human-in-the-loop for ML.
AI computational technologies:
- Skills grouped as recognition (classification, detection, clustering, counting) and prediction (e.g., environmental conditions, bias/error mitigation, species distribution, learning from player moves).
- Tasks: Predominantly data processing (classification/detection/clustering/counting) and some data analysis (e.g., evaluating annotation consistency, predicting distributions/conditions, correcting sensor/data biases). Typically well-structured with high interdependence (sequential/reciprocal with human contributions).
Illustrative quantitative results:
- Supernova Hunters: Deep Embedded Clustering (DEC) used to pre-group images; volunteer labeling effort reduced to about 18% of the standard image-by-image approach.
- EteRNA: EternaBrain CNN trained on expert/player moves achieved 51% base prediction accuracy and 34% location prediction accuracy; players outperformed prior algorithms in discovering RNA design rules.
- AirBeam: Automated ML temporal adjustment mitigated sensor bias during high humidity.
- Species distribution modeling: CNN, SVM, and ensemble methods applied to citizen data combined with environmental covariates yielded viable predictive performance; citizen-derived training data found effective when curated and combined with expert gold standards.
Overall patterns:
- Citizens: concentrated in low-complexity, well-structured tasks requiring common skills; occasional moderate-complexity tasks in game contexts.
- Experts: medium/high-complexity, well-structured, interdependent tasks leveraging expert/specialized skills; central in creating gold standards and validating ML.
- AI: excels at well-structured, high-interdependence recognition/prediction tasks; effectiveness depends on high-quality labeled data and expert oversight.
- The distribution suggests task polarization, with risks for volunteer engagement if roles become overly routine or are automated away.
Discussion
The authors propose a project classification matrix based on task nature (interdependence/structure) and skill requirements (human: common/specialized/expert; AI: recognition/prediction). Mapping reviewed cases shows:
- Citizens predominantly occupy low-complexity, well-structured tasks; in some game contexts they perform moderately complex, collaborative problem-solving.
- Experts conduct medium/high-complexity, well-structured tasks with substantial interdependence (e.g., collaborative gold-standard creation, cross-model evaluation), operating as human-in-the-loop for ML systems.
- AI systems perform well-structured, interdependent tasks, often sequentially or reciprocally linked with human work (training on human-labeled data; humans validate and refine models).
Mechanisms underlying allocation align with ML suitability criteria (Brynjolfsson & Mitchell, 2017): tasks mapping well-defined inputs to outputs with clear feedback and tolerance for some error are highly automatable (e.g., image classification). Conversely, tasks requiring extended reasoning, common-sense knowledge, or explanation are less amenable to current ML.
Implications: While ML scales processing and can reduce volunteer burden, the observed polarization risks disengagement if citizen roles become trivialized or displaced. Designing human–machine systems in citizen science must balance efficiency with meaningful engagement, potentially reallocating citizens to validation, complex annotation, or exploratory/problem-solving roles. Experts remain essential to curate data, ensure quality, interpret model outputs, and close the loop in active/human-in-the-loop learning.
The matrix helps diagnose where projects sit and guides rebalancing toward tasks that enhance learning, inclusion, and authentic participation while leveraging computational strengths.
Conclusion
This paper synthesizes 50 citizen science studies integrating ML/AI to depict how tasks are distributed among citizens, experts, and computational technologies. Using an adaptation of Franzoni and Sauermann’s framework, it shows: (1) citizens typically perform low-complexity, well-structured tasks requiring common skills; (2) experts undertake well-structured, medium/high-complexity, interdependent tasks requiring specialized/expert skills; (3) AI systems perform mostly well-structured, interdependent recognition and prediction tasks, trained and validated with human input. The authors introduce a classification matrix to characterize project profiles and highlight complementarity and interdependence between humans and AI.
Future research directions include: expanding evidence beyond peer-reviewed journals and English-language sources; incorporating project registries; examining which tasks resist full automation and why; studying engagement impacts across diverse volunteer groups; and designing task allocations that advance both efficiency and democratization goals by keeping citizens meaningfully in the loop.
Limitations
- Scope and sources: Limited to peer-reviewed journal articles in English indexed in Web of Science, Scopus, and ACM Digital Library; preprints and grey literature excluded; potential undercoverage of projects (e.g., non-academic or non-published initiatives).
- Sampling: Did not search citizen science project catalogs (e.g., SciStarter); likely bias toward successful projects. Game sampling relied on a convenience list (Baert, 2019), introducing selection bias.
- Timeframe: Publications up to July 2020; subsequent developments not captured.
- Screening: Single reviewer conducted screening and initial parsing, which may introduce selection bias.
- Reporting: Skill requirements often inferred from task descriptions due to limited explicit reporting in primary studies.
Related Publications
Explore these studies to deepen your understanding of the subject.