Interdisciplinary Studies

Ten simple rules for navigating AI in science

A. Crilly, A. Malivert, et al.

Abstract not provided. This research — conducted by Authors present in <Authors> tag — originates from a multidisciplinary Imperial College team; listen to the audio to uncover the full findings, methods and insights.... show more

Introduction

The paper addresses how scientists can effectively navigate and apply artificial intelligence (AI) across scientific disciplines. It highlights the rapid impact of AI and machine learning (ML) on science, citing landmark successes such as AlphaFold for protein structure prediction, deep reinforcement learning for fusion plasma control, and the growing use of large language models in research workflows. The authors define AI broadly to include ML, deep learning, statistical methods, and generative algorithms, and emphasize that their guidance is method-agnostic. They motivate the need for clear problem framing, understanding of data, and appropriate algorithm selection, while recognizing challenges in explainability, interpretability, robustness, and reproducibility. The paper targets early-career STEM researchers with some coding experience and proposes ten practical rules organized along the typical flow of scientific exploration—from framing problems and selecting algorithms, to coding practices, ethical considerations, and assessing robustness.

Literature Review

Methodology

Key Findings

The paper presents ten practical rules for applying AI in scientific research:

Frame your scientific question: Clearly define objectives; understand data characteristics, cleaning, preparation, and formatting; choose supervised vs. unsupervised approaches accordingly; consider causal inference and appropriate performance metrics beyond accuracy.
Learn varying terminology: Master AI jargon and cross-disciplinary terms; recognize synonymy (e.g., SMC vs. particle filters) and polysemy (e.g., multiple meanings of bias); use seminars and reading groups to build fluency.
Do not reinvent the wheel: Reuse existing methods, libraries, model zoos, and foundation models; leverage transfer learning and data augmentation to reduce data and compute needs; engage with experts across fields.
Invest time in your code: Use version control (e.g., Git), virtual environments and containerization (Conda/Docker), and modular, documented code; track assumptions and settings; improve skills via courses, competitions, and open-source contributions.
Bear in mind the FAIR principles: Make code, data, and model weights as open and reusable as possible (e.g., Zenodo, OSF); employ MLOps tools (e.g., Weights & Biases) for reproducible model development and tracking.
Start small and simple: Establish baselines (e.g., random guessing, RF, SVM, XGBoost); begin with simpler, non-deep models before escalating complexity; tune hyperparameters systematically; focus initially on core trends.
Start with synthetic data: Use synthetic datasets to diagnose model vs. data issues; conduct self-consistent tests (e.g., parameter recovery); apply data augmentation to expand training sets and reduce overfitting.
Incorporate additional knowledge: Embed physical laws, symmetries, logic rules, and knowledge graphs to improve robustness, trustworthiness, and data efficiency; align with causal modeling when appropriate.
Aim for explainable, interpretable, and trustworthy AI: Use feature-attribution (e.g., SHAP), saliency maps, and neuron activation analyses; address reproducibility challenges (random seeds, hardware variability) by sharing models, parameters, and hyperparameters.
Keep your Aim in mind: Balance metric optimization with scientific rigor and real-world relevance; assess generalizability and distribution shift (e.g., domain adaptation, post-calibration); quantify aleatoric and epistemic uncertainties for scientifically interpretable results.

Discussion

The paper situates its guidance within a rapidly evolving AI landscape, cautioning that AI can accelerate discovery yet risk creating an illusion of understanding that hinders true scientific progress. It encourages critical appraisal of robustness, explainability, and reproducibility, especially when chasing cutting-edge methods. The authors acknowledge that terminology and tooling change quickly, which pressures researchers to stay current while maintaining skepticism and rigor. Although selective in covered concepts and examples, the ten rules offer a practical framework to help researchers navigate AI’s complexities responsibly and reproducibly, fostering mindful, explainable use of AI in science.

Conclusion

This work contributes a concise, practice-oriented framework—ten rules—for integrating AI into scientific research responsibly and effectively. It emphasizes clear problem framing, careful data handling, principled model selection, robust coding and sharing practices aligned with FAIR, the use of baselines and synthetic data, incorporation of prior knowledge, and attention to explainability, reproducibility, and uncertainty quantification. Future directions include deeper methodological development and guidance on AI ethics and governance, standardized reproducibility protocols across hardware and software stacks, improved tools for interpretability and uncertainty estimation, and strategies to manage distribution shifts between simulation and real-world deployments.

Limitations

The paper is intentionally concise and selective, leaving out many nuances of AI ethics and methodological detail. Specific tools and libraries cited may become outdated as the field evolves. The guidance is high-level and not domain- or method-specific, so it may require adaptation to particular scientific contexts. The article does not present empirical benchmarks or quantitative evaluations of the recommended practices.

Related Publications

Explore these studies to deepen your understanding of the subject.

Interdisciplinary Studies

Ten simple rules for fostering creativity in research labs

M. C. Rillig

Interdisciplinary Studies

Trends in American scientists’ political donations and implications for trust in science

A. A. Kaurov, V. Cologna, et al.

Interdisciplinary Studies

Stereotypes, disproportions, and power asymmetries in the visual portrayal of migrants in ten countries: an interdisciplinary AI-based approach

J. S. Olier and C. Spadavecchia

Environmental Studies and Forestry

Strategies for mainstreaming nature-based solutions in urban governance capacities in ten European cities

K. Hölscher, N. Frantzeskaki, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny