Engineering and Technologynpj Computational Materials

Automated pipeline for superalloy data by text mining

W. Wang, X. Jiang, et al.

This groundbreaking research by Weiren Wang and colleagues introduces a novel natural language processing pipeline for extracting critical data from scientific literature, specifically targeting superalloys. The study successfully analyzes 2531 records, paving the way for a predictive model of γ solvus temperatures with remarkable accuracy. Discover how this work is revolutionizing material design and data utilization!... show more

General Summary Metrics

Abstract

Data provides a foundation for machine learning, which has accelerated data-driven materials design. The scientific literature contains a large amount of high-quality, reliable data, and automatically extracting data from the literature continues to be a challenge. We propose a natural language processing pipeline to capture both chemical composition and property data that allows analysis and prediction of superalloys. Within 3 h, 2531 records with both composition and property are extracted from 14,425 articles, covering γ solvus temperature, density, solidus, and liquidus temperatures. A data-driven model for γ solvus temperature is built to predict unexplored Co-based superalloys with high γ solvus temperatures within a relative error of 0.81%. We test the predictions via synthesis and characterization of three alloys. A web-based toolkit as an online open-source platform is provided and expected to serve as the basis for a general method to search for targeted materials using data extracted from the literature.

Publisher

npj Computational Materials

Published On

Jan 19, 2022

Authors

Weiren Wang, Xue Jiang, Shaohan Tian, Pei Liu, Depeng Dang, Yanjing Su, Turab Lookman, Jianxin Xie

DOI

https://doi.org/10.1038/s41524-021-00687-2

Explore these studies to deepen your understanding

Adjacent work that informs or extends this paper's methodology and findings.

Business

Mining the impact of social media information on public green consumption attitudes: a framework based on ELM and text data mining

J. Fan, L. Peng, et al.

Chemistry

Accelerated discovery of molecular nanojunction photocatalysts for hydrogen evolution by using automated screening and flow synthesis

W. Zhang, M. Yu, et al.

Computer Science

Deepfake audio as a data augmentation technique for training automatic speech to text transcription models

A. R. Ferreira and C. E. C. Campelo

Biology

DIAMetAlyzer allows automated false-discovery rate-controlled analysis for data-independent acquisition in metabolomics

O. Alka, P. Shanthamoorthy, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 22+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny