Computer Science

Accelerating materials language processing with large language models

J. Choi and B. Lee

Discover how Jaewoong Choi and Byungju Lee leverage generative pre-trained transformers to revolutionize materials language processing. Their innovative approach not only overcomes traditional challenges but also achieves remarkable results in various knowledge-intensive tasks with minimal dataset requirements.

00:00

~3 min • Beginner • English

Index

Abstract

Materials language processing (MLP) can facilitate materials science research by automating the extraction of structured data from research papers. Despite the existence of deep learning models for MLP tasks, there are ongoing practical issues associated with complex model architectures, extensive fine-tuning, and substantial human-labelled datasets. Here, we introduce the use of large language models, such as generative pretrained transformer (GPT), to replace the complex architectures of prior MLP models with strategic designs of prompt engineering. We find that in-context learning of GPT models with few or zero-shots can provide high performance text classification, named entity recognition and extractive question answering with limited datasets, demonstrated for various classes of materials. These generative models can also help identify incorrect annotated data. Our GPT-based approach can assist material scientists in solving knowledge-intensive MLP tasks, even if they lack relevant expertise, by offering MLP guidelines applicable to any materials science domain. In addition, the outcomes of GPT models are expected to reduce the workload of researchers, such as manual labelling, by producing an initial labelling set and verifying human-annotations.

Publisher

Communications Materials

Published On

Feb 15, 2024

Authors

Jaewoong Choi, Byungju Lee

DOI

https://doi.org/https://doi.org/10.1038/s43246-024-00449-9

Related Publications

Explore these studies to deepen your understanding of the subject.

Computer Science

Generating Code World Models with Large Language Models Guided by Monte Carlo Tree Search

N. Dainese, M. Alakuijala, et al.

Psychology

Automating psychological hypothesis generation with AI: when large language models meet causal graph

S. Tong, K. Mao, et al.

Engineering and Technology

Extracting accurate materials data from research papers with conversational language models and prompt engineering

M. P. Polak and D. Morgan

Engineering and Technology

Single-vat single-cure grayscale digital light processing 3D printing of materials with large property difference and high stretchability

L. Yue, S. M. Montgomery, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny