logo
ResearchBunny Logo
Abstract
This paper introduces ProtGPT2, a language model trained on protein sequences to generate novel protein sequences. The generated proteins exhibit natural amino acid propensities and are largely globular. Sequence searches indicate the model samples unexplored regions of protein space, and AlphaFold predictions show well-folded structures with complex topologies not found in current databases. ProtGPT2 generates sequences rapidly and is publicly available.
Publisher
Nature Communications
Published On
Jul 27, 2022
Authors
Noelia Ferruz, Steffen Schmidt, Birte Höcker
Tags
ProtGPT2
language model
protein sequences
novel proteins
AlphaFold
amino acid propensities
protein structure
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny