This paper evaluates 20 computational metrics for assessing the quality of enzyme sequences generated by three models: ancestral sequence reconstruction (ASR), a generative adversarial network (GAN), and a protein language model. Over 500 natural and generated sequences were expressed and purified to benchmark these metrics against in vitro enzyme activity. A computational filter, COMPSS, was developed, improving experimental success rates by 50-150%. The findings provide a benchmark for generative models and aid in selecting active variants for experimental testing, advancing protein engineering research.
Publisher
Nature Biotechnology
Published On
Apr 23, 2024
Authors
Sean R. Johnson, Xiaozhi Fu, Sandra Viknander, Clara Goldin, Sarah Monaco, Aleksej Zelezniak, Kevin K. Yang
Tags
enzyme sequences
computational metrics
generative models
protein engineering
experimental success rates
Related Publications
Explore these studies to deepen your understanding of the subject.