Engineering and TechnologyNature Communications
Exploiting redundancy in large materials datasets for efficient machine learning with less data
K. Li, D. Persaud, et al.
Discover groundbreaking research by Kangming Li, Daniel Persaud, Kamal Choudhary, Brian DeCost, Michael Greenwood, and Jason Hattrick-Simpers, revealing that up to 95% of materials dataset can be eliminated without sacrificing prediction accuracy. This study challenges conventional wisdom by demonstrating that less can indeed be more when it comes to machine learning datasets.
Related Publications
Explore these studies to deepen your understanding
Adjacent work that informs or extends this paper's methodology and findings.
Engineering and Technology
Topographic design in wearable MXene sensors with in-sensor machine learning for full-body avatar reconstruction
H. Yang, J. Li, et al.
Computer Science
Reliability of Supervised Machine Learning Using Synthetic Data in Health Care: Model to Preserve Privacy for Data Sharing
D. Rankin, M. Black, et al.
Computer Science
MD-HIT: Machine learning for material property prediction with dataset redundancy control
Q. Li, N. Fu, et al.
Chemistry
Representation of molecular structures with persistent homology for machine learning applications in chemistry
J. Townsend, C. P. Micucci, et al.

