Food Science and Technology

Predicting multiple taste sensations with a multiobjective machine learning method

L. Androutsos, L. Pallante, et al.

Discover VirtuousMultiTaste, a groundbreaking multi-class taste predictor developed by leading researchers, designed to differentiate between bitter, sweet, and umami flavors using state-of-the-art machine learning techniques. This innovative tool not only analyzes food ingredients' physicochemical properties but also aids in food design and personalized nutrition. Dive into the future of taste analysis!

00:00

Playback language: English

Index

Introduction

Taste perception, crucial for nutrient intake and avoiding harmful substances, involves five basic tastes: sweet, bitter, umami, salty, and sour. These tastes stem from molecular interactions between taste receptors and chemical tastants in the oral cavity, heavily reliant on the physicochemical properties of food. While machine learning (ML) has shown promise in classifying tastes based on molecular structures, improvements are needed in multi-class models predicting the full spectrum of basic tastes. This research addresses this gap by developing VirtuousMultiTaste, a multi-class predictor focusing on bitter, sweet, umami, and 'other' taste categories. This comprehensive approach aims to enhance our understanding of the chemical attributes associated with each fundamental taste, potentially integrating with multi-sensory perception (visual, tactile, olfactory) to holistically characterize flavor. The ultimate goal is to contribute innovative methodologies for rational food design, including pre-determining specific tastes and engineering complementary diets for enhanced health outcomes.

Literature Review

Numerous studies have utilized ML to predict taste, particularly sweet and bitter. Examples include BitterX, BitterPredict, e-Bitter, iBitter-SCM, BERT4Bitter, iBitter-Fuse, and several QSTR-based approaches for bitter taste prediction. For umami, predictors like iUmami-SCM, UMPred-FRL, VirtuousUmami, Umami-MRNN, and Umami-BERT exist. Initial models often employed Multiple Linear Regression (MLR) and Support Vector Machines (SVM) for binary classification. Recent advancements have favored tree-based models (Random Forest, AdaBoost) and Neural Networks (NN) for their effectiveness in multi-class classification and handling non-linearity. However, research on simultaneous prediction of multiple tastes remains limited, hindering applications in food science and technology.

Methodology

The VirtuousMultiTaste predictor was developed using a hybrid approach combining heuristic optimization and nonlinear machine learning. The dataset, curated from various sources, was initially composed of compounds categorized into sweet, bitter, umami, and 'other' taste classes. After data cleaning and standardization using the ChEMBL Structure Pipeline and RDKit, a subset of 1080 compounds (360 from each class except Umami, which had 227) was used for training. The Umami class was oversampled using AdaBoost to balance the classes. 2D Mordred molecular descriptors (1613) were extracted from SMILES representations. A multi-objective evolutionary algorithm optimized feature selection, classifier selection (between SVM and RF), and model parameters. This selected the Random Forest (RF) algorithm and reduced the number of features to 15. Model performance was assessed using 10-fold cross-validation on the training set and evaluated on an independent test set comprising the remaining compounds. SHAP (SHapley Additive exPlanations) values were used for feature importance analysis. The model was further validated using five external datasets (FooDB, FlavorDB, PhenolExplorer, Natural Product Atlas, and PhytoHub). Finally, a user-friendly web interface was developed using Ionic and Flask frameworks to provide easy access to the model.

Key Findings

The multi-objective evolutionary algorithm selected a Random Forest model as optimal, using 15 features. The 10-fold cross-validation on the training set yielded high accuracy (76.54% ± 1.0%), F1 score (76.58% ± 1.0%), F2 score (76.61% ± 1.01%), precision (76.92% ± 1.05%), and recall (76.64% ± 1.0%), with an AUC of 0.92 ± 0.02. The test set performance was also good, showing an accuracy of 71.76%, F1 score of 74.32%, F2 score of 73.10%, precision of 78.98%, recall of 71.76%, and an AUC of 0.87. The model performed well on each taste class individually, with particularly high performance in the Umami class in both training and testing sets (close to 96% and above 80% respectively). Feature importance analysis using SHAP values identified 15 key molecular descriptors related to charge distribution, electronegativity, and polarizability. Testing on external datasets revealed varying distributions of predicted tastes across different databases, largely consistent with previous literature. Notably, the model showed robust performance across different similarity quartiles in the test set, indicating general applicability. Comparisons with other existing taste prediction tools showed that VirtuousMultiTaste exhibits slightly better performance in predicting bitter and comparable results in predicting sweet tastes compared to those previous tools, and similar accuracy in umami prediction but slightly lower F1, F2, Precision and Recall scores.

Discussion

VirtuousMultiTaste successfully addresses the challenge of predicting multiple taste sensations simultaneously, offering a significant improvement over existing single-taste predictors. The model's high accuracy, robustness, and user-friendly interface make it a valuable tool for various applications. The reliance on 2D molecular descriptors reduces computational cost and simplifies interpretation compared to 3D descriptors. The consistent performance across varying similarity levels highlights its generalizability. However, the relative underrepresentation of umami compounds in the training dataset compared to other taste classes represents a limitation. The superior performance for Umami can be partially attributed to the smaller chemical domain of this class and to the oversampling during training. The use of SHAP values helps to interpret the model's predictions, but a future investigation into simpler descriptors or development of more intuitive methods for correlating molecular descriptors with structural features would further enhance model explainability.

Conclusion

VirtuousMultiTaste, a robust and user-friendly multi-taste predictor, provides a significant advance in taste prediction technology. Its ability to predict bitter, sweet, and umami tastes, along with a fourth 'other' category, offers broad applicability in food science and related fields. Future research could focus on expanding the dataset, particularly for umami compounds, incorporating other taste modalities (sour and salty), and developing more intuitive methods to correlate molecular descriptors with structural features to enhance the model’s explainability. This improved model could then be used to predict the overall sensory profile of foods, including the combination of different tastes, ultimately impacting various sectors such as personalized nutrition and innovative food product development.

Limitations

The primary limitation is the relatively small number of umami compounds compared to other taste classes in the training dataset. This imbalance might influence model performance, particularly concerning umami taste prediction. Additionally, the model currently focuses on four taste categories (bitter, sweet, umami, and other); incorporating sour and salty tastes would create a more comprehensive predictor of all five fundamental taste sensations. Finally, the model predicts taste of individual compounds, not the complex overall taste of food, which is influenced by various other factors.

Related Publications

Explore these studies to deepen your understanding of the subject.

Engineering and Technology

In-sensor human gait analysis with machine learning in a wearable microfabricated accelerometer

G. Dion, A. Tessier-poirier, et al.

Medicine and Health

HIDDEN: a machine learning method for detection of disease-relevant populations in case-control single-cell transcriptomics data

A. Goeva, M. Dolan, et al.

Medicine and Health

Predicting radiocephalic arteriovenous fistula success with machine learning

P. Heindel, T. Dey, et al.

Economics

Predicting loss aversion behavior with machine-learning methods

Ö. Saltık, W. U. Rehman, et al.

Listen, Learn & Level Up

Over 10,000 hours of research content in 25+ fields, available in 12+ languages.

No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.

listen to research audio papers with researchbunny