logo
ResearchBunny Logo
Predicting and improving complex beer flavor through machine learning

Food Science and Technology

Predicting and improving complex beer flavor through machine learning

M. Schreurs, S. Piampongsant, et al.

This fascinating study by Michiel Schreurs, Supinya Piampongsant, Miguel Roncoroni and colleagues delves into the world of beer, analyzing 250 varieties to unveil the secrets behind flavor and consumer appreciation. Utilizing gradient boosting machine learning models, it not only predicts complex flavor features but also identifies key chemicals that enhance beer variants. Cheers to science!

00:00
00:00
Playback language: English
Introduction
Predicting and understanding food perception and appreciation is a major challenge in food science. Accurate modeling could revolutionize food production and consumption, impacting quality control, product fingerprinting, counterfeit detection, spoilage detection, and new product development. Accurate predictive models would also improve existing food assessment methods, potentially replacing costly and time-consuming sensory panels with objective, quantitative data. These models could also help consumers understand their preferences. Despite the potential, predicting food flavor and appreciation from chemical properties remains elusive, especially for complex products. The immense number of flavor-active chemicals, their varying structures and concentrations, and the complexity of sensory perception (including non-linear, concentration-dependent synergistic and antagonistic effects) pose significant challenges. Sensory analysis is further complicated by the variability and expense of trained tasting panels, while public consumer reviews, though offering large datasets, are susceptible to biases (e.g., price, cult status, psychological conformity). Classical multivariate statistics and machine learning have been used to predict flavor, but mostly focusing on single compounds, neglecting complex interactions within a food matrix. Classical statistics often require large sample sizes and are sensitive to outliers, making them unsuitable for analyzing hundreds of interacting compounds. This study aims to overcome these limitations by applying advanced machine learning techniques to a large, comprehensive dataset of beer.
Literature Review
Previous studies have attempted to predict beer flavor and popularity using limited chemical data and flavor compounds. However, these studies lacked the scale and scope to fully capture the complex relationships between beer chemistry and sensory perception. There has been work using classical multivariate statistics and machine learning, but most focused on predicting organoleptic properties of single compounds and ignored complex interactions. Linear models and Partial Least Squares Regression (PLSR) are common, but struggle with non-linear relationships and interactions among many compounds. The use of large public databases of consumer reviews presents an opportunity, but bias needs to be carefully considered.
Methodology
This study used 250 commercial Belgian beers across 22 styles. For each beer, 226 chemical properties were measured, including alcohol content, iso-alpha acids, pH, sugar concentration, and over 200 flavor compounds (analyzed using HS-GC-FID/FPD, HS-SPME GC-MS, discrete photometric and enzymatic analysis, and NIR analyses). A trained 16-person tasting panel evaluated 50 sensory attributes using a 7-point scale, and data from over 180,000 consumer reviews from RateBeer were collected. Ten machine learning models were trained (linear regression, lasso regression, PLSR, AdaBoost, Extra Trees, Gradient Boosting, Random Forest, XGBoost, SVR, and ANN) to predict both trained panel scores and RateBeer appreciation scores from the chemical profiles. Model performance was evaluated using R-squared (R²) on a test dataset. Feature importance was assessed using impurity-based feature importance and SHAP values. Finally, validation experiments involved spiking commercial beers with identified key compounds to assess the effect on consumer appreciation.
Key Findings
The Gradient Boosting Regressor (GBR) model consistently outperformed other models, achieving higher R² values for predicting both trained panel scores and RateBeer appreciation scores. GBR models were better at predicting taste than aroma. The model revealed strong correlations between specific chemical compounds and sensory attributes, confirming some expected relationships (e.g., iso-alpha acids and bitterness). However, the model also identified unexpected correlations, highlighting the complexity of flavor perception. Analysis showed that RateBeer data, while biased, provided valuable complementary information, especially for overall appreciation and basic flavor attributes like bitterness and sweetness. The analysis showed that the most predictive parameters for beer appreciation are ethyl acetate, ethanol, protein level, and lactic acid. Some unexpected compounds, like methanethiol and ethyl phenyl acetate (often associated with beer staling), were identified as positively contributing to appreciation at moderate concentrations. Validation experiments showed that increasing the concentration of key compounds significantly improved the overall appreciation of both alcoholic and non-alcoholic beers by trained panelists.
Discussion
This study demonstrates the power of combining large-scale chemical and sensory data with advanced machine learning to predict and improve complex food flavor. GBR models outperform traditional statistical methods by capturing non-linear interactions between compounds and sensory perception. The integration of consumer review data, despite inherent biases, provides a valuable addition, especially for overall appreciation. The identification of unexpected compounds as drivers of appreciation highlights the limitations of relying solely on established knowledge of flavor chemistry. The validation experiments directly confirm the model's predictions, showing that targeted manipulation of specific compound concentrations can lead to improved consumer acceptance.
Conclusion
This study successfully used machine learning to predict beer flavor and consumer appreciation from chemical profiles, outperforming conventional statistical methods. Key compounds identified as drivers of appreciation can be used to improve beer flavor. Future work should focus on expanding the dataset to include a wider range of beer styles and consumer demographics and to develop methods for better identifying true causative compounds from correlated ones. This approach can be extended to other complex food and beverage products.
Limitations
The study's limitations include the focus on Belgian beers, the potential for biases in the consumer review data, and the inability of the models to capture the negative impact of excessively high concentrations of certain compounds. Also, the GBR model tends to prioritize the largest main effect among correlated variables, potentially obscuring the importance of other co-correlated factors. The dataset lacked demographic information on tasters, which could improve model accuracy. Finally, while the models are good predictors, they do not reveal causal relationships, necessitating validation experiments.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny