logo
ResearchBunny Logo
Fast and accurate machine learning prediction of phonon scattering rates and lattice thermal conductivity

Engineering and Technology

Fast and accurate machine learning prediction of phonon scattering rates and lattice thermal conductivity

Z. Guo, P. R. Chowdhury, et al.

Unlock the secrets of lattice thermal conductivity with groundbreaking machine learning techniques developed by Ziqi Guo and colleagues. This study achieves unprecedented accuracy in predicting phonon scattering rates and thermal conductivity, overcoming challenges of high skewness and complex contributions. Experience a leap in computational efficiency that paves the way for large-scale thermal transport informatics.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the challenge of accurately and efficiently predicting lattice thermal conductivity (κ) across materials. Traditional experimental measurements and first-principles Boltzmann transport equation (BTE) calculations require explicit three-phonon (3ph) and four-phonon (4ph) scattering and are computationally prohibitive, especially for complex materials and dense q-meshes. While including both 3ph and 4ph scattering is now recognized as necessary to achieve agreement with experiments, the corresponding computational costs limit broad application. Recent end-to-end machine learning models that map structural descriptors directly to κ provide only rough estimates with substantial errors. The authors propose a different ML strategy: building surrogate deep neural network models that predict phonon-level scattering rates, preserving the physics of phonon transport, to compute relaxation times and κ with high accuracy but far lower computational cost. The research hypothesis is that carefully engineered descriptors, data transformations to address skewed targets, and targeted loss weighting, combined with transfer learning between 3ph and 4ph, can yield phonon-scattering-rate surrogates that reproduce first-principles accuracy while drastically accelerating κ prediction.
Literature Review
Foundational work established phonon BTE and three-phonon scattering theory (Peierls; Maradudin and Fein), with subsequent ab initio implementations enabling first-principles κ predictions (Broido et al.). Four-phonon scattering theory was formulated more recently (Feng et al.), and its importance was experimentally validated. Several codes exist for κ prediction, notably ShengBTE, almaBTE, Phono3py, and FourPhonon. However, both experiments and first-principles computations, particularly including 4ph, remain expensive. Prior ML efforts have created end-to-end models predicting κ from structural information; while promising for screening, they lack first-principles-level accuracy and often incur errors exceeding ±30% or more than ±100% for some cases. Transfer learning has been applied in thermal materials to leverage proxy properties and low-/high-fidelity data. This work builds upon these insights by targeting the phonon-scattering level rather than direct κ prediction, aiming to combine physical fidelity with ML efficiency.
Methodology
Overall workflow: The standard phonon BTE workflow identifies allowed 3ph and 4ph scattering processes from phonon dispersions and phase space, computes process-level scattering rates Γ^{3ph}_{λλ'λ''} and Γ^{4ph}_{λλ'λ''λ'''}, sums over processes to obtain per-mode relaxation times τ^{3ph}_λ and τ^{4ph}_λ, combines them via a spectral Matthiessen’s rule τ_λ^{-1} = (τ^{3ph}_λ)^{-1} + (τ^{4ph}_λ)^{-1}, and integrates to yield κ and its spectral contributions. The most expensive step is the evaluation of individual Γ for the vast number of allowed processes (order 10^6 for 3ph and 10^11 for 4ph in Si). The authors replace explicit Γ evaluation with deep neural network (DNN) surrogates. Surrogate models and descriptors: Separate DNNs are trained for 3ph and 4ph scattering for each material (Si, MgO, LiCoO2). Each allowed scattering process is fully determined by its participating phonons, thus the input descriptor for a process concatenates, for each phonon, its frequency ω, wave vector k, eigenvector e, and group velocity v (all obtained from lattice dynamics by solving the dynamical matrix). For materials with 2 atoms per primitive cell (Si, MgO), descriptor dimensions are 57 (3ph) and 76 (4ph); for LiCoO2 with 4 atoms per primitive cell, dimensions are 93 (3ph) and 124 (4ph). Data generation and sampling: Datasets are generated using ShengBTE integrated with the FourPhonon module at 300 K. For 3ph only: N q-mesh of 28 (Si), 20 (MgO), and 10 (LiCoO2), unity broadening. For 3ph+4ph: N of 16 (Si), 15 (MgO), 10 (LiCoO2); broadening 0.1 (Si, MgO) and 0.01 (LiCoO2). Isotopic scattering is included. Negative Γ values (unphysical) are removed. To balance modes with different numbers of allowed processes, a fixed number of processes is sampled per mode: 2,000 per mode for 3ph and 20,000 per mode for 4ph to form the training sets, which constitute a small fraction of the total phase space. Target transformation and loss weighting: Because Γ spans many orders of magnitude with a long tail toward zero, direct training is biased and violates physical low-frequency scaling. The target is transformed as −log10(Γ), which improves scaling behavior and reduces skewness. However, errors on high-Γ processes disproportionately affect κ. Therefore, target-value-based loss weights are introduced to emphasize high-scattering-rate processes. The weighting form depends on the transformed Γ and was tuned to generalize across materials and both 3ph and 4ph; this improves prediction of the processes most influential for total scattering and κ. DNN architecture and training: Models are implemented in TensorFlow, with 4 hidden layers of sizes 1000, 1000, 1000, and 10 using ReLU activation; a single linear-output neuron predicts the transformed Γ. Mean squared error loss (with target-based weights) is minimized using Adam. Mini-batch size is 2048. Early stopping is employed. During inference, large batches (up to 2^20 descriptors) are used subject to memory limits. For 4ph, mode-by-mode evaluation with file I/O is used to avoid exceeding memory constraints; τ^{4ph} values are computed and memory freed per mode. Transfer learning: Because 4ph involves one additional phonon, the 3ph and 4ph models have different input dimensions. To enable transfer, a “virtual phonon” (dummy zeros) is appended to 3ph inputs so that the 3ph model’s input dimensionality matches the 4ph model. The modified 3ph model is trained normally and then used as a warm-start initialization (weights and biases) for the 4ph model, which is subsequently trained on 4ph data. This leverages similarities between 3ph and 4ph scattering physics to improve 4ph predictions and reduce required training data. κ computation and evaluation: The surrogate-predicted Γ (3ph and/or 4ph as appropriate) are used to compute τ per mode and then κ via the BTE workflow in ShengBTE under the same numerical settings as used for data generation. Results are averaged over six surrogates trained with different random splits. Accuracy is reported via R^2 for scattering rates or relaxation times and via MAPE for κ. Computational cost is measured as CPU (and GPU where used) time; comparisons are made to analytical calculations (ShengBTE+FourPhonon) executed on Purdue RCAC clusters.
Key Findings
- Three-phonon surrogates: For Γ^{3ph}, R^2 on test data is 0.922 (Si), 0.891 (MgO), 0.477 (LiCoO2). For τ^{3ph}, R^2 improves to 0.968 (Si), 0.957 (MgO), 0.945 (LiCoO2), indicating error cancellation upon summation over processes and correct low-frequency scaling behavior. κ^{3ph} predictions have relative errors less than about 3% and match spectral trends across frequency. - Computational speedups for 3ph: End-to-end κ^{3ph} workflow is accelerated by approximately 2.85× (Si), 2.5× (MgO), and 3.57× (LiCoO2). On a per-process basis, surrogate Γ evaluations are on average two orders of magnitude faster than analytical calculations. - Four-phonon surrogates: τ^{4ph} predictions achieve very high accuracy with R^2 of 0.994 (Si), 0.995 (MgO), and 0.979 (LiCoO2). Using surrogate τ^{4ph} combined with analytical τ^{3ph}, cumulative κ^{3ph+4ph} spectra closely match analytical results. - κ^{3ph+4ph} totals (W/(m·K)) averaged over six runs: Si 120.5 ± 0.2 vs analytical 120.6 (MAPE 0.09%); MgO 42.32 ± 0.10 vs 42.2 (MAPE 0.36%); LiCoO2 6.812 ± 0.288 vs 6.619 (MAPE 4.46%). All errors are under 5%, within typical experimental uncertainties (~10%). - Computational speedups for 3ph+4ph: Total κ^{3ph+4ph} workflow is accelerated by about 64.3× (Si), 69.9× (MgO), and 17.1× (LiCoO2). Per-process τ^{4ph} evaluation remains roughly two orders of magnitude faster than analytical. - Transfer learning from 3ph to 4ph: Using a warm-start from a modified 3ph model with a virtual phonon reduces κ^{3ph+4ph} MAPE by 66.7% (Si), 75.0% (MgO), and 55.8% (LiCoO2) compared to 4ph surrogates trained from scratch on the same data. With reduced 4ph training data (e.g., 0.3% of phase space), transfer learning substantially narrows the gap to models trained on 3% of the 4ph phase space, enabling significant data and time savings. - Comparison with end-to-end κ predictors: The surrogate approach yields relative errors always within 5%, whereas end-to-end models exhibit typical errors around ±30% and sometimes exceed ±100%.
Discussion
By targeting the phonon-scattering level instead of directly mapping structure to κ, the proposed ML surrogates retain essential physics of thermal transport, outputting process- and mode-level quantities (Γ, τ) that are valuable beyond κ prediction (e.g., optical linewidths, thermal barrier coatings, radiative cooling). This physics-informed approach translates to quantitative accuracy (≤5% error) versus the rough estimates of prior end-to-end ML models. The dramatic speedups arise from fast DNN forward passes and batched evaluation of many processes, without requiring specialized GPU acceleration (CPU-only runs remain highly accelerated). The framework can be extended, for example via classification to prune unimportant scattering processes before κ evaluation, further boosting efficiency. Transfer learning leverages similarities between 3ph and 4ph scattering to improve 4ph predictions and reduce data needs; similar strategies could be explored between materials with related structures. Some discrepancies remain, such as slight underprediction of κ^{3ph+4ph} for Si relative to experiment due to neglecting phonon renormalization at finite temperature, suggesting future inclusion of renormalization effects. Addressing memory constraints (mode-by-mode 4ph evaluation) and implementing in more efficient languages could yield further acceleration. Descriptor harmonization across materials with different primitive-cell sizes is another avenue to broaden cross-material transfer learning.
Conclusion
The authors develop deep-learning surrogate models that predict three- and four-phonon scattering rates and relaxation times with near first-principles fidelity, enabling accurate lattice thermal conductivity calculations with errors below ~3% (κ^{3ph}) and 5% (κ^{3ph+4ph}). The approach yields up to ~4× speedup for 3ph workflows and up to ~70× for 3ph+4ph compared with analytical BTE computations, while preserving detailed physical information. Transfer learning from 3ph to 4ph further enhances accuracy and reduces data requirements. These surrogates overcome key computational bottlenecks of high-order phonon scattering and pave the way for large-scale, high-confidence thermal transport informatics and materials design. Future work should incorporate phonon renormalization, optimize memory and implementation efficiency, and develop descriptors that facilitate transfer across materials with differing unit-cell complexities.
Limitations
- Use of the relaxation time approximation (RTA) assumes Umklapp-dominated scattering; this can contribute to slight underprediction versus experiment (e.g., Si) when phonon renormalization at finite temperature is neglected. - Memory constraints necessitate mode-by-mode evaluation and file I/O for 4ph, reducing potential speed gains; evaluating larger batches would accelerate further. - The current implementation in Python is less efficient than compiled languages; reimplementation could yield additional acceleration. - Descriptor dimensionality depends on the number of atoms per primitive cell, complicating transfer learning across materials with different eigenvector dimensions; improved, size-agnostic descriptor schemes are needed. - Training data generation still requires a fraction of analytical calculations; while small, it constitutes an overhead in the total workflow.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny