
Physics
High-throughput prediction of the carrier relaxation time via data-driven descriptor
Z. Zhou, G. Cao, et al.
Discover an innovative descriptor crafted by Zizhen Zhou, Guohua Cao, Jianghui Liu, and Huijun Liu, which efficiently predicts carrier relaxation time in tetradymite compounds using a unique data-driven approach. This breakthrough requires no complex calculations and leverages elemental properties, revolutionizing the study of materials with diverse stoichiometries.
~3 min • Beginner • English
Introduction
Substantial advances have been made in screening or discovering functional materials via high-throughput computational methods integrating physics, statistics, computer science, and artificial intelligence. Over the last decade, such approaches have been extensively applied in materials science, including identifying photovoltaic absorbers, predicting compound stability, and screening topological insulators. Among various applications, the search for high-performance thermoelectric materials is particularly important due to energy and environmental challenges. While prior high-throughput efforts targeted low thermal conductivity, another key parameter is the power factor, for which a central difficulty is evaluating the carrier relaxation time owing to complex scattering mechanisms. Electron–acoustic phonon interactions typically dominate in the operating temperature range of thermoelectrics. Within the effective mass approximation, the deformation potential (DP) theory provides an analytical estimate near band edges, and more complete, energy-dependent relaxation times can be obtained from electron–phonon coupling (EPC) calculations via Wannier interpolation. However, both require complex and computationally intensive first-principles workflows, becoming prohibitive for large unit cells. In this work, a high-throughput investigation combined with compressed sensing (SISSO) is employed to predict carrier relaxation time for tetradymite compounds. First-principles band structures are computed for a small set with integer stoichiometry, relaxation times τcal are obtained via DP theory focusing on dominant electron–acoustic phonon scattering, and systems are split into normal insulators (NIs) and topological insulators (TIs) due to their distinct band-edge dispersions and scattering phase spaces. For each class, SISSO is used to construct an interpretable descriptor τpre. The resulting descriptors show high quantitative agreement (Pearson correlation >90%) and can be generalized to predict relaxation times for tetradymites with fractional stoichiometry, with selective first-principles cross-checks confirming accuracy.
Literature Review
The study builds on extensive high-throughput materials discovery efforts applied to photovoltaics, stability prediction, and topological insulators, and on thermoelectric research emphasizing low thermal conductivity screening. For relaxation time estimation, two principal theoretical frameworks are cited: the deformation potential (DP) theory by Bardeen and Shockley within the effective mass approximation for near-edge scattering, and full electron–phonon coupling (EPC) calculations using Wannier interpolation that yield energy- and k-resolved τ but are computationally heavy. Prior experimental reports rarely measure τ directly; instead, τ is inferred from electrical transport (conductivity or mobility) fits. The topological character of several tetradymites (e.g., Bi2Se3, Bi2Te3, Sb2Te3) is known to alter band-edge dispersions and EPC phase space compared to normal insulators, motivating separate treatment of NIs and TIs. Compressed sensing and SISSO have been established as powerful tools to identify low-dimensional, physically meaningful descriptors from large feature spaces, even with relatively small training sets, making them apt for this problem.
Methodology
Data generation and labeling: A combinatorial set of tetradymite compounds was constructed by placing group-VA elements (As, Sb, Bi) on A/B cation sites and group-VIA elements (S, Se, Te) on C/D/E anion sites of the rhombohedral (quintuple-layer) structure. Among 243 integer-stoichiometry candidates, 85 NIs and 67 TIs were predicted mechanically stable and used for high-accuracy calculations. First-principles calculations: Density functional theory (VASP) with PBE-GGA exchange-correlation and explicit spin–orbit coupling was used. van der Waals interactions between quintuple layers were included via optB86b-vdW. Computational settings: 550 eV plane-wave cutoff; 13×13×13 Monkhorst–Pack k-mesh; energy convergence 1e-6 eV; forces <0.01 eV/Å. Relaxation time from DP theory: Assuming dominant acoustic phonon scattering, τcal was computed using τcal = 2√(2π) ħ C^(1/2) / [3 (kB T m)^(3/2) E^2], where C is the elastic modulus and E is the deformation potential. C and E were obtained from optimized structures plus ±1% and ±2% lattice deformations. Focus was on p-type relaxation time near band edges at room temperature. Dataset split: Due to band inversion and different dispersions in TIs, τcal values were split into NI and TI training sets. Feature construction for SISSO: Elemental properties of constituent atoms were used as primary features: atomic mass m, p-orbital radius r (Å), and Pauling electronegativity χ (see Table 1 values). Site-specific values (A, B, C, D, E), their averages (ave), and mean square errors (e.g., mrmsE) were included. Nonlinear feature space was generated with operators H = {1, +, −, ×, /, exp, log, …} up to complexity 3, producing >10^2 candidate features. SIS phase retained the top 80,000 features by correlation magnitude; sparsifying operator then selected optimal low-dimensional descriptors. Descriptor training: Separate SISSO models were trained for NIs (85 samples) and TIs (67 samples) to map features to τpre. Generalization to fractional stoichiometry: For doped/aliovalent systems, site-specific properties were computed as occupancy-weighted averages (virtual crystal approximation) for the A/B (As,Sb,Bi) and C/D/E (S,Se,Te) sites. These averaged features were fed into the trained descriptors to predict τpre for arbitrary compositions, enabling fast screening of over 16 million tetradymites by sampling stoichiometries on a 1/6 grid.
Key Findings
- Two compact, physically interpretable SISSO descriptors were identified to predict p-type relaxation time near the band edge at room temperature, separately for NIs and TIs, using only elemental properties (m, χ, r):
• NIs (85 samples): τpre shows excellent agreement with τcal (Pearson 94%). The dominant term depends on atomic masses, effectively captured by 7.8×10^-3 × [(mA + mB + mE − mave) √(mC mD)] − 0.6 × [(χAs − χC)(χave − χO)]/(mC mD) − 52.9. Analysis shows τ increases with average atomic mass mave via reduced effective mass m* (τ ∝ m*^-3/2).
• TIs (67 samples): τpre vs τcal correlation 92%. The leading dependence involves the mean square error of masses (mrmsE) and electronegativity/radius contrasts: τpre = 79.0 × {[(ln mse)(χA − χB) − (rC/rE)] − 3.0×10^-3 × [(mm/mse)(mse χave − χE)]} + 20.3. Larger mrmsE correlates with weaker anion–cation p-orbital hybridization, narrower bandwidth, larger m*, and lower τ.
- Validation vs experiment: For five tetradymites with available transport-based τexp, τcal and τpre agree with each other and are larger than τexp, consistent with experimental data taken at higher carrier concentrations (10^18–10^20 cm^-3) that increase scattering relative to near-edge calculations (10^17–10^18 cm^-3).
- Physical insights:
• NIs: τpre strongly correlates with mave; heavier compositions tend to have larger lattice constants and smaller m*, increasing τ. Higher anion electronegativity reduces lattice constants, increasing m* and decreasing τ.
• TIs: τpre mainly governed by mrmsE and p-orbital radius/electronegativity mismatches that modulate hybridization and m*.
- Generalization to fractional stoichiometry: For 13 NIs and 13 TIs with medium supercells and fractional compositions, τpre matches τcal with Pearson 96% (NIs) and 93% (TIs). Additional comparisons with τexp for five fractional systems show consistent trends.
- Large-scale screening: By sampling stoichiometry in 1/6 increments, >16 million tetradymites were screened. Distributions show most NIs have τpre ≈ 60–70 fs, while most TIs fall in 30–40 fs. Notably, >2×10^5 NIs possess τpre >120 fs (favorable for bulk TE performance). Conversely, ~5×10^5 TIs have τpre between 0–10 fs; >20,000 TIs show ultralow τpre <0.5 fs (e.g., Bi2S1.667Se0.5Te0.833 ≈ 0.3 fs; Sb0.5Bi1.5S1.5Se0.833Te0.667 ≈ 0.2 fs; BiSbS2.167Te0.833 ≈ 0.2 fs). Such low bulk τ, combined with long-lived surface states (e.g., ~550 fs), can yield large surface/bulk τ ratios (~1000) and enable high ZT (~2) in thin films.
- Computational efficiency: The descriptors require no additional first-principles input beyond elemental properties, enabling rapid high-throughput prediction for both integer and fractional stoichiometries.
Discussion
The work addresses the central challenge of rapidly estimating carrier relaxation time in complex thermoelectric tetradymites by deriving interpretable, low-cost descriptors. Splitting the dataset into NIs and TIs is essential because SOC-induced band inversion in TIs alters band-edge dispersions and scattering phase space. For NIs, the descriptor captures an inverse relationship between effective mass and average atomic mass, with electronegativity effects modulating lattice constants and thus m*. For TIs, variations in mass dispersion (mrmsE) and anion–cation orbital property mismatches govern hybridization strength, bandwidth, m*, and τ. The strong correlations between τpre and τcal (≥92%) and consistent trends with τexp demonstrate that the descriptors capture the dominant physics of acoustic phonon scattering near the band edge. The framework enables compositionally guided tuning of τ to optimize power factor and overall thermoelectric performance, and identifies composition regions with exceptionally high or low τ suited for bulk or thin-film TI applications, respectively.
Conclusion
Using SISSO with DP-theory-based training data, the study delivers two simple, physically grounded descriptors for rapid prediction of p-type carrier relaxation time at room temperature in tetradymite NIs and TIs. The models achieve high accuracy (Pearson 94% for NIs, 92% for TIs), generalize to fractional stoichiometries with strong validation, and enable screening of >16 million compounds at negligible computational cost. The descriptors elucidate key governing factors—average atomic mass, electronegativity contrasts, p-orbital radii dispersion—and offer practical levers to tune τ for enhanced thermoelectric performance. Given they rely only on intrinsic elemental properties, the approach is readily adaptable to other thermoelectric families (e.g., half-Heuslers, rocksalt chalcogenides). Future work could extend to energy-dependent τ via EPC-informed training, incorporate temperature and carrier concentration dependencies, and explore n-type carriers and additional scattering mechanisms.
Limitations
- Scope: Predictions target p-type relaxation time near band edges at room temperature and typical low carrier concentrations (≈10^17–10^18 cm^-3). Trends may deviate at higher carrier concentrations common in optimized thermoelectrics; the authors note the descriptors may not be suitable for heavily doped systems.
- Training physics: τcal used for training comes from deformation potential theory (acoustic phonon scattering) rather than full EPC, potentially omitting other scattering channels (impurities, grain boundaries, polar optical phonons) present in experiments.
- Topology-specific models: Separate descriptors are required for NIs and TIs; misclassification of topology could degrade predictions.
- Experimental comparison: Direct experimental τ data are scarce; available τexp are inferred from transport fits and at higher carrier concentrations, complicating direct quantitative comparison.
- Temperature/generalizability: Temperature dependence beyond room temperature and n-type carriers were not addressed in the presented results.
Related Publications
Explore these studies to deepen your understanding of the subject.