Introduction
The search for high-temperature superconductors is a significant challenge in condensed matter physics. The enormous size of the compositional phase space makes a comprehensive search using existing theoretical tools impractical. Traditional methods rely on a deep chemical understanding to identify promising candidates, thereby reducing the search space. However, machine learning (ML) offers a powerful alternative by uncovering correlations within higher-order parameter space and acting as a predictive tool for exploring new materials. The recent discovery of hydrogen-rich materials exhibiting superconductivity near room temperature under high pressure has fueled this research. Techniques like crystal structure prediction (CSP) are crucial for interpreting experimental results, but calculating the critical superconducting transition temperature (Tc) ab initio can be computationally expensive and time-consuming (days to months), depending on the complexity of the material and the need to account for anharmonicity. Further, the complexity increases significantly for ternary and higher-order hydrides, which may offer a pathway to achieving ambient-condition superconductivity. Therefore, this research is motivated by the need to develop fast and accurate machine learning methods for predicting material superconducting properties, leveraging existing superconductivity research and experimental data (e.g., the Supercon database). While existing ML models based on stoichiometry (e.g., using atomic mass, charge, and number of atoms as descriptors) achieve good regression performance, they lack the three-dimensional structural information needed to account for pressure-dependent structural changes that influence Tc. These models often struggle with data sets where Tc varies significantly for different pressures for the same composition. Recent efforts have incorporated three-dimensional atomic structure information using smooth overlap of atomic positions (SOAP) descriptors, which improves performance but may overlook similarities between polymorphs or neglect properties arising from the full periodic structure (such as phonon dispersion). This paper aims to develop a novel representation of structural properties for machine learning models to improve predictive accuracy and expand the capabilities of material discovery. This representation will differentiate polymorphisms and offer physical interpretability of the structural properties, thereby overcoming the limitations of existing approaches.
Literature Review
Existing literature highlights the challenges and previous attempts at using machine learning for predicting superconducting Tc. Composition-based models, which utilize atomic properties such as mass, charge, and number of atoms, have shown promise. However, these models ignore the three-dimensional structural information crucial for understanding the behavior of superconductors under pressure. Recent work has incorporated structural descriptors like SOAP to capture the local atomic environment, improving predictive capabilities. However, these methods might still not fully capture the global structural features and the interplay between atomic arrangement and superconducting properties. This research aims to improve on these existing approaches by introducing a novel structural representation that includes the periodic mass and charge distribution, leading to a more accurate and comprehensive prediction of Tc.
Methodology
This study uses a machine learning approach to predict the superconducting transition temperature (Tc) of materials based on their structural and compositional properties. The methodology consists of the following steps:
**Data Collection and Preprocessing:**
1. The Supercon database, containing thousands of superconducting Tc values along with chemical compositions, was used as the primary data source. Data cleaning was performed to handle ambiguities and errors, using a previously cleaned version of the dataset. Data was grouped by composition, and averages and variances of Tc values were calculated. Compositions with multiple reported temperatures were divided into two sets based on a variance cutoff (σ2Tcut = 2), with the lower variance set having averaged Tc values used for model training.
2. The Materials Project database was used to match compositions from the Supercon dataset with corresponding structures, resulting in 2454 structures. For low-variance compositions, the averaged Tc values were assigned to the corresponding structures. For high-variance compositions, the data was either excluded or labelled using different methods, with averaging chosen for this analysis.
3. Theoretical high-pressure hydride superconductors not present in Supercon were included, with Tc values labeled by the average of the μ* bounds from Eliashberg equation calculations.
**Structural Representation:**
The researchers developed a novel structural representation using Fourier analysis. The atoms are described as fields of atomic mass and charge, enabling a nuanced description of the atomic properties as local regions of charge or mass. A Gaussian density for a single atom is defined, and its periodic counterpart is obtained by convolving it with a Dirac comb, thereby incorporating cell periodicity. The Fourier transform of this periodic density is computed to express the field in terms of Fourier coefficients. A similar approach using the von Mises distribution is explored. This representation converts the atomic positions in a lattice to two 3D grids of complex coefficients. These coefficients, along with categorical and numerical data (including the number of each species in the unit cell, chemical descriptors, and lattice parameters), serve as input for the machine learning model.
**Model Development:**
1. The model consists of two input layers: one for the 3D spatial data (Fourier coefficients) and one for the numerical data. A complex-valued convolutional neural network (CVCNN) was used to process the spatial data, although alternate options (treating the data as real-valued by taking modulus, real or imaginary components) were also evaluated.
2. The numerical data is processed through four fully connected layers. For the spatial data, two convolutional layers with 32 units of 3x3x3 kernels, 2x2x2 average pooling, and 50% dropout are used. Activation functions (Cartesian ReLU and absolute value) are applied. The outputs are flattened and concatenated.
3. The combined output passes through four more fully connected ReLU layers with 50% dropout and then to a final single-neuron output for predicting Tc. The Keras Huber loss function with delta=7 was used during training to mitigate the effect of outliers. A train-test split of 80-20 was used, and the best model was selected based on validation loss after 300 training epochs.
**Model Parameters:**
The following parameters were used: cutoff δ = 13, scaling factors γe,m = {0.0025, 0.25}, and 0 ≤ h, k, l. Optimal values were explored, but the chosen parameters produced accurate models. A complex neural network was used to handle the complex-valued input from the Fourier coefficients.
Key Findings
The model achieved a high R-squared value of 0.9429 on the validation set, demonstrating excellent predictive performance across the entire prediction space, even without excluding high-variance Tc data. The model accurately predicted Tc values for various high-temperature superconductors, outperforming composition-based models, particularly for high-pressure hydride materials. For example, the model predicted a Tc of 237 K for P63/mmc YH9 at 255 GPa (close to the measured value of 237 K), and 249 K for LaH10 Fm3m at 150 GPa (close to the measured 249 K). Predictions for several hydrides (LaH10, YH10, CSH7) at high pressure (above 250 GPa) showed strong agreement with DFT calculations. The model's performance on LiMgH superconductors was less accurate, possibly due to their extreme Tc values relative to the training data. The speed of the model allows for rapid exploration of the compositional and structural space. Morphisms of base structures were created by varying lattice parameters and swapping atomic species. Using Im3m H3S and Fm3m MgH13 as base structures, various morphisms were generated and their Tc values predicted. This led to the identification of potential high-Tc superconductors: Fm3m LiMgH12 and AlH13 (with predicted Tc above 350 K), Im3m PH3, BH3, and SeH3. The SeH3 prediction was further validated using Quantum Espresso. Using Fd3m Li2MgH16 as a base, morphisms were explored, resulting in the identification of BaH32N5, LaH32N5, BaH32N4O, and LaH32N4O as potential high-Tc superconductors (Tc > 350 K). A further morphism exploration of XYH36 identified LiLaH36, ZrH36Cl, and TeH36N as potential room-temperature superconductors (Tc > 400 K). These predictions were not captured by composition-based models. The analysis of lattice scaling revealed a superconducting dome-like behavior, similar to experimental pressure-dependent Tc observations. Different polymorphs showed distinct responses to lattice scaling, revealing insights into the relationship between structure and Tc. Analysis of P3m1 Li2MgH16 showed the importance of 2D H-H structural motifs in driving Tc. The structural representation enabled insights into the relationship between structure and Tc, particularly the pressure-dependent superconducting dome effect.
Discussion
The results demonstrate the effectiveness of incorporating structural information into machine learning models for predicting Tc. The model’s superior performance over composition-based models highlights the importance of considering three-dimensional structural details. The identification of novel high-Tc superconductors, not predicted by composition-based models, validates the model's ability to explore the vast compositional space efficiently. The observation of a superconducting dome effect and the insights into structural motifs contribute to a deeper understanding of the underlying physics of superconductivity. The study’s limitations include the potential bias introduced by the data selection and preprocessing steps, and the reliance on existing DFT calculations for the theoretical hydride superconductors. Future improvements could involve using more sophisticated structural descriptors and exploring larger datasets to improve model accuracy and generalizability.
Conclusion
This research successfully demonstrates the use of machine learning with a novel structural representation for predicting superconducting transition temperatures. The model's accuracy, speed, and ability to identify novel potential superconductors make it a valuable tool for materials discovery. Future work should focus on expanding the dataset, refining the structural descriptors, and integrating the model with crystal structure prediction tools for a more efficient and automated screening process. Further experimental validation of the predicted superconductors is crucial.
Limitations
The study's limitations include the potential bias from data selection and preprocessing. The reliance on existing DFT calculations for theoretical hydrides might introduce uncertainties. The model's performance on outlier materials (e.g., LiMgH) suggests the need for further model refinement or inclusion of more diverse data in future studies.
Related Publications
Explore these studies to deepen your understanding of the subject.