
Engineering and Technology
Embedding physics domain knowledge into a Bayesian network enables layer-by-layer process innovation for photovoltaics
Z. Ren, F. Oviedo, et al.
Discover how integrating physics domain knowledge within a Bayesian network transformed gallium arsenide solar cell optimization, unveiling layers of performance improvements and reducing the reliance on traditional experimental methods. This innovative study was conducted by Zekun Ren, Felipe Oviedo, Maung Thway, Siyu I. P. Tian, Yue Wang, Hansong Xue, Jose Dario Perea, Mariya Layurova, Thomas Heumueller, Erik Birgersson, Armin G. Aberle, Christoph J. Brabec, Rolf Stangl, Qianxiao Li, Shijing Sun, Fen Lin, Ian Marius Peters, and Tonio Buonassisi.
Playback language: English
Introduction
Process optimization is crucial for maximizing the performance of novel materials and devices, particularly in photovoltaics where numerous process variables influence device efficiency. Traditional black-box optimization methods (e.g., Design of Experiments, Grid Search, Bayesian Optimization, Particle Swarm Optimization) systematically modify selected variables within a defined range to map the system's response surface and find an optimum. While cost-effective and potentially ideal for self-driving laboratories, these methods have limitations. The achievable performance improvement is restricted by the pre-selected variables and their ranges, artificially constraining the parameter space. Moreover, understanding the root causes of underperformance is limited, often requiring secondary characterization or numerous combinatorial variations of base samples. In contrast, Bayesian inference coupled with a physics-based forward model and rapid current-voltage measurements provides a statistically rigorous approach to identify performance bottlenecks in early-stage photovoltaic devices. Recent work also highlights the promise of combining physical insights with machine-learning models for energy materials development.
This study focuses on optimizing the synthesis temperature profile of a gallium arsenide (GaAs) solar cell using metal organic chemical vapor deposition (MOCVD). Growth temperature is a critical and challenging parameter in III-V film deposition, impacting growth rate, surface morphology, dopant incorporation, and defect formation. Other parameters like precursor flow rate and growth pressure are closely related (approximated by the Ideal Gas Law), but the relationships between other process variables (precursor type, carrier gas flow rate) and material properties are less clear and potentially tool-specific. Thus, growth temperature is used as the primary optimization variable, while machine learning models handle other variables.
GaAs solar cells consist of several layers (back surface field (BSF), bulk absorber, window layer). Optimizing material properties for each layer and interface is essential for maximum performance. An experienced researcher might individually optimize each layer (emitter, base, window, BSF) by mapping process variables to material properties, requiring the fabrication of many auxiliary samples under various conditions and employing secondary characterization techniques (secondary ion mass spectroscopy (SIMS), time-resolved photoluminescence (TR-PL)). These techniques are significantly more costly than current-voltage (JV) measurements, the primary indicator of solar cell performance. This problem is mirrored in optimizing other multi-layer energy systems and semiconductors.
To address this challenge, the researchers combine machine-learning techniques to infer the effects of process variables on different device layers, avoiding expensive characterization by using automated JV measurements at multiple illumination intensities (JVi). A physics-based "surrogate" model mimics the complex physical model of solar cell growth, significantly accelerating calculations. This surrogate model employs a two-step Bayesian inference method (Bayesian network or hierarchical Bayes) with physically constrained relations between layers, operating >100x faster than a numerical device-physics solver.
Literature Review
The paper extensively reviews existing literature on process optimization in photovoltaics, highlighting the limitations of traditional black-box optimization methods and the potential benefits of incorporating physics-based models. It cites several publications demonstrating the successful application of Bayesian optimization and machine learning in materials discovery and inverse design, emphasizing the accelerated discovery of materials with desired properties. Specifically, the authors reference works on Bayesian optimization for chemistry, accelerated discovery of large electrostrains in BaTiO3-based piezoelectrics, accelerated search for BaTiO3-based piezoelectrics, and multifunctional structural design of graphene thermoelectrics. The review also touches upon the use of machine learning to accelerate materials discovery in the context of the Harvard Clean Energy Project. Furthermore, the authors acknowledge previous research on rapid photovoltaic device characterization through Bayesian parameter estimation, which informs their approach to linking process variables to material properties and device performance. The literature review sets the stage for the proposed methodology by emphasizing the need for more efficient and interpretable optimization techniques in photovoltaic device manufacturing, while establishing the foundation of existing methods utilized and the novelty of the presented approach.
Methodology
The core methodology involves constructing a Bayesian network to link process variables with material and device properties in GaAs solar cells. The optimization targets individual material properties to maximize overall device performance. The Bayesian network consists of four parts:
1. **Parameterization of process variables by embedding physics knowledge:** Physics-based constraints are imposed to couple process variables (e.g., growth temperature) to material bulk and interface properties (e.g., lifetime). This uses a modified Arrhenius equation (Eq. [2]) to link Zn doping levels to growth temperature, constraining the variable space and improving the convergence of the Bayesian optimization algorithm. Similar parameterizations are used for Si doping concentration, bulk minority carrier lifetime (τ), and front and back surface recombination velocities (SRVs).
2. **Inference of material and device properties from device measurements:** The Bayesian network infers underlying material properties from JV curves to trace performance issues to specific material or interface properties. A numerical device-physics model (PC1D) links inferred properties to solar cell parameters (JV characteristics, quantum efficiency, conversion efficiency). However, numerical simulation is computationally expensive, so a surrogate model replaces it.
3. **Replacement of numerical solver with a robust neural network surrogate model:** A deep neural network surrogate model replaces the computationally intensive PC1D model. This surrogate model consists of two parts: (1) a denoising Autoencoder (AE) to reconstruct noise-free JV curves from noisy experimental data, and (2) a regression model to predict JV curves based on underlying material properties. The AE is trained on 20,000 simulated JV curves with added Gaussian noise to mimic experimental noise. The regression model predicts JV curves from material descriptors in the latent space of the AE. This surrogate model is 130 times faster than the PC1D numerical solver on a GPU.
4. **Optimizing solar cells using Bayesian network inferred results:** A hierarchical Bayesian inference procedure (Eq. [1]) integrates the two inference steps. A Markov Chain Monte Carlo (MCMC) method samples the posterior distribution of latent parameters, enabling efficient exploration of the parameter space. The optimal growth temperature profile is identified by maximizing the desired material properties (lifetime) and minimizing undesired properties (SRVs) using grid search.
Key Findings
The researchers fabricated five batches of GaAs solar cells (four cells per batch) across a range of constant growth temperatures (530-680 °C), exploring the relationship between growth temperature and material properties. The Bayesian network successfully inferred material properties (doping concentration, bulk lifetime, SRVs) as a function of growth temperature. The results showed that the logarithm values of p-type (Zn) doping level, n-type (Si) doping level, and front SRV had an almost linear correlation with 1/T, agreeing well with the Arrhenius equation, while bulk lifetime and rear SRV exhibited nonlinear relationships. Independent measurements using SIMS and TR-PL validated the inferred material properties from the Bayesian network approach. Interestingly, each recombination parameter (bulk lifetime, front SRV, rear SRV) had its minimum/maximum at a different growth temperature, suggesting that optimizing each layer separately could improve overall device performance.
Based on these insights, a new temperature profile was designed for the GaAs devices, and an additional MOCVD experiment was conducted using the optimized profile. This led to a 6.5% relative increase in average AM1.5G efficiency compared to the baseline (630 °C). JV and EQE measurements showed that both Jsc and Voc contributed to the efficiency improvement, with improved photo-response at wavelengths <820 nm, indicating a significant reduction in recombination at the front and bulk layers. The Bayesian inference confirmed improvements in material properties (lower SRVs, higher τ) in the optimized cells. The fact that this improvement was achieved in a single temperature sweep of five MOCVD runs (plus one for optimization validation) demonstrates the significant time and cost reduction compared to traditional approaches. The authors also show that replacing the Arrhenius equation parameterization with a black-box regression (kernel ridge regression) yields comparable accuracy in mapping temperature to material properties, but loses the interpretability offered by the physical model.
Discussion
The study successfully demonstrates a novel approach to optimize the fabrication process of GaAs solar cells by integrating physics-based domain knowledge into a Bayesian network. The key contribution lies in the ability to identify and address layer-specific performance bottlenecks without relying on extensive secondary characterization or multiple fabrication iterations. The use of a neural network surrogate model significantly accelerates the computational process, making this approach practical for real-world applications. The achieved 6.5% relative efficiency improvement surpasses the performance of traditional grid search optimization, demonstrating the effectiveness of the layer-by-layer optimization strategy. The results show a significant reduction in the time and cost associated with photovoltaic device optimization, which is highly relevant to the field. The methodology is potentially generalizable to other solar cell materials and multi-layer systems where physics-based or black-box relations exist between process variables, material properties, and device performance. The surrogate model offers a superior alternative to conventional models in closed-loop black-box optimization.
Conclusion
This research presents a novel Bayesian network approach for optimizing the layer-by-layer growth process of GaAs solar cells, resulting in a 6.5% relative efficiency improvement in just six MOCVD experiments. The integration of physics-informed relations and a fast neural network surrogate model significantly reduces the time and cost compared to traditional methods. The approach's generalizability to other materials and systems holds immense potential for accelerating materials and device development in various fields. Future research could explore the application of this approach to other photovoltaic technologies and extend its capabilities to optimize multiple process parameters simultaneously.
Limitations
While the study demonstrates significant progress, a few limitations should be considered. The accuracy of the inferred material properties relies heavily on the accuracy of the surrogate model and the initial parameterization of the Bayesian network. The study focuses primarily on growth temperature, and other process parameters may also play a significant role in overall device performance. The generalizability to other material systems might require modifications to the surrogate model and the physics-based constraints. Furthermore, the applicability to industrial-scale production needs further investigation, as the specific experimental setup might need adjustments for large-scale manufacturing.
Related Publications
Explore these studies to deepen your understanding of the subject.