Introduction
Tissue dynamics are crucial for various physiological functions and clinical diagnoses. High-resolution 3D imaging of these dynamics in real-time, however, remains challenging due to limitations in accessibility and temporal/spatial resolution. This is particularly true for vocal-fold vibration during phonation, a critical aspect of voice production and diagnosis. Current methods, such as strobolaryngoscopy and high-speed endoscopy, often provide only 2D views, neglecting the important vertical component of vocal fold motion. While techniques like high-speed camera tracking, high-frame-rate ultrasound, and optical coherence tomography have been explored, they compromise on temporal and spatial resolutions. Physics-informed neural networks (PINNs) offer a promising approach for reconstructing 3D fields from sparse measurements by incorporating physical laws into the network training. However, traditional PINNs struggle with the complexities of 3D flow-structure interactions (FSI) inherent in tissue dynamics, specifically concerning scalability for large 3D problems, capturing the non-smoothness of FSI interfaces in soft tissues, handling complex temporal dynamics, and establishing correspondence between network predictions and 2D imaging data. This study addresses these challenges by developing a hybrid PINN algorithm to reconstruct high-resolution 3D vocal-fold motion from 2D endoscopic imaging.
Literature Review
The paper reviews existing methods for capturing 3D tissue dynamics, highlighting the limitations of current imaging techniques in achieving real-time, high-resolution measurements, especially in applications such as vocal fold vibration analysis. It emphasizes the challenges in capturing the vertical component of vocal fold motion and the compromises made in terms of temporal and spatial resolution. The authors also discuss the potential of physics-informed neural networks (PINNs) as a novel approach to reconstruct 3D tissue dynamics from sparse or indirect measurements. However, they also highlight the limitations of traditional PINNs in handling the complexities of 3D flow-structure interactions, particularly in soft tissues. The review positions their proposed hybrid PINN algorithm as a solution to address these limitations.
Methodology
The researchers designed a hybrid PINN algorithm that integrates a recurrent neural network (RNN) model of 3D continuum soft tissue with a differentiable fluid solver. This approach tackles several key challenges:
1. **Scalability:** The algorithm leverages prior knowledge in solid mechanics by projecting the governing equation onto a reduced eigen space, significantly decreasing computational costs. Only the most important eigenmodes are used, minimizing error while improving training efficiency.
2. **Nonlinearity of FSI:** The algorithm handles the non-linearity of fluid-structure interaction through a novel loss function constructed purely from the residuals of the solid mechanics governing equations. A fully differentiable numerical fluid solver, integrated into the neural network, efficiently computes the fluid loading terms, ensuring end-to-end differentiability.
3. **Temporal Dependence:** A Long Short-Term Memory (LSTM)-based recurrent encoder-decoder network, connected to a fully connected neural network (FCNN), captures the temporal dependencies in the flow-structure interaction. This LSTM architecture learns the time history of modal coefficients, improving predictive accuracy for complex temporal dynamics.
4. **Correspondence between Data and Prediction:** The algorithm addresses the lack of direct correspondence between 2D imaging data and 3D network predictions by constructing a loss function based on the residuals of the projected 2D profiles, effectively assimilating the indirect measurements.
The algorithm was validated using synthetic data generated from a high-fidelity simulation of vocal fold dynamics in a canine larynx model reconstructed from MRI scans. The model included major cartilages, intrinsic muscles, and vocal-fold tissues. Twenty time-labeled 2D glottal shapes extracted from a simulation cycle served as training data for the PINN. The accuracy of 3D deformation fields, aerodynamics (mean and maximum flow rate, intraglottal pressure), and acoustics (sound pressure level, SPL, and acoustic power) were evaluated. The algorithm was further validated using experimental data from excised pigeon syringes, focusing on comparing predicted acoustic quantities (SPL and acoustic power) to experimental measurements. The entire algorithm, from data input to loss function calculation, is implemented in a differentiable programming framework, allowing for end-to-end training and efficient gradient backpropagation.
Key Findings
The algorithm demonstrated high accuracy in reconstructing 3D vocal fold dynamics. For the synthetic canine larynx data, the prediction error of the 3D vocal fold shapes over one vibration cycle was between 2.0% and 5.1%, with a mean of 3.8% and a standard deviation of 0.97%. The prediction errors for key aerodynamic and acoustic quantities were all within 5%, highlighting the accuracy of the algorithm in predicting various aspects of vocal fold behavior. The results demonstrate accurate prediction of 3D vibratory dynamics, including vertical velocity contours. Comparisons between predicted and simulated data showed excellent agreement for glottal flow rate, intraglottal pressure distribution, peak flow rate, mean flow rate, mean intraglottal pressure, SPL, and acoustic power. Validation against experimental data from excised pigeon syringes, while lacking full 3D dynamic data, showed good agreement between predicted and measured SPL and acoustic power, indicating accurate prediction of syringeal dynamics. The hybrid PINN effectively captures the 3D nature of the vocal fold/syringeal vibrations, providing insights into the 3D shapes and associated flow rate dynamics. The study confirms the potential of this algorithm for predicting multiple physical quantities (tissue stress, glottal flow rate) typically difficult or impossible to measure experimentally.
Discussion
The findings demonstrate that the proposed hybrid PINN algorithm successfully reconstructs high-resolution 3D tissue dynamics from sparse 2D imaging data. The algorithm's ability to accurately predict not only 3D shapes but also various aerodynamic and acoustic quantities is significant. This overcomes limitations of traditional imaging techniques that often capture only 2D information and compromises resolution or data quantity. The integration of physics-based models and data-driven learning strategies enhances the accuracy and generalizability of the predictions. The successful application to both synthetic and experimental data suggests a broad applicability of this method in various fields related to soft tissue dynamics. The algorithm's potential to infer otherwise unmeasurable physical quantities, such as glottal flow rate and vocal fold stresses, opens new avenues for research and diagnosis in voice disorders.
Conclusion
This study introduces a novel hybrid PINN algorithm capable of accurately reconstructing 3D soft tissue dynamics from limited 2D imaging data. The algorithm's success in both synthetic and experimental validations, coupled with its capacity to predict a range of unmeasurable physical quantities, positions it as a powerful tool for advancing research in fields involving 3D flow-structure interactions. Future work could include expanding the algorithm to infer material properties, incorporating more comprehensive flow solvers, utilizing multimodal inputs, and conducting further validation studies on human laryngeal data.
Limitations
The current algorithm requires a priori knowledge of material properties for eigenmode computation, limiting its direct application to in vivo studies. The use of a 1D flow model for glottal aerodynamics simplifies the flow dynamics, which may not be suitable for all applications. While the study shows promising results, further validation against diverse experimental and clinical datasets is needed to establish its robustness and generalizability across different tissue types and physiological conditions. The algorithm's reliance on accurate 2D profile segmentation from imaging data also affects the overall accuracy.
Related Publications
Explore these studies to deepen your understanding of the subject.