logo
ResearchBunny Logo
Introduction
Inaccurate gestational age (GA) estimation hinders evidence-based pregnancy care, particularly in low- and middle-income countries (LMICs). Accurate GA is vital for individual clinical decision-making and population-level analyses of infant mortality and morbidity factors like preterm birth. Currently, first-trimester fetal crown-rump length (CRL) measurement by ultrasound is the gold standard, but many women in LMICs seek antenatal care later in pregnancy. Relying on the last menstrual period (LMP) is unreliable due to inaccurate recall or irregular cycles. Second and third-trimester GA estimation often utilizes symphysis-fundal height (SFH) or ultrasound-based fetal biometry, but these methods are flawed because they assume average fetal size. This assumption leads to increased error as pregnancy progresses due to normal fetal size variation and pathological growth aberrations (small for gestational age (SGA) and large for gestational age (LGA)). This study proposes machine learning as an alternative, leveraging its effectiveness in ultrasound image analysis tasks like image registration, classification, and regression. Existing automated methods that derive biometric measures from ultrasound planes often produce uncertainties similar to clinical measurements or are limited to single fetal planes or videos. The study aims to determine if machine learning can accurately estimate GA using only image characteristics from standard ultrasound planes (head circumference (HC), abdominal circumference (AC), and femur length (FL)) in the second and third trimesters without relying on size-based information.
Literature Review
The introduction adequately reviews the existing literature on gestational age estimation, highlighting the limitations of current methods, particularly in later pregnancy. It discusses the challenges of relying on LMP and the inherent inaccuracies of biometry-based approaches due to normal fetal size variability and the presence of SGA and LGA. The review also touches upon the successful applications of machine learning in medical image analysis, positioning the current study within this context while also noting the limitations of current automated fetal biometric measurement methods.
Methodology
The study developed a machine learning model using ultrasound images from two independent datasets: the INTERGROWTH-21st Fetal Growth Longitudinal Study (FGLS) for training and internal validation, and the INTERBIO-21st Fetal Study for external validation. The model was blinded to the ground truth GA (based on reliable LMP and first-trimester CRL) during validation. All measurement and scale information was removed from the images. A convolutional neural network (CNN) architecture, modified from ResNet-50, was used. The modification involved using Consistent Ordinal RAnk Logits (CORAL) for classification loss during single-plane pre-training and then using plane-specific pre-trained models for automated feature extraction before fine-tuning with a multiple-plane model. This addressed the instability of training neural networks from scratch using regression loss by employing "binned" classification loss to retain ordinal information. The MultiPlane model concatenated the final layers of pre-trained single-plane models (HC, AC, FL) and was fine-tuned using L1 regularization. The model was trained and tested using a 75%, 15%, and 10% split of INTERGROWTH-21st data for training, validation, and testing, respectively. INTERBIO-21st data were used solely for external validation to assess generalizability. The process involved image preprocessing (removal of artifacts, resizing, intensity normalization), model training (using CORAL for single-plane models and L1 loss for MultiPlane), and performance evaluation (mean absolute error (MAE), percentage of estimations within ±7 days of the gold standard). Sub-analyses were conducted on SGA and LGA fetuses, and performance was compared to existing biometry-based methods (Hadlock and INTERGROWTH-21st).
Key Findings
The MultiPlane model demonstrated superior performance in estimating GA compared to existing methods, especially in later pregnancy. In the INTERGROWTH-21st internal validation set, the MultiPlane model achieved a MAE of 3.5 days across all gestational ages (13⁺⁰ to 42⁺⁰ weeks), with 90.7% of estimations within ±7 days of the gold standard. This improved to 3.0 days and 94.5% accuracy within ±7 days in the second trimester (18⁺⁰ to 27⁺⁶ weeks). In the third trimester (28⁺⁰ to 42⁺⁰ weeks), the MAE was 4.3 days, with 85.6% accuracy within ±7 days. The external validation using the INTERBIO-21st dataset yielded a MAE of 4.1 days across all gestational ages, with 85.1% accuracy within ±7 days, and 3.7 days and 88.1% accuracy in the second trimester. Comparison with biometry-based methods (Hadlock and INTERGROWTH-21st) revealed that the MultiPlane model significantly outperformed them, particularly beyond 32 weeks' gestation. The MultiPlane model also showed robustness in estimating GA for both SGA and LGA fetuses, with MAEs of 3.7 and 4.7 days in the second and third trimesters for SGA newborns and 4.6 and 5.1 days for LGA newborns, respectively. Sub-analysis by site in the INTERBIO-21st study showed consistent high performance across all sites, with differences in MAE being clinically insignificant.
Discussion
The study successfully demonstrates that a machine learning model can accurately estimate fetal GA using only image characteristics from standard ultrasound planes, without relying on biometric measurements. This addresses the limitations of current biometry-based methods, particularly the increased error in later pregnancy. The superior performance of the MultiPlane model, even in SGA and LGA cases, highlights its potential to improve GA estimation accuracy and inform clinical decision-making. The findings are significant because they provide a more accurate and robust method for GA estimation, especially in LMICs where access to first-trimester scans is limited and reliance on LMP is problematic. The model's real-time inference capability also suggests practical applicability.
Conclusion
This study introduces a novel, accurate, and robust method for estimating fetal gestational age using machine learning and standard ultrasound images. The MultiPlane model significantly outperforms existing biometry-based methods, particularly in late pregnancy and in cases of growth aberrations. This has substantial implications for improving obstetric care, especially in resource-limited settings. Future research should focus on validating the model's performance with different ultrasound machines and exploring its integration with automated plane detection methods to enhance its usability.
Limitations
The study's data were acquired using the same type of ultrasound machine, potentially limiting the generalizability of the model to other machines. Although the model performed well in SGA and LGA cases, further analysis is needed to fully understand its performance across a broader range of fetal abnormalities. The model's reliance on the availability of standard anatomical planes obtained by trained sonographers also represents a potential implementation challenge in resource-constrained settings.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs—just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny