Introduction
The development of non-invasive diagnostic methods is crucial in modern medicine. Exhaled breath analysis presents a promising avenue, offering a comfortable and readily accessible sample collection method. While some non-invasive tests are established (e.g., 13C-urea breath test for Helicobacter pylori), diagnosing high-mortality diseases like lung cancer often relies on invasive procedures such as biopsy and LDCT, both carrying significant drawbacks. Early-stage lung cancer frequently presents asymptomatically, making early detection challenging. Therefore, the research focuses on developing a non-invasive and accurate method for lung cancer diagnosis using exhaled breath analysis, aiming to improve early detection rates and minimize the need for invasive techniques. Gas chromatography-mass spectrometry (GC-MS) is selected as the analytical method due to its ability to quantitatively and qualitatively analyze breath samples, identifying potential VOC biomarkers that could differentiate between lung cancer patients and healthy individuals. The study acknowledges the variability of lung cancer and the potential confounding effects of comorbidities and treatment, which are investigated as potential factors influencing the results.
Literature Review
The literature review highlights the existing research on exhaled breath analysis for lung cancer diagnosis, noting inconsistencies in findings across studies. Differences in analytical conditions, participant groups, biomarker selection, and machine learning algorithms employed have contributed to the variability in results. Previous studies have explored exhaled breath analysis using various techniques, such as GC-MS, electronic noses, and ion mobility spectrometry. The lack of consistency emphasizes the need for standardized methodologies and a thorough investigation of confounding factors, including the influence of treatment and comorbidities.
Methodology
The study involved 112 lung cancer patients and 120 healthy volunteers. Exhaled breath samples were collected in pre-cleaned Tedlar bags, with precautions taken to minimize sample contamination. Phenol and N,N-dimethylacetamide were excluded due to potential contamination from the sampling bags. Samples were analyzed using GC-MS, with VOCs preconcentrated using Tenax TA sorbent tubes. A total of 205 VOCs were identified. Statistical analysis focused on VOC ratios (to mitigate the effects of individual VOC variability and avoid division by zero) calculated using the most frequently occurring compounds as denominators. Spearman's rank correlation test was used to assess correlations between VOC ratios, and factors like disease status, tumor localization (central vs. peripheral), histological tumor type (ranked by malignancy), TNM stage, treatment status (before vs. during chemotherapy), and comorbidities. A power analysis determined a required sample size of 221, with 232 samples collected. The dataset was split into training (70%) and test (30%) sets. Gradient boosted decision trees (GBDT) and artificial neural networks (ANN) were used to build diagnostic models, and 3-fold cross-validation was applied to ensure model robustness. The importance of different VOC ratios in the predictive models was assessed using GBDT.
Key Findings
The GC-MS analysis identified 205 VOCs in the exhaled breath samples. Several ratios involving 2-heptanone correlated significantly with treatment status (before or during chemotherapy). No significant correlations were found between comorbidities (anemia, acute cerebrovascular accident) and VOC profiles. Benzaldehyde/acetonitrile and benzaldehyde/2,3-butandione ratios correlated with obesity. 1-pentanol and several ratios involving it correlated with tumor localization. Statistically significant correlations were found between TNM stage and certain VOC peak areas and ratios. No VOCs significantly correlated with histological tumor type, however, some ratios (predominantly involving dimethyl trisulfide and 3-heptanone) showed correlations. Several VOCs and ratios were significantly different between lung cancer patients and healthy individuals. Diagnostic models were created using GBDT and ANN. ANN models showed higher accuracy (82-88% sensitivity and 80-86% specificity) on the test dataset compared to GBDT models. GBDT analysis revealed the importance of different VOC ratios in the diagnostic models, with dimethyl trisulfide/dimethyl disulfide and isoprene/acetone being the most significant. Ratios of toluene/acetonitrile, hexane/acetonitrile, and pentanal/isoprene showed significant correlations with disease status, consistent with previous research.
Discussion
The study's findings contribute to the understanding of exhaled breath VOCs as potential lung cancer biomarkers, acknowledging the influence of various factors. The observed correlations between VOC profiles and treatment status, tumor characteristics, and comorbidities highlight the complexity of developing a robust diagnostic tool. The significant performance of ANN models in distinguishing between lung cancer patients and healthy individuals using selected VOC ratios is promising. The study’s focus on VOC ratios, which are less susceptible to individual variations than absolute VOC levels, provides a more reliable approach. However, the impact of treatment and comorbidities needs further investigation to refine the diagnostic model's accuracy and generalizability. The differences in VOC profiles observed in relation to tumor localization and TNM stage warrant further research to confirm these findings and understand their mechanistic basis.
Conclusion
This research demonstrated that exhaled breath analysis, using specific VOC ratios, holds promise for non-invasive lung cancer diagnosis. The ANN-based diagnostic model showed high sensitivity and specificity. The study also highlighted the significant influence of comorbidities and treatment status on exhaled breath VOC profiles, emphasizing the need for careful consideration of these factors in future diagnostic model development. Further research, employing larger and more diverse cohorts, is necessary to validate these findings and improve the model's clinical utility.
Limitations
The study has several limitations. The effect of other lung comorbidities was not studied, the number of patients with certain comorbidities was relatively low, limiting the robustness of the analysis of their influence. Not all possible comorbidities were considered. While efforts were made to control for confounding factors (e.g., age, smoking), residual confounding effects are still possible. Further research with larger, more diverse populations and including a broader range of comorbidities is needed to confirm and expand these findings.
Related Publications
Explore these studies to deepen your understanding of the subject.