logo
ResearchBunny Logo
Non-Invasive Lung Cancer Diagnostics through Metabolites in Exhaled Breath: Influence of the Disease Variability and Comorbidities

Medicine and Health

Non-Invasive Lung Cancer Diagnostics through Metabolites in Exhaled Breath: Influence of the Disease Variability and Comorbidities

T. A.z, G. E.m, et al.

This groundbreaking study delves into non-invasive lung cancer diagnosis using volatile organic compounds (VOCs) present in exhaled breath. Through gas chromatography-mass spectrometry, researchers identified 205 VOCs and established significant correlations between VOC ratios and lung cancer outcomes. Advanced diagnostic models demonstrated high accuracy, illuminating the complexity of lung cancer screening. This research was conducted by Temerdashev, A.Z, Gashimova, E.M, Porkhanov, V.A, Polyakov, I.S, Perunov, D.V, and Dmitrieva, E.V.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses the urgent need for accurate, simple, and non-invasive diagnostic methods for lung cancer, a leading cause of mortality that is typically detected via invasive or harmful procedures such as low-dose computed tomography and biopsy. Exhaled breath analysis offers a painless, easily repeatable source of metabolic information. The research evaluates volatile organic compounds (VOCs) in exhaled breath using GC-MS to develop diagnostic models for lung cancer and investigates how disease variability (tumor stage, histology, localization), treatment status, and non-pulmonary comorbidities influence breath metabolite profiles, with the goal of improving robustness and clinical utility of breath-based diagnostics.
Literature Review
Prior work demonstrates promise for exhaled breath analysis in detecting lung cancer using various analytical platforms (GC-MS, ion mobility spectrometry, PTR-MS) and sensor-based electronic noses, yet reported biomarker panels and model performances vary widely across studies due to differing cohorts, sampling and analytical conditions, and algorithms. Some studies explored treatment effects mainly around surgery; others reported variable associations of VOCs with tumor stage or histology. The authors previously proposed using VOC ratios rather than absolute peak areas to enhance robustness across analytical conditions and cohorts. This study extends that approach to a larger cohort and systematically examines confounders, including non-pulmonary comorbidities and tumor characteristics, which are often underexplored.
Methodology
Participants: 232 total subjects (112 lung cancer patients, 120 healthy volunteers). Healthy status was based on annual physical exam and lung fluorography to exclude lung pathologies/inflammation; lung cancer diagnosis was biopsy-confirmed. Patients with other lung comorbidities were excluded. Most patients were under treatment (chemotherapy n=88, immunotherapy n=7, targeted therapy n=1); remaining provided samples pre-treatment. All participants gave informed consent. Sampling: Mixed expiratory breath collected into 5-L Tedlar bags (pre-cleaned with nitrogen). Ambient air was sampled same day to account for exogenous compounds. Participants fasted overnight; active smokers sampled ≥2.5 h after smoking. To standardize without causing discomfort, breath-hold (10 s) and deep breathing were used, but exact flow rate/anatomic dead space were not controlled; identical procedure applied to both cohorts. After 10-min rest, subjects repeatedly exhaled into the bag until filled. Storage in bags ≤6 h. Based on prior work, phenol and N,N-dimethylacetamide were excluded as potential biomarkers due to bag-related increases after 2 h. Preconcentration and GC-MS: VOCs were preconcentrated by passing 0.5 L of breath at 200 mL/min through Tenax TA (0.4 g, 60–80 mesh) sorbent tubes, then analyzed using a Chromatec Crystal 5000.2 GC coupled to Chromatec MSD (EI, 70 eV) with a two-stage thermal desorber (TD2). Column: Supelco Supel-Q PLOT (30 m × 0.32 mm × 15 µm). Carrier gas helium at 1.30 mL/min. Oven: 50°C start; 10°C/min to 150°C; 6°C/min to 220°C; 4°C/min to 250°C. Identification via analytical standards where available; otherwise by NIST 2017 library, accepting spectra with match factor ≥85%. Detailed TD and MS settings are reported (e.g., injector 250°C, source 200°C, transfer line 250°C, full scan 29–250 amu; TD valve 150°C, desorption 250°C; trap −10 to 250°C). Data processing and statistics: Chromatograms collected in full-scan; quantification via extracted ion chromatograms. Room air peak areas subtracted from breath; negatives set to zero. Only VOCs with breath peak areas ≥20% above ambient air and present in >50% of samples were included in statistics. Ratios were calculated using frequently occurring VOCs (e.g., acetone, isoprene, dimethyl sulfide at 100% occurrence; top 10 frequent VOCs as denominators) to minimize division-by-zero and enhance robustness. Normality was tested by Kolmogorov–Smirnov; due to non-normal distributions, Spearman’s rank correlation (α = 0.05) assessed associations between VOCs/ratios and disease status, tumor localization (central vs peripheral), histological malignancy ordering (squamous cell carcinoma < adenocarcinoma < small cell carcinoma), and TNM stage. Preliminary power analysis indicated N=221 needed for 85% power at correlation 0.2 (α=0.05); the study included N=232. Machine learning: Dataset split into 70% training and 30% test; 3-fold cross-validation. Two classifiers were built using the same 12 selected VOC ratios as inputs: gradient boosted decision trees (GBDT) and an artificial neural network (ANN; multilayer perceptron with one hidden layer of 5 neurons; BFGS training; output layer with two neurons representing class labels). Sensitivity and specificity were computed for training and test sets.
Key Findings
- Cohort and VOC detection: 112 lung cancer patients and 120 healthy volunteers analyzed. A total of 205 VOCs were identified; those present in >50% of samples were used for statistical analyses (frequency table provided). Frequently detected VOCs included isoprene, acetone, dimethyl sulfide (100% in both groups), and others. - Treatment effects: Ratios involving 2-heptanone were modestly but significantly associated with treatment status (before vs under chemotherapy), e.g., 2-heptanone/1-methylthiopropane (r = −0.196), 2-heptanone/1-methylthiopropene (r = −0.206), 2-heptanone/dimethyl disulfide (r = −0.202). - Comorbidities: No significant associations with anemia or acute cerebrovascular accident. Obesity correlated with benzaldehyde/acetonitrile (r = 0.237) and benzaldehyde/2,3-butanedione (r = 0.240). Diabetes correlated with 2-butanone (r = 0.245) and benzaldehyde ratios (e.g., benzaldehyde/dimethyl sulfide r = 0.230). Chronic heart failure and hypertension both showed significant negative correlations with toluene and several toluene-based ratios (e.g., toluene r = −0.220 and −0.268; toluene/acetonitrile r = −0.196 and −0.237; toluene/isoprene r = −0.214 and −0.257). - Tumor localization: Significant correlations for 1-pentanol (r = 0.222), 1-pentanol/2,3-butanedione (r = 0.262), 1-pentanol/isoprene (r = 0.210), 1-pentanol/acetone (r = 0.193), dimethyl disulfide/acetonitrile (r = 0.196), and 2-butanone/isoprene (r = 0.191). - TNM stage: Significant correlations with TNM stage included 2,3-butanedione (r = 0.343), octane (r = 0.272) and multiple octane-based ratios (up to r = 0.375), while dimethyl trisulfide and its ratios were negatively correlated (e.g., DMTS r = −0.235; DMTS/acetone r = −0.272; DMTS/2-pentanone r = −0.279). Benzaldehyde/acetonitrile (r = 0.249) and 2,3-butanedione/2-pentanone (r = 0.380) were also associated. - Histological type (ordered by malignancy): No single VOC peak area correlated significantly, but several ratios did, including octane/acetone (r = 0.207), multiple 3-heptanone ratios (r ≈ 0.229–0.235), and DMTS-related ratios (r ≈ 0.199–0.256). - Disease status (LC vs healthy): Significant correlations observed for peak areas of acetone (r = −0.163), 1-methylthiopropene (r = 0.140), 2-pentanone (r = 0.244), hexane (r = −0.287), toluene (r = 0.249), pentanal (r = −0.254), and dimethyl trisulfide (r = 0.260). Many ratios differed; 12 with highest discriminative power were selected for modeling, including hexane/2-pentanone (r = −0.309), pentanal/2-pentanone (r = −0.346), 2-butanone/2-pentanone (r = −0.320), dimethyl trisulfide/dimethyl disulfide (r = 0.271), acetonitrile/acetone (r = −0.269), isoprene/acetone (r = 0.227), and others. None of the selected ratios correlated significantly with age within groups. - Machine learning performance (3-fold): • GBDT training: sensitivity 92–96%, specificity 82–92%; test: sensitivity 77–88%, specificity 68–81%. • ANN training: sensitivity 87–89%, specificity 75–85%; test: sensitivity 82–88%, specificity 80–86% (best overall test accuracy). Variable importance from GBDT highlighted dimethyl trisulfide/dimethyl disulfide and isoprene/acetone as most influential; hexane/2-pentanone least influential. - Robust biomarkers: Consistent disease-status correlations with toluene/acetonitrile (positive), hexane/acetonitrile (negative), and pentanal/isoprene (negative) replicated prior findings, supporting their robustness.
Discussion
The study confirms that exhaled breath VOC profiling can discriminate lung cancer patients from healthy individuals but highlights that variability in patient characteristics and clinical factors influences VOC signatures and model performance across studies. Treatment status, particularly chemotherapy, modestly affects certain 2-heptanone ratios, suggesting that models may need to account for ongoing therapy. Non-pulmonary comorbidities (notably hypertension, diabetes, and chronic heart failure) significantly alter VOC profiles—especially toluene and related ratios—and should be considered to avoid confounding. Tumor localization, TNM stage, and histological malignancy also relate to specific VOCs or ratios (e.g., positive associations of octane-based ratios and 2,3-butanedione with advanced TNM stages; negative associations of dimethyl trisulfide and its ratios), indicating pathophysiological links and potential for disease characterization. Using VOC ratios rather than absolute abundances enhances robustness to analytical variability, and model importance analyses consistently emphasize dimethyl trisulfide/dimethyl disulfide and isoprene/acetone as key discriminators. Compared with the authors’ earlier, smaller-cohort models that reported >90% sensitivity and specificity, the present models show slightly lower performance but greater reliability due to a larger cohort, stricter test split, fasting standardization, and balanced smoking status across groups. Age differences between groups did not confound the selected ratio biomarkers. Overall, incorporating clinical covariates and focusing on robust ratios improves generalizability of breath-based lung cancer diagnostics.
Conclusion
Exhaled breath analysis by GC-MS with VOC ratio-based modeling enables non-invasive discrimination of lung cancer. The study demonstrates that non-pulmonary comorbidities (especially chronic heart failure and hypertension), treatment status, tumor localization, TNM stage, and histological malignancy introduce systematic variability in VOC profiles. Accounting for these factors and excluding affected parameters improves diagnostic model robustness. Among considered features, dimethyl trisulfide/dimethyl disulfide and isoprene/acetone ratios consistently contribute to classification, and ANN models achieved test sensitivities of 82–88% and specificities of 80–86%. Future work should expand cohorts, systematically stratify by comorbidities and treatments (including other lung comorbidities), validate biomarker ratios longitudinally, and pursue standardization of sampling and analysis to facilitate clinical translation.
Limitations
- Comorbidities: Only select non-pulmonary comorbidities were assessed; other lung comorbidities were excluded and not analyzed, and the proportion of patients with comorbidities was relatively low, potentially limiting detection of additional associations. - Sampling standardization: Flow rate, anatomic dead space, and expiratory parameters were not controlled to avoid patient discomfort; although procedures were the same across cohorts, residual variability may remain. - Sampling bag artifacts: Phenol and N,N-dimethylacetamide increased during storage and were excluded; other potential bag-related effects cannot be fully ruled out despite ≤6 h storage. - Group differences: Healthy volunteers were significantly younger than patients; while selected ratios did not correlate with age in-group, residual confounding is possible. - External validity: Single-center sampling environments and regional cohorts may limit generalizability; ambient air subtraction mitigates but does not eliminate exogenous influences. - Analytical identification: Some VOC identifications relied on library matches (≥85% match) without standards, which may introduce annotation uncertainty.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny