logo
ResearchBunny Logo
Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations

Medicine and Health

Selection, optimization and validation of ten chronic disease polygenic risk scores for clinical implementation in diverse US populations

N. J. Lennon, L. C. Kottyan, et al.

This groundbreaking study, conducted by a team of leading researchers, explores the transformative potential of polygenic risk scores (PRSs) in clinical settings. By creating a framework to implement PRS-based genome-informed risk assessments for a diverse population of 25,000 adults and children, the research tackles existing challenges and sheds light on the future of personalized medicine.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses how to select, optimize, validate and implement polygenic risk scores (PRSs) for multiple common chronic diseases in diverse clinical populations. While PRSs can improve disease risk prediction and enable earlier prevention or detection, most GWAS discovery cohorts are Eurocentric, leading to reduced predictive performance in individuals with non-European ancestry and risking exacerbation of health disparities. The eMERGE Network designed a prospective, pragmatic clinical study to return PRS-based genome-informed risk assessments to 25,000 participants (children and adults, ages 3–75) across multiple health systems, with enriched recruitment of underrepresented populations. The goals include identifying PRSs suitable for clinical use across ancestries, calibrating scores to genetic ancestry distributions to ensure equitable classification into high/not-high risk, integrating with clinical reporting workflows compliant with CLIA regulations, and evaluating early implementation outcomes in a primary care context.
Literature Review
The paper reviews challenges and advances in PRS development and clinical translation: the rapid increase in available PRSs; known declines in prediction accuracy with increasing genetic distance from discovery populations; improvements via multi-ancestry development and validation; and integration with clinical and environmental factors and monogenic risk to enhance prediction. It references prior clinical and direct-to-participant implementations, consumer offerings, and calls for large, multicenter pragmatic studies to assess patient and provider responses to PRS information in primary care. It also notes concerns such as potential health disparities, genetic determinism, stigmatization, and the need for robust regulatory and communication frameworks.
Methodology
Study design and selection: The eMERGE Network conducted a multistage, best-practice-aligned process to audit, evaluate, and select PRSs from an initial 23 conditions based on population health relevance (prevalence, heritability), feasibility (diverse validation datasets, EHR phenotyping), clinical actionability, and translatability. After iterative steering committee reviews (2020–2021), ten conditions were finalized for clinical implementation: asthma, atrial fibrillation, breast cancer, chronic kidney disease (CKD), coronary heart disease (CHD), hypercholesterolemia, obesity/BMI, prostate cancer, type 1 diabetes (T1D), and type 2 diabetes (T2D). Conditions such as abdominal aortic aneurysm and colorectal cancer were removed due to data limitations (for example, missing key EHR covariates or incomplete multi-ancestry validation). Some PRSs were adopted as published; five underwent optimization (for CKD, adding APOL1 risk genotypes to the polygenic component to improve prediction in African ancestry cohorts). Validation datasets and criteria: Validation emphasized performance across four ancestry groups (African/African American, Asian, European, Hispanic/Latino), seeking statistically significant odds ratios in at least two ancestries. Data sources included eMERGE I–III, UK Biobank, and the Million Veteran Program. Each condition team reported discovery/validation sources, availability of multi-ancestry PRSs, thresholds for high-risk classification (percentiles), discrimination (AUC), and effect sizes (ORs) comparing high versus not-high PRS groups. Genotyping and scoring pipeline: Samples were genotyped on the Illumina Global Diversity Array (GDA). Data were phased with Eagle2 and imputed with Minimac4 using a Broad-curated 1000 Genomes reference panel to remove mis-genotyped sites. The PRS pipeline used PLINK2 to compute raw scores as the sum over effect-weighted dosages. Site representation, repeatability, reproducibility, and score concordance with PCR-free 30X WGS were validated. Input material (blood versus saliva) performance was assessed using matched pairs. Ancestry calibration: To mitigate ancestry-dependent mean and variance differences in PRS distributions, a PCA-based calibration model was implemented, extending prior work to model both ancestry-dependent means and variances. Training and testing used All of Us (AoU) GDA-imputed cohorts with balanced representation across 1000 Genomes super-populations, including many admixed individuals. Calibration parameters were fit by maximizing likelihood in the training set and validated in a held-out test set, yielding standardized normal distributions across ancestries and removing admixture-fraction dependencies. CLIA validation and reporting: A CLIA laboratory (Clinical Research Sequencing Platform, LLC at the Broad Institute) implemented end-to-end validation: 70 diverse Coriell cell lines with WGS truth for accuracy; 20 matched blood–saliva pairs; three-sample six-replicate runs for reproducibility; and verification of clinical validity using eMERGE I–III cohorts where feasible. Standard operating procedures, data review metrics (score z-range, batch PCA outlier plots, control monitoring), and sample fingerprinting were established. A software pipeline generated PDF and JSON clinical reports with age/sex filters and condition-specific return policies. Results returned were qualitative (high risk vs not high risk); PRS z-scores were included only for CHD and breast cancer in sections used for integrated models (for example, BOADICEA).
Key Findings
- Ten PRSs were implemented clinically (asthma, atrial fibrillation, breast cancer, CKD, CHD, hypercholesterolemia, obesity/BMI, prostate cancer, T1D, T2D) after multistage evaluation and optimization where needed. - Technical validation of PRS pipeline (array-imputed vs WGS): Pearson correlations ranged from 93.0% (breast cancer) to 99.5% (obesity, T1D), with 100% repeatability and low reproducibility variability (z-score SD as low as 0.0001). Site missingness was generally low (for example, 0.32% breast cancer; up to 2.97% for prostate cancer and T1D). - Imputation pipeline performance: Sensitivity >97% and specificity >99% for SNPs across ancestries; INDEL sensitivity >94.9% and specificity >98.6%. Blood and saliva inputs performed equivalently. Batch size ≥10 samples maintained high imputation accuracy; call rates <95% degraded performance. - Ancestry calibration using AoU: Calibrated z-scores achieved approximately standard normal distributions across ancestries, improving both cross-ancestry comparability and within-ancestry calibration for admixed individuals by modeling ancestry-dependent means and variances. - Condition-specific effects (odds ratios for high PRS vs not high PRS, by ancestry; examples): - Prostate cancer: European OR 12.97 (95% CI 7.29–20.40); African American OR 20.45 (10.77–38.83). - CKD: European OR 3.6 (3.11–4.17); Hispanic OR 4.93 (2.46–9.89); Asian OR 3.81 (1.91–7.59); African American OR 2.66 (2.01–3.51). - Hypercholesterolemia: European OR 4.16 (2.59–6.44); African American OR 3.16 (1.92–5.01); Asian OR 3.75 (3.15–4.42); Hispanic OR 4.02 (2.72–5.83). - T1D: European OR 4.21 (3.66–4.84); African American OR 2.55 (2.09–3.11); Asian OR 4.58 (4.00–5.23). - Breast cancer: European OR 2.47 (2.20–2.77); African American OR 1.61 (1.38–1.87); Asian OR 2.22 (1.99–2.47); Hispanic OR 2.05 (1.10–3.83). - Atrial fibrillation: European OR 2.32 (2.07–2.61); African American OR 2.19 (1.38–3.38); Hispanic OR 2.27 (1.09–4.50). - CHD: European OR 2.3 (2.07–2.56); African American OR 1.68 (1.39–2.03); Hispanic OR 2.16 (1.47–3.19). - Asthma: European OR 1.95 (1.43–2.65); African American OR 1.83 (1.24–2.70); Hispanic OR 3.12 (1.32–7.44). - T2D: Hispanic OR 6.87 (3.11–15.15); European OR 3.67 (3.57–3.76); African American OR 2.95 (2.60–3.30); Asian OR 4.58 (4.00–5.23) where available. (Obesity/BMI ORs to be reported separately by GIANT.) - Early implementation (first 2,500 participants; 64.5% female; median age 51): 515 (20.6%) had high PRS for at least one condition; 64 (2.6%) for two; 2 (0.08%) for three. High-PRS individuals spanned the genetic ancestry spectrum. Observed vs expected high-PRS counts were largely consistent per condition; a notable deviation was T2D (observed 71 vs expected 50; P=0.03, Holm–Šidák adjusted).
Discussion
The work demonstrates a rigorous end-to-end framework to select, calibrate, validate, and clinically return multi-ancestry PRSs across ten common diseases, addressing a central barrier to equitable deployment: cross-ancestry portability and calibration. By prioritizing diverse validation, modeling ancestry-dependent PRS distributions, and establishing CLIA-compliant pipelines and reporting, the study provides a blueprint for responsible implementation. The results show robust technical performance, significant disease associations across multiple ancestries, and feasibility of PRS reporting in primary care settings at scale. This approach lays groundwork for integrated risk assessment (combining PRSs with monogenic variants, family history, and clinical factors) and for evaluating downstream clinical actions and outcomes. Remaining challenges include ensuring representativeness of training/validation cohorts, communicating probabilistic risk to patients and providers, and coupling risk return with effective prevention/early detection strategies to realize clinical benefit.
Conclusion
This study advances clinical implementation of PRSs by: (1) creating a multi-stage selection and optimization process that yields ten clinically actionable PRSs; (2) developing and validating a PCA-based ancestry calibration modeling both mean and variance, improving equity and portability; (3) establishing a CLIA-validated, reproducible genotyping–imputation–scoring pipeline with clinical reporting; and (4) demonstrating early real-world deployment to 2,500 participants. Future work should standardize thresholds for defining high PRS, expand age-based absolute risk estimation for more phenotypes, broaden multi-ancestry discovery/validation datasets, and develop best-practice guidelines for integrating PRSs into clinical workflows, communication, and decision support. Prospective outcomes from the full 25,000-participant cohort will inform utility, behavior change, and potential health impact.
Limitations
Key limitations include: (1) potential participation and selection biases in large biobanks used for validation (for example, UK Biobank), affecting generalizability; (2) reduced PRS accuracy with genetic distance from Eurocentric GWAS discovery cohorts, despite multi-ancestry validation and calibration; (3) limited availability of age-of-onset and incidence data to convert PRS to absolute, age-based risks across many phenotypes; (4) challenges in communicating and interpreting PRS results by providers and patients, and risks of genetic determinism or stigmatization; (5) clinical benefits depend on effective, accessible prevention and early detection strategies after result return; and (6) for some conditions or ancestries, validation datasets remain limited, and some odds ratios were not estimable (n.d.) or pending publication (for example, obesity/BMI).
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny