Polygenic risk scores (PRSs), which aggregate the effects of many genetic variants, offer potential for predicting individual disease predisposition. However, their clinical application, particularly in diverse populations, faces significant hurdles. Existing PRSs, often developed from predominantly European ancestry populations, show reduced predictive accuracy in other ancestries, potentially exacerbating health disparities. The Electronic Medical Records and Genomics (eMERGE) Network, a multicenter consortium, aims to address these challenges through a large-scale study involving 25,000 participants of diverse ancestry. This paper focuses on the selection, optimization, validation, and clinical implementation of PRSs for ten common diseases, along with the development of a robust framework for reporting results to both healthcare providers and patients. The study's primary outcome is the number of new healthcare actions taken following the return of the genome-informed risk assessment.
Literature Review
The paper reviews existing literature on PRS development and application, highlighting the limitations of Eurocentric PRSs in diverse populations. It cites studies demonstrating the reduced predictive accuracy of such scores in non-European ancestries and the potential for exacerbating health disparities. The authors discuss the challenges in combining genomic and non-genomic information for improved risk prediction, optimizing models for diverse populations and age groups, and effectively communicating this information to clinicians and patients. While commercial platforms offer PRS calculations, there's a lack of information on clinical implementation considerations, particularly in primary care settings. The need for large-scale studies to assess patient and provider interactions with PRSs in real-world clinical settings is emphasized.
Methodology
The eMERGE Network employed a multi-stage process for PRS selection and optimization. Initially, 23 conditions were considered based on factors like prevalence, heritability, evidence for PRS performance, clinical actionability, and data availability. A comprehensive literature review and evaluation matrix assessed analytical viability, feasibility, clinical actionability, and translatability for each condition's PRS. The process prioritized PRSs validated across four ancestry groups (European, African, Hispanic, and Asian), with emphasis on underrepresented groups. Thirteen conditions were excluded due to various factors, including low prevalence, lack of diverse validation datasets, and absence of validated algorithms for case/control identification. The remaining twelve conditions underwent further optimization to refine performance across ancestries. The final ten selected conditions were implemented in a clinical laboratory pipeline involving genotyping, phasing, imputation, PRS calculation, and ancestry calibration using a modified method trained on the All of Us Research Program cohort. A comprehensive validation study using whole genome sequencing data confirmed the pipeline’s accuracy and reproducibility. Finally, a clinical report generation and review pipeline was created to ensure regulatory compliance and facilitate return of results to providers.
Key Findings
From an initial 23 conditions, ten were selected for clinical implementation: asthma, atrial fibrillation, breast cancer, chronic kidney disease, coronary heart disease, hypercholesterolemia, obesity, prostate cancer, type 1 diabetes, and type 2 diabetes. The odds ratios associated with high PRS versus not high PRS were statistically significant in at least two ancestry groups for each condition. A population-based z-score calibration method, trained on a diverse subset of the All of Us Research Program cohort, was developed to account for ancestry-dependent differences in PRS distributions, ensuring consistent risk categorization across ancestries. The clinical laboratory pipeline was rigorously validated, showing high accuracy (Pearson correlation >93%) and reproducibility. The first 2,500 participants processed revealed that 20.6% had a high PRS for at least one condition, with the observed numbers of high-risk assessments largely consistent with expected numbers. High-PRS participants were observed across the spectrum of genetic ancestries.
Discussion
This study makes a significant contribution by demonstrating a practical approach for implementing PRSs in a diverse clinical setting. Addressing the limitations of Eurocentric PRSs, the study successfully optimized and validated scores across multiple ancestries, mitigating health disparities. The rigorous validation of the clinical laboratory pipeline and the development of a standardized reporting framework provide a blueprint for other initiatives. While the study addresses many challenges, further research is needed to fully understand the clinical utility of PRSs, including the development of age-based absolute risk estimates and the evaluation of the impact on patient outcomes. Challenges such as participation bias in training datasets and the need for improved communication strategies for both providers and patients remain. The 25,000 participant data from the eMERGE study will provide valuable insights into long-term harms and benefits.
Conclusion
The eMERGE Network's work represents a substantial advancement in the clinical implementation of PRSs for diverse populations. The rigorous selection, optimization, validation, and implementation process, combined with the development of robust reporting tools, provides a valuable model for future efforts. Ongoing research will further refine PRS applications and address remaining challenges, ultimately improving healthcare for all.
Limitations
The study acknowledges limitations such as potential participation bias in the All of Us Research Program cohort used for ancestry calibration. The generalizability of the findings may be limited by the specific conditions and populations included in the study. Furthermore, the long-term clinical utility of PRSs and their impact on health outcomes require further investigation. While this study will provide additional data to existing risk stratification, it will not answer all questions regarding the implementation and benefit of this technology.
Related Publications
Explore these studies to deepen your understanding of the subject.