
Education
Shared genetic architectures of educational attainment in East Asian and European populations
T. Chen, J. Kim, et al.
This groundbreaking study reveals the first large-scale genome-wide association study of educational attainment in East Asian individuals, uncovering high genetic correlation with European populations. Conducted by prominent researchers including Tzu-Ting Chen and Jaeyoung Kim, it emphasizes the importance of diverse ancestry in unraveling the genetic factors influencing education.
Playback language: English
Introduction
Educational attainment (EduYears), a heritable trait, serves as a proxy for cognitive ability and is linked to various health and social outcomes. Previous GWAS on EduYears primarily focused on European ancestry populations. This study aimed to address the lack of diversity in genetic studies of EduYears by conducting the first large-scale GWAS in East Asian populations and performing a cross-ancestry meta-analysis with European data. The specific objectives were to identify genomic loci for EduYears across populations, investigate its biological basis in East Asians, examine the shared genetic architecture between East Asian and European populations, and demonstrate the advantages of cross-population analysis in polygenic prediction and fine-mapping. The study's importance lies in its potential to reduce social and health disparities by improving the understanding of EduYears and its impact on socioeconomic and health outcomes in understudied populations. EduYears is moderately heritable, with twin studies suggesting ~40% heritability and SNP-based heritability around ~20% from previous GWAS. Previous GWAS meta-analyses in European populations have identified numerous associated loci, but their generalizability to non-European populations remained limited. This study aimed to fill this gap and advance the understanding of the genetic basis of educational attainment.
Literature Review
Previous genome-wide association studies (GWAS) have identified numerous genetic variants associated with educational attainment, primarily in individuals of European ancestry. The largest meta-analysis to date, involving approximately 3 million individuals of European ancestry, identified thousands of independent genome-wide significant loci. However, these findings have limited generalizability to other populations due to the lack of diversity in the samples studied. The absence of diverse representation in genetic studies of complex traits can lead to inaccurate generalizations and may contribute to health and social disparities. This study addresses this limitation by focusing on a large sample of East Asian individuals and conducting a cross-ancestry analysis to compare the genetic architecture of educational attainment between East Asian and European populations. The moderate heritability of educational attainment, as shown in twin and GWAS studies, provides a strong basis for investigating the genetic components contributing to its variation across different populations. The relationship between educational attainment and various health and social outcomes further emphasizes the importance of understanding its genetic underpinnings.
Methodology
This study involved two main stages: a large-scale GWAS in East Asian populations and a cross-ancestry meta-analysis combining East Asian and European data. The East Asian GWAS used data from the Taiwan Biobank (TWB) and the Korean Genome and Epidemiology Study (KoGES), encompassing a total of 176,400 individuals. Rigorous quality control (QC) procedures were applied to the genotype data, including filtering for call rate, removing duplicated variants and monogenic variants, checking for consistency between reported sex and genetic sex, and excluding population outliers through principal component analysis. Genotype imputation was performed using the 1000 Genomes Project Phase 3 East Asian data as a reference panel. Educational attainment was assessed using self-reported years of education, which were mapped to the International Standard Classification of Education (ISCED) categories for comparability. Genetic association analyses were conducted using a two-step whole-genome regression model in Regenie, adjusting for several covariates including birth year, sex, and principal components. The cross-ancestry meta-analysis integrated the East Asian GWAS summary statistics with publicly available summary statistics from a large-scale European GWAS of educational attainment, using METAL for inverse-variance-weighted fixed-effects meta-analysis. The study also employed several analytical approaches such as LD score regression, expression quantitative trait loci (eQTL) mapping, MAGMA gene-set analysis, stratified LDSC, LDSC-SEG analysis, GSA-SNP2 pathway enrichment analysis, multi-ancestry meta-analysis (MAMA), and polygenic score (PGS) prediction using PRS-CS and PRS-CSx. In particular, SuSiEx was used for both within-population and cross-population fine-mapping to refine the identified genetic loci. The study utilized multiple publicly available datasets like GTEx, Franke laboratory data, Cahoy et al. data, Roadmap Epigenomics and ENCODE projects, and data from the UK Biobank and NIA-LOAD for polygenic prediction.
Key Findings
The East Asian GWAS identified seven genome-wide significant loci associated with educational attainment, all of which had been previously reported in European ancestry GWAS. The SNP-based heritability estimates were similar in the Taiwan Biobank (9.7%), KoGES (8.7%), and the combined East Asian meta-analysis (9.0%), and also comparable to estimates from European populations (10.7%). High genetic correlation was observed within the East Asian population (r = 0.87) and between East Asian and European populations (r = 0.87), suggesting a substantial overlap in the genetic architecture of educational attainment across these populations. The cross-ancestry meta-analysis, combining East Asian and European data, identified a larger number of genome-wide significant loci (102), all previously reported in European GWAS. Multi-ancestry meta-analysis (MAMA) revealed 94 independent genome-wide significant SNPs, two of which were novel. Functional enrichment analyses, including eQTL mapping and gene-set analysis, identified several genes and pathways potentially involved in the biological mechanisms underlying the association between genetic variants and educational attainment, many of which are related to brain development and function. Cross-population fine-mapping using SuSiEx resulted in increased posterior inclusion probabilities for several SNPs compared to within-population fine-mapping, suggesting that combining diverse populations enhances fine-mapping resolution. Polygenic prediction analyses demonstrated that cross-population polygenic scores outperformed ancestry-matched scores in predicting educational attainment in independent East Asian cohorts, highlighting the benefits of using diverse populations in PGS development. LDSC analyses revealed significant genetic correlations between EduYears and several socioeconomic and health-related traits, with consistent directions of effects across East Asian and European populations.
Discussion
The findings of this study strongly support a substantial degree of shared genetic architecture for educational attainment between East Asian and European populations. The high genetic correlation and transferability ratios observed between these populations suggest that many of the genetic variants influencing educational attainment are common across ancestries. The success of cross-population fine-mapping in increasing the posterior inclusion probabilities for causal variants supports the hypothesis that incorporating diverse ancestry data enhances the power of genetic fine-mapping studies. Similarly, the improved predictive performance of cross-population polygenic scores highlights the value of diverse datasets in polygenic risk score development. The identified genes and pathways implicated in educational attainment provide valuable insights into the underlying biological mechanisms, many of which align with previously identified genes related to brain development and cognitive function. These results underscore the importance of including diverse populations in genetic studies to improve the accuracy and generalizability of findings and to avoid biases in understanding the genetic architecture of complex traits. The consistent genetic correlations observed between educational attainment and various socioeconomic and health outcomes across populations strengthen the link between these factors.
Conclusion
This study provides the first large-scale GWAS of educational attainment in an East Asian population, complemented by a powerful cross-ancestry meta-analysis with European data. The high genetic correlation and substantial transferability of loci between East Asian and European populations highlight the shared genetic architecture of this complex trait. The study demonstrates the benefits of incorporating diverse populations in genetic analyses, improving both fine-mapping resolution and polygenic prediction. Future research should focus on expanding the sample size in East Asian populations and incorporating additional diverse ancestry groups to further refine the understanding of the genetic basis of educational attainment and its relationship with various health and social outcomes. Further investigation into the identified novel loci could provide important insights into the underlying biology of this trait.
Limitations
One limitation is the use of self-reported educational attainment, which may not perfectly capture the actual years of education. Differences in compulsory education years between Taiwan and South Korea might also limit the phenotypic variation in EduYears. Another limitation is the relatively smaller sample size of the East Asian GWAS compared to the large European GWAS, potentially leading to less power for gene discovery compared to previous studies in European populations. The absence of novel loci identified in the East Asian GWAS compared to previous European GWAS reflects this difference in power. However, larger future samples will likely reveal more genetic signals associated with EduYears in East Asian populations. The reliance on publicly available summary statistics for European GWAS also introduces potential limitations related to data access and quality control procedures. Lastly, understanding the genetic basis of EduYears as a proxy phenotype may not immediately lead to direct clinical applications.
Related Publications
Explore these studies to deepen your understanding of the subject.