Introduction
Reading Disability (RD) is a prevalent neurocognitive disorder characterized by significant difficulties primarily in word reading. It exhibits substantial clinical and genetic overlap with other neurodevelopmental disorders, impacting academic success, employment prospects, and leading to lifelong consequences. RD is considered a complex polygenic trait, influenced by numerous genetic factors, initially explored through family-based linkage and fine-mapping association studies. These studies paved the way for more powerful genome-wide association studies (GWAS), although many GWAS results previously fell short of genome-wide significance, largely due to limited sample sizes. However, with the formation of larger consortia and increased sample sizes, significant associations are emerging, such as those observed with *RPL7P34*, *MIR924HG*, and *DOCK7* for reading or reading-related skills. A recent study using the 23andme cohort even identified 42 loci associated with self-reported dyslexia. The functional mechanisms by which these associated genetic variants contribute to RD remain largely unknown. A significant challenge in post-GWAS research is connecting single nucleotide polymorphisms (SNPs) to risk genes and their subsequent influence on cellular and brain function. Many associated variants reside in non-coding regions (introns or intergenic regions) and do not directly alter the protein code. These variants are often found in regulatory elements, suggesting a role in gene expression regulation. However, the genes they influence can be located millions of base pairs away, and the tissue- and cell type-specific nature of regulatory elements complicates the interpretation of GWAS findings. Determining the relevant cell types is crucial for understanding the genetic and molecular mechanisms underlying reading ability and disability, but this information is largely absent. This study aimed to bridge this knowledge gap by using linkage disequilibrium score regression (LDSC) to examine whether GWAS heritability is enriched in specific adult and fetal brain cell types. This approach combines GWAS summary statistics and gene expression data to interpret GWAS results and prioritize cell types for further functional investigation.
Literature Review
Neuroimaging studies have implicated various cortical regions in reading processes, including the anterior system (Broca's area in the inferior frontal gyrus) and posterior systems (dorsal parietotemporal and ventral occipitotemporal systems). While neurons are generally considered integral, the specific neuronal subtypes involved remain unclear. A prominent theory, the disrupted neuronal migration (DNM) hypothesis, suggests that risk variants contribute to subtle disruptions in neuronal migration, leading to altered connectivity of brain regions associated with language. This hypothesis is supported by post-mortem studies showing left-hemisphere polymicrogyria in individuals with RD, as well as findings of heterotopias, dysplasias, and dyslamination. Genetic studies have also provided some support for this hypothesis. Both glutamatergic (excitatory) and GABAergic (inhibitory) cortical neurons are likely candidates, and glial cells might also play a role. Excitatory-inhibitory imbalances have been implicated in RD and related traits. Magnetic resonance spectroscopy (MRS) studies have shown increased cortical glutamate in RD, with higher concentrations correlating with lower reading skills. Elevated glutamate levels have also been observed in autism spectrum disorder (ASD) and ADHD, leading to the neural noise hypothesis, which proposes that increased glutamate results in neural hyperexcitability. The neural noise hypothesis and DNM are interconnected, as proper neuronal migration is necessary for functional cortical circuits and balanced excitatory-inhibitory synaptic connections. Disruptions in either migration or excitatory-inhibitory balance can negatively affect the other.
Methodology
This study employed linkage disequilibrium score regression (LDSC) to analyze the enrichment of GWAS heritability in specific brain cell types. The LDSC method, using the 'Partitioning Heritability' function, integrates GWAS summary statistics, gene expression data, and baseline annotations as genomic controls. It assesses whether the heritability of a cell type, defined by its highly expressed genes, significantly contributes to the overall SNP heritability of a trait. The study used a meta-analysis of two GWAS datasets for word reading (n=5054), combining a family-based sample from Toronto (n=624) and a population-based sample from Philadelphia (PNC, n=4430). Imputation was performed to infer unobserved genotypes. Quality control steps involved removing SNPs with low imputation quality, those out of Hardy-Weinberg equilibrium, and low minor allele frequency. Only individuals of European ancestry were included. GWAS was conducted using linear mixed models, and a meta-analysis was performed using METAL. Summary statistics for ADHD, educational attainment, and cognitive ability were downloaded from the Psychiatric Genomics Consortium website. Preprocessing for LDSC included removing SNPs with low MAF and imputation quality scores. SNPs in the major histocompatibility region were removed due to high linkage disequilibrium. The LDSC command 'munge_sumstats.py' prepared the summary statistics for analysis. Heritability (h²) and standard error were estimated using the LDSC 'ldsc.py' command. Gene expression datasets from the Allen Brain Bank (ABB), Kriegstein lab, and Shen lab were used, comprising single-nucleus RNA sequencing (snRNA-seq), single-cell RNA-seq (scRNA-seq), and bulk RNA-seq data, respectively. These datasets encompassed various neural cell types in fetal and adult brains. To prepare the data for LDSC, annotation files were created using the top decile of mean gene expression or specificity for each cell type. The EWCE R package was used to process single-cell data, while mean and specificity were calculated directly in R for bulk RNA-seq data. Gene coordinates (hg19) +/- 100 kb were added to the annotation files. LD scores were computed using 'ldsc.py', and LD score regressions were run to partition SNP heritability for each cell type. False Discovery Rate (FDR) correction was applied separately for each RNA-seq dataset.
Key Findings
Heritability (h²) estimates were approximately 0.25 for word reading, 0.24 for ADHD, 0.11 for educational attainment, and 0.18 for cognitive ability. Analysis of adult cortical cells (ABB dataset) showed significant enrichment for excitatory and inhibitory neurons in word reading. Educational attainment and cognitive ability showed enrichment for excitatory and inhibitory neurons, astrocytes, and oligodendrocytes. For ADHD, no significant findings were observed at the major cell type level. Analysis of adult cortical cell subclasses (ABB dataset) revealed significant enrichment in specific excitatory (L6b FEZF2, L5/6 NP FEZF2, IT LINC00507 THEMIS RORB, L6 CT FEZF2, L5/6 IT Car3 THEMIS) and inhibitory (VIP, SST, PVALB) neurons for word reading. Educational attainment and cognitive ability showed enrichment across multiple excitatory and inhibitory subclasses, astrocytes, and oligodendrocytes. For ADHD, one subclass (L4 IT RORB) showed significant enrichment. Analysis of fetal cortical cells (Kriegstein dataset) showed multiple significant cell types for educational attainment and cognitive ability, including inhibitory neurons from the MGE region. Analysis of fetal cortical cells (Shen dataset) showed significant enrichment for intermediate progenitor cells, excitatory and inhibitory neurons, and radial glia for educational attainment and cognitive ability. In summary, the study identified significant enrichment for adult excitatory and inhibitory neurons for word reading, with specific subclasses highlighted. For ADHD, enrichment was found in a specific adult excitatory neuron subclass. Educational attainment and cognitive ability showed enrichment in both adult and fetal cell types, encompassing various neuronal subtypes, astrocytes, oligodendrocytes, intermediate progenitor cells, and radial glial cells.
Discussion
This study utilized LDSC to identify brain cell types associated with word reading and related traits, leveraging GWAS summary statistics and gene expression data. The use of datasets sampling from various cortical regions in both adult and fetal brains, along with different RNA sequencing methods (snRNA-seq, scRNA-seq, bulk RNA-seq), allowed for a comprehensive investigation. The findings regarding word reading and ADHD support previous research indicating excitatory-inhibitory imbalances in these disorders, as evidenced by increased cortical glutamate levels. The identification of specific neuronal subclasses provides valuable targets for future research, particularly using stem cell-derived neurons. The greater number of significant cell types identified for educational attainment and cognitive ability compared to word reading and ADHD likely reflects the larger sample sizes of the respective GWAS. The results contribute to the existing literature, supporting and extending previous findings on cell type enrichment in educational attainment, cognitive ability, and ADHD, and provide novel information on fetal cell types.
Conclusion
This study provides crucial insights into the neurobiological basis of word reading and related traits by identifying specific brain cell types associated with genetic variations. The findings highlight the role of excitatory and inhibitory neurons in reading and reading-related disorders, particularly ADHD. Furthermore, the study reveals novel information about fetal cell types contributing to educational attainment and cognitive abilities. Future research should focus on validating these findings in larger cohorts and exploring the functional mechanisms underlying these cell type-specific genetic associations. Expanding the investigation to include other brain regions and utilizing more comprehensive gene expression datasets could offer further insights.
Limitations
The study's power to detect cell type enrichment was influenced by the sample sizes of the GWAS datasets, with smaller sample sizes limiting the identification of significant cell types for word reading and ADHD compared to educational attainment and cognitive ability. The number of cells/nuclei sequenced for some datasets also affected the analysis, leading to the exclusion of some cell types due to low coverage and reduced power. The focus on cortical samples limits the understanding of cell types in other brain regions. The use of combined analyses across various developmental periods in fetal samples may have obscured findings for specific developmental stages. The moderate sample size for the word reading meta-analysis warrants cautious interpretation.
Related Publications
Explore these studies to deepen your understanding of the subject.