Veterinary Science
Bayesian model and selection signature analyses reveal risk factors for canine atopic dermatitis
K. Tengvall, E. Sundström, et al.
The study investigates the complex genetic architecture underlying canine atopic dermatitis (AD), a chronic inflammatory and pruritic skin disease often with early onset and clinical overlap with human AD. Prior evidence indicates a highly polygenic basis with environmental and epigenetic factors. In humans, the strongest genetic association is with Filaggrin (FLG) in the epidermal differentiation complex (EDC), and large GWAS/meta-analyses have identified 30+ risk loci. In dogs, breed predisposition (e.g., Golden and Labrador retrievers, German shepherd dog, West Highland white terrier) suggests heritable components, with multiple loci reported but limited replication and functional validation. Traditional single-variant GWAS may miss polygenic architectures due to stringent multiple testing and LD structure. The authors hypothesize that numerous small-to-moderate effect variants contribute to canine AD and aim to map these using a Bayesian mixture model (BayesR) and to detect selection signatures (XP-EHH) that may have driven accumulation of risk/protective variants within breeds.
Background work shows: (1) In human AD, FLG within the EDC is a major risk locus, and multi-ancestry GWAS/meta-analyses have identified 25–36 loci with FLG as the strongest signal. (2) In canine AD, breed-specific GWAS have implicated loci in GSD (CFA27), WHWT (CFA3 and CFA17), and GR (CFA3), with genes related to innate/adaptive immunity and skin barrier formation, but replication/functionality remain limited, and within-breed genetic heterogeneity is evident. (3) Methodological considerations: conventional GWAS (e.g., LMM) test variants individually, are conservative after multiple testing, may not account for LD adequately, and can overestimate significant effects. Bayesian mixture models (e.g., BayesR) jointly estimate variant effects, accommodate LD, reduce false negatives, and provide more unbiased effect estimates, suiting polygenic traits. (4) Dog breeds have strong artificial selection leading to selective sweeps and potential hitchhiking/pleiotropy influencing disease risk. XP-EHH can detect recent selection signals. These lines of evidence motivate using BayesR and XP-EHH to uncover polygenic risk and selection-related accumulation of AD risk alleles.
Study populations and phenotyping: Four predisposed breeds were analyzed: Labrador retriever (LR), Golden retriever (GR), German shepherd dog (GSD), and West Highland white terrier (WHWT). After QC and relatedness filtering, the analysis included 321 LR (178 cases, 143 controls), 256 GR (143/113), 219 GSD (106/113), and 235 WHWT (137/98). Stringent AD case definitions emphasized IgE-mediated environmental allergen association and excluded purely food-induced AD. Controls were ≥5 years old without skin/immunologic disorders. For UK LR/GR, owner questionnaires were used with strict inclusion/exclusion criteria. Breed-type metadata (kennels, merits, coat color) supported substructure assignments in LR (gundog vs common type) and GSD (working vs show type). Genotyping and imputation: Dogs were genotyped on Illumina CanineHD 170K or 230K arrays. Data were merged (canFam3.1 coordinates), per-breed QC in PLINK (typical thresholds: --geno 0.05, --mind 0.05, --maf 0.05), relatedness filtering via KING; PCA using GENESIS/SNPRelate to mitigate relatedness/population structure. Imputation used SHAPEIT2 pre-phasing and IMPUTE2 with a 435 purebred dog reference panel; per-chromosome internal cross-validation showed 96.2–98.8% concordance. A mask-impute validation (3.36% of SNPs) yielded 99.5% concordance across breeds (MAF-unfiltered 99.4–99.6%). Post-imputation per-breed datasets underwent additional LD pruning and QC. Bayesian association analysis: BayesR (300,000 iterations, 100,000 burn-in, run 5 times) modeled variant effects via mixtures N(0,0), N(0,0.0001^2), N(0,0.001^2), N(0,0.01^2). Fixed covariates included significant PCs (PC1–2 for LR and WHWT; PC1–3 for GR and GSD), and in GSD also -log10(IgA) and -log10(Age). Effect variants were defined as mean absolute effect size ≥1.0×10^-4. Associated loci were regions harboring effect variants with <1 Mb inter-variant distance; the top effect variant per locus represented it. A per-dog risk index summed risk genotypes across LR loci (0, 0.5, 1 per locus). Selection scans: XP-EHH (rehh) compared extended haplotype homozygosity between cases and controls per breed. Candidate regions under selection were defined using 1 Mb windows with 0.1 Mb overlap requiring ≥2 markers with -log10(p) XP-EHH ≥4. Extended haplotype homozygosity (iHH) and EHH decay were inspected around top selection variants. Association within extended regions was tested with PLINK (--assoc chi-square allelic and --logistic with covariates). Functional annotation and cross-species context: Genes within ±1 Mb of BayesR effect variants (BayesR regions) and within XP-EHH regions were compiled (UCSC canFam4). Gene set enrichment was tested in STRING using Homo sapiens and Canis lupus familiaris backgrounds. Variants were intersected with evolutionary conservation (phyloP across 240 mammals), ATAC-seq peaks (BarkBase), ENCODE candidate cis-regulatory elements, and GeneHancer (via liftOver to hg38). Canine liver TADs were used as proxies to delineate putative regulatory neighborhoods. Human GWAS Catalog was queried for overlap with dermatitis/eczema/psoriasis traits. Long-read sequencing (targeted follow-up in LR chr17 locus): Oxford Nanopore Technologies (ONT) whole-genome long-read sequencing was performed on two LR AD cases (heterozygous for risk alleles at chr17 effect variants) and two controls (homozygous non-risk). Reads were base-called (Guppy), aligned to canFam4 (minimap2), SNVs/indels called (Clair3), and structural variants (Sniffles) detected and phased (WhatsHap). Variants sharing the risk-allele pattern and variants in LD (r^2>0.8) with BayesR effect variants were cataloged; overlaps with ATAC-seq, ENCODE cCREs, and GeneHancer were assessed. Regions of homozygosity/low heterozygosity were identified via phased read analysis.
- Using BayesR, 15 AD-associated loci were identified: 11 in LR, 1 in GR, 1 in GSD, and 2 in WHWT. Notable LR loci included: CFA34 (top variant near ARL14), CFA4 (near ITGA1/ISL1), CFA36 (near UBE2E3/ITGA4), and a multi-variant locus on CFA17 (57.6–59.7 Mb) syntenic to the human EDC/FLG region. GR had an associated locus on CFA23 (SCN5A/ACVR2B region), GSD on CFA9 (ABCA9 intronic), and WHWT on CFA10 (HMGA2/LLPH region) and CFA15 (C4orf45 region).
- Risk index and variance explained: In LR, the mean risk index was significantly higher in cases (mean 15.6) than controls (13.1) with p=1.52×10^-22 (t=10.6, n=321). The LR risk index explained 26.4% of AD variance; modeling each locus separately yielded 32.8% total variance explained (largest locus CFA3 explaining 7.3%). PCs contributed 9.3% (risk index model) and 5.2% (separate loci). In GR and GSD, their single loci each explained 2.3% of variance; in WHWT, the two loci explained 16.1%.
- XP-EHH selection signals (8 candidate regions total): • LR: A strong selection signal on CFA3 spans TBC1D1; the top intronic selection variant chr3:74,218,744 (allele C) showed longer iHH in cases (618 kb) than controls (205 kb). Within the extended haplotype, AD-associated variants (chr3:assocA–D) formed a high-frequency risk haplotype; the CCG haplotype (chr3:assocA–C) explained 7.6% of AD variance and 18.4% of PC1 variance; chr3:sel explained 4.4% of AD and 25.1% of PC1 variance. Homozygous CCG occurred in 72.5% of cases vs 47.6% of controls; in common-type LR, CCG frequency was significantly associated with AD (χ^2=17.5, p=2.81×10^-5; n=245). • GSD: A broad selection signal on CFA19 across LRP1B (1078 selection variants; top chr19:44,248,511 in intron 1) showed longer iHH for allele T in controls (6.47 Mb) vs cases (3.18 Mb), indicating selection in controls. Association near LRP1B was strong in unadjusted tests but disappeared after adjusting for PCs and covariates, consistent with breed-type stratification. The T allele at chr19:sel was more frequent in working-type GSD and associated with AD in working-type controls (χ^2=5.21, p=0.0224; n=107). An additional GSD selection region on CFA20 neighbored LRP1B/KYNU. • WHWT: Four XP-EHH regions on CFA1 (near SMOC2/THBS2), CFA10, CFA26, and CFA32 (near TET2/CXXC4).
- LR chromosome 17 locus and human concordance: The LR CFA17 locus (effect variants in two clusters at 57.6–58.2 Mb and 59.1–59.7 Mb) extends >3 Mb via LD to within ~0.5 Mb of the canine EDC (chr17:61–62 Mb). Two human AD GWAS signals overlapped this region (near BCL9; between effect variants chr17:h and chr17:i). ONT in LR discovered: 486 variants in LD (r^2>0.8) with effect variants (238 heterozygous in cases, homozygous in controls), 26 overlapping canine ATAC-seq peaks (4 also overlapping cCRE and GeneHancer), 133 novel variants (9 in ATAC-seq peaks; 3 overlapping ATAC+cCRE+GeneHancer), and 65 structural variants across chr17:55–65 Mb. A 106 kb homozygous block in cases spanned VPS45 and upstream of OTUD7B; additional homozygous regions were observed near BCL9 and within ACP6.
- Pathways and cross-species relevance: STRING enrichment highlighted integrin alpha (ITGA) family proximity (ITGA1, ITGA4, ITGA9) among BayesR loci and the MyD88-dependent Toll-like receptor signaling pathway (MYD88, IRAK3, TNIP1 in BayesR; TLR1/6/10 in XP-EHH). Several candidate genes overlap human GWAS for AD/eczema/psoriasis or related traits: OTUD7B, PDE4DIP, CIART, MRPS21, SEMA6C (LR chr17 region), CLNK and WDR1 (LR chr3), TNIP1 (LR chr4), TLR1/10 (LR chr3 XP-EHH), LRP1B (GSD chr19; asthma/eczema), KYNU (psoriasis).
The study addresses the polygenic nature of canine AD by leveraging a Bayesian framework that models many small-to-moderate effect variants concurrently, revealing 15 loci across four predisposed breeds. These findings substantiate that AD risk in dogs arises from the aggregate effect of multiple loci, some converging on pathways implicated in skin barrier function and immune signaling (integrins, MyD88-dependent TLR signaling). The LR chr17 locus overlaps the region syntenic to the human EDC/FLG, aligning canine and human AD genetics and strengthening the translational relevance of dog models. Long-read sequencing and regulatory annotation support putative functional elements within this locus impacting nearby immune/skin genes (e.g., OTUD7B, VPS45, ECM1). Selection analyses provide a population-genetic context: in LR, a case-enriched selection signal at TBC1D1 (linked to body weight across species) co-localizes with an AD risk haplotype, consistent with hitchhiking or pleiotropy whereby selection for conformation/body type in common-type LR increased AD risk allele frequencies. Conversely, in GSD, selection in working-type controls across LRP1B (with neighboring KYNU) suggests selection on traits unrelated to AD could have inadvertently altered the local haplotype carrying AD-related variants, potentially yielding protective effects. Breed substructures (type splits) therefore shape disease allele distributions and must be accounted for in association models. Overall, the results connect breed-specific selection histories with AD risk architecture and identify cross-species convergent loci/pathways, providing mechanistic hypotheses for future functional validation and advancing comparative dermatogenetics.
The authors identified 15 AD-associated loci across four dog breeds using a Bayesian mixture model and detected eight candidate selection regions via XP-EHH. A Labrador retriever locus on chromosome 17 overlaps the canine region syntenic to the human EDC/FLG locus, evidencing shared genetic etiology between canine and human AD. Selection signals implicate TBC1D1 in LR cases and LRP1B (near KYNU) in GSD controls, indicating that breed-type selection may have increased or decreased AD risk via hitchhiking or pleiotropy. Pathway analyses highlight integrin and MyD88-dependent TLR signaling involvement. These findings refine the genetic landscape of canine AD and motivate cross-species comparisons and functional follow-up. Future work should expand cohorts, integrate denser genomic and multi-omic data, and experimentally validate candidate regulatory variants/haplotypes to resolve causal mechanisms and inform risk prediction and therapeutic strategies.
- Effect-size thresholding: The chosen BayesR cutoff (|effect| ≥1×10^-4) is somewhat arbitrary; it likely limited discovery to higher-effect signals in breeds where only 1–2 loci were detected, while lowering thresholds risks false positives.
- Sample size and substructure: Per-breed cohorts (~200–400 dogs) limit power for small-effect variants; pronounced breed-type substructure (LR common vs gundog; GSD show vs working) can confound associations despite PC adjustment.
- Imputation and reference: Although validation concordance was high, imputation may miss breed-unique haplotypes not captured by the reference panel.
- Tissue/context for TADs and regulatory annotation: Canine liver TADs were used as proxies due to data availability; skin-specific chromatin architecture might differ. Cross-species lifts (to hg38) and ATAC/cCRE overlaps suggest function but are indirect.
- Functional validation: Causal variants and mechanisms remain to be experimentally validated; many signals are in noncoding regions and in LD blocks.
- Generalizability: Findings may be breed- and subtype-specific and not fully generalizable across all dog populations or mixed-breed dogs.
Related Publications
Explore these studies to deepen your understanding of the subject.

