logo
ResearchBunny Logo
Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research

Agriculture

Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research

C. Kern, Y. Wang, et al.

Discover how gene regulatory elements drive phenotypic variation in key agricultural species! This groundbreaking research from Colin Kern, Ying Wang, and their colleagues at the University of California, Davis, reveals the conservation of regulatory elements across species, providing vital insights for the fields of comparative epigenomics and agricultural research.

00:00
00:00
~3 min • Beginner • English
Introduction
The study addresses how regulatory elements (REs) in genomes of key domesticated animals (chicken, pig, cattle) are organized and conserved, and how these elements relate to complex traits important for agriculture. Given that most causal variants for complex traits reside in non-coding regulatory regions, the research aims to functionally annotate REs across tissues and compare their conservation and function across vertebrates (including human and mouse). The work is motivated by the need to improve food production sustainability and understand genotype-to-phenotype links, particularly under challenges like climate change and pandemics. The central questions are the extent of positional and functional conservation of REs across large evolutionary distances and how tissue-specific regulatory logic is preserved.
Literature Review
Previous comprehensive projects, notably ENCODE in human and mouse, established catalogs of promoters, enhancers, and other regulatory features and highlighted their roles in health, disease, and evolution. Prior comparative studies indicated that specific sequences and genomic positions of enhancers show limited conservation, whereas some regulatory properties and trans-acting circuitry can be conserved across species. Studies in non-model organisms and birds examined regulatory sequence evolution and lineage-specific regulatory innovation. However, broad comparative epigenomic analyses across mammals and birds with harmonized assays and tissues remained sparse, leaving open questions about conservation of epigenomic states, transcription factor (TF) occupancy, and enhancer–gene targeting relationships over deep evolutionary time.
Methodology
Study design and samples: Eight tissues (liver, lung, spleen, skeletal muscle, subcutaneous adipose, cerebellum, brain cortex, hypothalamus) were collected from sexually mature male individuals of chicken (Gallus gallus), pig (Sus scrofa), and cattle (Bos taurus). Two biological replicates per species (16 tissue samples/species) were processed. Tissues were flash frozen and stored at −80 °C. Assays: Generated six epigenomic data types per tissue: ChIP-seq for H3K4me3, H3K27ac, H3K4me1, H3K27me3, and CTCF; open chromatin via DNase-seq (chicken) or ATAC-seq (pig, cattle); RNA-seq for transcriptomes. In total, 240 ChIP-seq libraries produced 5.02B (chicken), 4.28B (pig), and 6.81B (cattle) reads; DNase-seq (15 libraries) totaled 0.81B reads; ATAC-seq totaled 1.04B reads (pig, 16 samples) and 1.19B reads (cattle, 15 samples). Data processing and QC: Reads were trimmed (Trim Galore), aligned (RNA-seq: STAR; ChIP-seq: BWA-MEM), filtered (MAPQ ≥30), duplicates removed (Picard). Peak calling used MACS2 (narrow marks q≤0.01; broad marks q≤0.05). Quality metrics (NRF, PBC1/2, NSC, RSC, JSD, FRiP) followed ENCODE guidelines. Data reproducibility was assessed via hierarchical clustering of read depth for ChIP/open chromatin and PCA for RNA-seq. Chromatin state modeling and RE identification: ChromHMM was trained across marks, tissues, and species to produce a 14-state model. Active states (1–6, 8, 9, 11) were merged per tissue and across tissues to define regulatory elements, annotated by activity and genomic context: TSS-proximal (within 2 kb of protein-coding TSS), genic (gene body, excluding TSS-proximal), and intergenic. Enrichment profiles for histone marks and accessibility were computed around RE centers. Open chromatin was identified from combined replicate DNase/ATAC alignments. Comparative analysis across species: Human and mouse datasets for matching tissues/stages from ENCODE were processed with the same pipeline. RE coordinates were mapped pairwise across genomes using Ensembl Compara amniota multiple alignment (Ensembl v99). An RE was considered positionally conserved if mapped coordinates overlapped a target-species RE by ≥1 bp. Epigenomic conservation was assessed by whether mapped coordinates showed RE activity in the target species. Evolutionary distances were obtained from TimeTree. KEGG pathway enrichment (DAVID) was performed on genes with promoters conserved across species. TF footprinting and motif enrichment: Footprints were called using HINT (DNase bias correction for chicken; ATAC mode for pig/cattle). Validation with CTCF ChIP-seq confirmed a high fraction of CTCF footprints. Motif enrichment within ±10 bp of footprints used HOMER vertebrate motif database; enrichment and TF gene expression patterns were compared across tissues and species. TAD prediction and enhancer–gene pairing: TADs were predicted per species using pooled CTCF ChIP-seq peaks with motif orientation to infer loops (FIMO; loop merging per Oti et al.). Predicted TADs covered 82% (chicken), 91% (pig), 92% (cattle) of genomes. Genes and REs with low variance were filtered (expression or H3K27ac variance threshold approximating housekeeping gene filter). Within each TAD, Spearman correlations were computed between H3K27ac signal at REs and TMM-normalized gene expression across samples; pairs with BH-adjusted p<0.05 were retained as putative RE–gene links. Correlations were compared against naïve nearest/overlapping-gene assignments. PCA on H3K27ac at REs targeting one-to-one orthologs assessed tissue versus species clustering. Trait variant overlap: Dairy cattle GWAS and QTL-derived imputed variants were lifted to ARS-UCD1.2 and intersected with cattle REs (BEDTools). P-value distributions inside vs outside REs were compared, and enrichment of variant categories (e.g., geQTL, mQTL, SQTL, eeQTL) within REs was quantified (Fisher exact tests).
Key Findings
- Generated a harmonized, multi-assay epigenomic and transcriptomic resource across eight tissues in chicken, pig, and cattle: 240 ChIP-seq libraries (5.02B reads chicken; 4.28B pig; 6.81B cattle), 15 DNase-seq libraries (0.81B reads), and ATAC-seq (1.04B pig; 1.19B cattle reads), with high reproducibility and ENCODE-grade QC. - ChromHMM 14-state models annotated functional states across genomes: proportion of genome with any functional state was 53% (chicken), 40% (pig), 31% (cattle), varying by tissue. - RE composition: Chickens had approximately half as many genic/intergenic enhancers as mammals, while TSS-proximal RE counts were similar. Majority of active REs overlapped open chromatin in active tissues: 75±12% (chicken), 75±12% (pig), 69±15% (cattle). Among expressed genes (≥1 CPM): 70% (chicken of 11,476), 79% (pig of 12,203), 78% (cattle of 13,074) had active TSS-proximal REs. - Positional and epigenomic conservation: Mapping rates of REs decreased with evolutionary distance; intergenic enhancers mapped least frequently at all distances. Among mapped elements, epigenomic conservation averaged 77±8% for promoters and 33±8.1% for enhancers (genic and intergenic). Conservation rates declined minimally with evolutionary distance. - Conserved core REs: Identified 9,458 REs conserved across mammals. Across all five species (including chicken), 3,153 promoters and 1,452 enhancers were conserved, indicating substantial conservation over >300 million years. Conserved enhancers were rarely tissue-specific; conserved promoters were enriched for KEGG pathways tied to basic metabolic processes. - TF footprinting: 26 TF motifs exhibited conserved tissue-specific enrichment across species (including human and mouse). Examples: FOXA2 and HNF1B enriched/expressed in liver; SIX1 in skeletal muscle; ETS1 and FLII associated with spleen-specific targets. - TADs and enhancer–gene predictions: Predicted TADs: chicken 1,302 (82% coverage; median 355,527 bp), pig 2,574 (91%; median 385,854 bp), cattle 3,248 (92%; median 370,785 bp). Correlation-based RE–gene links within TADs: chicken 29,526 pairs (10,937 REs; 5,519 genes; mean genes/RE 2.7; REs/gene 2), pig 58,523 (31,735 REs; 8,233 genes; 1.8; 3.8), cattle 28,849 (16,348 REs; 7,113 genes; 1.8; 4.1). Maximum predicted targets per enhancer: pig 33, chicken 23, cattle 22. Predicted pairs improved correlation patterns versus naïve nearest/overlap assignments (activating marks more positive; H3K27me3 more negative). - Cross-species regulatory similarity: PCA of H3K27ac at REs targeting orthologous genes clustered samples by tissue rather than species, indicating conserved tissue-specific regulatory programs despite low positional conservation. - Complex trait relevance: In dairy cattle GWAS, SNPs within REs showed p-value distributions skewed to greater significance for milk protein, fat, and volume. Variant categories enriched in REs, with geQTLs appearing ~2.5× more frequently in REs than uncategorized SNPs (Fisher exact p<1e-5), supporting regulatory functionality of annotated elements.
Discussion
The study demonstrates that while enhancer sequences and positions show limited conservation across distant vertebrates, the functional properties of regulatory elements—such as tissue-specific TF motif enrichment and enhancer–gene regulatory relationships—are conserved. A core set of promoters and enhancers is maintained across mammals and birds and is associated with essential metabolic pathways. Tissue clustering of H3K27ac at REs targeting orthologous genes further supports conserved regulatory logic. Chickens possess fewer enhancers overall yet achieve a comparable number of enhancer–gene interactions, suggesting higher multifunctionality per enhancer relative to mammals. The resource enhances interpretation of genetic variants associated with complex agricultural traits by prioritizing those located within functionally annotated REs and linking them to putative target genes. Collectively, these findings address the central question of conservation in regulatory logic across large evolutionary distances and provide practical avenues to connect genotype to phenotype in agricultural species.
Conclusion
This work delivers a harmonized multi-tissue functional annotation of chicken, pig, and cattle genomes and integrates comparative analyses with human and mouse. Key contributions include: a high-quality epigenomic resource; evidence for a conserved core of regulatory elements and conserved tissue-specific regulatory features across vertebrates; and a predictive framework linking enhancers to target genes within TADs. The resource improves prioritization of causative variants for complex traits (e.g., GWAS in dairy cattle). Future directions include expanding to additional tissues and developmental stages, incorporating female samples, applying single-cell technologies, and generating Hi-C maps to refine 3D genome architecture and enhancer–promoter interactions.
Limitations
- Positional conservation assessments rely on genome alignments and may miss functional conservation where sequence alignment is poor, particularly for distal enhancers. - Open chromatin assays differed by species (DNase-seq in chicken vs ATAC-seq in pig/cattle), introducing assay-specific biases in footprint detection and cross-species comparisons. - TADs were predicted using CTCF ChIP-seq rather than measured by Hi-C; while CTCF-based predictions are informative, they may not capture all chromatin interactions. - Sample set limited to adult male individuals, eight tissues, and two biological replicates per species; results may not generalize to females, other tissues, or developmental stages. - Correlation-based enhancer–gene links, while improved over nearest-gene methods, remain associative and require experimental validation.
Listen, Learn & Level Up
Over 10,000 hours of research content in 25+ fields, available in 12+ languages.
No more digging through PDFs, just hit play and absorb the world's latest research in your language, on your time.
listen to research audio papers with researchbunny