Epigenomic data analysis

Uncover epigenetic mechanisms of gene regulation in development and disease.

Epigenomics characterizes the chromatin state down to minuscule chemical modifications. Epigenetic changes to the DNA and associated proteins affect gene expression and may lead to altered cellular states, including diseases.

We analyze a wide range of epigenomic sequencing data in order to gain deeper understanding of intra-cellular molecular mechanisms and to identify biomarkers for diseases.

Below we discuss common epigenomic data types and analyses, and present some of our past work involving epigenomic data analysis. To discuss your epigenomic bioinformatics needs, just leave us a message.

Leave us a short description of your bioinformatics needs and we will be in touch very soon!

Epigenomic assays

High-throughput assays for epigenomic profiling are numerous, and new protocols are being developed continuously. The most common epigenomic assays focus on DNA methylation, DNA-binding proteins, histone modifications, chromatin accessibility or the 3D conformation of the chromatin.

  • DNA methylation. DNA methylation assays based on bisuplhite-treated DNA enable identifying methylation events at the highest resolution. Such assays use next-generation sequencing (whole-genome or reduced representation bisulphite-sequencing) or microarrays. An alternative approach, MeDIP-sequencing, relies on immunoprecipitation and suffers from lower resolution. Of note, the long-read Oxford Nanopore sequencing technology enables detecting methylated CpGs directly.

  • Transcription factor binding and histone modifications. Assays to identify DNA-bound proteins such as transcription factors, as well as chemical modifications to the histone proteins, make use of antibodies. ChIP-sequencing is the most common method, but newer alternatives with better resolution have been developed. These include ChIP-exo, Chipmentation, CUT&RUN and CUT&Tag.

  • Chromatin accessibility. The gold standard assay for mapping regions of open chromatin is ATAC-sequencing. ATAC-seq has largely replaced previous methods such as DNase-seq and FAIRE-seq.

  • Chromatin conformation. The importance of the chromatin's three-dimensional conformation has gained particular appreciation in the study of gene regulation, and it is also commonly employed in de novo genome assembly. Chromatin conformation assays can be used to study the physical interactions between genes and their distal regulatory elements as well as the proteins that cause such looping of the chromatin. Hi-C is a typical assay for the former, while ChIA-PET can be applied to the latter.

To study the epigenome's direct effect on gene expression, epigenomic measurements are often complemented with RNA-sequencing experiments in the same setting.

Single-cell experiments, particularly single-cell ATAC-sequencing, is increasingly performed as a co-assay with single-cell RNA-sequencing. This yields gene expression and chromatin accessibility profiles from the same individual cells.

Peak calling and annotation

The analysis workflow for most sequencing-based epigenomic data (particularly ChIP-seq, ATAC-seq and related experiments) involves identifying, annotating and analysing peaks, or genomic regions with signal of interest.

The raw sequencing reads are first quality-controlled and aligned to a reference genome, after which possible control libraries (pre-IP input and IP with non-specific antibody, in the case of ChIP-seq) are used to normalize the read coverage signal.

Peaks in the signal are identified using a peak caller tool. This phase may require careful parameter tuning to optimize the analysis to the used protocol.

To enable further analysis, peaks are annotated with relevant information such as read statistics, and near or overlapping features such as genes, regulatory elements and binding motifs.

Annotating peaks with genes enables gene set enrichment analyses for further interpretation of downstream effects.

Exploratory analysis

Annotated peaks across the sample set are visualized using PCA (and UMAP or t-SNE algorithms for single-cell data) and heatmaps. These visualizations help in optimizing the peak calling process and answer questions such as:

  • Do the biological replicates resemble each other with regards to their epigenomic profiles?
  • Do distinct sample groups (e.g., different tissues, treatments or time points) form separate clusters?
  • Are there outlier samples?

Differential peak analysis

To compare different conditions, the identified peaks can be statistically compared — or, more commonly, differential peaks can be directly called from the respective read coverage signals.

Similar to differential gene expression analysis, differential peak analysis yields estimates on the effect size and statistical significance. These statistics can be visualized as a volcano plot.

As genome-wide epigenomic measurements yield a continuous signal across the genome, such analyses may also focus on specific regions of interest, such as promoters or known binding sites of a protein of interest. Density heatmaps are used to visualize the signal at sites of interest in different conditions.

Furthermore, overlapping binding motifs at the peaks can be statistically compared between conditions and visualized as volcano plots.

Transcription factor binding site analyses

ChIP-seq and related protocols can be used to identify transcription factor (TF) binding sites across the genome. Such assays rely on antibodies specific to the protein of interest, and this approach thus enables identifying binding sites of just one TF. ATAC-seq data, on the other hand, can be used to identify binding sites of all DNA-bound proteins in parallel, through an analysis called TF footprinting.

In TF footprinting, narrow drops in the chromatin accessibility signal are interpreted as protein binding sites. The identity of the TF may be indirectly inferred from binding motifs. Coupled with RNA-seq data, TF footprinting can be used to study the combined effects of TFs on gene expression in a very high-throughput manner.

DNA methylation data analysis

The analysis of DNA methylation data starts with the quality control and alignment of sequencing reads (or QC and normalization of array data), and proceeds to calling the methylated sites.

Detected methylated sites are used to identify larger regions of DNA methylation or differentially methylated regions (DMRs) between samples. These regions can be annotated similarly as peaks in other epigenomic data.

Possible downstream analyses for DNA methylation data include:

  • Integration with gene expression data. When RNA-seq or other gene expression data is available from the same setting, the association of promoter methylation and gene expression can be studied.
  • Epigenetic biomarker discovery. DNA methylation data from patient samples enables discovering clinically revelant epigenetic markers.
  • Biological age analysis. Epigenetic models of biological aging have been developed for DNA methylation data. Such models can be used to estimate the biological, as opposed to chronological, age of an individual or specific tissue within an individual.

Integrating RNA-seq and epigenomic data

Performing RNA-seq and epigenomic sequencing (such as ChIP or ATAC-seq) on the same samples enables integrative analyses to study gene regulatory programs genome-wide.

Regulatory connections can be identified between enhancers and their target genes, as well as transcription factors and their targets, building on evidence from both gene expression and the epigenomic status of regulatory elements.

Learn more

screen_shot_2022-11-04_at_4_31_31_pm.png

Meet some of our epigenomics experts

I am a senior bioinformatician and molecular biologist with several years of experience in generating and analysing different types of Next Generation Sequencing data.

I have used a large range of bioinformatics tools and have in-depth knowledge of different types of genomics data, including DNA methylation (sequencing and Illumina arrays), whole-genome sequencing, RNA-seq, single-cell RNA-seq, ATAC-seq and ChIP-seq. I have worked with human, rat, mouse, zebrafish data and cell line data. During my academic work I have applied pathway and network analysis methods to identify changes with disease and drug exposure.

My background in molecular biology coupled with my enthusiasm and expertise in bioinformatics put me in a unique position to understand and analyse large biological datasets.

Joana Viana
Joana Viana Scientific Project Manager Genevia Technologies Oy

I am a clinically-trained senior bioinformatics scientist with 10 years of experience in veterinary medicine, molecular biology, and functional genomics. My research background includes human and animal genetics, epigenetics, neurobiology, and microbiology.

I have expertise analyzing a variety of next-generation sequencing data types, including transcriptomics (RNA-seq, sc/snRNA-seq), epigenomics (WGBS, RRBS, ONT), metagenomics (16S, ONT), and de novo genome assembly and annotation.

I have worked with organisms across the tree of life, including viruses, bacteria, insects, fish, wild mammals, and humans.

Nicole Flack
Nicole Flack Scientific Project Manager Genevia Technologies Oy

I am an experienced biologist/bioinformatician specialized in mapping and functionally interrogating DNA regulatory elements and their target genes in normal cell development and disease using high-throughput genomics and genome editing tools.

I have over 7 years of experience in profiling the accessible chromatin using high-throughput methods (DNase I-seq, ATAC-seq, ChIP-seq, CUT&RUN) and transcriptomic profiling (RNA-seq) along hematopoietic development. I have also 4+ years of experience in implementing and analyzing single-cell multi-omic data (10X Genomics scRNA-seq, scATAC-seq) using a variety of computational tools.

Additionally, my all-around experience in life science data analysis and statistical implementation includes cancer database mining, mutational signature analysis, survival analysis, and machine learning applications.

Grigorios Georgolopoulos
Grigorios Georgolopoulos Bioinformatics Team Lead Genevia Technologies Oy

Learn more

References and case studies

All references

Selected publications from our customers

  • Peeters, J. G. C. et al. (2024). Hyperactivating EZH2 to augment H3K27me3 levels in regulatory T cells enhances immune suppression by driving early effector differentiation. Cell reports, 43(9), 114724. Advance online publication. https://doi.org/10.1016/j.celrep.2024.114724
  • Ness, C. et al. (2021). Integrated differential DNA methylation and gene expression of formalin-fixed paraffin-embedded uveal melanoma specimens identifies genes associated with early metastasis and poor prognosis. Experimental eye research, 203, 108426. https://doi.org/10.1016/j.exer.2020.108426
  • Tarkkonen, K. et al. (2017). Comparative analysis of osteoblast gene expression profiles and Runx2 genomic occupancy of mouse and human osteoblasts in vitro. Gene, 626, 119–131. https://doi.org/10.1016/j.gene.2017.05.028

Selected publications from our team

  • Onfray, C. et al. (2024). Unraveling hallmark suitability for staging pre- and post-implantation stem cell models. Cell reports, 43(5), 114232. https://doi.org/10.1016/j.celrep.2024.114232
  • Aakula, A. et al. (2023). RAS and PP2A activities converge on epigenetic gene regulation. Life science alliance, 6(5), e202301928. https://doi.org/10.26508/lsa.202301928
  • Zijlmans, D. W. et al. (2022). Integrated multi-omics reveal polycomb repressive complex 2 restricts human trophoblast induction. Nature cell biology, 24(6), 858–871. https://doi.org/10.1038/s41556-022-00932-w
  • Rodriguez-Martinez, A. et al. (2022). Novel ZNF414 activity characterized by integrative analysis of ChIP-exo, ATAC-seq and RNA-seq data. Biochimica et biophysica acta. Gene regulatory mechanisms, 1865(3), 194811. Advance online publication. https://doi.org/10.1016/j.bbagrm.2022.194811
  • Pekkarinen, M. et al. (2022). Integrative DNA methylation analysis of pediatric brain tumors reveals tumor type-specific developmental trajectories and epigenetic signatures of malignancy. bioRxiv 2022.03.14.483566; doi: https://doi.org/10.1101/2022.03.14.483566
  • Taavitsainen, S. et al. (2021). Single-cell ATAC and RNA sequencing reveal pre-existing and persistent cells associated with prostate cancer relapse. Nature communications, 12(1), 5307. https://doi.org/10.1038/s41467-021-25624-1
  • Georgolopoulos, G. et al. (2021). Discrete regulatory modules instruct hematopoietic lineage commitment and differentiation. Nature communications, 12(1), 6790. https://doi.org/10.1038/s41467-021-27159-x
  • Cilenti, F. et al. (2021). A PGE2-MEF2A axis enables context-dependent control of inflammatory gene expression. Immunity, 54(8), 1665–1682.e14. https://doi.org/10.1016/j.immuni.2021.05.016
  • Garcia-Manteiga, J. M. et al. (2021). Identification of differential DNA methylation associated with multiple sclerosis: A family-based study. Journal of neuroimmunology, 356, 577600. https://doi.org/10.1016/j.jneuroim.2021.577600
  • Rajamäki, K. et al. (2021). Genetic and Epigenetic Characteristics of Inflammatory Bowel Disease-Associated Colorectal Cancer. Gastroenterology, 161(2), 592–607. https://doi.org/10.1053/j.gastro.2021.04.042
  • Kukkonen, K. et al. (2021). Chromatin and Epigenetic Dysregulation of Prostate Cancer Development, Progression, and Therapeutic Response. Cancers, 13(13), 3325. https://doi.org/10.3390/cancers13133325
  • Linna-Kuosmanen, S. et al. (2021). NRF2 is a key regulator of endothelial microRNA expression under proatherogenic stimuli. Cardiovascular research, 117(5), 1339–1357. https://doi.org/10.1093/cvr/cvaa219
  • Verta, J. P. et al. (2021). Genetic Drift Dominates Genome-Wide Regulatory Evolution Following an Ancient Whole-Genome Duplication in Atlantic Salmon. Genome biology and evolution, 13(5), evab059. https://doi.org/10.1093/gbe/evab059

  • ENCODE Project Consortium et al. (2020). Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature, 583(7818), 699–710. https://doi.org/10.1038/s41586-020-2493-4
  • Lu, Y. et al. (2020). Interleukin-33 Signaling Controls the Development of Iron-Recycling Macrophages. Immunity, 52(5), 782–793.e5. https://doi.org/10.1016/j.immuni.2020.03.006
  • Policicchio, S. et al. (2020). Genome-wide DNA methylation meta-analysis in the brains of suicide completers. Translational psychiatry, 10(1), 69. https://doi.org/10.1038/s41398-020-0752-7
  • Viiri, L. E. et al. (2019). Extensive reprogramming of the nascent transcriptome during iPSC to hepatocyte differentiation. Scientific reports, 9(1), 3562. https://doi.org/10.1038/s41598-019-39215-0
  • Bénéchet, A. P. et al. (2019). Dynamics and genomic landscape of CD8+ T cells undergoing hepatic priming. Nature, 574(7777), 200–205. https://doi.org/10.1038/s41586-019-1620-6
  • Laing, L. V. et al. (2018). Sex-specific transcription and DNA methylation profiles of reproductive and epigenetic associated genes in the gonads and livers of breeding zebrafish. Comparative biochemistry and physiology. Part A, Molecular & integrative physiology, 222, 16–25. https://doi.org/10.1016/j.cbpa.2018.04.004
  • Moreau, P. R. et al. (2018). Transcriptional Profiling of Hypoxia-Regulated Non-coding RNAs in Human Primary Endothelial Cells. Frontiers in cardiovascular medicine, 5, 159. https://doi.org/10.3389/fcvm.2018.00159
  • Viana, J. et al. (2017). Schizophrenia-associated methylomic variation: molecular signatures of disease and polygenic risk burden across multiple brain regions. Human molecular genetics, 26(1), 210–225. https://doi.org/10.1093/hmg/ddw373
  • Bouvy-Liivrand, M. et al. (2017). Analysis of primary microRNA loci from nascent transcriptomes reveals regulatory domains governed by chromatin architecture. Nucleic acids research, 45(17), 9837–9849. https://doi.org/10.1093/nar/gkx680
  • Hannon, E. et al. (2016). Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nature neuroscience, 19(1), 48–54. https://doi.org/10.1038/nn.4182
  • Hannon, E. et al. (2016). An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation. Genome biology, 17(1), 176. https://doi.org/10.1186/s13059-016-1041-x
  • Laing, L. V. et al. (2016). Bisphenol A causes reproductive toxicity, decreases dnmt1 transcription, and reduces global DNA methylation in breeding zebrafish (Danio rerio). Epigenetics, 11(7), 526–538. https://doi.org/10.1080/15592294.2016.1182272
  • Kumsta, R. et al. (2016). Severe psychosocial deprivation in early childhood is associated with increased DNA methylation across a region spanning the transcription start site of CYP2E1. Translational psychiatry, 6(6), e830. https://doi.org/10.1038/tp.2016.95
  • Pidsley, R. et al. (2014). Methylomic profiling of human brain tissue supports a neurodevelopmental origin for schizophrenia. Genome biology, 15(10), 483. https://doi.org/10.1186/s13059-014-0483-2
  • Viana, J. et al. (2014). Epigenomic and transcriptomic signatures of a Klinefelter syndrome (47,XXY) karyotype in the brain. Epigenetics, 9(4), 587–599. https://doi.org/10.4161/epi.27806

Browse all

Contact us

Leave your email address here with a brief description of your needs, and we will contact you to get things moving forward!

Antti Ylipää
Antti Ylipää CEO, co-founder Genevia Technologies Oy +358 40 747 7672