Epigenomic data analysis
Uncover epigenetic mechanisms of gene regulation in development and disease.
Epigenomics characterizes the chromatin state down to minuscule chemical modifications. Epigenetic changes to the DNA and associated proteins affect gene expression and may lead to altered cellular states, including diseases.
We analyze a wide range of epigenomic sequencing data in order to gain deeper understanding of intra-cellular molecular mechanisms and to identify biomarkers for diseases.
Below we discuss common epigenomic data types and analyses, and present some of our past work involving epigenomic data analysis. To discuss your epigenomic bioinformatics needs, just leave us a message.
Leave us a short description of your bioinformatics needs and we will be in touch very soon!
High-throughput assays for epigenomic profiling are numerous, and new protocols are being developed continuously. The most common epigenomic assays focus on DNA methylation, DNA-binding proteins, histone modifications, chromatin accessibility or the 3D conformation of the chromatin.
DNA methylation. DNA methylation assays based on bisuplhite-treated DNA enable identifying methylation events at the highest resolution. Such assays use next-generation sequencing (whole-genome or reduced representation bisulphite-sequencing) or microarrays. An alternative approach, MeDIP-sequencing, relies on immunoprecipitation and suffers from lower resolution.
Transcription factor binding and histone modifications. Assays to identify DNA-bound proteins such as transcription factors, as well as chemical modifications to the histone proteins, make use of antibodies. ChIP-sequencing is the most common method, but newer alternatives with better resolution have been developed. These include ChIP-exo, Chipmentation, CUT&RUN and CUT&Tag.
Chromatin accessibility. The gold standard assay for mapping regions of open chromatin is ATAC-sequencing. ATAC-seq has largely replaced previous methods such as DNase-seq and FAIRE-seq.
Chromatin conformation. The importance of the chromatin's three-dimensional conformation has gained particular appreciation recently. Chromatin conformation assays are used to study the physical interactions between genes and their distal regulatory elements as well as the proteins that cause such looping of the chromatin. Hi-C is a typical assay for the former, while ChIA-PET can be applied to the latter.
To study the epigenome's direct effect on gene expression, epigenomic measurements are often complemented with RNA-sequencing experiments in the same setting.
Single-cell experiments, particularly single-cell ATAC-sequencing, is increasingly performed as a co-assay with single-cell RNA-sequencing. This yields gene expression and chromatin accessibility profiles from the same individual cells.
Peak calling and annotation
The analysis workflow for most sequencing-based epigenomic data (particularly ChIP-seq, ATAC-seq and related experiments) involves identifying, annotating and analysing peaks, or genomic regions with signal of interest.
The raw sequencing reads are first quality-controlled and aligned to a reference genome, after which possible control libraries (pre-IP input and IP with non-specific antibody, in the case of ChIP-seq) are used to normalize the read coverage signal.
Peaks in the signal are identified using a peak caller tool. This phase may require careful parameter tuning to optimize the analysis to the used protocol.
To enable further analysis, peaks are annotated with relevant information such as read statistics, and near or overlapping features such as genes, regulatory elements and binding motifs.
Annotating peaks with genes enables gene set enrichment analyses for further interpretation of downstream effects.
Annotated peaks across the sample set are visualized using PCA (and UMAP or t-SNE algorithms for single-cell data) and heatmaps. These visualizations help in optimizing the peak calling process and answer questions such as:
- Do the biological replicates resemble each other with regards to their epigenomic profiles?
- Do distinct sample groups (e.g., different tissues, treatments or time points) form separate clusters?
- Are there outlier samples?
Differential peak analysis
To compare different conditions, the identified peaks can be statistically compared — or, more commonly, differential peaks can be directly called from the respective read coverage signals.
Similar to differential gene expression analysis, differential peak analysis yields estimates on the effect size and statistical significance. These statistics can be visualized as a volcano plot.
As genome-wide epigenomic measurements yield a continuous signal across the genome, such analyses may also focus on specific regions of interest, such as promoters or known binding sites of a protein of interest. Density heatmaps are used to visualize the signal at sites of interest in different conditions.
Furthermore, overlapping binding motifs at the peaks can be statistically compared between conditions and visualized as volcano plots.
Transcription factor binding site analyses
ChIP-seq and related protocols can be used to identify transcription factor (TF) binding sites across the genome. Such assays rely on antibodies specific to the protein of interest, and this approach thus enables identifying binding sites of just one TF. ATAC-seq data, on the other hand, can be used to identify binding sites of all DNA-bound proteins in parallel, through an analysis called TF footprinting.
In TF footprinting, narrow drops in the chromatin accessibility signal are interpreted as protein binding sites. The identity of the TF may be indirectly inferred from binding motifs. Coupled with RNA-seq data, TF footprinting can be used to study the combined effects of TFs on gene expression in a very high-throughput manner.
DNA methylation data analysis
The analysis of DNA methylation data starts with the quality control and alignment of sequencing reads (or QC and normalization of array data), and proceeds to calling the methylated sites.
Detected methylated sites are used to identify larger regions of DNA methylation or differentially methylated regions (DMRs) between samples. These regions can be annotated similarly as peaks in other epigenomic data.
Possible downstream analyses for DNA methylation data include:
- Integration with gene expression data. When RNA-seq or other gene expression data is available from the same setting, the association of promoter methylation and gene expression can be studied.
- Epigenetic biomarker discovery. DNA methylation data from patient samples enables discovering clinically revelant epigenetic markers.
- Biological age analysis. Epigenetic models of biological aging have been developed for DNA methylation data. Such models can be used to estimate the biological, as opposed to chronological, age of an individual or specific tissue within an individual.
Performing RNA-seq and epigenomic sequencing (such as ChIP or ATAC-seq) on the same samples enables integrative analyses to study gene regulatory programs genome-wide.
Regulatory connections can be identified between enhancers and their target genes, as well as transcription factors and their targets, building on evidence from both gene expression and the epigenomic status of regulatory elements.
Meet some of our epigenomics experts
I am an experienced biologist/bioinformatician specialized in mapping and functionally interrogating DNA regulatory elements and their target genes in normal cell development and disease using high-throughput genomics and genome editing tools.
I have over 7 years of experience in profiling the accessible chromatin using high-throughput methods (DNase I-seq, ATAC-seq, ChIP-seq, CUT&RUN) and transcriptomic profiling (RNA-seq) along hematopoietic development. I have also 4+ years of experience in implementing and analyzing single-cell multi-omic data (10X Genomics scRNA-seq, scATAC-seq) using a variety of computational tools.
Additionally, my all-around experience in life science data analysis and statistical implementation includes cancer database mining, mutational signature analysis, survival analysis, and machine learning applications.
I specialize in gene and genome regulation, particularly in immunology, cancer research, DNA repair and cellular senescence.
For over 10 years, I have developed and applied computational pipelines to analyze data from transcriptomic and epigenomic sequencing assays, including scRNA-seq, scATAC-seq, spatial transcriptomics, ChIP-seq, RNA-seq, GRO-seq, ATAC-seq, CAGE-seq, XR-seq, DRIP-seq, BLISS-seq, Damage-seq, INI-seq, and HiC.
I have enjoyed working in multidisciplinary teams — as a bioinformatician, postdoc researcher, head of a single cell NGS bioinformatics facility and, most recently, as a project manager at Genevia.
I am a senior bioinformatician and molecular biologist with several years of experience in generating and analysing different types of Next Generation Sequencing data.
I have used a large range of bioinformatics tools and have in-depth knowledge of different types of genomics data, including DNA methylation (sequencing and Illumina arrays), whole-genome sequencing, RNA-seq, single-cell RNA-seq, ATAC-seq and ChIP-seq. I have worked with human, rat, mouse, zebrafish data and cell line data. During my academic work I have applied pathway and network analysis methods to identify changes with disease and drug exposure.
My background in molecular biology coupled with my enthusiasm and expertise in bioinformatics put me in a unique position to understand and analyse large biological datasets.
References and case studies
Selected publications from our customers
- Ness, C. et al. (2021). Integrated differential DNA methylation and gene expression of formalin-fixed paraffin-embedded uveal melanoma specimens identifies genes associated with early metastasis and poor prognosis. Experimental eye research, 203, 108426. https://doi.org/10.1016/j.exer.2020.108426
- Tarkkonen, K. et al. (2017). Comparative analysis of osteoblast gene expression profiles and Runx2 genomic occupancy of mouse and human osteoblasts in vitro. Gene, 626, 119–131. https://doi.org/10.1016/j.gene.2017.05.028
Selected publications from our team
- Liakos, A. et al. (2023). Enhanced frequency of transcription pre-initiation complexes assembly after exposure to UV irradiation results in increased repair activity and reduced probabilities for mutagenesis. Nucleic acids research, 51(16), 8575–8586. https://doi.org/10.1093/nar/gkad593
- Simigdala, N. et al. (2023). Loss of Kmt2c in vivo leads to EMT, mitochondrial dysfunction and improved response to lapatinib in breast cancer. Cellular and molecular life sciences : CMLS, 80(4), 100. https://doi.org/10.1007/s00018-023-04734-7
- Aakula, A. et al. (2023). RAS and PP2A activities converge on epigenetic gene regulation. Life science alliance, 6(5), e202301928. https://doi.org/10.26508/lsa.202301928
- Armaka, M. et al. (2022). Single-cell multimodal analysis identifies common regulatory programs in synovial fibroblasts of rheumatoid arthritis patients and modeled TNF-driven arthritis. Genome medicine, 14(1), 78. https://doi.org/10.1186/s13073-022-01081-3
- Zijlmans, D. W. et al. (2022). Integrated multi-omics reveal polycomb repressive complex 2 restricts human trophoblast induction. Nature cell biology, 24(6), 858–871. https://doi.org/10.1038/s41556-022-00932-w
- Fanourgakis, S. et al. (2022). Histone H2Bub1 dynamics in the 5' region of active genes are tightly linked to the UV-induced transcriptional response. Comput. Struct. Biotechnol. J. https://doi.org/10.1016/j.csbj.2022.12.013
- Rodriguez-Martinez, A. et al. (2022). Novel ZNF414 activity characterized by integrative analysis of ChIP-exo, ATAC-seq and RNA-seq data. Biochimica et biophysica acta. Gene regulatory mechanisms, 1865(3), 194811. Advance online publication. https://doi.org/10.1016/j.bbagrm.2022.194811
- Voda, A. et al. (2022). Alternative paths to immune activation: the role of costimulatory risk genes for polygenic inflammatory disease in T helper cells. bioRxiv 2022.11.23.517727; https://doi.org/10.1101/2022.11.23.517727
- Pekkarinen, M. et al. (2022). Integrative DNA methylation analysis of pediatric brain tumors reveals tumor type-specific developmental trajectories and epigenetic signatures of malignancy. bioRxiv 2022.03.14.483566; doi: https://doi.org/10.1101/2022.03.14.483566
- Taavitsainen, S. et al. (2021). Single-cell ATAC and RNA sequencing reveal pre-existing and persistent cells associated with prostate cancer relapse. Nature communications, 12(1), 5307. https://doi.org/10.1038/s41467-021-25624-1
- Georgolopoulos, G. et al. (2021). Discrete regulatory modules instruct hematopoietic lineage commitment and differentiation. Nature communications, 12(1), 6790. https://doi.org/10.1038/s41467-021-27159-x
- Rajamäki, K. et al. (2021). Genetic and Epigenetic Characteristics of Inflammatory Bowel Disease-Associated Colorectal Cancer. Gastroenterology, 161(2), 592–607. https://doi.org/10.1053/j.gastro.2021.04.042
- Kukkonen, K. et al. (2021). Chromatin and Epigenetic Dysregulation of Prostate Cancer Development, Progression, and Therapeutic Response. Cancers, 13(13), 3325. https://doi.org/10.3390/cancers13133325
- Linna-Kuosmanen, S. et al. (2021). NRF2 is a key regulator of endothelial microRNA expression under proatherogenic stimuli. Cardiovascular research, 117(5), 1339–1357. https://doi.org/10.1093/cvr/cvaa219
Verta, J. P. et al. (2021). Genetic Drift Dominates Genome-Wide Regulatory Evolution Following an Ancient Whole-Genome Duplication in Atlantic Salmon. Genome biology and evolution, 13(5), evab059. https://doi.org/10.1093/gbe/evab059
- ENCODE Project Consortium et al. (2020). Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature, 583(7818), 699–710. https://doi.org/10.1038/s41586-020-2493-4
- Liakos, A. et al. (2020). Continuous transcription initiation guarantees robust repair of all transcribed genes and regulatory regions. Nature communications, 11(1), 916. https://doi.org/10.1038/s41467-020-14566-9
- Morianos, I. et al. (2020). Activin-A limits Th17 pathogenicity and autoimmune neuroinflammation via CD39 and CD73 ectonucleotidases and Hif1-α-dependent pathways. Proceedings of the National Academy of Sciences of the United States of America, 117(22), 12269–12280. https://doi.org/10.1073/pnas.1918196117
- Lu, Y. et al. (2020). Interleukin-33 Signaling Controls the Development of Iron-Recycling Macrophages. Immunity, 52(5), 782–793.e5. https://doi.org/10.1016/j.immuni.2020.03.006
- Policicchio, S. et al. (2020). Genome-wide DNA methylation meta-analysis in the brains of suicide completers. Translational psychiatry, 10(1), 69. https://doi.org/10.1038/s41398-020-0752-7
- Viiri, L. E. et al. (2019). Extensive reprogramming of the nascent transcriptome during iPSC to hepatocyte differentiation. Scientific reports, 9(1), 3562. https://doi.org/10.1038/s41598-019-39215-0
- Laing, L. V. et al. (2018). Sex-specific transcription and DNA methylation profiles of reproductive and epigenetic associated genes in the gonads and livers of breeding zebrafish. Comparative biochemistry and physiology. Part A, Molecular & integrative physiology, 222, 16–25. https://doi.org/10.1016/j.cbpa.2018.04.004
- Moreau, P. R. et al. (2018). Transcriptional Profiling of Hypoxia-Regulated Non-coding RNAs in Human Primary Endothelial Cells. Frontiers in cardiovascular medicine, 5, 159. https://doi.org/10.3389/fcvm.2018.00159
- Viana, J. et al. (2017). Schizophrenia-associated methylomic variation: molecular signatures of disease and polygenic risk burden across multiple brain regions. Human molecular genetics, 26(1), 210–225. https://doi.org/10.1093/hmg/ddw373
- Bouvy-Liivrand, M. et al. (2017). Analysis of primary microRNA loci from nascent transcriptomes reveals regulatory domains governed by chromatin architecture. Nucleic acids research, 45(17), 9837–9849. https://doi.org/10.1093/nar/gkx680
- Lavigne, M. D. et al. (2017). Global unleashing of transcription elongation waves in response to genotoxic stress restricts somatic mutation rate. Nature communications, 8(1), 2076. https://doi.org/10.1038/s41467-017-02145-4
- Hannon, E. et al. (2016). Methylation QTLs in the developing brain and their enrichment in schizophrenia risk loci. Nature neuroscience, 19(1), 48–54. https://doi.org/10.1038/nn.4182
- Hannon, E. et al. (2016). An integrated genetic-epigenetic analysis of schizophrenia: evidence for co-localization of genetic associations and differential DNA methylation. Genome biology, 17(1), 176. https://doi.org/10.1186/s13059-016-1041-x
- Laing, L. V. et al. (2016). Bisphenol A causes reproductive toxicity, decreases dnmt1 transcription, and reduces global DNA methylation in breeding zebrafish (Danio rerio). Epigenetics, 11(7), 526–538. https://doi.org/10.1080/15592294.2016.1182272
- Kumsta, R. et al. (2016). Severe psychosocial deprivation in early childhood is associated with increased DNA methylation across a region spanning the transcription start site of CYP2E1. Translational psychiatry, 6(6), e830. https://doi.org/10.1038/tp.2016.95
- Pidsley, R. et al. (2014). Methylomic profiling of human brain tissue supports a neurodevelopmental origin for schizophrenia. Genome biology, 15(10), 483. https://doi.org/10.1186/s13059-014-0483-2
- Viana, J. et al. (2014). Epigenomic and transcriptomic signatures of a Klinefelter syndrome (47,XXY) karyotype in the brain. Epigenetics, 9(4), 587–599. https://doi.org/10.4161/epi.27806
Leave your email address here with a brief description of your needs, and we will contact you to get things moving forward!