Regulatory genomics

  • Vavoury lab
Campus ICO-Germans Trias i Pujol

Office 1- 5 (first floor)

Germans Trias i Pujol Research Institute (IGTP)
Muntanya Building
Can Ruti CampusCtra de Can Ruti
Camí de les Escoles s/n08916 Badalona, Barcelona, Spain



We are a group of computational biologists seeking to understand how gene expression is affected by mutations, drug treatments and environmental exposure. Our research approach involves analysing high-throughput gene expression and related data with a view to testing or proposing hypotheses about molecular mechanisms and shedding light on the interaction between environment, genotype and phenotype. We work with genomic data from public databases, as well as data generated in collaboration with labs with complementary expertise.


Regulation of gene expression is the fine-tuning of the synthesis of the functional product of genes and is one of the most fundamental processes in life. It is the process that makes different cell types have different properties and differentiates unhealthy from healthy cells. Gene expression is regulated by internal signals (the activity of other genes, mutations, etc.) and by external signals (diet, temperature, pharmacological therapies, etc.). 

Since 2011, we have been studying different aspects of gene regulation, including chromatin, DNA methylation and small non-coding RNAs. We have also been looking at how different types of exposure and drug treatments affect gene expression. We have a long-standing interest in the non-coding part of the genome, specifically non-coding sequence elements that affect gene regulation.

Our current research

Our research focuses on three main areas. Firstly, we study the effect of the environment on gene expression changes that are transmitted from parents to their offspring.We want to understand how information about our exposure to different environments may be encoded in molecules - other than DNA - inside germ cells that are transmitted between generations. Transmission of non-genetic information can influence an individual’s phenotype, or disease risk. We would like to find out which molecules in the germline carry such information.

Secondly, we work on non-coding RNAs and other non-coding elements that influence gene expression. We are interested in which non-coding elements affect gene expression and how. These include distal enhancers, small non-coding RNAs and transposable elements. Most genetic variations between individuals occur within the non-coding parts of our genomes. We want to understand which of these variations influence gene expression and potentially phenotype/disease risk.

Finally, we want to understand how epigenetic drugs affect gene expression and chromatin in different genomic contexts. Epigenetic drugs currently used in the clinic include those for the treatment of patients with acute myeloid leukaemia and myelodysplastic syndrome. Our work involves analysing data from experiments on human cell lines. A more in-depth understanding of the effects of these drugs and how they work may lead to improved or more personalized medicine in the future.

Our goals

Our aim is to contribute to a better understanding of gene regulation and the consequences of drug treatments and inter-individual genetic variation in gene expression. Although most of our research is based on data from animal model organisms or cell lines, we hope that, in the long term, the knowledge acquired will increase our understanding about humans.

Extensive aberrant gene expression and genome deregulation are extremely common in cancer, especially haematological forms, and treatments targeting gene regulation pathways are being used for haematological malignancies.

Last, but not least, we hope that the data we generate and the analysis methods we develop serve as useful tools for the wider research community. 


2018 - present

Teaching collaborator on the Màster de bioinformàtica i bioestadística de la calidad, Open University of Catalonia/University of Barcelona


2016 IED Program I3 award

2015 Catalan Predoctoral Fellowship (AGAUR) to Eduard Casas

2015 Group favorably evaluated by the Ramon y Cajal I3 Programme

2014 Spanish National Postdoctoral Fellowship (MINECO) to Yulia Medvedeva

2014 Max Planck - Prince of Asturias Award Mobility Grant to Eduard Casas

2014 Group recognized by the Catalan Research Agency (AGAUR)

2014 EpiGeneSys Travel Fellowship to Eduard Casas

2013 Elected Associate Member of the EU Network of Excellence EpiGeneSys

2011 Spanish National Research Grant (MICINN)

2011 European Reintegration Grant - Framework Programme 7

2010 Ramon y Cajal Award to Tanya Vavouri


Selected publications

Klosin A, Reis K, Hidalgo-Carcedo C, Casas E, Vavouri T, Lehner B

Impaired DNA replication derepresses chromatin and generates a transgenerationally inherited epigenetic memory.

Sci Adv Aug 2017, 3 (8) e1701143. Epub 16 Aug 2017
Impaired DNA replication is a hallmark of cancer and a cause of genomic instability. We report that, in addition to causing genetic change, impaired DNA replication during embryonic development can have major epigenetic consequences for a genome. In a genome-wide screen, we identified impaired DNA replication as a cause of increased expression from a repressed transgene in Caenorhabditis elegans. The acquired expression state behaved as an "epiallele," being inherited for multiple generations before fully resetting. Derepression was not restricted to the transgene but was caused by a global reduction in heterochromatin-associated histone modifications due to the impaired retention of modified histones on DNA during replication in the early embryo. Impaired DNA replication during development can therefore globally derepress chromatin, creating new intergenerationally inherited epigenetic expression states.
More information
Klosin A, Casas E, Hidalgo-Carcedo C, Vavouri T, Lehner B

Transgenerational transmission of environmental information in C. elegans.

Science 21 Apr 2017, 356 (6335) 320-323.
The environment experienced by an animal can sometimes influence gene expression for one or a few subsequent generations. Here, we report the observation that a temperature-induced change in expression from a Caenorhabditis elegans heterochromatic gene array can endure for at least 14 generations. Inheritance is primarily in cis with the locus, occurs through both oocytes and sperm, and is associated with altered trimethylation of histone H3 lysine 9 (H3K9me3) before the onset of zygotic transcription. Expression profiling reveals that temperature-induced expression from endogenous repressed repeats can also be inherited for multiple generations. Long-lasting epigenetic memory of environmental change is therefore possible in this animal.
More information
Pantano L, Jodar M, Bak M, Ballescà JL, Tommerup N, Oliva R, Vavouri T

The small RNA content of human sperm reveals pseudogene-derived piRNAs complementary to protein-coding genes.

RNA Jun 2015, 21 (6) 1085-95. Epub 22 Apr 2015
At the end of mammalian sperm development, sperm cells expel most of their cytoplasm and dispose of the majority of their RNA. Yet, hundreds of RNA molecules remain in mature sperm. The biological significance of the vast majority of these molecules is unclear. To better understand the processes that generate sperm small RNAs and what roles they may have, we sequenced and characterized the small RNA content of sperm samples from two human fertile individuals. We detected 182 microRNAs, some of which are highly abundant. The most abundant microRNA in sperm is miR-1246 with predicted targets among sperm-specific genes. The most abundant class of small noncoding RNAs in sperm are PIWI-interacting RNAs (piRNAs). Surprisingly, we found that human sperm cells contain piRNAs processed from pseudogenes. Clusters of piRNAs from human testes contain pseudogenes transcribed in the antisense strand and processed into small RNAs. Several human protein-coding genes contain antisense predicted targets of pseudogene-derived piRNAs in the male germline and these piRNAs are still found in mature sperm. Our study provides the most extensive data set and annotation of human sperm small RNAs to date and is a resource for further functional studies on the roles of sperm small RNAs. In addition, we propose that some of the pseudogene-derived human piRNAs may regulate expression of their parent gene in the male germline.
More information
Öst A, Lempradl A, Casas E, Weigert M, Tiko T, Deniz M, Pantano L, Boenisch U, Itskov PM, Stoeckius M, Ruf M, Rajewsky N, Reuter G, Iovino N, Ribeiro C, Alenius M, Heyne S, Vavouri T, Pospisilik JA

Paternal diet defines offspring chromatin state and intergenerational obesity.

Cell 4 Dec 2014, 159 (6) 1352-64.
The global rise in obesity has revitalized a search for genetic and epigenetic factors underlying the disease. We present a Drosophila model of paternal-diet-induced intergenerational metabolic reprogramming (IGMR) and identify genes required for its encoding in offspring. Intriguingly, we find that as little as 2 days of dietary intervention in fathers elicits obesity in offspring. Paternal sugar acts as a physiological suppressor of variegation, desilencing chromatin-state-defined domains in both mature sperm and in offspring embryos. We identify requirements for H3K9/K27me3-dependent reprogramming of metabolic genes in two distinct germline and zygotic windows. Critically, we find evidence that a similar system may regulate obesity susceptibility and phenotype variation in mice and humans. The findings provide insight into the mechanisms underlying intergenerational metabolic reprogramming and carry profound implications for our understanding of phenotypic variation and evolution.
More information
Blay N, Casas E, Galvan-Femenia, Graffelman J, deCid R, Vavouri T

Assessment of kinship detection using RNA-seq data

bioRxiv 13 Feb 2019, On-line preprint . Epub 13 Feb 2019
Analysis of RNA sequencing (RNA-seq) data from related individuals is widely used in clinical and molecular genetics studies. Sample labelling mistakes are estimated to affect more than 4% of published samples. Therefore, as a method of data quality control, a way to reconstruct pedigrees from RNA-seq data would be useful for confirming the expected relationships. Currently, reconstruction of pedigrees is based mainly on SNPs or microsatellites, obtained from genotyping arrays, whole genome sequencing and whole exome sequencing. Potential problems with using RNA-seq data for kinship detection are the low proportion of the genome that it covers, the highly skewed coverage of exons of different genes depending on expression level and allele-specific expression. In this study we assess the use of RNA-seq data to detect kinship between individuals, through pairwise identity-by-descent (IBD) estimates. First, we obtained high quality SNPs after successive filters to minimize the effects due to allelic imbalance as well as errors in sequencing, mapping and genotyping. Then, we used these SNPs to calculate pairwise IBD estimates. By analysing both real and simulated RNA-seq data we show that it is possible to identify up to second degree relationships using RNA-seq data of even low to moderate sequencing depth.
Show all publications

Current projects

La evolución de nuevos ARN que interactúan con PIWI en mamíferos

Project leader:Tanya Vavouri
Start date:01/06/2020
End date:31/05/2024

Previous projects

PIWI-interacting RNAs - their evolution and their role in epigenetic inheritance and cancer

Project leader:Tanya Vavouri
Start date:01/01/2016
End date:31/12/2018

Ayudas para incentivar la incorporación estable de doctores (IED).

Project leader:Tanya Vavouri
Start date:01/01/2017
End date:31/12/2018