Genome-wide
discovery of human heart enhancers
Leelavati Narlikar, Noboru J Sakabe, Alexander A Blanski, Fabio E Arimura, John M Westlund, Marcelo A Nobrega and Ivan Ovcharenko
National Center for Biotechnology Information (NCBI), NLM, NIH
and
University of Chicago
The various organogenic programs deployed during embryonic development rely on the precise
expression of a multitude of genes in time and space. Identifying the cis-regulatory elements
responsible for this tightly orchestrated regulation of gene expression is an essential step in
understanding the genetic pathways involved in development. We describe a strategy to
systematically identify tissue-specific cis-regulatory elements that share combinations of sequence
motifs. Using heart development as an experimental framework, we employed a combination of Gibbs
sampling and linear regression to build a classifier that identifies heart enhancers based on the
presence and/or absence of various sequence features, including known and putative TF binding
specificities. In distinguishing heart enhancers from a large pool of random noncoding sequences, the
performance of our classifier is vastly superior to four commonly used methods, with an accuracy
reaching 92% in cross-validation. Furthermore, most of the binding specificities learned by our
method resemble the specificities of TFs widely recognized as key players in heart development and
differentiation, like SRF, MEF2, ETS1, SMAD, and GATA. Using our classifier as a predictor, a
genome-wide scan identified over 40,000 novel human heart enhancers. Although the classifier used
no gene expression information, these novel enhancers are strongly associated with genes expressed
in the heart. Finally, in vivo tests of our predictions in mouse and zebrafish achieved a validation rate
of 62%, significantly higher than what is expected by chance. These results support the existence of
underlying cis-regulatory codes dictating tissue-specific transcription in mammalian genomes and
validate our enhancer classifier strategy as a method to uncover these regulatory codes.
Click here to download the program.