Epigenomes or The How of Cardiovascular Disease
Post date: May 29, 2018 7:55:21 PM
Epigenomes or The How of Cardiovascular Disease
The tracks of the train point where it is going—they also reveal where it has been. The tracks of the leopard in the snow do only the latter. Which is the case for epigenetic processes in cardiovascular development and disease? Chromatin, the whatof epigenetics, enables bespoke storage and retrieval of genetic information across the hundreds of cells in a multicellular organism. Chromatin is perhaps the largest, in terms of physical size and list of component parts, and most functionally diverse molecular structure in existence—ex uno plurimum, as it were—enabling the vast diversity of physical form observed in the multicellular world.
Epigenesis connotes change and resistance to change. It implies directionality. In development, the setting in which epigenesis has been most extensively examined, the task of epigenetic processes is to guide the unidirectional differentiation of cells. In adulthood, epigenetic mechanisms provide stability, maintaining the blueprints laid down in development and resisting environmental stresses and stochastic intracellular changes. In cardiovascular disease,an area in which epigenetic processes are increasingly appreciated to operate and in which insights into human health are the most actionable, the task, teleologically, of chromatin is not all that clear. Are epigenetic processes at work in the cell during disease always combating the insult, attempting to restore and preserve healthy adult phenotypes? Have epigenetic processes been hijacked by evolution to contribute to cell survival decisions? What are the operational principles of epigenetics and how are they distinguished from gene regulation?
Because epigenetics can be defined as the integration of genetic information and environmental cues into a non-genetic substrate in a manner that is transgenerationally heritable, the modern epidemic of cardiovascular disease—which has arisen over the past half century (without sufficient time for substantive changes in human genetic evolution) and which afflicts diverse human populations with stark variability—has produced a surge of epigenetic studies, almost none of which definitively demonstrate integration of genetic and environmental cues into a non-genetic substrate in a manner that is transgenerationally heritable. Transgenerational inheritance of chromatin features like DNA methylation or histone modification conflates two separate concepts: epigenetics and Lamarckian evolution (or the inheritance of acquired traits). In a slightly less narrow and certainly more widely investigated definition, epigenetic processes are those that can be proven to persist in a differentiating cell lineage and/or in adult somatic cells under the constant task of remaining the right cell type…what might be thought of as epigenetic conveyance within a single organism. Colloquial use of the term epigenetics is broader still and might be defined something like: a process or feature that modifies gene expression. For the ensuing discussions, epigenetics refers to processes that include DNA modification, RNA and DNA-binding proteins (and their modification), with an emphasis on determining the extent to which these factors are durable and persistent in time.
Words matter. We ought not just go around calling anything that has to do with DNA “epigenetics.” Such considerations have fundamental implications for what is written in articles and text books, for what is being taught and for how things are investigated in the laboratory. The Oxford New American dictionary goes with: relating to or arising from nongenetic influences on gene expression. The Oxford English Dictionary defines epigenetics as: of or pertaining to, or of the nature of epigenesis; in turn defined as: the formation of an organic germ as a new product; theory of epigenesis: the theory that the germ is brought into existence (by successive accretions), and not merely developed, in the process of reproduction.The OED also sites, helpfully perhaps, G. H. Lewes in Hist. Philos. (1867) Proleg. As stating “With Mind, as with Body, there is not preformation or pre-existence, but evolution and epigenesis.” Among Waddington’s statements on the matter, the OED highlights: “The science concerned with the causal analysis of development” from his 1952 treatise. The meaning of words is not fixed, however, and authority in their usage must arise from cogent argument as well as consensus—neither alone being sufficient. For this essay, shall we consider the epigenome to consist of the genome, plus everything that can bind to and modify it?
Without proteins, DNA is incapable of anything too interesting in eukaryotes. Functionally orstructurally: there are no stable higher order structures of DNA that formin vivoindependent of histones or other chromatin binding proteins. Histone fold containing proteins are found in Archaea, and, although the mandatory heteromultimerization into nucleosomes that defines eukaryotic chromatin is absent in Archaea, structural studies indicate that DNA wraps around histone-like proteins in Archaea remarkably similar to how it does this in eukaryotes (Mattiroli, Bhattacharyya et al. 2017), suggesting that the last common ancestor to utilize these proteins to package DNA is quite ancient.
Epigenesis connotes specialization. Visualize Professor Waddington’s drawing of a contoured surface on which a ball rolls downhill towards the viewer into one trough or another—a powerful image of cellular differentiation and the epigenetic processes that underpin it (incidentally an even more prescient visual metaphor can be found a few pages later in Waddington’s 1957 treatise (Waddington 1957)[p. 36] in which he shows us what the underside of the Waddington landscape should look like, referring to the “genes and their chemical tendencies” that determine the contours of the surface and, by extrapolation, the range of possible paths the differentiating cell can take, as well as what gets them there). The key concepts of epigenetics and cellular specification are contained in this model: some type of magical entropic force (the gravity equivalent) guides a ball down one of the valleys once it has committed to said valley; the energy required to remain in the valley is less than that required to jump from one valley to another; the valleys themselves created by Waddington’s “chemical tendencies” affecting the genes, which we now know to include proteins, post-translational modifications of these proteins, modification to DNA, RNAs and the resulting quaternary structure of chromatin; and the movement is usually unidirectional, that is, towards a more differentiated state, although recent observations suggest that diseased cells adopt epigenomic features of more primitive cells as part of the disease process.
In studying chromatin there is a catch-22: if you trust the proteins, RNAs and modifications that have been studied longer, you have the weight of more publications to support your claim and the availability of reagents to perform genome-wide analyses. If you look for a new molecule, you may uncover a dramatic phenotype but will lack both the reagents and the weight of previous literature to test or propose universal function. Many investigators trust knockouts plus cell free reconstitution of a molecular event as the sine qua nonfor determining what’s real. Others trust genome wide measurements and demonstration across cell types, organ systems and organisms. Regarding this latter approach, modern techniques for analysis of chromatin structure provide complementary information about epigenomic processes. There are chromatin structural measurements that directly report chromatin structure—such as EM, FISH and X-ray crystallography. Then there are methods that indirectly report features of molecular interactions when coupled with DNA sequencing: chromatin immunoprecipitation (ChIP), chromatin capture family (e.g.Hi-C, 3/4/5C) and the chromatin accessibility family (e.g.ATAC, DNAase). Imaging based approaches tend to ignore locus and sequence specificity in favor of high resolution (X-ray and EM) or to target individual loci (FISH) at the loss of universality and resolution. Sequencing based approaches are inherently sequence specific and have the capacity to be genome wide but they do not reveal the actual 3D structure of chromatin. Instead one achieves a precise record of who is interacting with whom, a sort of 2D molecular dataset, that can then used to reassemble a 3D representation of the epigenome.
Chromatin may be viewed as the cell’s brain: it holds memory, gives commands, processes inputs and responds to stress. It coordinates the other functions of the cell in these scenarios. Like the brain, chromatin performs decision making tasks by integrating multiple inputs to create an appropriate cellular phenotype. Chromatin is the organelle that brings order to the various other organelles and processes in the cell, levying control through transcriptional regulation. Chromatin is necessary for multicellularity, to get the same blueprint to do different things. Let us next examine how chromatin accomplishes this feat in the cardiovascular system and consider implications for disease.
The protein components of chromatin
The epigenome has two fundamental components: (1) the structural features of chromatin and (2) the enzymes, RNAs, small molecules and processes that modify these features. For example, a nucleosome is the chromatin structural unit, but an ATP-dependent chromatin remodeler is an enzyme complex that modifies this structural unit. Here is how you make a nucleosome (Kornberg 1977): bind histone H3 to histone H4 and combine two of these dimers to form a tetramer; then combine this tetramer with histone H2A and histone H2B, twice, and embrace this octamer in turn with ~145-147 bp of DNA. Crystal structures (Luger, Mader et al. 1997)of the nucleosome in 1997 revealed this octameric protein complex to be assembled through histone fold dimers of H2A with H2B and H3 with H4. The intervening years have seen the discovery of multiple histone isoform variants that can exert structural changes at the atomic level on the functional properties of the nucleosome: one example is histone H2A.z, an H2A variant associated with active transcription, which operates by altering the interfaces between H2A-H2B dimers, as well as amongst the entire tetramer, partially destabilizing the nucleosome core (thereby facilitating nucleosome eviction during transcription) (Suto, Clarkson et al. 2000). Another example is centromeric histone H3, called CENP-A, which replaces H3 in centromeric nucleosomes, altering the interactions amongst histones but maintaining the octameric structure (Sekulic, Bassett et al. 2010). Mammalian genomes encode multiple versions of core histones, with each family hosting multiple variants of unknown significance. These histone isoform variants have been shown to participate in development and disease through altered roles in DNA replication, chromosome packaging and DNA accessibility (Buschbeck and Hake 2017, Talbert and Henikoff 2017). Proteomics experiments have demonstrated variable expression of these isoforms in different cell types, including in the heart (Franklin, Zhang et al. 2011), and the stoichiometry of histones changes with disease, such as in pressure overload hypertrophy (Franklin, Chen et al. 2012). In general, however, the nucleosome is amongst the most conserved protein structures known.
Where nucleosomes bind along the genome is influenced—certainly in reconstituted systems but likely also in vivoto some degree—by primary DNA sequence, with poly dA:dT rich regions being comparably depleted of nucleosomes, due in part to the biophysical properties of these regions that resist bending necessary for nucleosome binding (Segal and Widom 2009). The supra-nucleosome structures chromatin may form will be discussed in detail below, but an additional class of chromatin structural element includes non-nucleosome chromatin structural proteins, such as histone H1 family proteins, CTCF, and high mobility group (HMG) proteins. Histone H1, the so-called linker histone (also known as histone H5), is not part of the nucleosome core particle, but may participate in the formation of higher order chromatin fibers through interactions with nucleosome histones and DNA (Crane-Robinson 2016). Structures of nucleosomes with H1 (which has been termed the chromatosome (Simpson 1978)) show the latter nestled just outside of the core (Allan, Hartman et al. 1980, Ramakrishnan, Finch et al. 1993), making independent contacts with DNA, but whether linker histones are obligatory components of a structural feature in eukaryotic chromatin remains a matter of debate. Observations from a variety of eukaryotic systems implicate linker histone H1 proteins in regulating wide ranging actions of chromatin including repair/stability, replication, and transcription (Fyodorov, Zhou et al. 2017)—whether some or all of these functions are dependent on actions of H1 on chromatin structureare unknown (clarification: few of the studies directly measure chromatin structure or accessibility concomitant with interrogation of H1’s role on higher level phenotypes, or vice versa). CTCF and HMG proteins each have likewise been attributed a wide range of possible functions, based mostly on gain/loss of function studies in cell systems. As a general mechanism, CTCF binds DNA and facilitates chromatin looping (i.e. the formation of semi-stable long range intra-chromosomal interactions, spanning kilobases) (Phillips and Corces 2009); specificity may arise, or be imparted, on this action in different cell types to facilitate regulatory interactions. HMG proteins, of which there are many families, bind and stabilize distinct structural features in DNA, thereby contributing to high order chromatin anatomy and perhaps assemblies of nucleosomes. No consensus is forthcoming, however, about the existence or structural features of repeating units of chromatin endowed by HMG family proteins.
How nucleosomes position along chromatin is largely specified, in multicellular eukaryotes, by ATP-dependent chromatin remodeling enzymes and the trio of protein classes responsible for so-called the histone code or an epigenetic language, those being writers, erasers and readers, that add, remove or interpret histone modifications, respectively. Although all these processes exist in other organelles and compartments (indeed many purported histone modifiers in fact interface with non-histone, non-nuclear proteins), the allegorical “grammar” terminology adopted for chromatin PTMs has been particularly helpful in conceptualizing experiments into how gene expression is regulated and particularly effective in biasing the interpretation of results.
Since their discovery and association with transcription or its inhibition, histone modifications have been the source of innumerable question-beggingexperiments and seemingly endless rounds of debate regarding their relationship to gene regulation (Stedman and Stedman 1950, Turner 1991, Strahl and Allis 2000). Histone post-translational modifications were originally described in the 1960’s (Allfrey, Faulkner et al. 1964)but only with the advent of chromatin immunoprecipitation and DNA sequencing (ChIP-seq, and prior to that ChIP-CHIP, where the second chipreferred to the microarray chip used to identify the genomic region) studies could the occupancy of these modifications be correlated, in a genome-wide manner in a single experiment, with gene transcription.Histone H3 has been particularly well characterized and shown to exhibit conserved regulatory behavior across species and cell type, with consensus existing about the roles of H3K27me3 in reversible gene silencing, H3K27ac in gene activation, H3K4me3 in gene activation, H3K9me3 in more lasting gene silencing and heterochromatin, although other histone isoforms, such as residue K20me on histone H4 (associated with gene silencing), exhibit evolutionarily conserved relationships to gene regulation. Histone modifications are associated with a variety of processes not related directly to transcription including replication, DNA repair, mitosis, and cell division (Zhao and Garcia 2015, Buschbeck and Hake 2017, Talbert and Henikoff 2017), although these processes tend to be better understood in model organisms like yeast and not extensively studied in mammalian systems in vivo.
That consistent results are observed across labs and cell systems from experiments with the subset of histone PTMs for which antibodies are commercially available strongly supports the prevailing dogma about the correlation (some positive, some negative, some exhibiting combinatorial complexity) between histone modifications and gene expression. The histone code hypothesis (discussed in greater detail below), as originally articulated by Strahl and Allis (Strahl and Allis 2000), states that prescribed transcriptional or other genomic regulatory response arises, or is evoked, from a certain combination of histone PTMs in vivo. This hypothesis is intuitive and thus has been quite influential in driving the chromatin biology field for the last two decades. The histone code idea also establishes a framework in which to examine large amounts of ChIP-seq data, inspecting for regions of correlation between histone marks and gene expression. The resolution of most ChIP-seq “peaks” of ~200-1,000 bp, which is highly dependent on the informatics tools used to map reads and subsequently call peaks (Zentner and Henikoff 2014)(because protein occupancy on the genome makes a much smaller footprint, it is consequently often unknown exactly where a protein binds; the less commonly applied ChIP-exo(Rhee and Pugh 2011)and X-ChIP-seq (Skene and Henikoff 2015), achieving ~25-50bp resolution, are exceptions). Thus, one does not know whether histone PTMs occupy the same nucleosome, which would be required for a histone PTM “reader” protein to distinguish a full, accurate code from a partial and/or inaccurate code (partial exception e.g.: chromatin digestion to single nucleosomes coupled with mass spectrometry proved coexistence of modifications on the same particle [in an antibody independent manner], which incidentally revealed that bivalency can occur at the single nucleosome level [i.e. different histone H3 copies in the same complex have opposing—that is one has an active and the other a repressive—PTMs], so-called asymmetrically modified nucleosomes (Voigt, LeRoy et al. 2012)). From this information we will then want to know: how does a protein complex that acts across the genome exert effects on a specific subset of genes in different cell types and what do we learn about their (the ATP-dependent chromatin remodeler or any of the writer, eraser and reader proteins) modus operandi from a knockout experiment, as is often utilized to investigate epigenetic modifiers in cardiovascular pathophysiology?
Histone writers and erasers are signal transduction processes that sometimes use histones as substrates. The concept of the reader, however, is an indispensable component of epigenetic language and one without obvious equal outside the context of the chromatin (scaffolding proteins in signaling networks are a similar but distinct concept). To work as advertised—that is, to distinguish target region alpha from non-target beta, to bind alpha, and then to do something—readers must be able to simultaneously recognize more than one histone PTM. Let us go down to the single molecule level here for a second: this means that if the code requires two modifications, call them X and Y, to be present to generate output, say transcription, when bound by a reader, the reader must bind if and only if X and Y are both present on the same protein complex.The reader must therefore be endowed with intramolecular (or intermolecular, if the reader is a multiprotein complex) anatomical domains enabling: binding to X, binding to Y, cooperative structural changes that occur only when X and Y are bound and that in turn do something (e.g.binding of a nucleosome remodeler, binding of a RNA polymerase, or the prohibition of such events) and safeguard features that prevent the doing of somethingin the absence of binding of both X and Y. A reader may accomplish these tasks on neighboring histones in the same nucleosome—in other words, if two modifications occur to histone H3, for example, specifying an action distinct from the situation where either is present alone, then these may occur on different copies of the H3 present in the same nucleosome octamer—but this feat would require a still longer range inter/intramolecular communication with the above referenced considerations. The evidence definitively demonstrating any of these three tactics (same molecule, different copies of same protein in same nucleosome, or neighboring nucleosomes) exists (Sabari, Zhang et al. 2017), but is slim, and has not been demonstrated to operate in a mammalian tissue specific manner. Thus, widespread evidence is lacking for epigenetic readers acting according to the aforementioned principles in the cardiovascular system. Progress has recently been made in the area of heart failure, however, with the example of the BET bromodomain protein BRD4, whose inhibition with the small molecule JQ1 prevents hypertrophy and associated transcriptional changes (Anand, Brown et al. 2013, Spiltoir, Stratton et al. 2013)and whose pharmacological inhibition reverses some of the fibrotic and deleterious structural remodeling in the wake of infarction or pressure overload (having no effect on physiological hypertrophy) (Duan, McMahon et al. 2017).
Chromatin is often viewed as though all processes are active in disease: if observed, a protein, a modification, a structural feature is there for a reason. But what if we instead assume the exact opposite: that all chromatin features are reactive and that nothing is causative. In this scenario we acknowledge, rather than ignore, the precise temporal, intermolecular events that must be (and very, very rarely are) demonstrated to be true before something can be said to be actively involved in, say, transcription, and since in most cases this confirmatory evidence is lacking, we build a model based on data we do have, instead of based on data we think we should be able to get (some day). The data we have: (i) histone modifications and combinations of modifications correlate with transcriptional behaviors; (ii) genetic or pharmacologic manipulation of chromatin machinery causes disease; (iii) development is associated with decreased potentiality—in terms of gene expression and chromatin marks—of the genome and disease is associated with a partial recovery of potentiality. In this model, there is no first mover at the molecular level and the histone PTMs observed on chromatin are neither consequence nor cause of gene expression changes—they are coincident, although certainly not unrelated.
The modifications of histones, then. All four core histones are extensively post-translationally modified, with different histones (and different modifications) studied to varying extents and degrees of rigor and functional depth. The bane of accurate and reproducible proteomic mass spectrometry is that a truly unbiased experiment to detect PTMs will identify heaping loads of them, with no certainty about which ones are important, versus otiose signaling noise. The most common answer to the question of which PTMs are important for chromatin regulation is some variation on only the ones shown to be functionally involved in a phenotype.Gould and Lewontin borrowed the term spandrel (Gould and Lewontin 1979)(those ornate painted or chiseled thingamajigs on the arches of Renaissance buildings that were architecturally necessary but only later took on a decorative role) to refer to a biological feature that is a necessary component of something else but that itself is not directly evolutionarily selected for. This concept may inform thinking about evolution of histone marks: some keep chromatin open and others keep it closed. Chromatin accessibility was actively selected for, by virtue of its relationship to cellular plasticity and overt phenotypes, and histone marks were brought along in the process. Often the PTMs deemed most tantalizing are those previously published on in lower organisms and/or those for which commercially available, good antibodies exist (let’s set aside for now the problems with what the term “good” connotes versus what it should actually mean in this usage, and just say that good antibodies are those that produce a single band in a test cell lysate and in some fortunate cases have been shown to bind a recombinant version of protein against which they were raised; we can mitigate this concern, however (Rothbart, Dickson et al. 2015)). In the chromatin world, these PTMs and the associated antibodies are the ones getting the most game time in ChIP-seq experiments. When examining a new PTM, no good antibody, and/or no tractable cell system for transgenesis of a tagged surrogate, means no genome-wide address book, which means, for the present, the given modification is assumed to be noise. That histone modifications cannot be unequivocally examined by genetic approaches (i.e. by mutating a residue in the gene and studying the protein) is the reason why they are alternatively so frustrating and interesting.
The first attempts to understand proteomic remodeling of the cardiomyopathic nucleus were conducted in hamsters (Liew and Sole 1978)(decades before the word ‘proteomics’ was even coined) during the pre-dawn of two dimensional electrophoresis’s era as a common lab technique. These early studies did not directly measure histone PTMs—although they (the PTMs) likely were reflected in the 2D gel patterns reported in these investigations—owing to the absence of accurate and quantitative mass spectrometry of proteins and PTMs. More recently, quantitative analysis of nuclear proteins (Kislinger, Cox et al. 2006)or chromatin-associated proteins in the heart described the proteins regulating cardiac epigenomes (Frazer, Eskin et al. 2007)and revealed changes in histone protein isoform stoichiometry (Franklin, Chen et al. 2012)in the setting of pressure overload hypertrophy, yet these studies did not characterize histone PTMs. Renal nuclear proteomes have also been explored (Pickering, Grady et al. 2016), with physiological implications for cardiovascular disease. A field into itself, mass spectrometry-based identification of PTMs on histones wrestles with unique challenges (in contrast to PTM analyses on non-histone proteins), including: histones are small in size and contain few tryptic peptides (the species of peptide generated by trypsin digestion, the most common form of proteolytic treatment prior to peptide-based mass spectrometry); histone isoforms of the same family differ by only a few amino acids, further complicating accurate attribution of a PTM to one isoform and not another similar family member; and the modified species of the protein often represents a subset of the total pool of the protein, necessitating enrichment to achieve identification (this problem is not unique to histones but along with other considerations, is exacerbated when analyzing them). Across eukaryotes, ~500 post-translational individual modificationshave been identified just on the core nucleosome particle (Zhao and Garcia 2015). Very few of these have been identified, much less quantified and proficiently interrogated, in the cardiovascular system (most studies, to our knowledge, of histone PTM in the cardiovascular system rely on commercially available antibodies). Histone PTMs represent, thus, a large unexplored area of basic cardiovascular research in chromatin. Recent studies have employed large libraries of histone modifications, in some cases representing >100 different nucleosome combinations (note: these are discrete, tailored nucleosomes with known histone post-translational modifications, in contrast to the total list of histone modifications, for which it is rarely known whether the modifications are happening on the same nucleosome, let alone in the same copy of the histone, with the exception of proteomic studies on intact proteins (Zheng, Huang et al. 2016)), to investigate the ability of particular histone PTM combinations to influence the activity of human ISWI ATP-dependent chromatin remodelers (Dann, Liszczak et al. 2017). These libraries (Nguyen, Bittova et al. 2014)allow for the effects of individual recombinant nucleosomes with designer histone modifications (and combinations of modifications) to be tested in vitrofor their ability to influence a host of chromatin properties, such as the binding preferences for transcription factors or chromatin modifying enzymes.
Actions of histone modifying enzymes are manifold, dynamic and correlated with transcription
Investigations into the enzymes responsible for depositing and removing histone modifications have demonstrated striking phenotypes in a range of cardiovascular syndromes. Histone deacetylases (HDACs) are one of the most extensively studied families of histone-modifying enzymes in the cardiovascular system. HDACs consist of four families, each with distinct isoforms which in turn have distinct histone and—in some cases—non-histone targets, distinct cellular locations and distinct biological functions. Inhibition of HDACs has been shown pharmacologically (e.g. with trichostatin A or valproic acid) to prevent proliferation of vascular smooth muscle cells (Okamoto, Fujioka et al. 2006, Kong, Fang et al. 2009)(with implications for atherosclerosis (Zheng, Zhou et al. 2015)), to attenuate hypertension (Cardinale, Sriramula et al. 2010), to ameliorate ischemic/reperfusion injury and post-ischemic remodeling (Lee, Lin et al. 2007, Zhao, Cheng et al. 2007, Granger, Abdullah et al. 2008, Aune, Herr et al. 2014)and to block cardiac hypertrophy in the setting of heart failure (Kook, Lepore et al. 2003, McKinsey and Olson 2004, Cao, Wang et al. 2011). Molecular dissection of these phenomena, particularly in the setting of cardiac growth have revealed HDACs to be powerful hypertrophic modulators: loss of HDAC9 leads to prodigious cardiac growth (Zhang, McKinsey et al. 2002), HDAC4 and 5 regulate CamKII-dependent gene regulation (Zhang, Kohlhaas et al. 2007, Backs, Backs et al. 2008), HDAC2 regulates GSK3beta-Akt-dependent fetal gene activation in hypertrophy (Trivedi, Luo et al. 2007), to name just a few examples. Indeed, the literature on the role of HDAC targeting by drugs or genetic manipulation in the cardiovascular system is sufficiently vast to devour entire, authoritative reviews articles (McKinsey and Olson 2004, McKinsey 2011, Gillette and Hill 2015, Zheng, Zhou et al. 2015).
Histone acetyltransferases (HATs; the writer to the HDAC eraser) also constitute a large family of genes and have numerous nuclear and non-nuclear substrates (with type A being nuclear and type B being non-nuclear, mainly cytoplasmic) and have been characterized with varying degrees of specificity in cardiac and vascular cells (Pons, de Vries et al. 2009, Wang, Miao et al. 2014, Gillette and Hill 2015). One of the best characterized HATs is p300 which, in addition to its common residence at enhancer elements, has been shown to regulate genes that inhibit endothelial cell inflammation in the setting of atherosclerosis (Zhang, Qiu et al. 2011)and to attenuate salt-induced hypertensive heart failure (Morimoto, Sunagawa et al. 2008)and agonist-induced cardiac hypertrophy (Gusterson, Jazrawi et al. 2003)(the males absent on the first [MOF] HAT has similar anti-hypertrophic actions when overexpressed in the mouse (Qiao, Zhang et al. 2014)).
The histone methyltransferase (HMT) SET domain containing 2 (SETD2) was recently shown to be essential for myoblast differentiation in a process involving modification of histone H3K36 trimethylation (Yi, Tao et al. 2017). SET and MYND containing HMT (Smyd; a family with 5 isoforms of varying tissue expressivity) family member Smyd1, restricted to striated muscle, has been shown to participate in cardiac phenotype through loss of function studies in the adult heart, which resulted in hypertrophy and dilation and de-repression of some cardiac disease genes (Franklin, Kimball et al. 2016). Although Smyd1 localizes to the nucleus and interacts with chromatin in the adult heart (Franklin, Kimball et al. 2016)(and has been shown to regulate H3K4me3, an activating mark, in reconstituted systems (Tan, Rotllant et al. 2006); mice with Smyd1 depletion in adulthood exhibited sustained H3K4me3 levels (Franklin, Kimball et al. 2016), however, supporting the existence of alternative substrates for Smyd1 and clearly indicating the existence of alternative proteins capable of maintaining H3K4me3 in cardiac myocytes), a substantial portion of the protein is non-nuclear. This non-nuclear population of the protein has also been implicated in adult cardiac function (along with another family member Smyd2 (Voelkel, Andresen et al. 2013)) and heart development (Gottlieb, Pierce et al. 2002, Just, Meder et al. 2011, Rasmussen, Ma et al. 2015). A different SET family protein, G9a (also known as euchromatic histone lysine methyltransferase 2 − EHMT2), has been shown by loss of function studies to play a key role in adult cardiac phenotype: inducible MerCreMer depletion resulted in cardiac hypertrophy, modestly depressed ejection fraction and fibrosis through a mechanism that involves targeted derangement of multiple histone methylation marks (including H3K9me2/3 and H3K27me3) at genes involved in cardiac function (Papait, Serio et al. 2017). On the removal side of methylation, genetic loss or augmentation of the histone demethylase JMJD2A respectively blocked or exacerbated cardiac hypertrophy in mice (Zhang, Chen et al. 2011). A recent pharmacological study showed that inhibition of histone methylation at H3K9 with the compound chaetocin (which targets the enzyme [SU(VAR)3-9] responsible for conversion of H3K9me2 to H3K9me3) attenuates some aspects of salt-induced cardiac dysfunction (survival rate, fractional shortening and fibrosis)—while not affecting pathologic gene regulation and only modestly impacting hypertrophic growth—in part through diminished H3K9me3 (and presumably less silencing) at repetitive elements and mitochondrial genes (Ono, Kamimura et al. 2017).
The polycomb repressive complex (PRC) is one of the best studied gene silencing complexes (responsible for H3K27me3 deposition) and has been implicated—usually through the actions and/or genetic disruption of one or more of its components—in a wide variety of higher phenotypes in mammals. In the cardiovascular system, the Ezh subunits 1 and 2 were shown to be differentially involved in cardiac development and regeneration: both were necessary for normal development, with Ezh1 but not 2 being required for neonatal heart regeneration and with Ezh1 but not 2 being capable, via overexpression, to promote regeneration in the hearts of mice aged outside the established neonatal regenerative period (Ai, Yu et al. 2017), suggesting that features of myocyte proliferative/regenerative plasticity may be revived through histone modifying enzymes. Ezh2 stabilizes forming blood vessels in the mouse embryo (Delgado-Olguin, Dang et al. 2014)and pharmacologic inhibition of Ezh2 (and some H3K27me3 target loci) improved outcomes in hind limb ischemia (Mitic, Caporali et al. 2015). Some naturally occurring compounds target histone—and non-histone lysine residue-containing proteins—modulating enzymes and have substantial in vivobenefit in conditions such as cancer (Nian, Delage et al. 2009, Kim, Bisson et al. 2016). Since many of these compounds are present in dietary selections shown epidemiologically to promote cardiovascular health (such as cruciferous vegetables), part of this effect may be through the actions to promote—at the subcellular level and across organs systems—favorable epigenomic health.
While the histone isoforms and even specific residues targeted by individual enzymes have often been worked out in reconstituted systems in vitro, such information is almost universally lacking from such studies in animal models of the cardiovascular system (this observation is also true, incidentally, for most studies of acetylating/deacetylating and methylating/demethylating enzymes in non-cardiovascular systems when examined at the organ level in animal models). It is also important to note that, in apparent disavowal of their given names, many of these histone-modifying enzymes target non-histone substrates that have no obvious or even implied relationship to gene expression or epigenetics but yet exhibit powerful effects on complex organ level phenotypes (cardiovascular e.g.Smyd on titin (Voelkel, Andresen et al. 2013), HDACs on myofilaments (Jeong, Lin et al. 2018), and HDAC family members of the Sirtuin class which regulate many substrates in mitochondria and cytosol (Matsushima and Sadoshima 2015)).
As a result, the rubric from model systems and reconstitution experiments is often imported wholesale to the interpretation of results from animal studies in which enzymes are targeted pharmacologically or genetically in the absence of unequivocal evidence that the enzymes are actually operating as advertised and having their effects in the complex disease scenario through (and only through) the established molecular mechanisms demonstrated in yeast, cell culture or test tube. This leads to a couple interesting as-yet unanswered, and intimately related, questions about chromatin modifying enzymes in the cardiovascular system: First, what are the principles that allow for coordination of the various writers, readers and erasers in the given cell type at any time (by this it is meant: how do the histone modifiers themselves get turned on or off, up or down, and when turned on, how do they compete for influence over gene expression in a reproducible manner?). And second, how do the cadre of expressed-at-any-given-time enzymes decide which nucleosomes (and thereby, which genes) to modify (i.e.how is targeting accomplished, since most histone modifying enzymes do not have DNA sequence targeting motifs)? One approach to answer these questions would be to identify all the histone modifying enzymes expressed in the heart or vascular cell type of interest, to characterize their changes in expression and localization in disease models, and to perform both gain and loss of function genetic studies to reveal changes in intermediate endpoints like gene expression, metabolism and cell survival, while simultaneously, in the same disease models, performing genome-wide ChIP-seq for all the proteins (concomitant with RNA-seq and chromatin accessibility assays, as non-biased, global, output-assaying measurements of the cumulative effects on the epigenome) in the normal and diseased tissue and then attempting to integrate all of this data into some rigorous model that both incorporates all the gain/loss of function phenotypic observations with the truly massive amount of genome-wide occupancy data to predict, or at least in silicoreplicate, the intermediate phenotypes in a reproducible manner, thereby divining principles for how all the histone modifying enzymes together create the gene expression profiles and ultimate cellular phenotypes of health and disease, which frankly gives me the fantods just thinking about it. Another approach may be to focus on intermediate indices of epigenomic function like accessibility and structure, designing interventions that modulate these indices.
Chromatin remodelers use ATP to actively reposition nucleosomes
ATP-dependent chromatin remodeling enzyme complexes use the energy from ATP to translocate DNA through the nucleosome. That is, the ATP-dependent chromatin remodelers reposition nucleosomes along the genome according, in part, to cues harbored in the spectrum of histone tail PTMs, thereby enabling fundamental genomic processes like transcription, nucleosome assembly/disassembly, mitosis, meiosis and chromosome segregation. One of the most studied of these complexes is the SWItch/Sucrose Non-Fermentable (SWI/SNF) complex, originally identified in yeast (and known to have >10 protein components) and its mammalian cousin the BAF (brahma-associated factor) complex (itself composed of >10 protein components, some of which exhibit tissue specific expression). Models for the actions of these remodelers are informed by protein crystallography studies, in vitrobiochemical assays and ChIP-seq-based genome wide measurements and are thus highly developed and really intricate and can help to explain the observed dynamism of chromatin in development, between cell types and in stimulus-response (Ho and Crabtree 2010, Narlikar, Sundaramoorthy et al. 2013, Clapier, Iwasa et al. 2017). Because presence of bivalent marks at a given locus are necessary but not sufficient to specify a bivalent locus, recent studies have focused on evaluating the role of ATP-dependent chromatin remodelers on sealing the deal (Harikumar and Meshorer 2015). Studies in the cardiovascular system have examined the role of these complexes in tissue level phenotypes by genetic manipulation of the ATPase subunits of the BAF complex, Brm (Brahma) or Brg1 (Brahma-related gene 1, a.k.a. Smarca4). For example, genetic disruption of either of these molecules alone had no effect on retinal angiogenesis in neonates, exercise-induced angiogenesis in adult skeletal muscle, or tumor angiogenesis, whereas mice with disruption of both Brm and Brg1 after birth exhibited fatal vascular malfunction in the heart and gut during the early postnatal period (Wiley, Muthukumar et al. 2015)(similar context dependent functional redundancy, and lack thereof, was observed between Brm and Brg1 in the vascular endothelium, wherein disruption of both proteins in endothelial cells was required to observe tissue level defects (Willis, Homeister et al. 2012)). Brg1 has been shown to be involved in zebrafish myocyte proliferation and cardiac regeneration (Xiao, Gao et al. 2016), mesoderm, and hence cardiomyocyte, differentiation in cell culture, in part by modulating enhancer activity (Alexander, Hota et al. 2015), whereas both Brg1 and Smarca3 (a.k.a. BAF60c, another BAF complex member) are required for normal heart development in mouse (Lickert, Takeuchi et al. 2004, Hang, Yang et al. 2010, Takeuchi, Lou et al. 2011)(incidentally, Brg1 was found to be down-regulated in adult murine hearts and re-expressed, concomitant with the myosin heavy chain isoform switch [alpha to beta] associated with cardiac pathology; blocking Brg1 upregulation in the adult prevented this molecular event and attenuated hypertrophy (Hang, Yang et al. 2010)). Early formation of vasculature and erythropoiesis in mouse is dependent on Brg1 but not Brm in hematopoietic and endothelial cells (Griffin, Brennan et al. 2008). implying functional distinction between complexes seeded with these different ATPases during cardiovascular development (a similar conclusion was made from genetic disruption in smooth muscle cells (Zhang, Chen et al. 2011)), a functional involvement that may extend into adulthood in the setting of endothelial injury and presumably disease (Fang, Chen et al. 2013).
DNA methylation is stable, but not immutable, and can regulate gene expression and phenotype
The process unabashedly epigenetic in higher organisms is DNA methylation, common in vertebrate development where it reinforces cell fate decisions and controls imprinting, or the dependence of gene/protein expression (and associated phenotypes) on whether a given version of a gene is expressed from maternal or paternal allele. Unlike histone modifications, which can occur on any number of different amino acids apparently without heed to locale (that is, without clear DNA consensus motifs), DNA methylation targets a single residue, cytosine, usually in a single context (that is, when followed by a guanine, so called CpG dinucleotides; recent evidence suggests, however, that this too may be an oversimplification as non-CpG DNA methylation, so-called CpH methylation [where the H connotes A, T or C] occurs in some cells (Guo, Su et al. 2014), such as neurons [where it may account for 25% of methylation], although this has only begun to be explored in the cardiovascular system (Nothjunge, Nuhrenberg et al. 2017, Zhang, Wu et al. 2017)). Also contrasting with the plethora of proteins controlling histone PTM, DNA methylation is directly added or removed by a narrow suite of enzymes: maintenance (DNA methyltransferase 1, DNMT1) and de novo (DNMT3a and DNMT3b; DNMT2 modifies RNA and DNMTL is a catalytically inactive regulatory component of the methylation machinery) methyltransferases which establish methylation patterns after mitosis and replication, and alter the pattern of methylation during organismal development and disease, respectively. Demethylation of DNA occurs in part via non-enzymatic means during replication, as well as during normal and pathological conditions in non-dividing cells. Conversion of 5-methyl-cytosine (5mC) to 5-hydroxymethylcytosine (5-hmC) is catalyzed by the ten eleven translocation methylcytosine dioxygenase 1 (TET1) family of enzymes—an active, selective process. 5hmC is a less stable modification, prone to non-enzymatic conversion to (unmodified) cytosine, and has thus been proposed as a molecular beacon of genes switching from off to on. DNA methylation patterns are erased and reestablished transgenerationally, but this process appears to involve faithful perpetuation of methylation marks along the genetic lineage, i.e.from parent to progeny (further evidence of this phenomenon can be seen in inbred mouse strains, whose DNA methylation landscapes are epigenetically preserved within a genetic lineage, whilst being stably distinct between lineages (Orozco, Morselli et al. 2015, Chen, Orozco et al. 2016)).
Bisulfite sequencing is the method for unequivocal determination of methylation status (note: all methods for DNA methylation analysis are in symbiosis with dynamic suites of informatics tools (Teschendorff and Relton 2018)). Briefly, treatment of DNA with bisulfite converts unmethylated cytosines to uracil, leaving methylated (including hydroxymethylated) cytosines unmolested. Subsequent sequencing allows determination of methylation status: uracil indicates unmethylated; cytosine, methylated. Broadly construed, DNA methylation in CpG islands (regions of genome with high frequencies of the dinucleotide, which incline towards promoters) and shores (areas around said islands) tends to be associated with gene silencing.(Jones 2012)Conversely, the bodies of mRNA-encoding genes tend to be methylated, without an established correlation allowing prediction of expression. Mammalian genomes exhibit enrichment of CpGs in exons and promoters (versus intronic and intergenic regions) regardless of their methylation status. Leveraging this principle of genomic cartography, the technique of reduced representational bisulfite sequencing (RRBS) was invented to interrogate CpG methylation status near regulatory regions of genes. Treatment of genomic DNA with a restriction enzyme, (e.g.MspI) that cuts around CpG rich regions, coupled with bisulfite conversion, enables concentration of maximal sequencing firepower on the region of the genome hypothesized to be most centrally involved in transcription and thereby phenotype. In an RRBS experiment, ~1-3 million CpGs are commonly measured (depending on species and sequencing depth), representing ~10% of all CpGs in the genome; the CpGs are preferentially sampled, however, from regulatory regions like CpG islands and shores in promoters and genes. RRBS delivers, therefore, a good bang for the buck in terms of single base resolution methylation information and potential for disease insightsOther methods for large scale analysis of DNA methylation include methylation immunoprecipitation (which has been used to identify methylation dependent regulation of atherosclerotic risk in humans (Aavik, Lumivuori et al. 2015)) and DNA methylation arrays (notably the Illumina 450 chips), the latter of which has been extensively deployed in humans to characterize methylation patterns associated with a host of pathophysiological conditions including cancer (Sandoval, Heyn et al. 2011), high blood pressure(Richard, Huan et al. 2017), body mass index and obesity (Aslibekyan, Demerath et al. 2015, Mendelson, Marioni et al. 2017), atrial fibrillation (Lin, Yin et al. 2017), inflammation(Ligthart, Marzi et al. 2016), and death (Chen, Marioni et al. 2016). Perhaps because of cost and technical demands, RRBS (or much more expensive whole genome bisulfite sequencing) has been applied to a smaller list of diseases. Such data from mice however show that DNA methylation plays a powerful role in heritable differences in response to metabolic syndrome (Orozco, Morselli et al. 2015)and may contribute to catecholamine induced cardiac pathology (Chen, Orozco et al. 2016). DNA methylation and/or hydroxymethylation abnormalities have been found in animal (Gilsbach, Preissl et al. 2014, Greco, Kunderfranco et al. 2016)and human(Movassagh, Choy et al. 2011, Movassagh, Vujic et al. 2011)heart failure, associated with changes in expression of pathologic genes. Work from mouse cardiomyocytes suggests that DNA methylation largely obeys chromatin structural features of A/B compartmentalization (itself defined based on gene density, histone marks and other features of open chromatin; see section below on chromatin strcuture), wherein dynamics of DNA methylation during lineage commitment are enriched in A (active) compartments and genetic disruption of DNA methylation (via DNMT3a and 3b knockout) does not alter compartmentalization (Nothjunge, Nuhrenberg et al. 2017). This observation supports a passive relationship between DNA methylation and chromatin structure, at least in the formation phase.
Another peculiar involvement of DNA methylation in cardiovascular disease was recently unearthed by studies of clonal hematopoiesis, or the process whereby hematopoietic cells acquiring advantage-conferring somatic mutations outcompete other, non-mutated cells in an individual’s resident population, becoming—the somatic mutation containing population does—the dominant if not exclusive source of hematopoietic cells in the individual. Blood cancer was found to be tightly linked with clonal hematopoiesis arising in individuals with mutations in genes encoding DNA methylation machinery (DNMT3a [a maintenance methyltransferase] and TET2 [a demethylase] specifically; incidentally, the third gene found highly associated with this phenotype was ASXL1, a PRC2-associated protein) (Genovese, Kahler et al. 2014). Because of the increased propensity for cardiovascular disease in these patients, another group subsequently examined whether clonal hematopoiesis, driven by TET2, may be a heretofore unrecognized risk factor for atherosclerosis (Fuster, MacLauchlan et al. 2017). Remarkably, loss of TET2 in bone marrow (and bone marrow-derived cells in adult mice) was sufficient to increase atherosclerotic plaque size in plaque-prone Ldlrknockout mice. Global effects on DNA methylation in blood or other tissues was not examined, and whether mutation of the methyltransferase DNMT3a (or ASXL1, for that matter) gives similar results in terms of clonal expansion and atherosclerosis remains to be tested, but this set of observations revealed an anomalous source of cardiovascular risk and an unexpected set of molecular circumstances for how DNA methylation in blood may correspond to end organ (e.g.heart or vessel) risk, which is: DNA methylation in blood is the result of the balance of adding and removing enzymes in that tissue, which in turn can be affected by mutation; in the scenario in which mutations in DNA methylation enzymes perturb or eliminate function, large scale DNA methylation changes result in a pro-atherogenic phenotype in blood by either targeted alterations in select genes and/or by wholesale changes in blood cell methylation.
What is the import that methylation patterns are associated with complex human phenotypes? One method through which DNA methylation has its molecular effects is to reinforce prevailing chromatin landscapes by preventing accessibility and facilitating compaction (in promoters, as mentioned above, and in X chromosome inactivation, where it conspires with histone H3K27me3 to silence expression of one of the two X chromosomes in females (Shen, Matsuno et al. 2008, Lee 2012)); another is to favor relaxing of chromatin and transcription (as in gene bodies). These opposing effects must require the intervention of discriminating factor(s), perhaps including methyl-CpG binding proteins, but this field currently wants for established rules and actors. Another mode of action is through transeffects, whereby a methylation event can regulate the expression of a gene in a distal region of the genome (i.e.far away from the actual CpG in question). Studies from human cardiac development reveal an enrichment of regulatory elements, including DNA methylation sites, in regions of genetic variation associated with heart disease (Gilsbach, Schwaderer et al. 2018), supporting a molecular link between chromatin regulation and genetic variation in the context of pathological phenotypes. By “regulate,” it is meant here that the methylation event is shown to correlate—with genome-wide statistical significance—with the expression of a gene, a methylation quantitative trait locus. This regulation may take the form of enhancer element formation/modification (discussed below) or other as-yet uncharacterized chromatin structure effect.
Another way to ask this question is: what is the right scale to analyze the effects of DNA methylation (or any chromatin feature, for that matter) on phenotype? One way, call it the molecular pathway model, is to envision every one of the 2 million CpGs measured in an RRBS experiment or the 21 million CpGs in the mouse genome as independent actors, pursuing a discrete molecular mechanism—each one like a light switch—wherein methylation at the residue causes one chain of events, and lack of methylation another, and together, these 2 million CpGs (or 21 million, depending on how we do the experiment) all send independent and unique and important signals that, somehow, are all integrated into a coherent cellular response at any given time. Another, non-competing way to think about these observations, call it the no molecular pathways model, is to envision the 2 or 21 million CpGs as readouts of a global change in the nature of the chromatin environment, shifting towards a more or less transcriptionally pliant state and serving (the methylation and other chromatin mark are, in this model) like tethers to bring together certain parts of the genome in 3D, keeping others away from each other (note: differentially methylated regions, rather than individual sites, have been posited to be functional units through which this modification operates). The prevalence of cancer (Jones, Issa et al. 2016)and congenital heart disease (Zaidi, Choi et al. 2013)in humans is associated with mutations in genes encoding proteins that modify chromatin, such as histone modifying enzymes (writers, erasers and readers). Furthermore, these complex diseases are often associated with global changes in DNA methylation without apparent plans for targeting genes known to induce cancer or cardiovascular disease. In some cases, the genetic or epigenetic lesion occurs in a gene whose aberrant function can exert a dominant role in disease pathogenesis. In other cases, these epigenomic changes may instead be general hallmarks of perturbed cellular function, whereby the normal parameter space for gene expression is expanded, facilitating dysfunction of multiple cellular processes. For virtually all observations on histone PTMs and DNA methylation in cardiovascular disease, the train or snow leopardquestion remains unanswered.
Transcribed genomic regions that do not encode proteins modulate the epigenome
The now quotidian usage of RNA-seq to explore transcriptomes has led to a fascinating discovery: most of the genome is transcribed, if only a small portion of that transcriptome encodes mRNA destined for translation, with intriguing differences in this non-coding transcriptome across cell types and following pathological insult. Noncoding RNA biology is a specialized discipline unto itself, with new species of RNA—ascribed really cool and sometimes bizarre functions—identified seemingly endlessly, and will not be extensively reviewed herein (excellent reviews on the roles of various noncoding RNAs in cardiovascular biology have emerged (Mathiyalagan, Keating et al. 2014, Uchida and Dimmeler 2015, Bar, Chatterjee et al. 2016, Poller, Dimmeler et al. 2017)). Of particular interest to chromatin biology, however, is the concept that long noncoding RNAs (lncRNAs) may participate in gene regulation by modulating chromatin structure.
lncRNAs have been proposed as a potential solution to the Big Hole in the Theory of Chromatin Function, which is that it remains unknown how different chromatin marks are deposited—specifically and reproducibly—across the genome.One of the best-studied lncRNAs, a general definition of which is an RNA greater than 200 nucleotides with no discernable open reading frames (an exception to this being the presence in some lncRNAs of ORFs which have been shown to produce micropeptides that go on to regulate key intracellular processes in cardiovascular cells (Magny, Pueyo et al. 2013, Anderson, Anderson et al. 2015)), is Xist, which is centrally involved in X chromosome inactivation. Xistis transcribed from and acts in cisto silence the X chromosome through a process that recruits, via direct binding ofXistto the proteins, the PRC2 complex and YY1. A depositor of histone H3K37me3 silencing marks and transcriptional repressor, respectively, these proteins in turn compact the X chromosome and prevent further transcription (Lee 2012, Engreitz, Pandya-Jones et al. 2013). This model—lncRNA binding to a specific region of chromatin and recruiting histone modifying enzymes—is appealing, because it solves the problem of DNA sequence recognition, of which many histone modifiers are incapable. Another well-characterized lncRNA that binds PRC2 subunits is Hotair, involved in gene silencing in mammals, shown to regulate chromatin in transoutside of the context of X inactivation (Rinn, Kertesz et al. 2007). These studies led to a gold rush on PRC2-interacting lncRNAs, which have been estimated to range in number from hundreds to thousands in mice (Zhao, Ohsumi et al. 2010)and humans (Khalil, Guttman et al. 2009). The general properties, if they exist, through which these lncRNAs couple PRC2 to chromatin are the focus of continued investigation (Beltran, Yates et al. 2016, Tu, Yuan et al. 2017). It may be that the genes for these molecules are distributed across the genome and the lncRNAs in turn all act in a local manner to recruit and modulate chromatin machinery to a given gene expression environment (Joung, Engreitz et al. 2017). Yet there are clear limitations were the cell to attempt to repeat this process with other lncRNAs: physiological transcriptional profiles in adult cells do not involve turning on or off entire chromosomes, with genes temporally co-regulated often residing on different chromosomes (each of which, in this model, would require its own lncRNA [although 3D chromatin environments may allow transcription factories to form bringing multiple mRNA coding genes into a neighborhood governed by a single lncRNA) and beset by numerous histone modifications (meaning multiple protein factors are in play). Reflecting this fact, the spectrum of lncRNA functions has expanded (Lee 2012, Bohmdorfer and Wierzbicki 2015)to include actions in trans(i.e.targeting other chromosomes) as well as cisto enhance transcription, to block it, to scaffold chromatin interactions, and to aggregate microRNAs (thereby making them unavailable to regulate mRNAs).
Initial investigations of lncRNAs in the heart revealed involvement in developmental growth and maturation. Fendrrbinds both PRC2 and the activating complex Trithorax group/MLL in mesoderm, its depletion leading to impaired cardiac and chest wall development (Grote, Wittler et al. 2013). Braveheart, another mesoderm associated lncRNA, binds the Suz12 subunit of PRC2 and is required for proper differentiation of embryonic stem cells into cardiac precursors (Klattenhoff, Scheuermann et al. 2013). Also involved in cardiac development is the lncRNA Upperhandthat regulates the Hand 2locus in cisby facilitating chromatin modifications (i.e.super enhancer maintenance) and RNA pol II elongation (Anderson, Anderson et al. 2016). Other lncRNAs have been discovered to play a role in disease-associated gene regulation. Chaerbinds the Ezh2 subunit of PRC2 and its genetic manipulation leads to alteration in H3K27me3 levels around pathologic genes and cardiac hypertrophy in the mouse.(Wang, Zhang et al. 2016)Some lncRNAs appear to exert their effects on gene regulation through interaction with chromatin remodeling complexes, as is the case for Mantis, a lncRNA discovered in macaque and shown to regulate endothelial angiogenesis in a manner involving interaction with BRG1 (Leisegang, Fork et al. 2017). Similarly the cardiac-specific lncRNA Myheartbinds and inhibits the actions of BRG1, thereby regulating expression of myosin heavy chain expression, along with other genes. Myheartis downregulated by pressure overload stress and its transgenic restoration protects against overload induced hypertrophy (Han, Li et al. 2014). Interestingly, lncRNAs like Malat1in vascular tissues (Michalik, You et al. 2014)and Chastin cardiac tissues (Viereck, Kumarswamy et al. 2016)are nuclear localized and regulate expression of nearby genes, although it remains to be tested whether they accomplish these actions through recruitment of chromatin complexes. In the field of cholesterol metabolism, two recent lncRNAs have been discovered that exert powerful effects of lipid levels and atherosclerosis in vivo: LeXis(Sallam, Jones et al. 2016), expressed in the liver, directly controls genes involved in cholesterol biosynthesis, consequently modulating plasma cholesterol levels, and MeXis(Sallam, Jones et al. 2018), expressed in macrophages, regulates genes involved in cholesterol efflux (both lncRNAs were found to operate through chromatin based on subcellular localization, accessibility assays and transcriptional regulation).
Discovery analyses in a mouse model of pressure overload revealed the totality of cardiac lncRNAs (Matkovich, Edwards et al. 2014), determining their extent of enrichment in this tissue when compared to tissue of distinct developmental origin (liver and skin) and determining changes between embryonic, adult and diseased non-coding transcriptomes (nota bene: only a few of the developmentally silenced lncRNAs were re-expressed with disease, in contrast to the “fetal gene program” (Rajabi, Kassiotis et al. 2007)documented for mRNAs). Likewise in myocardial infarction, the recovery/injury/remodeling period was found, in mice, to be associated with changes in lncRNA expression (incidentally, the lncRNAs were also found to reside near chromatin marks associated with transcriptional enhancement), some of which (the lncRNAs) were subsequently shown to modulate expression of mRNA-encoding genes known to participate in basic cardiac function (Ounzain, Micheletti et al. 2015). Studies from humans have charted differences in lncRNA expression between fetal and adult cardiomyocytes, linking their expression with known enhancer marks associated with protein-coding RNA transcription (e.g.H3K4 methylation)(He, Hu et al. 2016).
What is known about lncRNAs in the cardiovascular system is that they can be cell type specific, often lack extensive sequence conservation across species (although they may be conserved at the level of secondary structure), can regulate transcription (probably mostly in cis) and can correlate with histone PTMs, binding some of the histone modifying complexes, PRC2 in particular. What is unknown is to what extent they can act at a distance (beyond, say, a few kilobases from their own site of transcription), the role of chromatin structure to coordinate such actions (although there is intriguing evidence from X inactivation, where lncRNA function has been perhaps most exhaustively investigated, that local chromatin environment—rather than, say, DNA consensus motifs distributed across the chromosome—facilitates Xistbinding, PRC recruitment and inactivating activity (Engreitz, Pandya-Jones et al. 2013)), if they bind directly to chromatin and/or DNA and whether they are sufficient to coordinate the locus specific activities of chromatin modifiers through a model in which multiple lncRNA genes, by virtue of their evolutionary population at key sites across the genome, establish local neighborhoods of regulation at which they recruit—or repel, according to wont—histone modifiers and polymerase machinery.
Perhaps one of these paradigms explains how specificity is endowed to chromatin structure: (1) cell type specific transcription factors control subordinate RNAs and proteins that, when synthesized, bind to specific loci, which in turn reprogram chromatin remodeling machinery and nucleosomes; (2) cell type specific chromatin remodeling proteins bind specific DNA sequences and remodel chromatin accordingly; (3) DNA methylation machinery has as-yet unidentified cell type specific regulatory cofactors that induce the right gene expression and proscribe the wrong; (4) lncRNAs, either via an ability to bind DNA (perhaps via triple helix formation (Felsenfeld, Davies et al. 1957, Buske, Mattick et al. 2011)) and thereby demarcate some loci for further action, or merely by their transcription (the lncRNA’s) to create regions of RNA pol II activity into which neighboring mRNA-encoding genes may be transcribed.
Specialized regions of DNA and associated proteins function as transcriptional enhancers
Enhancers are regions of DNA that promote the transcription of other regions of DNA (Calo and Wysocka 2013, Shlyueva, Stampfel et al. 2014, Heinz, Romanoski et al. 2015). A contemporary synthesis on how this works: specific histone PTMs (e.g.histone H3K4me1 and H3K27ac; some histone isoforms, such as H3.3 and H2az, contribute to enhancer activity; enhancers are now thus commonly identified by genome-wide ChIP-seq experiments) decorate regions of DNA that need not be—although may be (see below)—themselves transcribed, which in turn recruit binding of enhancer associated proteins (e.g.lineage relevant transcription factors, RNA pol II and co-activator proteins, such as p300 and Mediator) and interact in three dimensions with the genes whose expression they…enhance. This region of DNA, the appropriated demarcated histones and any associated proteins, together constitute the enhancer which is often “validated” as such by showing that either (a) its genetic disruption interferes with expression of its target gene and/or (b) that the enhancer DNA sequence can drive developmental and lineage appropriate transcription through a cell- or organism-based reporter assay. In the absence of chromatin conformation data, enhancers are usually assumed (and tested) to regulate the nearest downstream gene. Somewhat counterintuitively, then, enhancers tend to reside in areas of relative nucleosome depletion (not, strictly speaking, in areas devoid of nucleosomes), such that enhancers can be identified by open chromatin assays (e.g. DNAse I hypersensitivity or ATAC) followed by DNA sequencing. A subgroup of enhancers, called super enhancers (Hnisz, Abraham et al. 2013), has been classified based on the observation that the aforementioned enhancer features at times occur multiple times in close proximity to each other. Super enhancers can exhibit augmented transcriptional activation potential and thus may represent a distinct structural property of cell type-specific chromatin (Hnisz, Shrinivas et al. 2017). Further specification of enhancer behavior includes delineation of poised (those ready for promoting transcription of their targets) versus active (those actually so promoting) enhancers, which can be distinguished by the presence of silencing histone marks (and the enzymes that deposit them) at poised enhancers and their absence (concomitant with the presence of greater levels of RNA pol II) at active enhancers (Calo and Wysocka 2013, Shlyueva, Stampfel et al. 2014, Heinz, Romanoski et al. 2015).
It has more recently become apparent that some enhancers (alternatively known as transcriptional regulatory elements) as well as other genomic features originally thought to regulate only via binding proteins and thereby forming or contributing to the formation of permissive or restrictive nuclear environs, may themselves be transcribed (Kim, Hemberg et al. 2010)and may thus operate in the RNA form. It could be that this transcription is a goal-directed process in the normal way we think about RNA doing thingsin the cell: the enhancer RNAs, (eRNAs) may have gene regulatory or other functions. It may also be that the eRNA synthesis is a by-product of enhancer DNA in close apposition to churning transcription factories and serves no subsequent end and/or that its very transcription serves the end of keeping a transcription factory churning and poised for ready enlistment in production of other RNAs that do serve subsequent ends (as RNAs). This fascinating concept of chromatin biology is an active area of investigation(Danko, Hyland et al. 2015). Studies have begun to emerge examining the role of enhancer transcription in select cardiovascular processes such as cardiac conduction (Yang, Nadadur et al. 2017)and endothelial cell stress response (Hogan, Whalen et al. 2017). Active endothelial cell enhancers, defined by H3K4me2 and H3K27ac binding (plus some eRNA transcription), exhibited altered transcription factor binding in human aortic endothelial cells following exposure to oxidized phospholipids (Hogan, Whalen et al. 2017). Intriguingly, SNPs associated with cardiovascular disease were over represented in these enhancers, suggesting a molecular scale explanation for how the former influences transcription and phenotype, a property that may be a common feature of enhancers across cell types and species (Cheng, Ma et al. 2014).
p300 occupancy has been used to identify enhancers in the developing mouse heart embryonic day 11.5, many of which were found to exhibit tissue specific activity (Blow, McCulley et al. 2010). A similar approach was used to characterize enhancers in fetal and adult human heart tissue (of note, 48% of the enhancers were the same in fetal and adult human hearts; when comparing fetal mouse to fetal human, the overlap was 21%) (May, Blow et al. 2012), revealing functional elements that may participate in human cardiac gene regulation. Angiotensin II-induced vascular growth, a key component of atherosclerosis, was found to proceed via dynamic utilization of super enhancers in human cells through a process that involves complex interplay amongst noncoding RNAs, chromatin readers and transcriptional machinery (Das, Senapati et al. 2017). A theme of developmental processes being redeployed—not wholesale, but in a selective manner—in the disease setting may also play out for enhancers: regulatory elements marked by H3K27ac are specified in part by the master cardiac transcription factor GATA4 during development and some of these regions, devoid of GATA4 in healthy adult heart, are revisited by the protein upon pathologic stimulation, contributing to disease-associated gene expression (He, Gu et al. 2014). Pursuing this concept more directly in a complementary model (which, incidentally, also provides a resource of ChIP-seq data for other non-enhancer-associated histone marks in the normal adult and pressure overloaded adult heart), it has also been shown that cardiac enhancers undergo altered regulation by disease-associated transcription factors following pathologic stress (Papait, Cattaneo et al. 2013). The histone modification reader BRD4 binds super enhancers that are associated with cardiac disease genes. Interestingly, this process is finely tuned to differentially modulate association of BRD4 with these disease genes while leaving housekeeping genes unaffected, a process controlled in part by miRNA-dependent titration of BRD4 levels (Stratton, Lin et al. 2016).
Aside: what is meant by epigenomics of disease
Since the advent of the central dogma of molecular biology, the following cognitive bias has been pervasive: genetic lesions disable proteins, which in turn muck up cellular and then organ level processes.What if disease, and even normal function, were examined according to groups of molecules at the center, moving out from there to understand function? Here’s why one might entertain this thought experiment: among the numerous genes/proteins/RNAs/other molecules implicated by genetic loss/gain of function in animal models of cardiovascular disease, a minority have been shown to be causative—using a human genetics definition of the term—in human disease. This does not mean that the animal data is wrong. It does suggest, however, that implication of a pathway in animal work is not evidence of potential clinical relevance in humans. The rarity with which mouse studies are translated to humans supports this interpretation. Like most common diseases, cardiovascular disease is a lowest energy state assumed by a perturbed cell/organ (“lowest energy” here used as a borrowed phrase from thermodynamics or protein chemistry, not to mean something like “how much ATP is being consumed”), resulting from changes in metabolism, structure, cell fate or—related to all these—gene expression, which is underpinned by chromatin. Multigenic diseases, including most cardiovascular diseases, may not be “diseases” at all: these “syndromes,” rather, are the manifestation of a deleterious low energy state for a cell and organ, based on the convergence of genetic programming and environmental perturbation. A low energy state not conducive to (healthy) life. The lessons from studies of genetic variability in the onset of cardiovascular disease in animal models (Bennett, Farber et al. 2010, Rau, Wang et al. 2015, Wang, Rau et al. 2016)is not that driver or master regulator or causative genes are identified from network analysis. The lesson is that genetic variability creates many paths to the same lowest energy state, not because the pathways converge on a single molecular integrator (indeed few of the proteins shown by gain/loss of function studies to be involved in heart failure, for example, are found to be altered in the same manner in a genetically heterogeneous population subject to an environmental stimulus (Karbassi, Monte et al. 2016)). The pathways, to the extent they exist, do not converge: they just all lead to the same place. Such a consideration is of particular relevance for cardiovascular chromatin, which must integrate genetic and environmental factors. More evidence that cardiovascular disease as a label obfuscates the clinical presentation and treatment: particularly in the elderly (>75), 41.7% of women and 40.7% of men with coronary artery disease lack traditional risk factors (even amongst the “young” [£45], ~10% have no risk factors and 41.9% of women and 48.0% of men have just one risk factor)(Khot, Khot et al. 2003). Do these syndromes, these deleterious low energy states, exist at the subcellular level? Put another way, what would diseased chromatin look like?
Some insight might come from the observation that all cells of the same type—that is, those with the same textbook names—are not identical. The rapidly dividing phenotype of cancer has allowed researchers in this field to identify epigenetic clones: (Flavahan, Gaskell et al. 2017)lineages of cells outwardly genetically identical that differ based on semi-stable, transmissible (through mitosis) chromatin features. All cardiac myocytes and vascular smooth muscle cells are not the same, which means that although these cells do not proliferate and differentiate like cancerous cells do, it is reasonable to hypothesize that developmentally endowed epigenetic clones exist and contribute to organ level phenotypes in the adult. Indeed, distinct clonal populations (arising from a common progenitor in development, rather than from a resident adult stem cell) of cardiac cells contribute to different anatomical and functional features of the adult organ (Meilhac, Esner et al. 2004, Moretti, Caron et al. 2006, Domian, Chiravuri et al. 2009, Kuhn and Wu 2010, Spater, Hansson et al. 2014, Li, Miao et al. 2016)(recent single cell studies have revealed these distinct myocyte populations to indeed exhibit distinct transcriptomes (DeLaughter, Bick et al. 2016, See, Tan et al. 2017))—epigenetic dissection of these populations may well reveal epigenetic clonality to be an underlying process contributing to this observation. The chromatin accessibility assay ATAC-seq, which reveals areas of open chromatin, has been applied in a single cell format to a lymphoblastoid cell line, identifying subpopulations of cells based on chromatin accessibility (Buenrostro, Wu et al. 2015). That such variability exists in post-mitotic, healthy adult cells in the cardiovascular system remains to be demonstrated, but the observation of transcriptome variability in these cells, and the presence of chromatin accessibility variability in cells otherwise phenotypically similar, makes such a conjecture not unwarranted. Chromatin maintains the lowest energy states of the genetic material. But then, development, adulthood and disease are all states of equally low energy, otherwise these states could not be reproducibly attained. The higher, unstable energy states are only transitory. If epigenetic functional stability is directly related to disease, then more stable chromatin should be less susceptible to disease. End of aside.
Some considerations about chromatin based on things—some strange—it can do
It would be imprudent to examine chromatin in cellular function and disease without addressing the concept of causality in biology. Do epigenetic changes cause disease? By this we usually mean: if we induce the change, do we get disease and if we block the change, do we block the disease? This rubric is a vestige of pharmacological approaches to examining cellular processes whereby if a drug inhibiting a factor X was used to block a cellular process then factor X was reasoned to cause said process (i.e.to be necessary for it). If you use an activator of X and you get the process back, then X is also sufficient. What is the purpose of trying to define the function of every molecule in the cell? To assign each a name, a personality, a job description? What if the aim of ‘omics is to forget that genes and proteins have been given names and personalities by humans and instead just attempt to divine the principles that govern their actions? Epigenomics and proteomics are powerful tools to analyze chromatin. They create and amplify, however, a dichotomous relationship in our understanding of how chromatin works in vivo—of how the totality of proteins that mayenter the nucleus mayregulate the genome. We know where many proteins bind the genome based on ChIP-seq experiments. But proteomics tells us that there are many more proteins in the nucleus (perhaps ~1000), which themselves are then subjected to PTM—about whose genome binding we know jack squat. We need a conceptual shift in the way we are thinking about chromatin: ChIP’ing a thousand proteins and maybe tens of thousands of modified proteins all in the same cells or tissues before and after some intervention is a nonsensical exercise, no matter how cheap sequencing gets. It is worth mentioning that there are two fundamentally distinct ways ‘epigenetic’ studies proceed. In one approach, epigenetic factors are cellular components that, like any other garden variety pathway involved in signaling, metabolism, protein turnover, stress response, and cell survival/death, participate as components of the cell’s molecular toolkit. That is, they (i.e. studies treating chromatin without any special considerations) lack any extra, special distinguishing behaviors versus these other proteins whose actions do not, by definition, imply persistence in time. The other way in which epigenetic processes are studied is in the context of inheritance, either through cell division, cell differentiation and/or organismal reproduction. The vast majority of published studies, though they may espouse the latter, definitively test only the former. To put a finer point on it: there is a dearth of information on how ostensibly universal principles of chromatin regulation (such as opening up chromatin by histone acetylation or modulating gene expression by DNA methylation) operate at multiple individual loci, and an absence of evidence as to whether such modifications persist through time. The prevailing belief system has arisen from studies that either (i) focus on a single locus or (ii) study the entire genome, which respectively lack universal proof or finite, comprehensive molecular details (contrasted with, say, the central dogma, which is invariant across cell type). Approaches examining behaviors of chromatin that expand the concept of rote transcriptional control offer glimpses into a more interconnected role for chromatin in cellular processes.
Another (somewhat briefer) aside: what happens when a myocyte has multiple nuclei
It also appears to matter how many nuclei the cardiomyocyte has in terms of its ability to proliferate. The percentage of mononuclear diploid cardiomyocytes was found to correlate directly with the degree of myocyte proliferation and heart function following myocardial infarction in a panel of inbred mouse strains (Patterson, Barske et al. 2017). Who is in charge when more than one brain is present in a single cell? Anecdotally, different nuclei in the same cardiomyocyte behave differently according to various immunostaining-based measurements of gene and protein localization. If binuclear cardiomyocytes never divide and mononuclear cardiomyocytes divide rarely, would it not be extraordinarily informative to know the molecular differences between a diploid nucleus in a mononuclear cell and the two diploid nuclei in a binuclear cardiomyocyte? And what about between the two diploid nuclei in the binuclear cardiomyocytes themselves, if they could be laser capture micro-dissected, or by other means extracted, from the cell and kept distinct from each other for analysis. How this might work with single cell ‘omics: identifying and exploiting a marker that distinguished two nuclei from each other in cardiomyocytes—itself having been identified by a single cell experiment and then validated using immunofluorescence as exhibiting always decoration of just one nucleus per binucleated cell and never the nuclei of a mononuclear cell—in a nuclear sorting exercise prior to single cell epigenomics and transcriptomics to ask i) whether the nuclei were distinguished from each other in terms of their RNA products; and ii) whether differences in chromatin features between the two diploid nuclei in the same binuclear cardiomyocyte were part of the explanation for transcriptome differences, if observed, and/or for differences in the potentiality of the host cells. End of brief aside.
Chromatin can act like a stress sensor complex, wherein there is no single factor controlling changes in disease associated gene expression. Some investigators have described excitation-transcription coupling, with the term specifically applied to local calcium signals around the nucleus (as distinguished from global calcium transients involved in myofilament contraction) inducing local CaMKII activation and HDAC mobilization(Wu, Zhang et al. 2006). What if this observation is evidence of a more generalized, myocyte specific sensory apparatus on chromatin, that detects local calcium signaling, such as that involved in pathologic gene activation, from calcium involved in contraction and nonetheless critical to influence gene expression (e.g.sustained faster heart rates require greater turnover of proteins and thus transcripts)? Various pathological cell states, including cardiovascular disease and cancer, have been characterized by global changes (e.g.that revealed by a total cell lysate western blot or genome-wide ChIP-seq signal, for example) in histone modification. One possible reason for this unexpected observation was found to include regulation of cellular acidity: (McBrian, Behbahan et al. 2013)global histone acetylation responds to perturbation of cellular pH (lower pH leads to less histone acetylation) and cells respond to modulation of histone acetylation by modulating pH, a sort of acetate capacitance system on chromatin to attenuate large swings in cellular acidity (Kurdistani 2014). Histone PTMs can thus function as acute sensors of cellular metabolite levels. Combined metabolomic and proteomic studies reveal that abundance of short chain acyl-CoA donors directly, although not indiscriminately, influences the modification of histone tails in human cells in culture (Simithy, Sidoli et al. 2017).
As the guts of the organelle, it should be unsurprising that chromatin controls nuclear morphology, yet the extent to which epigenomic marks influence nuclear structure is fascinating. Aberrations in nuclear rigidity and structural integrity are associated with diseases like cancer and progeria, some of which are driven by so called laminopathies, arising from malfunction of nuclear lamina proteins. Cardiomyopathies resulting from mutations in lamin A/C are one of the best studied group of genetic diseases in clinical cardiology, and have led to clinical trials, although in this context the effects on chromatin structure (or vice versa) are unclear: aberrant nuclear morphology, the blebbing of the nuclear membrane due to impaired laminar network architecture, is a hallmark of these diseases (Worman, Fong et al. 2009). Association of chromatin with the lamina appears to be essential for nuclear structure and disrupting this interaction has detrimental effects on nuclear integrity (Schreiner, Koo et al. 2015), particularly in cells subject to mechanical force. These actions are coupled to the mechanisms known to regulate chromatin’s role in gene expression, as supported by the observation that histone PTM influences nuclear rigidity and membrane integrity (Stephens, Liu et al. 2018).
An unexpected non-nuclear, signaling behavior of not just chromatin modifying proteins but actual intact multimolecular slabs of chromatin has been observed in cancer (Dou, Ghosh et al. 2017): cytoplasmic chromatin fragments—evaginated nuclear membrane containing DNA and nucleosomes decorated with heterochromatin marks—can induce inflammation and cell death through cytoplasmic signaling and circle-back transcriptional regulation. Get a load of this even weirder story:rod and cone cells are terminally differentiated, specialized components of the retina, the light sensitive component of the vertebrate eye in which evolution has hijacked chromatin in rods (but not cones) to serve the transcriptionally unrelated function of focusing light in the retinas of nocturnal but not diurnal mammals (that is, the organization of DNA in the nucleus forms a physical lens) (Solovei, Kreysing et al. 2009), thereby providing a meta-function in service of that specific cell’s raison d’être.
Lastly a process only slightly nuanced from the model of histone PTMs specifying transcriptomes and as a bridge to the next section about models of chromatin function: different structural units of chromatin establish distinct transcriptional environments. It can be helpful to think of chromatin itself as a transcription factor, or transcriptional processor. Transcription does not happen willy-nilly throughout the nucleus, but rather is localized to transcription factories (Edelman and Fraser 2012), or areas designated to different forms of transcription, such as rRNAs and house-keeping sorts of protein-coding mRNAs, separated from stimulus responsive genes and furthermore from transcriptionally silent regions. This happens on a nuclear scale as reflected by the observation that transcription tends to happen toward the center of the nucleus whereas the periphery is an area of gene silencing. Another scale of transcription factor sort of activity is sub-chromosomal, in the form of chromatin looping, the formation of short- and long-range interactions to facilitate gene activation (i.e.enhancer elements) or repression (i.e.insulators or boundary elements). This is also an appropriate place to note connection between chromatin looping on the gene scale and transcription, wherein looping would bring together transcription start and end sites in three dimensions to facilitate efficient cycling of machinery like polymerases and transcription factors (Hi-C data supports this concept for a cohort of genes). This principle has been supported with ChIP-seq data from rat hearts (Sayed, He et al. 2013), in which different transcriptional activation profiles (pause-release and de novorecruitment) have been described in the setting of pressure overload hypertrophy, along with accumulation of RNA pol II in transcription end site, perhaps reflective of gene looping.
Structure-function features of chromatin
In 1950, the year Ezzard Charles defeated Joe Louis to retain the heavyweight belt and Merv Griffin sang I’ve Got a Lovely Bunch of Coconuts, Edgar Stedman and Ellen Stedman communicated (Stedman and Stedman 1950)in a letter to the editor their observation that cells of different type had different histone features (as determined by the arginine content of their nuclei). In discussing this observation, they speculate:
“The demonstration…that some of the basic proteins [i.e.histones] present in cell nuclei are certainly cell-specific leads to the hypothesis that one of their physiological functions is to act as gene suppressors.”
Open any molecular biology textbook today and you will find it: the diagram that starts with a mitotic chromosome at one end and the DNA double helix at the other, in between displaying intermediates including a single nucleosome, the “beads on a string” tract of several nucleosomes, maybe a cluster of nucleosomes representing the 30-nm fiber, and various artful representations whose resemblance vacillates between spaghetti noodles and heads of broccoli. Genomic structure, as we know it. What is the evidence that chromatin is inherently ordered above the level of the nucleosome (where data exists to the atomic—that is several angstrom—level (Luger, Mader et al. 1997)) and below the level of the chromosome (where chromosome painting, closer to the scale of micrometers, demonstrates compartmentalization (Cremer and Cremer 2001))? The goal of chromatin structural studies is to determine: what are the structural features between these scales (angstroms for the nucleosome [8 proteins plus ~150 bp of DNA] and micrometers for the chromosome [thousands of proteins, millions of bases]) and at what scale(s) are structural features functionally important? One of the supposed advantages of disorder, for example in protein structure, is the increased likelihood of random interactions. If you were going to build a dynamic and protean multimolecular complex like chromatin, disorder would be an attractive quality: it should be fluid, but not randomly associated, stable, but not immutable, with sufficient complexity to enable a wide range of behaviors but with a defined set of molecular agents to enable reproducibility. The spaghetti and broccoli metaphors specify purely random interactions driven by physical chemical properties of essentially indistinguishable chromosomes versus ordered, hierarchical interactions arising from preferential proximity or lack thereof in 3D, respectively. These models have limitations in accommodating the afore-mentioned desirable features in chromatin, and indeed recent studies have indicated that chromatin is probably best represented by something incorporating features of both models. Next consider the pattern of chromatin observed in various cells of the cardiovascular lineage with a commonplace method such as DAPI labeling: while the pattern of staining is not random, there is no obvious reproducible pattern within a class of cells (and not shared between two classes) to which a functional consequence can be intuited (for a nifty exception, see (Solovei, Kreysing et al. 2009)), in contrast to chromosome patterns in mitosis/meiosis which definitively exhibit such tell-tale architecture. Fluorescence in situhybridization (FISH) experiments clearly demonstrated spatial segregation of chromosomes into territories (Cremer and Cremer 2001), whilst also clearly demonstrating the lack of a chain of command amongst the units of DNA. Recent higher resolution electron microscopy-based imaging of chromatin shows that its structure, in both interphase and mitotic cells, rarely achieves a scale greater than 24 nanometers in diameter (for reference, the nucleosome diameter is ~11 nm), with distinctions in arrangement between such cells coming from the density of compaction, which the authors interpret to be evidence of an absence of repeating, stable, hierarchical structure (Ou, Phan et al. 2017). How do these observations hold up in analyses of individual genes and with respect to histone post-translational modifications? In the cardiovascular arena, combination of Dam-ID and LaminB ChIP-seq (to identify loci associated with the nuclear periphery) and FISH was used to demonstrate that differentiation in the myocyte lineage involves precise reorganization of expressed genes away from the myocyte nuclear membrane, itself found to be decorated with the silencing mark H3K9me2 and to a lesser degree by H3K9me3 (other silencing marks H3K27me2/3 and H4K20me2/3 were not found enriched at the periphery in skeletal myoblasts) (Poleshko, Shah et al. 2017).
A prediction of a non-hierarchical model of chromatin is that the size of structural elements should be normally distributed. For chromatin interactions detected by Hi-C, for example in cardiac myocytes, this prediction has been shown to be true: the number of interactions plotted per locus follows a normal distribution, where most locus bins (bin size=5kb) have the same number of interactions (~2500) and a small number have very few or very many interactions (see Supplemental Figure 1a in (Rosa-Garrido, Chapski et al. 2017), for an example from cardiac myocytes). Chromatin structural topology is not scale free. Instead, the vast majority of loci interact with a median number of other loci, and no privileged structural behavior can be assigned to the regions with large interactions (as would be a prediction of a scale free or hierarchical topology).
Additional insights from the explosion of chromatin capture techniques to determine endogenous interactions have been informative (Dekker and Mirny 2016, Schmitt, Hu et al. 2016). These studies have characterized features of chromatin organization that are conserved across species and cell type (note: method development for analysis of chromatin capture data is ongoing and the interpretation evolves with it (Davies, Oudelaar et al. 2017)): topologically associated domains (TADs) are regions of chromatin with privileged local interaction (but a TAD is a statistical phenomenon resulting from interactions…it is not a structural feature, like an alpha helix or a square or a hand); TAD boundaries, as nominally implied, demarcate regions of the genome where intrachromosomal interactions switch from interacting in one direction (say 5’ biased) to the opposite; distinct regions are insulated against expression, in part by chromatins structural proteins and histone modifications; short and long range interactions reproducibly form (i.e.the structure is not random). The boundaries represent epigenomic cornerstones, directing interactions of nearby DNA and proteins in alternating directions, probably through the binding of chromatin structural proteins like CTCF. Cohesin and CTCF knockout animals and cells indicate these proteins are involved in TAD maintenance (Nora, Goloborodko et al. 2017, Rosa-Garrido, Chapski et al. 2017, Schwarzer, Abdennur et al. 2017), but these proteins alone are not the whole story: embryonic stem cells (Nora, Goloborodko et al. 2017)with only 4% normal CTCF protein levels still exhibited TADs and cardiomyocytes(Rosa-Garrido, Chapski et al. 2017)with 20% normal CTCF protein levels exhibited sparse, minor changes in TAD boundaries and strength. These studies also show that TAD formation/maintenance and A/B compartmentalization can be decoupled experimentally, as loss of cohesin or CTCF did not affect A/B compartmentalization. “A” compartments have more genes and are defined by having less interactions than would be expected for a given distance (“B” has more), indicating less compact chromatin. Eu- and heterochromatin marks dominate in A and B compartments, respectively.
Chromatin structure is dynamic during the cell cycle. The predominance of longer range interactions present in post-mitotic cells are rapidly lost upon entrance into G1, followed by further depletion throughout S and G2, in favor of shorter range, local interactions. This process abruptly reverses itself, with a return of TADs and long-range interactions following nuclear division (Nagano, Lubling et al. 2017). Mitotic chromosomes lack TADs and chromatin neighborhoods, instead exhibiting a uniform, homogenous pattern of hierarchical interactions (Naumova, Imakaev et al. 2013). A similar observation was made for oocytes in metaphase II: an absence of TADs and chromatin neighborhoods in these cells persisted in the zygote, with long range chromatin interactions manifesting at the 8-cell and inner cell mass stages (Du, Zheng et al. 2017). Interestingly, physical segregation of alleles was seen to persist until the 8-cell stage as well (Du, Zheng et al. 2017), even after the formation of long range chromatin contacts, suggesting that chromatin structure is an emergent property of an allele, can vary between alleles and thus may participate in allelic inheritance.
Chromatin capture techniques tell you who interacts with whom, but that’s not a structure. Traditional structural analyses of molecules seek aform. Modern attempts seek multiple forms representing different states which are taken to demonstrate how the molecule functions. This will not work for chromatin: the DNA component is too large, and the protein and RNA components too multifarious, to reasonably conceive of a single, or even a couple, fixed states of conformation at 3D atomic resolution. Probabilistic modeling, however, has been used to reveal 3D organizational principles from Hi-C datasets, the goal here being not a structure per se, but a population-based representation of the structural features of the chromosomes as they associate in the nucleus (Tjong, Li et al. 2016). Using datasets from human lymphoblastoid cells, this approach can measure and schematize genome structure to reveal inter-chromosome surfaces of apposition and to detect new anatomical properties of the nucleus, such as the physical clustering of centromeres of different chromosomes and the anatomical positioning of euchromatin and heterochromatin pockets with respect to other nuclear landmarks. Unlike traditional FISH or chromosome painting, in such an exercise one can know (depending on the resolution afforded by a sequencing experiment, so on the scale of 103-106bases at present) which loci are responsible for a given anatomical feature and in what part of the nucleus this feature tends to occur relative to other feature.
In 1964, the year Cassius Clay defeated Sonny Liston to become heavy weight champion and the Stones released Heart of Stone,Allfrey, Faulkner and Mirsky reported (Allfrey, Faulkner et al. 1964)evidence connecting histone acetylation and methylation with RNA synthesis (i.e.transcription). Commensurate with these observations, they advanced the following prescient hypothesis (emphasis added):
“…histone effects on nuclear RNA metabolism may involve more than a simple inhibition of RNA synthesis…more subtle mechanisms may exist which permit both inhibition and reactivation of RNA production at different loci along the chromosome.”
In the discussion, they go further:
“It may be suggested that DNA-histone binding, alterable by acetylation of the histone, can influence the rate of RNA synthesis. This would allow a means of switching-on or -off RNA synthesis at different times, and at different lociof the chromosomes.”
The only thing lacking from chromatin structural models described in the preceding paragraphs are the rules since divined from ChIP-seq and single locus analyses of histone isoforms, PTMs and chromatin remodelers. The histone code (Strahl and Allis 2000)idea—that regions of the epigenome are marked with histone isoforms and PTMs that specify transcriptional behavior—is a modern iteration of the conjecture summarized from Allfrey, Faulkner and Mirsky above. The histone code has become a ready-to-hand tool for chromatin interrogation, shaping how studies are designed and interpreted.
But the histone code has limitations. As discussed elsewhere in this essay, hundreds of histone PTMs are now known to exist. Also, the histone code and related ideas (Chen, Monte et al. 2012)lack an underpinning in mathematical logic, a limitation addressed by investigators who have used dry lab approaches to characterize chromatin states (Ernst and Kellis 2017)or rules of nucleosome positioning (Segal and Widom 2009)that reconcile ChIP-seq and chromatin accessibility data with genome sequence and transcription. Apart from the accordant histone and chromatin binding proteins associated with different flavors of chromatin, how distinct chromatin domains form, in a physical sense, is not completely understood. Recent evidence from Drosophilaand human chromatin suggests that heterochromatin domains comprised of H3K9me3-marked nucleosomes, heterochromatin protein 1 (HP-1) and DNA can exhibit phase separation behavior, which may be an explanation to link domain-scale and molecular-scale properties of heterochromatin foci which can display both liquid and stable phase properties (Strom, Emelyanov et al. 2017), a phenomenon supported on a broader scale by contemporaneous studies (Hnisz, Shrinivas et al. 2017). The way forward is to integrate structural studies from imaging and sequencing techniques with genome occupancy studies from sequencing techniques to build a new model governed by principles that incorporate all these sets of data (or at least that are not incompatible with large swaths of it).
How the epigenome integrates genetics and environment in disease or chromatin structure-function changes underlying organ malfunction
Histones were originally identified as inhibitors of transcription. This concept remains a kernel of chromatin theory: heterochromatin increases during differentiation and loss of pluripotency, and some diseases, notably cancer, have been found to be associated with a more euchromatic environment. Because they change concomitant with gene expression and phenotype, chromatin modifications are ispo factotaken as responsible for the unidirectional progression of cell fate commitment in the cardiovascular system. Another hypothesis cached therein is that histone PTM and other chromatin marks stabilize cell identity. Regarding this argument, here is a premise that should be rejected: identification of a chromatin modifying enzyme in the heart or vasculature whose genetic manipulation impairs or reverses developmental state is a necessary and sufficient condition to prove a role for chromatin in deciding and/or stabilizing cell fate. Here is another such premise of tenuous utility: if chromatin modifications stabilize cell phenotype, then reprogramming strategies that restore pluripotency (e.g.iPS) or directly convert one cell type to another (Ieda, Fu et al. 2010, Song, Nam et al. 2012)must do so by wholesale reprogramming of chromatin (although these processes do, no doubt about it, reprogram histone post-translational modifications and DNA methylation at cardiac genes (Liu, Chen et al. 2016)). iPS-derived cells coaxed toward a cardiovascular lineage acquire regulatory elements (e.g.histone post-translational modifications on regulatory elements nearby lineage appropriate genes) reminiscent of their endogenous counterparts (Zhao, Shao et al. 2017)and yet studies from non-cardiovascular tissues have shown that iPS-derived cells retain some epigenetic memories from their cells of origin, which, perhaps not surprisingly, is also the case for cardiovascular cells derived in cell culture from developmental precursors (e.g.DNA methylation) (Tompkins, Jung et al. 2016). Cardiac cells exhibiting progenitor-like behavior isolated from adult hearts indeed exhibit, commensurate with transcriptome changes, DNA methylation changes in genes associated with the mature cardiomyocyte lineage vis-à-vis adult cardiomyocytes lacking such progenitor-like behavior (Zhang, Zhong et al. 2015).
lncRNA expression changes in hypertrophied mouse hearts did not reveal a global embryonic non-coding transcriptome in this disease model: only 17 (out of a total of 321 [117 of which were cardiac enriched] lncRNAs in adult heart) exhibited altered expression, of which 13 were present in embryonic hearts (Matkovich, Edwards et al. 2014). Cell culture studies of distinct stages of cardiac lineage commitment explored the changes in chromatin marks associated with this process (Paige, Thomas et al. 2012, Wamstad, Alexander et al. 2012). No universal laws relating the timing of histone modification changes to transcription (i.e.who moves first) were uncovered, however, general features of heterochromatic mark (H3K27me3) loss were observed around genes that were expressed (genes never expressed in the cardiac lineage, in contrast, retained abundant H3K27me3 through differentiation and never gained activating marks like H3K4me3), and genes that would be expressed in subsequent stages of development were sometimes (although not always) enriched with H3K4me1 (a so-called ‘poised enhancer’ mark) prior to acquisition of H3K4me3 and RNA pol II concomitant with expression. If one were so inclined, the following observations may be taken as evidence that chromatin becomes more plastic in the setting of cardiac pathology: stimulation of neonatal rat ventricular myocytes with isoproterenol leads to decreased density of chromatin as measured by histone H3 immunolabeling and super resolution microscopy (Mitchell-Jordan, Chen et al. 2012); pressure overload hypertrophy is associated with a decrease in total histone H3K9me3 and increase in total H3K4me3 (as detected by western blotting) as well as a decrease in the linker histone H1 to core (measured by H4) ratio (Franklin, Chen et al. 2012); association across genetically variable mouse strains between select chromatin structural proteins HMGB2 and CTCF and cardiac phenotype and the ability of HMGB2 to modulate cardiomyocyte hypertrophy and chromatin accessibility in cardiac myocytes (Monte, Rosa-Garrido et al. 2016); and loss of CTCF (which induces cardiac dysfunction) or pressure overload hypertrophy is accompanied by a global decrease in genomic interactions detected by Hi-C (Rosa-Garrido, Chapski et al. 2017).
Every cell in the body must remember to remain the cell that it is. Here is what the cells of the cardiovascular system must also remember: exercise, virtuous diet, tobacco use, carbohydrate levels, fatty acid levels, toxins, viruses, illicit drug use, air pollution, stress, elation, sadness. The spectrum of cardiovascular disease symptoms, and the observation that two patients indistinguishable from each other based on symptoms will exhibit starkly different responses to standard medical therapy, has been attributed to gene by environment interactions and resulted in the common assumption that as much as 50% of common cardiovascular disease is heritable. Genome-wide association studies (GWAS), which identify association of genetic variation with a trait, have been used in a variety of cardiovascular conditions (Kessler, Vilne et al. 2016, Dodoo and Benjamin 2017), with relative success (e.g.PCSK9 [the gene for which housed just one of more than hundreds of genome-wide significant SNPs associated with coronary artery disease traits like blood lipid levels], a recently emergent therapeutic target for coronary artery disease, was identified by GWAS (Myocardial Infarction Genetics, Kathiresan et al. 2009)[although it had previously been mapped by linkage analysis in families affected by hypercholesterolemia (Abifadel, Varret et al. 2003)]) and—from a clinical perspective—unclear actionability (e.g.GWAS in heart failure has revealed few variants (Rau, Lusis et al. 2015)and none in as-yet therapeutically viable genes/proteins). Meta-analysis of heart failure GWAS studies recently uncovered a novel risk allele associated with mortality and located in a noncoding enhancer region (Smith, Felix et al. 2016). Interestingly, DNA methylation signatures at this locus in blood were correlated with allergic sensitization, potentially hinting at a gene-environment interaction leaving an epigenetic signature. A similar observation of epigenetic risk conferred in a cell type specific manner by marks present across different tissues was recently reported in the context of dilated cardiomyopathy (Meder, Haas et al. 2017).
Inheritance of non-genetic factors—more specifically, inheritance of acquired traits—transgenerationally has gained renewed interest in the recent past. Unequivocal determination of transgenerational inheritance of acquired traits is tricky(Bohacek and Mansuy 2015)and many purported examples are hotly debated. Epigenetic landscapes are inherited from both parents and arise following fertilization and morula formation. Some of these differences in chromatin architecture are strictly genetic, which is to say, they can be explained by differences in DNA sequence (i.e.which alleles were inherited from which parent). But some of the differences are according to Hoyle transgeneratioanlly epigenetic: they arise for differences in DNA methylation and in some cases differences in histone modifications, such as the ability of maternal-specified histone H3K27me3 to modulate paternal allele specific gene expression independent of DNA methylation (Inoue, Jiang et al. 2017). These heritable, non-DNA features, can in turn specify chromatin accessibility in progeny, so-called “protein imprinting.” C. elegans, which lacks DNA methylation, can pass through the germline epigenetic cues (Greer, Maures et al. 2011)—including those triggered by heat shock and persisting for 14 generations—via histone H3K9me3 (Klosin, Casas et al. 2017). In mice, liver (Orozco, Rubbi et al. 2014)and heart (Chen, Orozco et al. 2016)DNA methylation patterns are transgenerationally heritable and contribute to gene expression. Moreover, epigenome-wide association analyses in liver demonstrate (Orozco, Morselli et al. 2015)DNA methylation-dependent—and sequence variation independent—associations with clinical traits important for cardiovascular disease such as insulin level, as well as other ‘omics endpoints, providing proof of concept for population level epigenomic regulation of complex disease traits through the actions of DNA methylation variation to control phenotype, presumably through effects on chromatin structure or accessibility.
Something ostensibly heritable through cell division or meiosis may masquerade as epigenetic and/or may even be of chromatic origin but may in fact proceed via a genetic means. Consider the following example from Turner (Turner 2011): a given locus of DNA is occupied by nucleosomes that have, according to established credo, been tweaked by post-translational modification that favors association of DNA methylation machinery (DNMT3a or 3b, for example, suitable for deposition of new methylation on cytosine), that in turn methylates CpGs in the locus (that is, a new, potentially epigenetic mark, is deposited) that, over time, succumb to non-enzymatic deamination (methyl-cytosine does) to thymidines, thus changing the DNA sequence that, if occurring in a mitotic capable cell, would be passed to cellular progeny and if occurring in germ cells, could be passed to organismal progeny, and that could be further amplified—the original chromatin permutation, the resulting DNA methylation changes and the resulting genetic modification could—by in turn further rejigging the locus via direct influence on nucleosome positioning, creation, ablation or modulation of consensus motifs for protein binding, and/or additional histone post-translational modification that, together, could alter the accessibility of the locus to influence its transcription, in addition to the potential effects of the mutation to alter the sequence of resultant RNA and, when relevant, protein product. This example shows how an epigenetic change might induce a genetic one, but it is important to note that epigenetic changes themselves can induce phenotypic changes, on which selection acts, underscoring the difficulty of unequivocal experimental dissection of solely epigenetic inheritance from parent to progeny—cell or organism.
Maternal smoking, for example, is known to induce widespread DNA methylation differences in newborns, some of which persist into the offspring’s adulthood, including in genes known to be associated with smoking-related birth defects (Joubert, Felix et al. 2016), although whether these effects are due to natal exposure (which would render them not epigenetic in the Lamarckian sense) remains unknown. There have been limited experimental studies directly testing the role of non-genetic inheritance of cardiovascular risk in animal models (e.g.DNA methylation dependent target gene expression in the context of offspring ischemic injury (Patterson, Xiao et al. 2012)) and none, to our knowledge, conclusively demonstrating epigenetic inheritance of cardiovascular risk in the absence of in uteroexposure. In vitrofertilization experiments using gametes from obese or normal weight parents (induced by dietary modification) and surrogate, normal chow fed, mothers revealed that a propensity for increased body weight can be inherited by non-genetic means (Huypens, Sass et al. 2016). Metabolic gene regulation has been shown to be a modifiable, and subsequently heritable, feature, in that mice fed low protein diets passed hepatic gene expression profiles transgenerationally through the paternal germ line (Carone, Fauquier et al. 2010). Genetic variability contributes to chromatin accessibility (measured via FAIRE-seq) in the basal state and following complex metabolic changes, such as those accompanying high fat diet (Leung, Parks et al. 2014). Human studies in ethnically diverse populations have revealed DNA methylation variation associated with nicotine and alcohol dependence (and the co-dependency between these forms of addiction) (Xu, Wang et al. 2017)although no evidence of inheritance and/or precedence of the phenotypes by the epigenetic features was demonstrated. In addition to the obvious prognostic and diagnostic potential of genomics and epigenomics measurements in the clinical arena (the practical considerations of which are discussed in detail elsewhere (MacRae, Roden et al. 2016, Wang, Aboulhosn et al. 2016)), it is noteworthy that tools for reducing epigenomic treatment to practice have begun to emerge. Modification of the CRISPR/Cas9 gene targeting system, involving an inactive Cas9 nuclease (so-called dCas9) fused to DNMT3a or Tet1 and combined with guide RNAs to localize the complex, can induce altered DNA methylation of specific loci in somatic cells in vivocommensurate with desired changes in gene expression (Liu, Wu et al. 2016)(techniques for remodeling chromatin loops with designer CRISPR tools have also emerged (Morgan, Mariano et al. 2017)). Such approaches may enable targeting of entire transcriptomes, rather than individual molecules, in a gene therapy workflow that at once provides both specificity and temporal tuning.
It occurs to me that another definition of epigenomics would be: the molecular features that make one unit of a living being the same unit tomorrow that it is today.Because different cells with the same DNA have different regimes of RNA and protein and are capable of quite different things, if a unifying model for chromatin regulation does not yet exist, it is necessary to invent one.
Tom Vondriska, May 2018
A vigorously edited adaptation of this essay was published, through kind invitation, in the journalCirculation Research(2018;122:1586-1607) in collaboration with Manuel Rosa Garrido and Douglas Chapski in May 2018. I also thank Emma Monte for feedback on this essay.
Extraordinarily complex things can emerge without design. Yet, the question of how the same four nucleic acid building blocks and much of the same chromatin packaging toolkit (made from protein and RNA) present in all eukaryotes endow Saccharomyces cerevisiaewith the ability to metabolize sugar into alcohol, Solanum lycopersicumwith the ability to photosynthesize and Homo sapienwith the ability to execute complex behaviors as speech, chess (or围棋) and abstract reasoning, remains largely unanswered.
The goals of this essay to are stimulate discussion about the concepts of chromatin biology and epigenomics and to provide a thorough review of chromatin-related investigations in the cardiovascular system. In further support of the latter objective, Supplemental Table 1 provides a detailed compendium of published work implicating chromatin modifiers in the cardiovascular system, including references and notes on phenotypes and model systems.
If the last eukaryotic common ancestor had histones and used them to package DNA—which seems likely given observations in Archaea by K. Luger—this would mean the nucleosome protein complexes we study today have ~2 billion years of evolutionary refinement behind them. While the fundamental binding of histones by DNA has changed little in that time, it is interesting to note that Archaea histone-like proteins are smaller than their eukaryotic relatives and lack basic tails, which in eukaryotes are unstructured and beset by post-translational modifications which are, in turn, associated with highly specialized transcriptional responses that can differ between cells, as discussed in detail elsewhere in this piece. It is thus reasonable to hypothesize that chromatin structure evolved long ago and has not much changed, whereas the regulation of this structure has occurred in the realm of eukaryotes and not in Archaea, paralleling evolution of organelles, multicellularity and transcriptome expansion in the former and not the latter.
Resolution in terms of chromatin structure can mean in 3D space, as is acquired with imaging techniques and which includes direct visualization in the nuclear environment (higher resolution here meaning a greater ability to measure where something is in 3D w.r.t something else and/orto learn the anatomic features of the chromatin) but which treats DNA in a democratized fashion, which is to say, you do not know which regions you are looking at, although you may be able to visualize them with great detail. Resolution can also refer to the scale of locus specificityat which a chromatin feature is determined, say for instance if FISH determines two chromosomes to be nearer to each other in the nucleus than either is to a third (this would be relatively low resolution) or when a chromatin capture experiment determines, with accuracy of several tens or hundreds of base pairs, the physical proximity of regions of the genome not in close contiguous proximity along the same chromosome.
ChIP-seq protocols require optimization for target and source material and thus often vary considerably across labs. Most experiments examining chromatin employ a fixation step, often with formaldehyde. For these reasons, it would not be unreasonable to worry that grand conclusions about chromatin biology from ChIP-seq experiments, and other epigenome-wide methods, may be contingent on techniques. Allaying this concern somewhat is the observation that in fact the data resulting from various different labs around the world deploying ChIP-seq experiments on different cell types often produce similar observations: for example, activating marks tend to reside in genes that are transcribed and so on (cognitive biases in data analysis and presentation notwithstanding). Indeed, general patterns of accessibility vary little, at a low-resolution glance, between cells. FISH experiments can further dampen such concerns, for example in the confirmation of DAM-ID experiments by co-labeling the nuclear membrane and a gene of interest, as was recently demonstrated in transition of stem cells to the cardiac lineage (Poleshko A, et al. Cell. 2017;171:573.), although FISH also usually involves fixation. An additional piece of evidence suggesting that the uniform aspect of epigenomic experiments across cells (that is, the fixation) is not inducing an artificial property of chromatin is the observation that one of those features—the TAD—is not universally detected following Hi-C experiments (i.e. you can perform Hi-C on genomic DNA and, under the right conditions, notdetect TADs either following genetic fiddling with chromatin structure proteins [Rao SSP, et al. Cell. 2017;171:305; Nora EP, et al. Cell. 169:930; Schwarzer W, et al. Nature. 2017;551:51] or during meiosis [Du Z et al. Nature. 2017;547:232], implying these features are not artifacts of current analytical approaches.
Take for instance the bivalent histone modifications H3K4me3 and H3K27me3: in C57BL/6 mouse heart, 32,461 H3K4me3 autosomal peaks were called, whereas 14,915 H3K27me3 peaks were called (4,055 H3K27me3 and 6,172 H3K4me3 peaks overlap, the difference in these numbers results from the varying levels of peak width between the datasets [this is also reflected in the different levels of total peaks called; H3K4me3 peaks are narrower]). In the liver, for comparison, those numbers are 32,690 and 19,829, respectively (ENCODE datasets from Bing Ren lab, UCSD). How do the writers, erasers and readers distinguish between these sites?
A less common, but more interesting, answer is: all of them. This hypothesizes that every scale of biological (and indeed physical) information that can be measured is important, somehow privileged, by its existence, versus biological and physical things that do not exist. If humans can perceive it, it needs to be there. A variation of everything happens for a reason, this answer is more likely to be given verbally (as opposed to in the discussion section of a paper or the background and preliminary data sections of a grant application) and perhaps after the day’s lectures are done and libations are being imbibed, followed up by some variation of yeah, well but we will never be able to figure it all out anyway, so.
Note well an example: the silencing mark histone H3 lysine 27 trimethylation, which enriches around gene regulatory regions, is represented at the time of this writing by 403 ChIP-seq datasets in ENCODE and ~2000-3000 publications in PubMed (an exact count of publications is tough given cavalier nomenclature); those numbers for H3 K27 acetylation, a euchromatic mark (note: on the same residue of the same protein), identified in 384 and ~500, respectively. In contrast, a recently identified succinylation on human histone H3 K79, also found to enrich in the transcription start sites of active genes, has one publication and one ChIP-seq dataset (Wang Y, et al. Nature. 2017;552:273).
Virtually every form of PTM that has been identified on any protein has been identified on a histone, meaning a comprehensive list—forget a comprehensive discussion—of these modifications is a lengthy endeavor unto itself (e.g.Zhao and Garcia. Cold Spring Harbor Perspect Biol. 2015;7:a025064).
Brahma is a Hindu god associated with creation—in apposition, balance, or partnership perhaps, with gods of preservation (Vishnu) and destruction (Shiva; New Oxford American Dictionary).
This type of experimental design, while producing interpretable results, can represent attempted avoidance of cognitive dissonance by stacking the deck to confirm a hypothesis (“looking for the keys under the street lamp”). In the context of the broader question of how does DNA methylation regulate chromatin structure?, RRBS is probably inadequate for this purpose, as it ignores a large physical portion of the genome based on the—not unreasonable—assumption that it does not code mRNA. Whole genome bisulfite sequencing of large groups of animals with known genetic relatedness will be necessary to definitively test whether all DNA methylation behaves, in terms of inheritance, like that in CpG islands and what, if any, role is played by intergenic DNA methylation in chromatin structure, mRNA transcription and other transeffects on genome function.
RNA interference (RNAi; a rich field unto itself) operates to silence transcription through multiple mechanisms. One mechanism in lower organisms involves facilitation of heterochromatin via direct interaction of noncoding small RNAs with chromatin modifying enzymes (and a host of regulatory proteins, plus an RNA-dependent RNA polymerase, necessary for RNAi-mediated chromatin silencing, that is not present in mammals) at defined heterochromatin loci, a process most well-studied in Schizosaccharomyces pombeand not well understood in mammals (Martienssen and Moazed D. Cold Spring Harb Perspect Biol. 2015;7).
To unpack the BHitToCF a little further: the same genetic material exists in all cells and while the primary sequence of DNA provides cues for nucleosome positioning and transcriptional activity, it is insufficient to explain chromatin landscapes from an empirical standpoint (different cells have different transcriptomes, ergo transcriptional regulation is highly specialized). The histone modifying enzymes themselves are often multiprotein complexes but have not been shown to engage cell type specific (and/or stimulus-specific) subunits that have distinct DNA binding activity. Many histone modifications are shared across different cell types. Chromatin environments (so called gene expression neighborhoods or transcription factories) as defined by chromosome capture experiments may contribute to made-to-order transcriptomes, but many of these features (such as chromatin loops and TADs) are also shared between different cell types. lncRNAs have the distinguishing feature that they may, by virtue of their own transcription, create a microenvironment that facilitates a chromatin feature (Engreitz JM, et al. Science. 2013;341:1237973), which in turn may regulate a gene expression program. But then, how did the polymerase get to the lncRNA gene in the first place (and what keeps it away from other lncRNAs that govern transcription programs in other cell types)? At present, theories in this area are not much better than turtles all the way down.
Let’s get it out of the way that there is nothing wrong with the central dogma as it applies to the flow of information in the context of inheritance. It is as close to one can get to a biological law, and was articulated by Crick as “the detailed residue-by-residue transfer of information” from nucleic acids to nucleic acids, or from nucleic acids to proteins (Nature. 1970;227:561). But its common distillation to “DNA to mRNA to protein” as a linear process is a cognitive bias nonetheless and thus deserves skepticism in terms of the universality (or lack thereof) of its usefulness across biological scales for explaining cellular networks and mechanisms of disease.
Diseases are statistical phenomena based on symptom accretion and prevalence within a population. Some physicians, particularly in the rare disease community where precedent can be of little use for diagnosis and treatment of genetic conditions, consider that there may be “only patients and no diseases” (Manish Butte, UCLA, in conversation).
Why do so many (published) genetic gain/loss of function models in the heart result in some type of hypertrophy and/or dilated cardiomyopathy? Because they all work “in the same molecular pathway” or because the realm of possible entropic destinations for the adult heart is fixed (and quite small, in terms of organ and cell level phenotypes)?
In non-infectious disease scenarios like most forms of cardiovascular disease, metabolic disease, cancer and many neurological disorders, it does not really make sense to isolate a causative agent, although much of molecular biology has attempted to isolate, in a sense, causative genes and to use them like infectious agents (sometimes, as with viral delivery, quite literally). While this approach has been very useful in identifying things that can cause disease in animals and cells, it has been less useful in identifying things that do cause disease in people. In multigenic diseases, the same gene is often not the one (or not the only one) behaving badly in patients presenting with the same symptoms. Apologies to R. H. H. Koch.
It is tempting to speculate that epigenetic processes may be themolecular substrate of organismal memory, that is, the chemical messages the brain’s cells use to remember things. Distributed networks of cells, and their associated privileged connections (or lack thereof) are widely held to be critical for this process, but the molecular agents are not universally agreed upon. Organismal (and thus cellular [and thus molecular]) memory does not require active chemistry. Example: Caenorhabditis elegansand Milnesium tardigradumcan remain in suspended animation for years, even sometimes “remembering” simple tasks performed before the long nap. Electrical circuits seem an odd choice (electric shock usually does not induce a proverbial tabula rasa) and other cellular processes seem to fail a sniff test because they are either too stable (e.g. DNA itself) or too labile (e.g. membrane potentials, intracellular calcium stores, signaling networks).
Translation: in conversationwith Drs. Mark Sussman, Elaheh Karbassi and others, and unpublished observations.
The text preceding this quote contains an early articulation of a general theory of chromatin:
“It has always been a puzzle to us…how the physiological functions of cell nuclei in the same organism can differ—as presumably they do differ—from one cell-type to another when they all contain identical chromosomes and hence identical genes. The physiological functions of the nuclei are presumably due to the genes which they contain; they should, therefore, be identical in all nuclei of a given organism. If, however, it is postulated that nuclei contain some mechanisms for the suppression of the activities of particular genes, or groups of genes, and that this mechanism is specific for each cell type, these difficulties disappear.”
It is worth reflecting that this conjecture—a stunningly accurate description of the modern understanding of chromatin function—was written three years before the publication of the structure of DNA and a time at which there was still active debate about the role of DNA as the genetic material (Griffiths AJF, et al. An introduction to genetic analysis. 7thEd. W.H. Freeman, New York, 2000).
Metaphor extension: HiC gives you the equivalent of an excel spreadsheet of interactions…a list of all the segments of each piece of spaghetti and information on which other segments of which other pieces it interacts with (and, which ones it does not). Models like those developed by F. Alber and othersgive you the equivalent of a navigable 3D model of the bowl of pasta.
In contrast with what Heidegger viewed as “present-to-hand” tools, which have no preordained function (Wheeler, Michael, "Martin Heidegger", The Stanford Encyclopedia of Philosophy (Fall 2017 Edition), Edward N. Zalta [ed.]), chromatin models, the histone code being archetypical, take as fixed certain rules about the relationship between nucleosomes, histone modifications, chromatin structure and gene expression. Such assumptions may be unavoidable, but they should allow that the model may “break.”
Just because a human study does not result in clinically actionable intel does not mean that it is not instructive regarding basic human biology. In the case of heart failure, the human data may be telling us (again) that single genes are not the answer to a new therapy that will work on a population scale.
As noted by Professor Turner, all of these processes have been experimentally demonstrated independently in higher eukaryotes, including mammals, although the precise manner in which they might combine remains to be universally demonstrated. That is to say: it remains unknown how frequently environmental—read: non-genetic—factors use chromatin as a substrate to tamper with the genetic code in a heritable manner across the genome; this ignorance is even deeper but perhaps more tantalizing in the context of higher traits like those commonly investigated in cardiovascular disease (still less is known about whether this process works in a Darwinian selectablemanner) in which environmental factors (that is, modifiable CVD risk factors) have, over the course of industrialization of much of the world, come to exert stronger and more persistent bombardment of the epigenome.
The source of man’s essence has been the subject of intense thought and debate throughout recorded history. Modern science, as contrasted with ancient science which was not divorced from philosophy, has largely reduced this question to one of physical, chemical relationships, regarding supervenience as indisputable truth. To even suggest otherwise would certainly raise eyebrows and torpedo grants. The basic question of essence is useful, however, in framing the analysis of complex physiological systems, be they organisms, organs or cells: What is the nature of identity? What are the principles that govern persistence of that identity through time? This essay endeavors, on some level, to answer these questions in physical, chemical terms for the cardiovascular system.