In molecular biology and genetics, transcriptional regulation is the means by which a cell regulates the conversion of DNA to RNA (transcription), thereby orchestrating gene activity. A single gene can be regulated in a range of ways, from altering the number of copies of RNA that are transcribed, to the temporal control of when the gene is transcribed. This control allows the cell or organism to respond to a variety of intra- and extracellular signals and thus mount a response. Some examples of this include producing the mRNA that encode enzymes to adapt to a change in a food source, producing the gene products involved in cell cycle specific activities, and producing the gene products responsible for cellular differentiation in multicellular eukaryotes.
The regulation of transcription is a vital process in all living organisms. It is orchestrated by transcription factors and other proteins working in concert to finely tune the amount of RNA being produced through a variety of mechanisms. Prokaryotic organisms and eukaryotic organisms have very different strategies of accomplishing control over transcription, but some important features remain conserved between the two. Most importantly is the idea of combinatorial control, which is that any given gene is likely controlled by a specific combination of factors to control transcription. In a hypothetical example, the factors A and B might regulate a distinct set of genes from the combination of factors A and C. This combinatorial nature extends to complexes of far more than two proteins, and allows a very small subset (less than 10%) of the genome to control the transcriptional program of the entire cell.
Much of the early understanding of transcription came from prokaryotic organisms, although the extent and complexity of transcriptional regulation is greater in eukaryotes. Prokaryotic transcription is governed by three main sequence elements:
· Operators recognize repressor proteins that bind to a stretch of DNA and inhibit the transcription of the gene.
· Positive control elements that bind to DNA and incite higher levels of transcription.
While these means of transcriptional regulation also exist in eukaryotes, the transcriptional landscape is significantly more complicated both by the number of proteins involved as well as by the presence of introns and the packaging of DNA into histones.
The transcription of a basic prokaryotic gene is dependent on the strength of its promoter and the presence of activators or repressors. In the absence of other regulatory elements, a promoter’s sequence-based affinity for RNA polymerases varies, which results in the production of different amounts of transcript. The variable affinity of RNA polymerase for different promoter sequences is related to regions of consensus sequence upstream of the transcription start site. The more nucleotides of a promoter that agree with the consensus sequence, the stronger the affinity of the promoter for RNA Polymerase likely is.
In the absence of other regulatory elements, the default state of a prokaryotic transcript is to be in the “on” configuration, resulting in the production of some amount of transcript. This means that transcriptional regulation in the form of protein repressors and positive control elements can either increase or decrease transcription. Repressors often physically occupy the promoter location, occluding RNA polymerase from binding. Alternatively, a repressor and polymerase may bind to the DNA at the same time with a physical interaction between the repressor preventing the opening of the DNA for access to the minus strand for transcription. This strategy of control is distinct from eukaryotic transcription, whose basal state is to be off and where co-factors required for transcription initiation are highly gene dependent.
Sigma factors are specialized bacterial proteins that bind to RNA polymerases and orchestrate transcription initiation. Sigma factors act as mediators of sequence-specific transcription, such that a single sigma factor can be used for transcription of all housekeeping genes or a suite of genes the cell wishes to express in response to some external stimuli such as stress.
The added complexity of generating a eukaryotic cell carries with it an increase in the complexity of transcriptional regulation. Eukaryotes have three RNA polymerases, known as Pol I, Pol II, and Pol III. Each polymerase has specific targets and activities, and is regulated by independent mechanisms. There are a number of additional mechanisms through which polymerase activity can be controlled. These mechanisms can be generally grouped into three main areas:
· Control over polymerase access to the gene. This is perhaps the broadest of the three control mechanisms. This includes the functions of histone remodeling enzymes, transcription factors, enhancers and repressors, and many other complexes
· Productive elongation of the RNA transcript. Once polymerase is bound to a promoter, it requires another set of factors to allow it to escape the promoter complex and begin successfully transcribing RNA.
· Termination of the polymerase. A number of factors which have been found to control how and when termination occurs, which will dictate the fate of the RNA transcript.
All three of these systems work in concert to integrate signals from the cell and change the transcriptional program accordingly.
While in prokaryotic systems the basal transcription state can be thought of as nonrestrictive (that is, “on” in the absence of modifying factors), eukaryotes have a restrictive basal state which requires the recruitment of other factors in order to generate RNA transcripts. This difference is largely due to the compaction of the eukaryotic genome by winding DNA around histones to form higher order structures. This compaction makes the gene promoter inaccessible without the assistance of other factors in the nucleus, and thus chromatin structure is a common site of regulation. Similar to the sigma factors in prokaryotes, the general transcription factors (GTFs) are a set of factors in eukaryotes that are required for all transcription events. These factors are responsible for stabilizing binding interactions and opening the DNA helix to allow the RNA polymerase to access the template, but generally lack specificity for different promoter sites. A large part of gene regulation occurs through transcription factors that either recruit or inhibit the binding of the general transcription machinery and/or the polymerase. This can be accomplished through close interactions with core promoter elements, or through the long-distance enhancer elements.
Once a polymerase is successfully bound to a DNA template, it often requires the assistance of other proteins in order to leave the stable promoter complex and begin elongating the nascent RNA strand. This process is called promoter escape, and is another step at which regulatory elements can act to accelerate or slow the transcription process. Similarly, protein and nucleic acid factors can associate with the elongation complex and modulate the rate at which the polymerase moves along the DNA template.
At the level of chromatin state
In eukaryotes, genomic DNA is highly compacted in order to be able to fit it into the nucleus. This is accomplished by winding the DNA around protein octamers called histones, which has consequences for the physical accessibility of parts of the genome at any given time. Significant portions are silenced through histone modifications, and thus are inaccessible to the polymerases or their cofactors. The highest level of transcription regulation occurs through the rearrangement of histones in order to expose or sequester genes, because these processes have the ability to render entire regions of a chromosome inaccessible such as what occurs in imprinting.
Histone rearrangement is facilitated by post-translational modifications to the tails of the core histones. A wide variety of modifications can be made by enzymes such as the histone acetyltransferases (HATs), histone methyltransferases (HMTs), and histone deacetylases (HDACs), among others. These enzymes can add or remove covalent modifications such as methyl groups, acetyl groups, phosphates, and ubiquitin. Histone modifications serve to recruit other proteins which can either increase the compaction of the chromatin and sequester promoter elements, or to increase the spacing between histones and allow the association of transcription factors or polymerase on open DNA. For example, H3K27 trimethylation by the polycomb complex PRC2 causes chromosomal compaction and gene silencing. These histone modifications may be created by the cell, or inherited in an epigenetic fashion from a parent
Through transcription factors and enhancers
Transcription factors are proteins that bind to specific DNA sequences in order to regulate the expression of a given gene. The power of transcription factors resides in their ability to activate and/or repress wide repertoires of downstream target genes. The fact that these transcription factors work in a combinatorial fashion means that only a small subset of an organism’s genome encodes transcription factors. Transcription factors function through a wide variety of mechanisms. Often they are at the end of a signal transduction pathway that functions to change something about the factor, like its subcellular localization or its activity. Post-translational modifications to transcription factors located in the cytosol can cause them to translocate to the nucleus where they can interact with their corresponding enhancers. Others are already in the nucleus, and are modified to enable the interaction with partner transcription factors. Some post-translational modifications known to regulate the functional state of transcription factors are phosphorylation, acetylation, SUMOylation and ubiquitylation. Transcription factors can be divided in two main categories: activators and repressors. While activators can interact directly or indirectly with the core machinery of transcription through enhancer binding, repressors predominantly recruit co-repressor complexes leading to transcriptional repression by chromatin condensation of enhancer regions. It may also happen that a repressor may function by allosteric competition against a determined activator to repress gene expression: overlapping DNA-binding motifs for both activators and repressors induce a physical competition to occupy the site of binding. If the repressor has a higher affinity for its motif than the activator, transcription would be effectively blocked in the presence of the repressor. Tight regulatory control is achieved by the highly dynamic nature of transcription factors. Again, many different mechanisms exist to control whether a transcription factor is active. These mechanisms include control over protein localization or control over whether the protein can bind DNA. An example of this is the protein HSF1, which remains bound to Hsp70 in the cytosol and is only translocated into the nucleus upon cellular stress such as heat shock. Thus, the genes under the control of this transcription factor will remain untranscribed unless the cell is subjected to stress
Enhancers or cis-regulatory modules/elements (CRM/CRE) are non-coding DNA sequences containing multiple activator and repressor binding sites. Enhancers range from 200 bp to 1 kb in length and can be either proximal, 5’ upstream to the promoter or within the first intron of the regulated gene, or distal, in introns of neighboring genes or intergenic regions far away from the locus. Through DNA looping, active enhancers contact the promoter dependently of the core DNA binding motif promoter specificity. Promoter-enhancer dichotomy provides the basis for the functional interaction between transcription factors and transcriptional core machinery to trigger RNA Pol II escape from the promoter. Whereas one could think that there is a 1:1 enhancer-promoter ratio, studies of the human genome predict that an active promoter interacts with 4 to 5 enhancers. Similarly, enhancers can regulate more than one gene without linkage restriction and are said to “skip” neighboring genes to regulate more distant ones. Even though infrequent, transcriptional regulation can involve elements located in a chromosome different to one where the promoter resides. Proximal enhancers or promoters of neighboring genes can serve as platforms to recruit more distal elements.
POST TRANSCRIPTIONAL REGULATION/RNA PROCESSING
The eukaryotic pre-mRNA undergoes extensive processing before it is ready to be translated. The additional steps involved in eukaryotic mRNA maturation create a molecule with a much longer half-life than a prokaryotic mRNA. Eukaryotic mRNAs last for several hours, whereas the typical mRNA lasts no more than five seconds.
Pre-mRNAs are first coated in RNA-stabilizing proteins; these protect the pre-mRNA from degradation while it is processed and exported out of the nucleus. The three most important steps of pre-mRNA processing are the addition of stabilizing and signaling factors at the 5′ and 3′ ends of the molecule, and the removal of intervening sequences that do not specify the appropriate amino acids. In rare cases, the mRNA transcript can be “edited” after it is transcribed.
While the pre-mRNA is still being synthesized, a 7-methylguanosine cap is added to the 5′ end of the growing transcript by a 5′-to-5′ phosphate linkage. This moiety protects the nascent mRNA from degradation. In addition, initiation factors involved in protein synthesis recognize the cap to help initiate translation by ribosomes.
: Capping of the pre-mRNA involves the addition of 7-methylguanosine (mG) to the 5′ end. The cap protects the 5′ end of the primary RNA transcript from attack by ribonucleases and is recognized by eukaryotic initiation factors involved in assembling the ribosome on the mature mRNA prior to initiating translation.
WHY 5′ CAPPING?
1. It adds stability to mRNA
2. It helps in nuclear transport
3. It helps by promotion of translation
4. It helps in 5’ proximal intron excission
3′ Poly-A Tail
While RNA Polymerase II is still transcribing downstream of the proper end of a gene, the pre-mRNA is cleaved by an endonuclease-containing protein complex between an AAUAAA consensus sequence and a GU-rich sequence. This releases the functional pre-mRNA from the rest of the transcript, which is still attached to the RNA Polymerase. An enzyme called poly (A) polymerase (PAP) is part of the same protein complex that cleaves the pre-mRNA and it immediately adds a string of approximately 200 A nucleotides, called the poly (A) tail, to the 3′ end of the just-cleaved pre-mRNA. The poly (A) tail protects the mRNA from degradation, aids in the export of the mature mRNA to the cytoplasm, and is involved in binding proteins involved in initiating translation.
Eukaryotic genes are composed of exons, which correspond to protein-coding sequences (-on signifies that they are pressed), and intervening sequences called introns ( ron denotes their ervening role), which may be involved in gene regulation, but are removed from the pre-mRNA during processing. Intron sequences in mRNA do not encode functional proteins.
Discovery of Introns
The discovery of introns came as a surprise to researchers in the 1970s who expected that pre-mRNAs would specify protein sequences without further processing, as they had observed in prokaryotes. The genes of higher eukaryotes very often contain one or more introns. While these regions may correspond to regulatory sequences, the biological significance of having many introns or having very long introns in a gene is unclear. It is possible that introns slow down gene expression because it takes longer to transcribe pre-mRNAs with lots of introns. Alternatively, introns may be nonfunctional sequence remnants left over from the fusion of ancient genes throughout evolution. This is supported by the fact that separate exons often encode separate protein subunits or domains. For the most part, the sequences of introns can be mutated without ultimately affecting the protein product.
All introns in a pre-mRNA must be completely and precisely removed before protein synthesis. If the process errs by even a single nucleotide, the reading frame of the rejoined exons would shift, and the resulting protein would be dysfunctional. The process of removing introns and reconnecting exons is called splicing. Introns are removed and degraded while the pre-mRNA is still in the nucleus. Splicing occurs by a sequence-specific mechanism that ensures introns will be removed and exons rejoined with the accuracy and precision of a single nucleotide. The splicing of pre-mRNAs is conducted by complexes of proteins and RNA molecules called spliceosomes.
Pre-mRNA splicing involves the precise removal of introns from the primary RNA transcript. The splicing process is catalyzed by large complexes called spliceosomes. Each spliceosome is composed of five subunits called snRNPs. The spliceosome’s actions result in the splicing together of the two exons and the release of the intron in a lariat form.
Each spliceosome is composed of five subunits called snRNPs (for small nuclear ribonucleoparticles, and pronounced “snurps”.) Each snRNP is itself a complex of proteins and a special type of RNA found only in the nucleus called snRNAs (small nuclear RNAs). Spliceosomes recognize sequences at the 5′ end of the intron because introns always start with the nucleotides GU and they recognize sequences at the 3′ end of the intron because they always end with the nucleotides AG. The spliceosome cleaves the pre-mRNA’s sugar phosphate backbone at the G that starts the intron and then covalently attaches that G to an internal A nucleotide within the intron. Then the spliceosme connects the 3′ end of the first exon to the 5′ end of the following exon, cleaving the 3′ end of the intron in the process. This results in the splicing together of the two exons and the release of the intron in a lariat form.
: The snRNPs of the spliceosome were left out of this figure, but it shows the sites within the intron whose interactions are catalyzed by the spliceosome. Initially, the conserved G which starts an intron is cleaved from the 3′ end of the exon upstream to it and the G is covalently attached to an internal A within the intron. Then the 3′ end of the just-released exon is joined to the 5′ end of the next exon, cleaving the bond that attaches the 3′ end of the intron to its adjacent exon. This both joins the two exons and removes the intron in lariat form.