From Genes to Proteins

From Genes to Proteins

From Here to There 

If you have been reading this far, you may be thinking that DNA is a bunch of chemicals. You also might be thinking about who would win in a fight between a monkey with a knife and a dog with a laser. But, that is not relevant. Like the bunch of so-called "chemicals" in shampoo that add shine to your hair, or the bunch of chemicals in soft drinks that energize you for playing video games, chemicals can do a lot of things. 

DNA encodes information that determines how we look, what we can do, and how we live. If DNA were an instruction manual on how to make a human being, each page would be a gene. Genes are pieces of information that encode proteins that perform specific functions in a cell. Genes work individually or together to make us what we are, and understanding how genes work tells us more about ourselves. 

The Sex Lives of Peas and Flies 

Genes are the basic unit of inheritable information. Things like eye color, hair color, and blood type are examples of inheritable information. B-T-dubs, inheritable is the same as heritable; it is like flammable and inflammable. Genetics, or the study of the role of genes in heredity, began with the work of German monk Gregor Mendel. Being a monk, Mendel had a lot of free time, so he passed his days in solitude growing pea plants. He noticed while he was growing certain plants that he could predict what characteristics the plant would have depending on how he mated the plants. Plant mating is exactly what you think it is, except in plants.

Mendel studied these characteristics, or traits, and found that certain traits could be inherited from previous generations, like flower color, seed color, seed shape, pod shape, and pod color. This discovery led to the development of the field of genetics and introduced us to the concept of heredity, or the inheritance of traits that come from our parents. 

If you thought that Gregor Mendel was a strange guy, you should meet the next major figure in genetics: Thomas Hunt Morgan. While Mendel liked to study pea plants, Morgan studied fruit flies. The more you study biology, the more you will learn that we garnered so much information from the weirdest sources, such as fruit flies, pea plants, jellyfish, worms, corn, and yeast. Morgan was the first person to show that genes are found on chromosomes. This discovery eventually led to the "one gene, one protein" hypothesis. This hypothesis implies that each gene makes one protein, and that protein plays some role in the observable traits we see in offspring. We see you.



Like ice cream, traits come in many different flavors, and each flavor of trait is called an allele. Unfortunately, traits are lousy as ice cream flavors, and only come in two kinds: dominant and recessive. The dominant allele is the most common flavor of trait, like chocolate ice cream, while the recessive allele is generally more rare, like the ever-coveted rhubarb mint chocolate chip flavor. When parents mate, they provide one allele for each gene.      



The combination of alleles in an individual is called a genotype. A genotype is either homozygous, if it has 2 dominant or 2 recessive alleles, or heterozygous, if it has one dominant and one recessive allele. We can easily determine genotype from a Punnett square, which is not a nerd who makes lousy puns, or a square punster (so punny).

Phenotype is determined based on genotype, where typically, any allele that has a dominant and recessive form will display the dominant phenotype if it is homozygous dominant, or heterozygous. The recessive phenotype is only seen when two recessive alleles are inherited. 

The most notable exception to this two-allele rule would be blood type, where there are three alleles:  
  • IA (A)
     
  • IB (B)
     
  • IO (O)
     
Take that, Genetics. Allele A and B are called "codominant" because, if you inherit A and B and are heterozygous A and B, you will be AB. O is the recessive allele. A and O or B and O will only be A or B.  

Structure of Genes (Transcription) 

It is kind of amazing that a random sequence of nucleotides can make a specific protein, which is basically just a random sequence of amino acids. However, for this event to happen, there must be an order to things. A method to the madness, if you will. Similar to the idea that every game you play needs rules, all proteins/genes must follow a certain structure and process. A gene is composed of a series of codons, or three base sequences that tell the cell to make a certain amino acid. All of the codons in a protein are the sequence that is called an open reading frame. The open reading frame is basically the sequence of the protein. In order to make the protein, a gene needs to be transcribed, which is done by an enzyme that makes RNA. This enzyme, called RNA polymerase, needs proteins that bring it to the DNA, called transcription factors.



Transcription factors function to recruit RNA polymerase to DNA by first binding DNA, and then binding RNA polymerase. Transcription factors are like wingmen bringing DNA and RNA polymerase together. Transcription factors all bind to a sequence of nucleotides before the open reading frame, called the promoter. Once the right combination of transcription factors bind DNA, RNA polymerase comes in and transcription ensues. The minimum unit for a gene is a promoter and an open reading frame.




There are three kinds of transcription factors:
  • Activators or Promoters
     
  • Repressors (aka The Jerks)
     
  • Enhancers (aka Lazy Activators)
     


Transcription factors are all involved in promoting transcription and are sometimes called "activators." There are some shady proteins that bind DNA like transcription factors but actually stop or reduce transcription. These are called "repressors," but they should be called "jerks." These proteins bind DNA and prevent RNA polymerase from binding DNA, or they can push RNA polymerase off of DNA by derailing it. Not very nice, if you ask us.

There are other factors called "enhancers" that can be upstream of the promoter or within the open reading frame, who only function to enhance the possibility of transcription to start. Enhancers are basically lazy activators.

Bacteria like being different, and so do their genes. Eukaryotic genes are laid out in a series of exons (a nucleic acid sequence that will eventually represent a mature mRNA molecule) that are separated by sequences of DNA called introns (a nucleic acid sequence that will eventually be removed). Once an mRNA molecule is made, the exons are glued back together and the introns are cut, or spliced, out. Fancy word, right there.

Note: Bacteria do not have introns: they have open reading frames. This situation is good for bacteria because they can easily make proteins. However, they lack the ability to make different proteins by creating different splice products by splicing out one or more exons.

Say What? (Translation)

If you have ever watched the hit (term used loosely here, at least in reference to quality) Nicole Kidman film, The Interpreter, you know that translation is hard, especially when you are getting shot at. Cellular translation is like real-life translation: you take one language, and you make sense of it in another language—but cells need to do it without the help of Google Translate.

For the cell, the two languages we are dealing with are nucleic acid sequences and amino acid sequences. In the previous sections, we have shown that there is a lot of information in the specific order, or sequence, of nucleic acids, and this information can be easily translated into sequences of amino acids. We are well on our way to ProteinLand.



Amino acid sequences are important because they determine the shape of a protein. Depending on the sequence of amino acids, also referred to as the primary structure, you can get completely different functions for proteins. These protein functions can range from making fireflies glow to digesting lactose so that you do not get diarrhea every time you eat pizza and ice cream. Protein function is determined entirely by the sequence of amino acids, which is determined by the sequence of nucleic acids in a gene. It all comes together.



As you may have guessed, doing this whole translation business correctly the first time is important, and the cell has devised a brilliant system to translate: codons.

Codons are triplets of nucleotides (so cure) that determine what is the next amino acid to add to the protein. To "read" a DNA sequence, look at groups of 3 letters at a time to see what each one codes for. One concern in all translations is mistranslating something, like when KFC translated "Finger-lickin' good!" to "Eat your fingers off!" in Chinese. (See some other awesome mistranslations here.) Imagine if that happened with your proteins! Instead of being able to digest pizza, you would get diarrhea and glow like a firefly. That is not as awesome as it sounds.



Here is the part where we admire codons for their brilliance. Because multiple codons encode a specific amino acid, this allows for some errors to occur in the transcription and DNA replication processes. Errors made that do not affect the amino acid sequence are called "synonymous" mutations because the amino acid sequence is identical even though the nucleic acid sequence is different. Errors that change the amino acid sequence, including substitutions, insertions, and deletions, are called "nonsynonymous" mutations. Smart, those codons.

Transcription and translation are the processes that make proteins, and a protein that is made from a certain gene generally is the "trait" that geneticists define. Oftentimes, traits are not perceptible, like hair color, or even blood type, and many unobservable traits have some involvement in diseases. For more detail on making proteins from genes, see the Transcription and Translation unit.