* Your assessment is very important for improving the work of artificial intelligence, which forms the content of this project
Download proteins
RNA polymerase II holoenzyme wikipedia , lookup
Endogenous retrovirus wikipedia , lookup
Gel electrophoresis of nucleic acids wikipedia , lookup
Real-time polymerase chain reaction wikipedia , lookup
Eukaryotic transcription wikipedia , lookup
RNA silencing wikipedia , lookup
Promoter (genetics) wikipedia , lookup
Messenger RNA wikipedia , lookup
Molecular cloning wikipedia , lookup
Polyadenylation wikipedia , lookup
Metalloprotein wikipedia , lookup
DNA supercoil wikipedia , lookup
Two-hybrid screening wikipedia , lookup
Transcriptional regulation wikipedia , lookup
Non-coding DNA wikipedia , lookup
Vectors in gene therapy wikipedia , lookup
Proteolysis wikipedia , lookup
Silencer (genetics) wikipedia , lookup
Gene expression wikipedia , lookup
Amino acid synthesis wikipedia , lookup
Point mutation wikipedia , lookup
Epitranscriptome wikipedia , lookup
Molecular evolution wikipedia , lookup
Artificial gene synthesis wikipedia , lookup
Deoxyribozyme wikipedia , lookup
Biochemistry wikipedia , lookup
Genetic code wikipedia , lookup
Basic concepts
of molecular biology
Gabriella Trucco
Email: gabriella.trucco@unimi.it
Life
The main actors in the chemistry of life are molecules called
proteins
nucleic acids
Proteins:
many different kinds: Structural proteins, enzymes, …
Two kinds of nucleic acids: ribonucleic acid, abbreviated by RNA, and
deoxyribonucleic acid, or DNA
Proteins
chain of simpler molecules called amino acids
Aminoacid
one central carbon atom
hydrogen atom
amino group (NH2)
carboxy group (COOH)
side chain
Proteins
There are 20 common amino acids (aa’s); two systems of abbreviations are used:
3-letter-code and 1-letter-code. We usually use the 1-letter-code.
alanine Ala A
arginine Arg R
asparagine Asn N
aspartic acid Asp D
cysteine Cys C
glutamine Gln Q
glutamic acid Glu E
glycine Gly G
histidine His H
isoleucine Ile I
leucine Leu L
lysine Lys K
methionine Met M
phenylalanine Phe F
proline Pro P
serine Ser S
threonine Thr T
tryptophan Trp W
tyrosine Tyr Y
valine Val V
Proteins
Peptide bonds
Backbone
Primary, secondary, tertiary, and quaternary structures of proteins
DNA: nucleotides
Double chain
Simple chain: strand
Basic unit
(nucleotide):
Sugar
Phosphate group
Nitrogenous base
4 bases: A C G T:
adenine, cytosine,
guanine, thymine
(bases, nucleotides)
DNA: nucleotides
5’ ...AACAGTACCATGCTAGGTCAATCGA...3’
3’ ...TTGTCATGGTACGATCCAGTTAGCT...5’
orientation (read from 5’ to 3’ end)
length measured in bp (base pairs)
double stranded, the two strands are antiparallel
A - T and C - G complementary (Watson-Crick pairs)
DNA as string of letters, each letter representing a base.
"string-view" of DNA: one of the strings on top of the other
reverse complementation: (ACCTG)rc = CAGGT
RNA: nucleotides
Like DNA, except:
4 characters: A C U G: adenine, cytosine, uracil, guanine (U instead of
T)
Genes
Certain contiguous stretches along DNA encode information for
building proteins, but others do not
This stretch is known as a gene
Protein: chain of amino acids
Triplets of nucleotides specify each amino acid
Each nucleotide triplet is called a codon
Genetic code: table that gives the correspondence between each
possible triplet and each amino acid
The genetic code
Degeneracy of the
genetic code: 64
codons but only 20
aa’s plus stop codon
Silent mutations: if
third position
mutates, this often
does not alter the aa
The central dogma of molecular biology
How the information in the
DNA results in proteins
Promoter – AUG
Transcription: copy of the
gene made on an RNA
molecule (messenger RNA, or
mRNA ).
This resulting RNA will have
exactly the same sequence as
one of the strands of the gene
but substituting U for T
The strand identical to the
mRNA is called coding strand
The other strand (the one
which is used for the
transcription) is called template
strand
The central dogma of molecular biology
Protein synthesis inside cellular
structures called ribosomes
tRNA: transfer RNA
Translation: tRNA make the
connection between a codon
and the specific amino acid this
codon codes for.
Each tRNA molecule has, on one
side, a conformation that has
high affinity for a specific codon
and, on the other side, a
conformation that binds easily to
the corresponding amino acid.
As the messenger RNA passes
through the ribosome, a tRNA
matching the current codon
binds to it, bringing the
corresponding amino acid
When a STOP codon appears, no
tRNA associates with it, and the
synthesis ends
Strings in molecular biology
Many other problems in molecular biology can be modelled by strings (e.g.
gene order, haplotypes, . . . )
Strings are finite sequences over an alphabet S (also called sequences)
DNA (characters: nucleotides) S = {A,C,G,T}
RNA (characters: nucleotides) S = {A,C,G,U}
Proteins (characters: peptides) S = {A,C,D,E,F,. . . ,W,Y}
Reading frames
Reading frame: one of the three possible ways of grouping bases to form
codons in a DNA or RNA sequence
3 different reading frames for translation: The DNA sequence
5’ ...TATTCGAATCGGC...3’
can be translated in 3 different ways, leading to different aa sequences
Exercise
Translate this DNA sequence according to the 3 different reading
frames:
5’ ...TATTCGAATCGGC...3’