You are on page 1of 62

PART V Analysis of Genetic Information

CHAPTER

Digital Analysis of DNA

http://www.accessexcellence.org/RC/AB/IE/Ethical_Issues_of_the_HGP.php https://genographic.nationalgeographic.com/

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

The human karyotype: Banding distinguishes the chromosomes


Photos (upper) and ideograms (lower) of stained human chromosomes at metaphase Autosomes are numbered in order of descending length Short arm is "p" Long arm is "q"
Fig. 10.2a, b
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10 2

The genome contains distinct types of gene organization


Gene families

Closely-related genes (paralogs) that are members of multi-gene families


Can be clustered together or dispersed on several chromosomes
Example in human genome: olfactory receptor (OR) genes arose from multiple duplication events followed by divergence to create 1000 paralogous genes (see Fig 10.11) Other examples genes that encode histones, hemoglobins (see Chapter 9), immunoglobins, actins, collagens, and heat-shock proteins
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10 3

The genome contains distinct types of gene organization (cont)


Gene-rich regions
Chromosomal regions that have many more genes than expected from average gene density over entire genome Example in human genome class III region of major histocompatibility complex (Fig. 10.12)

Gene deserts
Regions of >1 Mb that have no identifiable genes 3% of human genome is comprised of gene deserts

Do they exist simply because the genes are hard to identify (e.g. big genes)?

Biological significance of gene-rich regions and gene deserts is not known


Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10 4

The information gained from DNA is in a onedimensional manner and is digital


Biological information is encoded in the nucleotide sequence of DNA and each unit of information is discrete

DNA sequence can be handled by computers


Automated DNA sequencers can sequence about 106 base pairs/day New technologies can sequence even more DNA per day

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display 5 Hartwell et al., 4th ed., Chapter 1

How to analyze a simple biological system

1. 2. 3. 4. 5. 6.

Restriction enzymes
Gel electrophoresis Molecular cloning: isolate, amplify and purify fragments

Use of probes to identify similar sequences


Polymerase Chain Reaction (PCR) Sequencing

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

Recombinant DNA Technology Tools

Characterizing DNA molecules through analysis of their sequences, not through a phenotype

1. 2. 3. 4. 5.

Ligases Polymerases Replication Hybridization of complementary single strand sequences replication

Where do we begin
1.
A genomic library

a collection of clones containing every sequence in the whole genome


Digest with restriction enzymes

Ligate
Transform

2.

A cDNA library

DNA copied from all of the RNA transcripts in the tissue/cell of interest

Libraries are collections of cloned fragments


Genomic library
Long-lived collection of cellular clones that contains copies of every sequence in the whole genome inserted into a suitable vector cDNA library Long-lived collection of cellular clones that contains copies of every mRNA expressed in a particular tissue or condition inserted into a suitable vector Series of in vitro reactions used to make cDNA copies of mRNA
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 9

Genomic libraries
Complete genomic library
Collection of clones that contain one copy of every sequence in the entire genome
Genomic equivalent number of clones in a perfect library To determine number of clones needed, divide the length of the genome by the average size of insert fragments

Impossible to obtain a perfect library


Usually libraries are made that have four to five genomic equivalents Gives an average of four or five clones for each locus (95% probability that each locus is present at least once
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 10

A comparison of genomic and cDNA libraries


Random 100 kb genomic fragment

Clones from a genomic library with 20 kb inserts that are homologous to this region

Clones from cDNA libraries

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

11

Converting RNA transcripts to cDNA: Obtaining mRNA from red blood cell precursors
Eukaryotic mRNAs have poly A tails at 3 end

mRNAs purified by affinity to oligo(dT) single strand DNA fragments of 20 nucleotides made of dT only

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

12

Converting RNA transcripts to cDNA (cont): Synthesis of hybrid cDNA-mRNA molecule


In vitro synthesis using reverse transcriptase (a DNAdependent RNA polymerase) + dATP + dGTP + dTTP + cCTP
Prime DNA synthesis using oligo(dT)

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

13

Creating the second DNA strand complementary to the first cDNA strand
mRNA digested with RNAse 3 end of cDNA folds back and acts as a primer for 2nd strand synthesis In the presence of dNTPs and DNA polymerase, the first cDNA strand acts as a template for synthesis of the second cDNA strand Double-stranded cDNA can be cloned into a plasmid
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 14

Fig. 9.8c, d

Restriction enzymes fragment the genome at specific sites


Each restriction enzyme recognizes a specific sequence of bases anywhere within the genome
Cuts sugar-phosphate backbones of both strands Restriction fragments are generated by digestion of DNA with restriction enzymes

Hundreds of restriction enzymes now available

Recognition sites for restriction enzymes are usually 4 8 bp of double-strand DNA (see Table 9.1)
Often palindromic base sequences of each strand are identical when read 5'-to-3' Each enzyme cuts at same place relative to its specific recognition sequence (Figure 9.2)
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 15

Ten commonly used restriction enzymes

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

16

Restriction enzymes produce restriction fragments with either blunt or sticky ends
Blunt ends cuts are straight through both DNA strands at the line of symmetry Sticky ends cuts are displaced equally on either side of line of symmetry Ends have either 5' overhangs or 3' overhangs

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

17

Different restriction enzymes produce fragments of different length


Average fragment length is 4n, where n is the number of bases in the recognition site 4-base recognition site occurs every 44 bp, average restriction fragment size is 256 bp
3 billion bp genome/256 = 12 million fragments

6-base recognition site occurs every 46 bp, average restriction fragment size is 4100 bp (4.1 kb)
3 billion bp genome/4100 = 700,000 fragments

8-base recognition site occurs every 48 bp, average restriction fragment size is 65,500 bp (65.5 kb)
3 billion bp genome/65,500 = 46,000 fragments
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 18

Gel electrophoresis distinguishes DNA fragments according to size

Preparing an agarose gel for electrophoresis

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

19

Gel electrophoresis distinguishes DNA fragments according to size (cont)


Load DNA samples into wells in gel, place gel in buffered aqueous solution, and apply electric current
Electrophoresis (movement of charged particles in an electric field) DNA has negative charge, so moves toward positive charge

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

20

Gel electrophoresis distinguishes DNA fragments according to size (cont)


With linear DNA fragments, migration distance through gel depends on size After electrophoresis, visualize DNA fragments by staining gel with fluorescent dye, and photograph gel under uv light

Determine size of unknown fragments by comparison to migration of DNA markers of known size

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

21

Different types of gels separate different-sized DNA molecules

Polyacrylamide gels (left) separate small fragments

Agarose gels (right) separate larger fragments

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

22

Restriction maps provide sequence-specific landmarks in the DNA terrain


Restriction maps show the relative orders and distances between multiple restriction sites
Construction of restriction map Digest DNA sample with different restriction enzymes, single digests vs double digests Run gel and determine fragment sizes for each digest Deduce restriction arrangement of sites by process of elimination

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

23

Deducing a restriction map


(a) Do single and double digests with two restriction enzymes (b) Load each digest into gel along with size markers

(c) Use process of elimination to derive the only possible arrangement that accounts for all the observed fragments
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 24

Cloning fragments of DNA

Genomes of animals, plants, and microorganisms are too large to analyze using simple techniques such as gel electrophoresis and restriction mapping Cloning is a means to purify a specific DNA fragment away from all other fragments, and make many identical copies of the fragment The cloned fragment can then be analyzed by restriction mapping and DNA sequencing

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

25

pBR322 Cloning Vector

26

Two strategies to purify and amplify individual fragments of DNA


Molecular cloning

Purification and amplification of previously uncharacterized DNA


Cut DNA and insert fragments of specific sizes into vectors Transport vector-insert molecules into living cells that make many copies of the recombinant vector Clones have amplified sets of purified DNA molecules

Polymerase chain reaction Purification and amplification of previously sequenced genomic regions
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 27

Hybridization is used to identify similar DNA sequences


Complementary single-stranded DNA or RNA will base pair and form stable double helices Hybridization probes can be from cloned fragments of DNA, PCR products, or chemically synthesized Probes are labeled with radioactive or fluorescent tag Complementary region must be sufficiently long and accurate to produce a large enough number of H bonds
Cohesive force formed by large numbers of H bonds counteracts thermal forces that disrupt the double helix

Hybridization can be DNA/DNA, DNA/RNA, or RNA/RNA


Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 28

How to make oligonucleotide probes for screening a library


Automated DNA synthesizer is used to synthesize specified oligonucleotides of defined length and sequence

Reverse translation generating a degenerate DNA sequence that contains all possible codons for a specific amino acid sequence
Fig. 9.10b
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 29

Southern blots allow visualization of rare DNA fragments in complex samples


Cut genomic DNA with restriction enzyme (s) and separate DNA fragments by electrophoresis on agarose gel

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

30

Southern blots allow visualization of rare DNA fragments in complex samples (cont)
DNA is transferred from gel to nitrocellulose membrane by blotting DNA fragments on the membrane (blot) are in the same migration pattern as in the gel

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

31

Southern blots allow visualization of rare DNA fragments in complex samples (cont)
After electrophoresis, gel is treated with NaOH to denature the transferred DNA and the blot is treated with uv and high temperature to attach single-stranded DNA

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

32

Southern blots allow visualization of rare DNA fragments in complex samples (cont)
After hybridization of probe to the blot, autoradiography reveals fragments in restriction digests that have sequences complementary to the probe

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

33

PCR generates copies of target DNA


Polymerase chain reaction (PCR) first developed in 1985 Faster, less expensive, and more flexible way to amplify specific fragments of DNA than molecular cloning Extremely efficient can amplify DNA from a single cell or from some archaeological samples Oligonucleotides are designed from previously known DNA sequence and serve as primers for DNA synthesis
Target sequence located between primer sequences are exponentially amplified by 25-30 cycles of DNA synthesis
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 34

Two oligonucleotide primers (16 26 nt) are needed for PCR reactions
Region between the two primers will be synthesized
One primer is complementary to one strand of DNA at one end of the target region
The other primer is complementary to the other strand of DNA at the other end of the target region

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

35

PCR consists of repeated cycles of DNA synthesis, with three steps in each cycle

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

36

The three steps in each cycle of PCR

(1) Denature strands

(2) Base pairing of primers

(3) Polymerization from primers along templates

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

37

Exponential increase in the amount of target DNA during PCR

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

38

Some of the uses of PCR


PCR fragments can be labeled to produce hybridization probes and can be sequenced Genotype detection and gene mapping Determine evolutionary relationships of living and extinct species Study genetic variation and changes in nucleotide sequence in groups of individuals over time

Detection of infectious diseases (e.g. HIV)

39

PCR Videos
http://www.youtube.com/watch?v=eEcy9k_KsDI&feature=player_embedded

http://www.youtube.com/watch?v=HMC7c2T8fVk&feature=player_embedded

http://www.youtube.com/watch?v=ZmqqRPISg0g&feature=player_embedded

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

40

Automated DNA sequencing


Each ddNTP is labeled with a different color fluorescent dye and all four are used in a single synthesis reaction

All four ddNTP reactions are run together in a single lane on a gel
After electrophoresis, fragments flow through a fluorescence detector and the color of the fragment is digitally recorded
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 41

Fluorescent bands in an automated sequencing gel

Each lane displays the sequence obtained from a separate DNA sample and primer Each fragment has terminated with a specific ddNTP labeled with a specific fluorescence

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

42

Chromatogram and inferred DNA sequence from automated Sanger sequencing


Computer reads of sequence complementary to the template strand Sequence is read from left to right (5'-to-3' synthesis from primer) Ambiguity in sequence is recorded as "N"

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

43

Accumulation of genome sequence data


Parallel revolutions in acquisition of genome sequence and information technology GenBank first official open-access, online repository for DNA sequences (1982, National Institutes of Health)

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

44

Ultrahigh-throughput DNA sequencing


2008 - New generation of nanotechnology-based DNA sequencers

100 billion base pairs of sequence can be determine in a single experiment


Millions of DNA clones can now be sequenced simultaneously

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

45

Bioinformatics provides tools for visualizing functional features of genomes


Bioinformatics is the science of using computational tools to decipher biological information 1988 National Center for Biotechnology Information (NCBI) established
Oversees GenBank Created additional public databases of biological information Developed bioinformatic tools for analyzing, systemizing, and disseminating the data

RefSeq species reference genome sequence, a single, complete, annotated version of the species genome
Is not from one individual, but is a composite from several individuals
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 46

New global tools of genomics can analyze thousands of genes rapidly


Schematic drawing of the components of a DNA chip

Hybridization of cDNAs made from cellular mRNAs to a DNA chip

Computerized analysis of chip hybridizations can be used to compare mRNA expression in two types of cells
Thousands of genes can be simultaneously analyzed

In this example, genes whose expression was altered by treatment with an experimental cancer drug were identified using a DNA chip

Visualizing genes of the human RefSeq genome with the UCSC Genome Browser

Fig. 9.16a
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 50

A 3 Mb region of human chromosome 7


From human RefSeq on NCBI Sequence Viewer Between sequence positions 116,000,001 and 119,000,000 Shows locations of nine genes, including the CFTR gene

A gene desert
Fig. 9.16b

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9

51

Visualization of a 540 kb region of human chromosome 7 containing the CFTR gene


From human RefSeq on NCBI Sequence Viewer
For each gene,
Exon/intron structure; blue boxes and connected lines Spliced RNA products; red boxes Protein coding sequences; black boxes

Fig. 9.16c
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 52

Whole-genome comparisons distinguish genomic elements conserved by natural selection


Charles Darwin proposed "descent with modification" Genome sequencing of many species has shown that the DNA sequence undergoes descent with modification Two perfectly matched 50 bp DNA sequences found in different species are almost certainly derived from an ancestral species Probability of occurrence = (0.25)50 = 8 x 10-31 DNA sequence conservation

Homologous sequences in two species that show evidence of being derived from a common ancestor
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th ed., Chapter 9 53

An oligonucleotide array

DNA arrays have thousands of fragments of known nucleotide sequence spotted at precise locations on a solid support Arrays can be hybridized with fluorescent or radioactive DNA or RNA probes

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10

54

Two-color DNA microarrays can be used to determine relative expression of genes


Two cDNA samples with different fluorescence labels are mixed together and used as hybridization probes on a DNA array
Green label for cDNAs from normal yeast cells Red label for cDNAs from mutant yeast cells

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10

56

Relative expression levels of ~ 6000 yeast genes on a DNA microarray


Red spots represent mRNAs expressed at higher level in mutant cells Green spots represent mRNAs expressed at higher levels in normal cells Yellow spots represent mRNAs expressed at equivalent amounts in normal and mutant cells

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10

57

An optical fiber approach to DNA array analyses

Instruments have 96 optical fibers of 1 mm diameter


The end of each fiber has 50,000 wells
Each well contains a different oligonucleotide on a bead

Used to interrogate target samples for SNPs or gene expression Can analyze >106 SNPs a day

58

An optical fiber approach to DNA array analyses

Protein array of different types of protein kinases


Spectral array of different types of protein kinases Radioactivity associated with each kinase after application of radioactive substrate

Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10

60

CHiP/chip analyses to identify protein-DNA interactions


Combination of genomic and proteomic approaches to measure protein-DNA interactions required for gene regulatory networks Binding of transcription factors to cis-control DNA elements

Binding of complexes of activators and repressors to DNA


Chromatin immunoprecipitation (CHiP) Used to identify all genomic sites at which a transcription factor in specific cell types can bind (Fig. 10.25)
Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10 61

Diagram of the CHiP/chip process


Cells genetically engineered by adding tag sequences to 5' or 3' end of gene encoding the protein of interest Isolate chromatin from engineered cells Shear chromatin and precipitate protein-DNA complex with anti-tag antibody PCR-amplify DNA with fluorescent label
Use red for experimental sample and green for control sample prepared from cells that lack tag sequence

Hybridize both probes to DNA array (chip)


Copyright The McGraw-Hill Companies, Inc. Permission required to reproduce or display Hartwell et al., 4th edition, Chapter 10 62

You might also like