Professional Documents
Culture Documents
Project Rationale
Genotyping Strategies/Technical Leaps Data Management/Quality Control
SNP Genotyping
Matched
Probe and Target C All ele
Mis-Matched
T Allele
Heritability
Allele-Specific Hybridization
C Target C G
Hybridize
Fail to hybridize
Degrade
C G
Fail to degrade
Candidate Gene, Pathway, Genome 5-10 SNPs, 400 to 1,000, 10K, 500K
Polymerase Extension
Target
+ddC TP
C incorporated
C G
C Fails to incorporate
Oligonucleotide Ligation
C Target
Cost
Ligate
C G
Fail to ligate
Low
Medium
eg. SNPlex - Intermediate Multiplexing reduces costs - Genotype directly on genomic DNA - new paradigm for high throughput
384 - 1,536 SNPs - cost reductions based on scale (~0.08 - 0.15 per SNP/genotype) 300,000 to 500,000 SNPs defined format (~0.002 per SNP/ genotype) 10,000-20,000 SNPs - defined and custom formats (~0.03 per SNP/genotype)
$800,000 $>250,000
High
Taqman
Genotyping with fluorescence-based homogenous assays (single-tube assay) = 1 SNP/ tube
SNP 1252 - T
SNP 1252 - C
ZipCode 1
G C
P
Locus Specific Sequence
P
Ligation Product Formed
2. Clean-up
Detection
9. Characterize on Capillary Sequencer
Biotin
Univ. PCR Primer
4. Capture double stranded DNA- microtiter plate (Streptavidin) 5. Denature double stranded DNA 6. Wash away one strand 7. Zip Chute Hybridization
SNP 1
SNP 2
SNPlex Readout
ZipChuten N(n)
T
Position n
n ~ 48/lane ~2000 lanes/day
Chip Array
T A
~96,000 genotypes/day
ParAllele Affymetrix
Sentrix Platform
Sentrix 96 Multi-array Matrix matches standard microtiter plates (96 - 1536 SNPs/well) Up to ~140,000 assays per matrix
(1-20 nt gap)
SNP
Polymerase
Ligase
[T/C]
[T/A]
illumiCode Address Universal PCR Sequence 3
Amplification Template
illumiCode #561
A G
Universal Primer P3
BeadArray Reader
illumiCode #561
/\/\/\/
Confocal laser scanning system Resolution, 0.8 micron Two lasers 532, 635 nm Supports Cy3 & Cy5 imaging Sentrix Arrays (96 bundle) and Slides for 100k fixed formats
illumiCode #217
/\/\/\/
illumiCode #1024
/\/\/\/
A/A
T/T
C/T
Process Controls
Illumina Readout for Sentrix Array > 1,000 SNPs Assayed on 96 Samples
Mismatch
High AT/GC
Gender
Gap
First Hyb
Second Hyb
Contamination
Chip Array
ParAllele Affymetrix
Affymetrixs Chip
~650K SNPs
400 samples Call rate, accuracy LD
500K SNPs
a
T
SNP2
SNP3 - - - - -
SNP
SNP1
Signal Green/Red
SNP2
SNP3 - - - - - SNP
Additional Content
~8,000 nsSNPs ~1,500 tag SNPs selected from high density SNP data in the MHC region
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 >0 >0.1 >0.2 >0.3 >0.4 >0.5 Max r2 >0.6 >0.7 >0.8 >0.9
Percent
99.93% >99.99% 0.035% 99.69%
CEU, mean 0.87 median 0.97 CHB+JPT, mean 0.80 median 0.94 YRI, mean 0.57 median 0.55
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0.0 >0 >0.1 >0.2 >0.3 >0.4 >0.5
2
CEU, mean 0.95 median 1.0 CHB+JPT, mean 0.93 median 1.0 YRI, mean 0.75 median 0.88
>0.6
>0.7
>0.8
>0.9
Max r
Replicate samples
CT 1 50 0 TT 0 0 25 Replicates can also detect sample handling errors
Wrong plate Plate rotation
Rep 1 CC Rep 2 CC 24 CT 0 TT 0
Expectations
p2 = frequency 11 2pq = frequency 12 q2 = frequency 22
HWE Example
Homozygote excess
Biologic
Population stratification Null allele
Technical
Nonspecific assays Duplicated regions
Technical
Allele dropout
Frequency Analysis
tagSNP Allele Frequency
100% 80%
IGAP
30% 20% 10% 0% 0% White European Black African 10% 20% 30% 40% 50% PGA
80%
100%
10