Professional Documents
Culture Documents
Qunyuan Zhang Division of Statistical Genomics Statistical Genetics Forum March 10,2008
What is R ?
R
Run
s on a wide variety of UNIX platforms, Windows and MacOS (interactive or batch mode)
Free
and open source, can be downloaded from cran.r-project.org range of packages (base & contributed), novel methods available grammar & good structure (function, data object, methods and class)
Wide
Concise Help
Slow,
time and memory consuming (can be overcome by parallel computation, and/or integration with C)
Popular,
R Task Views
http://cran.r-project.org/web/views/
GenABEL
Aulchenko Y .S., Ripke S., Isaacs A., van Duijn C.M. GenABEL: an R package for genome-wide association analysis. Bioinformatics. 2007, 23(10):1294-6.
GenABEL: genome-wide SNP association analysis a package for genome-wide association analysis between quantitative or binary traits and single-nucleotides polymorphisms (SNPs). Version: 1.3-5 Depends: R ( 2.4.0), methods, genetics, haplo.stats, qvalue, MASS Date: 2008-02-17 Author: Yurii Aulchenko, with contributions from Maksim Struchalin, Stephan Ripke and Toby Johnson Maintainer: Yurii Aulchenko <i.aoultchenko at erasmusmc.nl> License: GPL ( 2) In views: Genetics CRAN checks: GenABEL results
"geno.raw)
2 10 3 11 Save 75%
convert.snp.text() from text file (GenABEL default format) convert.snp.ped() from Linkage, Merlin, Mach, and similar files convert.snp.mach() from Mach format convert.snp.tped() from PLINK TPED format
npsubtreated():
summary of snp data (Number of observed genotypes, call rate, allelic frequency, genotypic distribution, P-value of HWE test check.trait(): summary of phenotypic data and outlier check based on a specified p/FDR cut-off check.marker(): SNP selection based on call rate, allele frequency and deviation from HWE HWE.show(): showing HWE tables, Chi2 and exact HWE Pvalues perid.summary(): call rate and heterozygosity per person
ibs():
matrix of average IBS for a group of people & a given set of SNPs hom(): average homozygosity (inbreeding) for a set of people, across multiple markers
snp association test using GLM in R library scan.glm((y~x1+x2++CRSNP", family = gaussian(), data, snpsubset, idsubset) scan.glm((y~x1+x2++CRSNP", family = binomial (), data, snpsubset, idsubset) scan.glm.2D(): 2-snp interaction scan Fast Scan (call C language)
ccfast():
case-control association analysis by computing chi-square test from 2x2 (allelic) or 2x3 (genotypic) tables emp.ccfast(): Genome-wide significance (permutation) for ccfast() scan
qtscore():
association test (GLM) for a trait (quantitative or categorical) emp.qtscore(): Genome-wide significance (permutation) for qscaore() scan
mmscore():
score test for association between a trait and genetic polymorphism, in samples of related individuals (needs stratification variable, scores are computed within strata and then added up)
egscore():
association test, adjusted for possible stratification by principal components of genomic kinship matrix(snp correlation matrix)
scan.haplo.2D():
(haplo.stats package required) Sliding window strategy Posterior prob. of Haplotypes via EM algorithm GLM-based score test for haplotype-trait association (Schaid DJ,
Rowland CM, Tines DE, Jacobson RM, Poland GA. 2002. Score tests for association of traits with haplotypes when linkage phase is ambiguous Am J Hum Genet 70: 425-434. )
scan.gwaa-class
Names:
snpnames list of names of SNPs tested P1df: p-values of 1-d.f. (additive or allelic) test for association P2df: p-values of 2-d.f. (genotypic) test for association Pc1df: p-values from the 1-d.f. test for association between SNP and trait; the statistics is corrected for possible inflation effB: effect of the B allele in allelic test effAB: effect of the AB genotype in genotypic test effBB: effect of the BB genotype in genotypic test Map: list of map positions of the SNPs Chromosome: list of chromosomes the SNPs belong to Idnames: list of subjects used in analysis Lambda: inflation factor estimate, as computed using lower portion (say, 90%) of the distribution, and standard error of the estimate Formula: formula/function used to compute p-values Family: family of the link function / nature of the test
GenABEL: Functions
descriptives.marker(): table of marker info. descriptives.trait(): table of trait info. descriptives.scan(): table of scan results plot.scan.gwaa(): plot of scan results plot.check.marker(): plot of marker data (QC etc.)
SNPassoc
An R package to perform whole genome association studies, Juan R. Gonzlez 1, et al. Bioinformatics, 2007 23(5):654-655
SNPassoc: SNPs-based whole genome association studies This package carries out most common analysis when performing whole genome association studies. These analyses include descriptive statistics and exploratory analysis of missing values, calculation of Hardy-Weinberg equilibrium, analysis of association based on generalized linear models (either for quantitative or binary traits), and analysis of multiple SNPs (haplotype and epistasis analysis). Permutation test and related tests (sum statistic and truncated product) are also implemented. Version:1.4-9 Depends:R ( 2.4.0), haplo.stats, survival, mvtnorm Date:2007-Oct-16 Author:Juan R Gonzlez, Llus Armengol, Elisabet Guin, Xavier Sol, and Vctor MorenoMaintainer:Juan R Gonzlez <jrgonzalez at imim.es> License:GPL version 2 or newerURL:http://www.r-project.org and http://davinci.crg.es/estivill_lab/snpassoc; In views:Genetics CRAN checks:SNPassoc results
info=map.table,
summary()
data=, model = (codominant, dominant, recessive, overdominant, log-additive or all),quantitative = , level = 0.95) scanWGassociation(): only p values association(): only for selected snps, can do stratified, GxE interaction analyses Results Summary: a summary table by genes/chromosomes Wgstats: detailed output(case-control numbers, percentages, odds ratios/ mean differences, 95% confidence intervals, P-value for the likelihood ratio test of association, and AIC, etc.) Pvalues: a table of p-values for each genetic model for each SNP Plot: p values in the -log scale for plot.Wgassociation() Labels: returns the names of the SNPs analyzed
Haplotype Analysis haplo.glm(): using the R package haplo.stats: association analysis of haplotypes with a response via GLM haplo.interaction(): interactions between haplotypes (and covariates)