PLINK: Whole genome data analysis toolset [an error occurred while processing this directive]
Epistasis
For the disease-trait population-based samples, it is possible to test for epistasis. The epistasis test can either be case-only or case-control. Either all pairwise combinations of SNPs can be tested (although this is most likely not desirable, it is computationally feasible using PLINK -- the 4.5 billion two-locus tests generated from a 100K data set took just over 24 hours to run) or sets can be specified (e.g. to test only the most significant 100 SNPs against all other SNPs, or against themselves, etc). The output consists only pairwise epistatic results above a certain significance value; also, for each SNP, a summary of all the pairwise epistatic tests is given (e.g. maximum test, proportion of tests significant at a certain threshold, etc). A similar methodology allows for testing of gene-environment interaction (for dichotomous environmental variables).


SNP x SNP epistasis

To test SNP x SNP epistasis, the command

plink --file mydaya --epistasis

will send output to the files
     plink.epi-cc1
     plink.epi-cc2
where cc = case-control.

There are different modes for specifying which SNPs are tested:
ALL x ALL

plink --file mydata --epistasis

SET1 x SET1  { where epi.set contains only 1 set }

plink --file mydata --epistasis --set epi.set

SET1 x ALL  { where epi.set contains only 1 set } 

plink --file mydata --epistasis --set epi.set --set-by-all

SET1 x SET2  { where epi.set contains 2 sets }  

plink --file mydata --epistasis --set epi.set

For the 'symmetrical' cases (ALLxALL and SET1xSET1) then only unique pairs are analysed.

For the other two cases (SET1xALL, SET1xSET2) then all pairs are analysed (e.g. will perform SNPA x SNPB as well as SNPB x SNPA, if A and B are in both SET1 and SET2). It will not try to analysis SNPA x SNPA however.

The output can be controlled via

plink --file mydata --epistasis --epi1 0.0001

which means only record results that are significant p<=0.0001. (This prevents too much output from being generated). The output is in the form
     Col 1 : SNP 1
     Col 2 : SNP 2
     Col 3 : Interaction odds ratio 
     Col 4 : z-score
     Col 5 : p-value
The z-score is a test for difference in SNP1-SNP2 assocation (odds ratio) between cases and controls (or in cases only).

A second part of the output: for each SNP in SET1, or in ALL if no sets were specified, is information about the number of significant epistatic tests that SNP featured in (i.e. either with ALL other SNPs, with SET1, or with SET2). The threshold --epi2 determines this:

plink --file mydata --epistasis --epi1 0.0001 --epi2 0.05

The output is
     Col 1 : Chromosome
     Col 2 : SNP
     Col 3 : # significant epistatic tests (p <= "--epi2" threshold)
     Col 4 : # of valid tests (i.e. non-zero allele counts, etc)
     Col 5 : proportion significant of valid tests
This will give a rough idea about the extent of epistasis and which SNPs seem to be interacting (although, of course, this is a naive statistic as we do not take LD into account -- i.e. Col 3 does not represent the number of *independent* epistatic results).


Case-only epistasis

For case-only epistatic analysis,

plink --file mydata --epistasis --case-only

sends output to (co = case-only)
     plink.epi-co1
     plink.epi-co2
All other options are as described above.

Take note! Currently, in case-only analysis, all pairs of SNPs will be tested (i.e. regardless of whether they are near each other on the chromosome -- we probably want an option to just automatically skip SNPs that are too close, in terms of physical distance).


Gene-based tests

A gene-based test is available: this is performed using the statistical package R. The following command will automatically generate the script and data file required for the R analysis; if possible, it will also directly call R and start the analysis also.

The --genepi command initiates this analysis. It is always necessary to specify a set-file (--set filename) which contains at least two sets (i.e. specifying the SNPs in two or more genes; the analysis is pairwise between pairs of genes, not pairs of SNPs).

plink --file mydata --genepi --R --set gene.set

sends output to
     plink.genepi-
     plink.epi-co2
[an error occurred while processing this directive] This document last modified [an error occurred while processing this directive]