PLINK: Whole genome data analysis toolset [an error occurred while processing this directive]
Summary statistics
PLINK will generate a number of standard summary statistics that are useful for quality control (e.g. missing genotype rate, minor allele frequency, Hardy-Weinberg equilibrium failures and non-Mendelian transmission rates). Thse can also be used as thresholds for subsequent analyses (described in next section).
Missing genotypes

To generate a list genotyping/missing rate statistics:

plink --file data --missing

will create two files:
     plink.imiss
     plink.lmiss 
which detail missingness by individual and by SNP. For individuals, the format is:
     Family ID (FID)
     Individual ID (IID)
     Missing phenotype? (Y/N)
     Number of missing SNPs (N_MISS)
     Proportion of missing SNPs (%_MISS)
For each SNP, the format is:
	
     SNP ID (SNP)
     Chromosome (CHR)
     Number of individuals missing this SNP (N_MISS)
     Proportion of sample missing for this SNP (%_MISS)


Hardy-Weinberg Equilibrium

To generate a list of genotype counts and Hardy-Weinberg test statistics for each SNP, with a particular threshold:

plink --file data --hardy 0.01

will create a file:
     plink.hwe
This file has the following format: for case/control samples
     SNP
     Genotype counts in cases [A: A11 A12 A22 ]
     Genotype counts in controls [U: A11 A12 A22 ]
     Whole-sample genotype counts [A+U: A11 A12 A22 ]
     Whole-sample HWE expected counts [E(A+U): A11 A12 A22 ]
     Whole sample H-W chi-square
     Whole sample H-W p-value
     Cases-only H-W chi-square
     Cases-only H-W p-value
     Controls-only H-W chi-square
     Controls-only H-W p-value
For quantitative traits, only the whole-sample results will be given.

In addition, the following output will appear in the terminal, detailing how many SNPs failed the Hardy-Weinberg test, for the sample as a whole, and (when PLINK has detected a disease phenotype) for cases and controls separately.
Writing Hardy-Weinberg test results to [ plink.hwe ]
22 markers failed HWE test ( p <= 0.05 )
        16 markers failed HWE test in cases
        12 markers failed HWE test in controls
WARNING! Currently everybody is included in the Hardy-Weinberg calculations -- for family data, it would be better to only consider founders (i.e. independent genotypes). This option will be added in future.

WARNING! Currently, the H-W test statistic is the standard contingency table chi-square statistic: it has been shown that exact tests have more desirable properties, particularly when one allele is rare. This alternate approach will be adopted in the future.


Allele frequency

To generate a list of minor allele frequencies (MAF) for each SNP:

plink --file data --freq

will create a file:
     plink.frq
with five columns:
     Chromosome
     SNP identifier
     Allele 1 code
     Allele 2 code
     Minor allele frequency


Mendel errors

To generate a list of genotype counts and Hardy-Weinberg test statistics for each SNP:

plink --file data --mendel

will create files:
     plink.mendel
     plink.imendel
     plink.lmendel
The *.mendel file contains all Mendel errors (i.e. one line per error); the *.imendel file contains a summary of per-family error rates; the *.lmendel file contains a summary of per-SNP error rates.

TODO Output from this option not yet fully implemented.


Pedigree errors

PLINK will spot some basic pedigree errors when performing a family-based test (--tdt option), otherwise pedigree structure (Family and individual ID) is completely ignored (i.e. all individuals are assumed to be unrelated).

For a more comprehensive evaluation of pedigree errors (invalid or incompletely specified pedigree structures) please use a different software package such as PEDSTATS or famtypes. [an error occurred while processing this directive]
This document last modified [an error occurred while processing this directive]