Genetic Power Calculator

S. Purcell & P. Sham, 2001-2009

This site provides automated power analysis for variance components (VC) quantitative trait locus (QTL) linkage and association tests in sibships, and other common tests. Suggestions, comments, etc to Shaun Purcell.

If you use this site, please reference the following Bioinformatics article:
Purcell S, Cherny SS, Sham PC. (2003) Genetic Power Calculator: 
design of linkage and association genetic mapping studies of complex 
traits. Bioinformatics, 19(1):149-150.

Modules

Case-control for discrete traits Notes
Case-control for threshold-selected quantitative traits Notes
QTL association for sibships and singletons Notes
  
TDT for discrete traits Notes
TDT and parenTDT with ascertainment Notes
TDT for threshold-selected quantitative traits Notes
  
Epistasis power calculator Notes
  
QTL linkage for sibships Notes
  
Probability Function Calculator Notes

Instructions for power calculations
VC model calculations are based upon formula derived in Sham et al (2000) [AJHG, 66, 1616-1630]. Users of this site who are unsure of the nature of the VC tests and power calculations are strongly advised to consult this article.

A genetic model for a single diallelic QTL is specified in terms of

In addition, the VC association test requires some extra parameters:
Given these parameters, the program outputs the expected non-centrality parameter given the genetic model and the sample specification. The power of detecting a QTL effect is given for the different levels of type I error rate (including the 'user-defined' type I error rate). Also, the sample size needed to obtain the 'user-defined' power is given for each level of type I error rate (alpha). Please note that the VC tests are one-sided tests.


Instructions for VC QTL linkage conditional on trait

For sample of sibships measured on a quantitative normally-distributed trait, the expected contribution to the sample noncentrality parameter (NCP) is calculated, conditional on their trait scores, the sample residual correlation and the QTL effect size.

Simply paste the trait scores into the text windows (no other variables, whitespace delimited, also no trailing whitespace please). Enter also the residual sibling correlation (0-1) and the desired QTL variance (0-1).

Output consists of the trait scores returned but with an additional column added (first column) that represents the expected contribution to the sample NCP. This is an index of potential informativeness of that sibship: the higher the expected NCP, the more informative the sibship. Sibships can therefore be rank-ordered by this index for selective genotyping. This index can also be summed over all sibships to give the sample NCP, from which power can be calculated.


Instructions for discrete trait TDT power calculator

The main parameters that must be specified by the user are The output gives the baseline genotypic risk r(aa) and also the genotypic odds ratios for the 'Aa' and 'AA' genotypes (will be very similar to the genotypic relative risks for rare diseases). The expected number of heterozygous (i.e. informative) parents per family is given and also the transmission probabilities of the two alleles for affected offspring. The deviation of these two probabilities from 50:50 is the basis for the TDT statistic.

Power is given for various values of alpha for the user-specified sample size. Also, required sample size is given for various levels of alpha for the user-specified power.

Instructions for discrete trait Case-Control power calculator

See the instructions above for discrete TDT for a description of most of the model parameters. One difference is that this procedure is for a marker B in linkage disequilibrium with the test locus A. To specify power at the test locus, set the LD measure (d-prime) to 1 and the allele frequencies of A and B equal.

As well as specifying the number of affected individuals (cases) the user must specify the control:case ratio. If this equals 1, then there are as many controls as cases. If this were 0.5, and there were 200 cases, there would be 100 controls, etc.

The output also gives the expected allele and genotype frequencies for cases and controls. A chi-squared test statistic (and associated power at alpha=0.05) is given for a test of Hardy-Weinberg equilibrium in cases and controls (the presence of H-W disequilibrium in cases but not controls can be indicative of an association).

Additionally, the haplotype frequencies and implied r-squared (measure of LD) are displayed, along with the D-prime measure input by the user.

The rest of the output is as described above - note that the number of cases for a specific power refers to the number of affected individuals along with the appropriate number of controls as specified by the control:case ratio.

For this module, we also consider the performance of genotypic tests that assume either a recessive, dominant or general (2 df) model, as well as the standard allelic test (which is printed last). For most purposes, only this last allelic test (the label is highlighted in red) will be of interest.


Instructions for quantitative trait TDT power calculator

The specification of parameters is somewhat different in the case of quantitative traits. Firstly, effects are expressed in terms of variance components rather than genotypic relative risks. Secondly, it is possible to consider the scenario where the test locus is a marker in LD with the true QTL.

The total QTL variance should range between 0 and 1. For example, 0.05 represents a QTL that accounts for 5% of the phenotypic variance. For no dominance, set dominance:additive QTL effects to 0, or 1 for complete dominance. As well as specifying the allele frequency for the biallelic QTL, one must specify the marker frequency for the biallelic marker, as well as D-prime (the proportion of possible LD present) between marker and QTL.

If the test locus were the QTL, equate p and m1 and set d-prime to 1.

The case threshold refers to the value of a standard normal scale above which offspring are ascertained as cases. For example, if only individuals who score 2 standard deviations above the mean (assuming a normally distributed trait) are ascertained, then the case threshold equals 2. Always use the absolute threshold value, even if cases are defined as scoring less than a negative threshold.

The recombination fraction (0 for complete linkage to 0.5 for no linkage) is always required. Typically, this would be set near 0.


Instructions for quantitative trait Case-Control power calculator

See the quantitative TDT notes above for a description of the main parameters. The number of cases and the control:case ratio (see notes above) specify the sample size. The thresholds specify where the cases and controls are approximately sampled from on a standard normal scale. If cases are selected as scoring over 2 standard deviations above the mean, the lower case threshold would be 2, the higher case threshold could be set to something like 5 or 6 (setting it even higher would not influence results as we would not expect to see any individuals scoring so high usually). If controls were sampled as being 'average', selecting out extreme scorers, one might specify -1 and +1 as the lower and upper control thresholds. If an equal number of controls as cases were sampled, the control:case ratio would be set to 1 also.

Various misc. utilities

The following utilities are undocumented and no longer supported

Variance Components - Relative Risk Conversion

Two locus QTL linkage (means)

Two locus QTL linkage (effects)

Epistasis: genotypic means -> effects