Linkage & Linkage Disequilibrium

S. Purcell

Objective

This module aims to demonstrate the related phenomena of linkage and linkage disequilibrium (allelic association) between two loci.

Tutorial

In the module, an individual is represented as follows:

The circles represent the disease locus.

The squares represent the marker locus.

The left symbols represent the maternally inherited haplotype.

The right symbols represent the paternally inherited haplotype.

Each locus has two alleles: a red allele and a white allele.

In a more traditional form, we would represent this as :

We specify the recombination fraction between the disease locus and the marker locus. This represents the probability of a cross-over occurring between the two loci during meioses.

If there is no recombination then the parent would transmit either:

(A) both their maternally inherited disease allele and their maternally inherited marker allele
(B) both their paternally inherited disease allele and their paternally inherited marker allele

If there is recombination then the parent would transmit either:

(A) their maternally inherited disease allele and their paternally inherited marker allele
(B) their paternally inherited disease allele and their maternally inherited marker allele

Recombination will occur with probability theta in any one meioses.

Whether or not recombination has occurred, then either scenario (A) or (B) will occur with equal 50:50 frequency.

Simulation

In the simulation we take one individual who has inherited :

a red-red disease-marker haplotype maternally
a white-white disease-marker haplotype paternally

The aim of the simulation is to track the transmission of these alleles through through five generations of offspring.

We assume that every individual has two female offspring.

We only represent the females in this simulation. That is, we do not show the father's genotypes.

Remembering that the left hand-side symbols represent the maternally inherited haplotype, whether or not we call an offspring a recombinant is determined only by maternal recombination (obviously, in reality, we would also be interested in recombination during paternal meioses).

The figure below represents a parent and her two offspring:

In this case, the parent has inherited both the red allele at the disease locus and the red marker allele from her mother (as it is on the left hand-side).

To her first offspring, we see that she transmits the red allele at the disease locus but not at the marker allele - so we can infer a recombination event occurred during meioses. We represent this by placing a circle around the child.

The second offspring has inherited the red allele at both the disease locus and the marker, and so is not a recombinant.

In both cases, the offspring have also received a haplotype from their fathers of course. These are the right hand side symbols (note: as mentioned, the fathers are not represented in this simulation).

In this simple simulation, we assume that both at the marker and the disease locus, both red and white alleles have a 50:50 frequency. We also assume that fathers are equally likely to transmit any of the four disease-marker haplotypes.

Linkage & Linkage Disequilibrium

Linkage If were able to observe this entire pedigree and track the transmission of disease and marker alleles with 100% accuracy, the calculation of the recombination fraction is simply the proportion of recombinants versus non-recombinants.

In general, linkage arises solely from the genetic distance between two loci, and is estimated by inferring the amount of recombination between two loci.

As such, it is not a function of the frequencies of any specific alleles. Indeed, even for regions of the genome without any variation, there will still be linkage resulting from the relationship between genetic distance and recombination. The only difference is that we would never be able to detect it (for it would be impossible to assign a parental origin to any allele).

Linkage Disequilibrium In contrast, linkage disequilibrium describes a correlation between specific alleles at two loci in a population sample.

This can arise due to the fact that apparently unrelated individuals are infact likely to have distant relatives. Two loci that are very tightly linked and therefore unlikely to be separated by recombination may quite possibly be transmitted together from distant ancestors to apparently unrelated individuals. This will induce a correlation between alleles at the two loci.

(Hopefully!) the demonstration will demonstrate this phenomenon.

Unlike linkage, linkage disequilibrium also depends on other factors such as allele frequency and how old the polymorphism is in terms of generations.

Linkage disequilibrium (i.e. correlations between alleles at different loci) can occur for reasons other than tight linkage however. For instance, ethnic stratification can easily cause correlations between alleles at different loci. Imagine we observed the following allele counts for Europeans and Asians had the following allele frequencies for two loci :


200 European individuals (i.e. 400 alleles)
           |          Locus A
   Locus B |         A          a |     Total
-----------+----------------------+----------
         B |       160        160 |       320 
         b |        40         40 |        80 
-----------+----------------------+----------
     Total |       200        200 |       400 
          Pearson chi2(1) =   0.0000   Pr = 1.000


200 Asian individuals (i.e. 400 alleles)
           |          Locus A
   Locus B |         A          a |     Total
-----------+----------------------+----------
         B |       160         40 |       200 
         b |       160         40 |       200 
-----------+----------------------+----------
     Total |       320         80 |       400 
          Pearson chi2(1) =   0.0000   Pr = 1.000

If we analyse the two samples together, however, we find evidence of an association:


400 European and Asian individuals (i.e. 800 alleles)
           |          Locus A
   Locus B |         A          a |     Total
-----------+----------------------+----------
         B |       320        200 |       520 
         b |       200         80 |       280 
-----------+----------------------+----------
     Total |        520       280 |       800 
          Pearson chi2(1) =   7.8251   Pr = 0.005

The measure of linkage disequilibrium we use in this demonstration is the simple chi-squared for a 2-by-2 table.

Summary

Linkage and linkage disquilibrium are related phenomenon
Linkage mapping is based on inferring recombination in observed meioses
Linkage disequilibrium is based on the cumulative effect of recombination (or lack of it) in the unobserved meioses that track back to common ancestors
Linkage disequilbrium requires very tight linkage to be preserved over a number of generations
In the module, we nonetheless see that just one individual's haplotypes can induce a correlation in 32 fifth-generation relatives

Site created S.Purcell, last updated 7.10.2000