This section contains a rough flow-chart of some of the main operations in PLINK. In particular, it is designed to indicate the order in which certain operations are performed (i.e. whether SNPs are excluded before or after merging files, etc), and also when PLINK halts operation, e.g. after certain commands, meaning that certain combinations are not feasible.

Most of these steps are optional (i.e. will only occur if a specific command has been issued on the command line).

Parse command line, for commands and options
Check version, unused options, warnings
Define chromosome set (human, or --mouse, --rice, etc)
Run ID-helper utility (--id-dict and --id-match), then QUIT
Run SNP-annotation (--lookup and --lookup-gene), then QUIT
Run compression/decompressio utility (--compress and --decompress), then QUIT
Read input, either:
- Dummy dataset(--dummy), or
- Simulated dataset (--simulate), or
- Result files for meta-analysis (--meta-analysis), or
- Result files for gene-based report (--gene-report), or
- Result and annotation files (--annotate), or
- Maps for CNVs (--cfile, --cnv-list), or
- Binary filset (--bfile), or
- PED fileset (--file), or
- LGEN fileset (--lfile), or
- Transposed fileset (--tfile), or
- Maps for generic variants (--gfile), or
- Map and dosage files (--dosage)
For commands not involving basic SNP or CNV data directly (e.g. --meta-analysis, --annotate, --dosage, --gene-report, etc) then call the corresponding function directly, then QUIT
At this stage, the following filters apply directly when loading (Note: some other filters not mentioned below are done later, e.g. --snps, --extract, --remove, --filter-males):
- --chr
- --snp, --window
- --from, --to
- --from-kb, --to-kb, etc
Check for duplicate individual or SNP names
Merge one or more filesets (--merge, --bmerge, --merge-list)
Swap in alternate phenotype file (--pheno), or make a new phenotype (--make-pheno)
Remove individuals with missing phenotypes (--prune)
Update SNP information (--update-map)
Update FAM information (--update-ids, --update-sex, ...)
Update allele information (--update-alleles)
Flip strand (--flip)
Recode alleles 1234/ACGT (--alleleACGT, --allele1234 )
Either, if (--exclude-before-extract), then
- extract any SNPs (--extract)
- then exclude any SNPs (--exclude)
otherwise
- exclude any SNPs (--exclude)
- then extract any SNPs (--extract)
Either, if (--keep-before-remove), then
- keep any individuals (--keep)
- then remove any individuals (--remove)
otherwise
- remove any individuals (--remove)
- then keep any individuals (--keep)
Filter SNPs based on attributes (--attrib)
Filter individuals based on attributes (--attrib-indiv)
Filter SNPs based on quality scores (--qual-scores)
Filter genotypes based on quality scores (--qual-geno-scores)
Random thinning of SNPs (--thin)
Read --genome-lists
Read list of obligatory missing genotypes (--oblig-missing)
Filter based on a variable (--filter)
Filter based on sex, phenotype, etc (--filter-males, --filter-cases, ...)
Read covariate file (--covar)
Read cluster file (--within)
Zero-out specific genotypes (--zero-cluster)
Process rare CNV data
- Read CNV list, map to genomic positions
- Filter on genes, sizes, types, etc (--cnv-intersect, --cnv-del, --cnv-kb, etc)
- Write back any genes, regions intersected (--cnv-report-regions)
- Filter CNVs based on frequency (--cnv-freq-exclude-above, etc)
- Report basic count of CNVs in LOG file
- Write a new CNV list, map file (--cnv-write, --cnv-make-map)
- Calculate per-individual CNV summary statistics
- Calculate per-position CNV summaries
- Make summary displays(--cnv-track, --cnv-seglist)
- Find overlapping CNVs as pools (--segment-group)
- Perform association / genome-wide burden test (--mperm, --cnv-indiv-perm)
- QUIT
Process generic variant data (--gfile)
- Read GVAR data (might be on top of existing, standard file)
- Calculate frequency statistics for each allele, CNP state
- Perform linear/logistic regression of phenotype on CNP states
- QUIT
Main SNP filters
- Count founders and nonfounders
- Calculate per-individual genotyping rate, remove individuals below threshold (--missing, --mind)
- Calculate (or read from file (--read-freq) allele frequencies
- Determine per SNP missing genotype rate, after removing individuals, exclude below threshold (--geno)
- Determine minor (reference) allele
- List of heterozygous hets found, by default set to missing
- List SNPs with no founder genotypes observed
- Write allele frequencies to file (--freq)
- Calculate HWE statistics per SNP (--hardy, --hwe); after --hardy, then QUIT
- Report genotyping rate per SNP and per individual as calculated above (--missing)
- Remove SNPs below the MAF filter (--maf)
Re-report basic case/control counts to LOG
Re-specify reference alleles (--reference-allele )
Make family units, if needed; perform Mendel checks (--mendel, --me, --tdt, etc)
Reset pat and mat codes of non-founders if parents not present (--make-founders)
Perform sex-check (--check-sex)
Create pseudo case/control units from trio data (--tucc)
Write permuted phenotype file (--make-perm-pheno), QUIT
Write table of SNPs/set scoring (--set-table), QUIT
Write covariate file (--write-covar), then QUIT
Write cluster file (--write-cluster), then QUIT
Write snplist file (--write-snplist), then QUIT
Write binary fileset file (--make-bed), then QUIT
Write other file formats for genotype data (--recode, --recodeA, --list, --two-locus, etc), then QUIT
Create and output a SET file given ranges (--make-set), then QUIT
LD-based clumping of association results, (--clump), then QUIT
Generate lists of SNPs tagging other SNPs (--show-tags), then QUIT
Generate haplotype blocks (--blocks), then QUIT
Determine if conditioning SNPs used (--condition)
Perform IBS, cluster analysis and MDS analysis (--cluster, --mds-plot, --neighbour), then QUIT
Test for differences in IBS between groups (--ibs-test), then QUIT
Calculate genome-wide IBS and IBD (--genome), then QUIT
Calculate F inbreeding statistic (--het)
Calculate runs of homozygosity (--homozyg), then QUIT
Perform LD-based pruning of SNP (--indep, --indep-pairwise), then QUIT
Perform LD-based scan for strand flips (--flipscan), then QUIT
Calculate and display pairwise LD (--r2, --ld), then QUIT
General haplotype estimation, (association, phase reports, frequencies) --hap)
- Phasing
- Report haplotype frequencies
- Report hapotype phases
- Perform mis-hap test for non-missing randomness
- Proxy association and imputation
- QUIT
SNP-by-SNP epistasis tests (--epistasis), then QUIT
Score per-individual risk profiles (--score), then QUIT
Run R-plugin on dataset (--R), then QUIT
For main association tests, loop over all phenotypes, (--all-pheno)
- Perform assocaition test (--mh, --model, --assoc, --fisher, --linear, --logistic, --homog, --qfam, --tdt, --poo, --dfam, --gxe, etc)
- Perform haplotype association test (--hap-assoc, --hap-tdt)
- Perform conditional haplotype test (--chap), then QUIT
- Perform --test-missing
- If specified, repeat the above tests with permuted datasets
- Go to next phenotype
Perform PLINK segmental sharing test
Definitely QUIT

This document last modified Wednesday, 25-Jan-2017 11:39:26 EST