| 1. Introduction 
2. Basic information 
3. Download and general notes 
4. Command reference table 
5. Basic usage/data formats 
6. Data management 
 
7. Summary stats
8. Inclusion thresholds
9. Population stratification
10. IBS/IBD estimation
11. Association
12. Family-based association
13. Permutation procedures
14. LD calculations
15. Multimarker tests
16. Conditional haplotype tests
17. Proxy association
18. Imputation (beta)
19. Dosage data
20. Meta-analysis
21. Annotation
22. LD-based results clumping
23. Gene-based report
24. Epistasis
25. Rare CNVs
26. Common CNPs
27. R-plugins
28. Annotation web-lookup
29. Simulation tools
30. Profile scoring
31. ID helper
32. Resources
33. Flow-chart
34. Miscellaneous
35. FAQ & Hints
36. gPLINK |  | Gene reporting toolThe functions listed here are designed to provide a quick and easy way
to partition any PLINK results file that indexes SNPs based on
chromosome and base-pair position in terms of genes.Basic usageThe basic command to produce a gene-centric report of single SNP results, for example from run1.assoc, is
 ./plink --gene-report run1.assoc --gene-list glist-hg18
which assumes the file run1.assoc will have a standard header row containing the fields 
CHR and BP, which it will if it was created by the PLINK --assoc command 
previously. It is not necessary that the original genotype filesets be present when running this command.
The gene list, glist-hg18, should a standard text file in the
following format: one row per gene, chromosome, start and stop
positions (base-pair) and then gene name, e.g.
     7 20140803 20223538 7A5
     19 63549983 63556677 A1BG
     10 52236330 52315441 A1CF
     8 43266741 43337485 A26A1
     15 19305252 19336667 A26B1
     21 13904368 13935777 A26B3
     ...
These files are available for download from the resources section of this web-site.
This generates a file
      plink.range.report
which simply takes the lines of the results file, and lists them by the genes specified in the gene-list file. The listing is
alphabetical by gene name. For example,
     ACO2 -- chr22:40195074..40254939 ( 59.865kb ) 
           DIST  CHR        SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR 
        13.22kb   22  rs2267435   40208294    3   0.3958   0.3537    1       0.3351       0.5627        1.197 
        24.84kb   22  rs2076196   40219909    1   0.3333   0.2683    3       0.8852       0.3468        1.364 
        57.13kb   22  rs1810460   40252200    4  0.04167  0.07317    2       0.8278       0.3629       0.5507 
     ADORA2A -- chr22:23153529..23168325 ( 14.796kb ) 
           DIST  CHR        SNP         BP   A1      F_A      F_U   A2        CHISQ            P           OR 
        11.14kb   22  rs5760423   23164672    4   0.4592   0.4024    3       0.5854       0.4442        1.261 
etc, which shows the lines of run1.assoc split by the genes the SNPs fall in. In this case, the first 
gene is ACO2; the location based on glist-hg18 is specified, along with the length. Then the 
SNPs within this gene are listed.  If genes overlap, then the SNPs will be listed more than once. If a SNP does 
not fall within any gene or region specified, then it will not be listed here. 
The first field, DIST is added, which represents the distance from the start position of the gene. (Note: if
a border is added, with --gene-list-border, see below, then DIST can be negative, i.e. representing 
that the SNP is before the actual start of the gene.)
Naturally, the regions listed in the --gene-list file do not have to correspond to actual genes -- for 
example, they might correspond to known linkage peaks, or regions with disease-related copy number variants, etc.Other optionsThe following options modify this procedure:
     --pfilter 0.01
will list only SNPs with p-values less than 0.01. This requires that
the results file has a field labelled P in the header row.
The additional command
     --gene-list-border 20
will add a 20kb border to the start and stop of each gene listed in the gene file.
The additional command
     --gene-subset candidate.list
will make a report extracting only the genes listed
in candidate.list from the file specified by 
--gene-list.  For example, if the
file candidate.list contained two schizophrenia candidate
genes,
     DISC1
     COMT
then (assuming the genes listed here match a row in the gene-list file, glist-hg18)
will only report nominally significant (P=0.05) SNPs within or near
(+/- 50kb) these two genes.  This is designed to be a more convenient
way to quickly query a focussed set of genes, so one can keep only a
single, central gene-list file.
plink --gene-report run1.assoc 
      --gene-list glist-hg18 
      --gene-subset candidate.list 
      --pfilter 0.05 
      --gene-list-border 50
 |  |