Meta-analysis
This page describes the basic meta-analysis functions in PLINK, in
which two or more result files can be combined in fixed-effects and
random-effects meta-analysis.
Basic usage
The basic command for meta-analysis is invoked as
plink --meta-analysis study1.assoc study2.assoc study3.assoc
PLINK expects each file to be a plain-text, rectangular white-space
delimited file, with a header row. PLINK will search the header row
for the columns:
SNP SNP idenitifier
OR Odds ratio (or BETA, etc)
SE Standard error of OR (or user-defined weight field)
P (Optional) p-value from test
CHR (Optional)
BP (Optional)
A1 (Optional)
A2 (Optional)
HINT The SE field is added as an output
field in the standard --assoc, --mh,
--linear and --logistic tests, etc, if the --ci
0.95 command is specified.
For example, consider we have two association files from independent
studies, s1.assoc and s2.assoc. For example, if the
first few rows of s1.assoc were as follows:
CHR SNP BP A1 F_A F_U A2 CHISQ P OR SE L95 U95
22 rs915677 14433758 A 0.1522 0.1842 G 0.1538 0.695 0.7949 0.5862 0.252 2.508
22 rs140378 15251689 G 0.02083 0.04762 C 0.4988 0.48 0.4255 1.243 0.03719 4.869
22 rs131564 15252977 C 0.1522 0.2619 G 1.625 0.2024 0.5058 0.5401 0.1755 1.458
22 rs4010550 15274688 G 0.1364 0.275 A 2.495 0.1142 0.4163 0.5642 0.1377 1.258
22 rs5747361 15365080 0 0 0 G NA NA NA NA NA NA
22 rs2379981 15405346 G 0.02083 0 A 0.8848 0.3469 NA NA NA NA
...
The command
plink --meta-analysis s1.assoc s2.assoc
gives the following output
Performing meta-analysis of 2 files
Reading results from [ s1.assoc ] with 2680 read
Reading results from [ s2.assoc ] with 2655 read
2778 unique SNPs, 2557 in two or more files
Rejected 1911 SNPs, writing details to [ plink.prob ]
Writing meta-analysis results to [ plink.meta ]
In general, SNPs across two or more files do not need to be in the
same order; also, a SNP does not need to feature in all files. By
default, meta-analysis will be reported for any SNP in two or more
files.
In this case, a number of SNPs are reported as being rejected from
meta-analysis. The reason for this is reported in the file
plink.prob
which lists the SNP, the file and the problem code, as follows:
BAD_CHR Invalid chromosome code
BAD_BP Invalid base-position code
BAD_ES Invalid effect-size (e.g. OR)
BAD_SE Invalid standard error
MISSING_A1 Missing allele 1 label
MISSING_A2 Missing allele 2 label
ALLELE_MISMATCH Mismatching allele codes across files
The main output is in the file
plink.meta
for example,
CHR BP SNP A1 A2 N P P(R) OR OR(R) Q I
22 14433758 rs915677 A G 2 0.2217 0.2217 0.5823 0.5823 0.4184 0.00
22 15252977 rs131564 C G 2 0.2608 0.2608 0.6665 0.6665 0.4924 0.00
22 15274688 rs4010550 G A 2 0.298 0.3545 0.6748 0.6673 0.2489 24.79
22 15462210 rs11089263 A C 2 0.3992 0.3992 1.3108 1.3108 0.3600 0.00
22 15462259 rs11089264 A G 2 0.4719 0.4719 1.2606 1.2606 0.4079 0.00
22 15475051 rs2154615 T C 2 0.5518 0.5518 1.2876 1.2876 0.7534 0.00
22 15476541 rs5993628 A G 2 0.8014 0.8014 1.0948 1.0948 0.3380 0.00
22 15549842 rs2845362 C G 2 0.865 0.9789 0.9399 0.9854 0.1307 56.23
which has the following fields:
CHR Chromosome code
BP Basepair position
SNP SNP identifier
A1 First allele code
A2 Second allele code
N Number of valid studies for this SNP
P Fixed-effects meta-analysis p-value
P(R) Random-effects meta-analysis p-value
OR Fixed-effects OR estimate
OR(R) Random-effects OR estimate
Q p-value for Cochrane's Q statistic
I I^2 heterogeneity index (0-100)
The effect (OR, or BETA in case of quantitative trait) is with
respect to the A1 allele (i.e. if OR is greater than 1,
implies A1 increases risk relative to A2).
HINT If an input file is compressed (gzip
compression) and ends in the .gz extension, PLINK will
automatically decompress it (if compiled with ZLIB support)
Misc. options
A number of options can be specified after the list of result
files. As --meta-analysis takes a variable number of files as
arguments, it is necessary to explicitly indicate that additional
options are specified, by a plus sign, as follows:
plink --meta-analysis s1.assoc s2.assoc + report-all
In this example, the report-all option means that even SNPs
that are only found in a single file are reported. A full list of
options is give here:
study Collate study-specific effect estimates in plink.meta (F0, F1, ...)
no-map Do not look for or use CHR/BP positions (i.e. if absent from files)
no-allele Do not look for or use A1/A2 allele codes (i.e. if absent from files)
report-all Report for SNPs seen only in a single file
logscale Indicates that effects are already on log-scale (i.e. beta from logistic regression)
qt Indicates that effects are from linear regression (i.e. not OR, do not take log)
Selecting subsets of SNPs: One can use the --extract
option as well as --chr, etc, to input and perform
meta-analysis only on certain subsets of SNPs.
HINT If performing meta-analysis on a large number of
large files (e.g. 10+ files of imputed results, each with over 2
million entries), one might need to perform this one chromosome at a
time, with the --chr option, as all the result files might
not fit in memory in one go otherwise.
|