INRICH: interval-based enrichment analysis

Main

Tutorial

References

The following table lists the command line parameters for INRICH analysis. Options in bold* are requisite. When optional argument is a file, pay special attention to compile your input file adhering to the formatting requirements listed in the Format (4th) column. Also, note that all input file columns are tab-separated.

Option	Option name	Default value(s)	Format	Description
-2	test-type	no designation	-2 or no designation	default : enrichment statistic is calculated based on the number of testing intervals overlapping with pathway genes -2 : enrichment statistic is calculated based on the number of pathway genes overlapping with testing intervals
-a*	test-interval-file		-a file-name chr start_bp stop_bp	List of associated genomic interval regions from GWAS analysis
-b	background-gene-file		-b file-name gene_id	List of genes with special interest Pathway analysis can be restricted to genes listed in this file. For example, users may be interested in analyzing a subset of reference genes that are expressed only in a specific tissue type.
-c	pre-compute	true	-c	Turn off the precomputing option that searches acceptable positions for random interval generation prior to permutation Precomputing speeds up the permutation procedure at the expense of increased memory usage. We thus suggest to turn off the precompute option (by specifying -c), when more than 1000 genomic intervals are examined or when memory allocation error messages appear.
-d	match-density	0.2	-d numeric float	Allow (-d) % SNP mapping density extension for random intervals
-e	match-genes	true	-e	If used, omit the random set generation criterion that matches the number of overlapping genes for each interval with the input interval set with the original test regions
-f	random-seed	0	-f numeric integer	Useful if want to repeat an analysis and get the same output
-g*	reference-gene-file		-g file-name chr start_bp stop_bp gene_id gene_name	List of reference genes
-h	positional clustering		-h top-N-closest-regions	Conduct positional clustering test on top-N-closest regions
-i, -j	target size-filter	-i 2 -j 200	-i numeric integer -j numeric integer	Restrict analysis to the gene sets with at least (-i) genes but not more than (-j) number of genes
-k	compact	true	-k	Turn off the compact option that excludes non-genic test intervals
-m*	reference-SNP-file		-m file-name chr bp	List of reference SNPs included in the GWAS analysis
-n	top-N-regions	all	-n numeric integer	Restrict analysis to the top N intervals listed in the test input file
-o	output-root	inrich	-o alphabet/numeric string	Output file name
-p	display-p	0.05	-p numeric float	Displays pathways significant below this threshold as part of the standard output and log file
-q	num-bootstrapping	1000	-q numeric integer	Number of permutations to carry out for correcting empirical pathway-level p-values
-r	num-replicates	5000	-r numeric integer	Number of permutations to carry out for calulating empirical pathway-level p-values
-t*	geneset-file		-t file-name gene_id gene_set_id gene_set_description	List of gene and gene sets
-u	human GWAS data	true	-u	Designate that test data are from non-human genome
-w	bp-window	0	-n numeric integer	Extend gene regions to include (-w) bp up/downstream
-x	range-file		-x file-name chr start_bp stop_bp	List of genomic ranges with special interst Pathway analysis can be restricted to a subset of reference genes based on their genomic location specified in this file.
-z	min-obs threshold	2	-z numeric integer	Restrict analysis to the gene sets with at least -z number of genes overlapping with test intervals