[2] | 1 | <tool id="hgv_gpass" name="GPASS" version="1.0.0"> |
---|
| 2 | <description>significant single-SNP associations in case-control studies</description> |
---|
| 3 | |
---|
| 4 | <command interpreter="perl"> |
---|
| 5 | gpass.pl ${input1.extra_files_path}/${input1.metadata.base_name}.map ${input1.extra_files_path}/${input1.metadata.base_name}.ped $output $fdr |
---|
| 6 | </command> |
---|
| 7 | |
---|
| 8 | <inputs> |
---|
| 9 | <param name="input1" type="data" format="lped" label="Dataset"/> |
---|
| 10 | <param name="fdr" type="float" value="0.05" label="FDR"/> |
---|
| 11 | </inputs> |
---|
| 12 | |
---|
| 13 | <outputs> |
---|
| 14 | <data name="output" format="tabular" /> |
---|
| 15 | </outputs> |
---|
| 16 | |
---|
| 17 | <requirements> |
---|
| 18 | <requirement type="binary">gpass</requirement> |
---|
| 19 | </requirements> |
---|
| 20 | |
---|
| 21 | <!-- we need to be able to set the seed for the random number generator |
---|
| 22 | <tests> |
---|
| 23 | <test> |
---|
| 24 | <param name='input1' value='gpass_and_beam_input' ftype='lped' > |
---|
| 25 | <metadata name='base_name' value='gpass_and_beam_input' /> |
---|
| 26 | <composite_data value='gpass_and_beam_input.ped' /> |
---|
| 27 | <composite_data value='gpass_and_beam_input.map' /> |
---|
| 28 | <edit_attributes type='name' value='gpass_and_beam_input' /> |
---|
| 29 | </param> |
---|
| 30 | <param name="fdr" value="0.05" /> |
---|
| 31 | <output name="output" file="gpass_output.txt" /> |
---|
| 32 | </test> |
---|
| 33 | </tests> |
---|
| 34 | --> |
---|
| 35 | |
---|
| 36 | <help> |
---|
| 37 | **Dataset formats** |
---|
| 38 | |
---|
| 39 | The input dataset must be in lped_ format, and the output is tabular_. |
---|
| 40 | (`Dataset missing?`_) |
---|
| 41 | |
---|
| 42 | .. _lped: ./static/formatHelp.html#lped |
---|
| 43 | .. _tabular: ./static/formatHelp.html#tab |
---|
| 44 | .. _Dataset missing?: ./static/formatHelp.html |
---|
| 45 | |
---|
| 46 | ----- |
---|
| 47 | |
---|
| 48 | **What it does** |
---|
| 49 | |
---|
| 50 | GPASS (Genome-wide Poisson Approximation for Statistical Significance) |
---|
| 51 | detects significant single-SNP associations in case-control studies at a user-specified FDR. Unlike previous methods, this tool can accurately approximate the genome-wide significance and FDR of SNP associations, while adjusting for millions of multiple comparisons, within seconds or minutes. |
---|
| 52 | |
---|
| 53 | The program has two main functionalities: |
---|
| 54 | |
---|
| 55 | 1. Detect significant single-SNP associations at a user-specified false |
---|
| 56 | discovery rate (FDR). |
---|
| 57 | |
---|
| 58 | *Note*: a "typical" definition of FDR could be |
---|
| 59 | FDR = E(# of false positive SNPs / # of significant SNPs) |
---|
| 60 | |
---|
| 61 | This definition however is very inappropriate for association mapping, since SNPs are |
---|
| 62 | highly correlated. Our FDR is |
---|
| 63 | defined differently to account for SNP correlations, and thus will obtain |
---|
| 64 | a proper FDR in terms of "proportion of false positive loci". |
---|
| 65 | |
---|
| 66 | 2. Approximate the significance of a list of candidate SNPs, adjusting for |
---|
| 67 | multiple comparisons. If you have isolated a few SNPs of interest and want |
---|
| 68 | to know their significance in a GWAS, you can supply the GWAS data and let |
---|
| 69 | the program specifically test those SNPs. |
---|
| 70 | |
---|
| 71 | |
---|
| 72 | *Also note*: the number of SNPs in a study cannot be both too small and at the same |
---|
| 73 | time too clustered in a local region. A few hundreds of SNPs, or tens of SNPs |
---|
| 74 | spread in different regions, will be fine. The sample size cannot be too small |
---|
| 75 | either; around 100 or more individuals (case + control combined) will be fine. |
---|
| 76 | Otherwise use permutation. |
---|
| 77 | |
---|
| 78 | ----- |
---|
| 79 | |
---|
| 80 | **Example** |
---|
| 81 | |
---|
| 82 | - input map file:: |
---|
| 83 | |
---|
| 84 | 1 rs0 0 738547 |
---|
| 85 | 1 rs1 0 5597094 |
---|
| 86 | 1 rs2 0 9424115 |
---|
| 87 | etc. |
---|
| 88 | |
---|
| 89 | - input ped file:: |
---|
| 90 | |
---|
| 91 | 1 1 0 0 1 1 G G A A A A A A A A A G A A G G G G A A G G G G G G A A A A A G A A G G A G A G A A G G A A G G A A G G A G A A G G A A G G A A A G A G G G A G G G G G A A A G A A G G G G G G G G A G A A A A A A A A |
---|
| 92 | 1 1 0 0 1 1 G G A G G G A A A A A G A A G G G G G G A A G G A G A G G G G G A G G G A G A A G G A G G G A A G G G G A G A G G G A G A A A A G G G G A G A G G G A G A A A A A G G G A G G G A G G G G G A A G G A G |
---|
| 93 | etc. |
---|
| 94 | |
---|
| 95 | - output dataset, showing significant SNPs and their p-values and FDR:: |
---|
| 96 | |
---|
| 97 | #ID chr position Statistics adj-Pvalue FDR |
---|
| 98 | rs35 chr1 136606952 4.890849 0.991562 0.682138 |
---|
| 99 | rs36 chr1 137748344 4.931934 0.991562 0.795827 |
---|
| 100 | rs44 chr2 14423047 7.712832 0.665086 0.218776 |
---|
| 101 | etc. |
---|
| 102 | |
---|
| 103 | ----- |
---|
| 104 | |
---|
| 105 | **Reference** |
---|
| 106 | |
---|
| 107 | Zhang Y, Liu JS. (2010) |
---|
| 108 | Fast and accurate significance approximation for genome-wide association studies. |
---|
| 109 | Submitted. |
---|
| 110 | |
---|
| 111 | </help> |
---|
| 112 | </tool> |
---|