[2] | 1 | <tool id="hgv_snpFreq" name="snpFreq" version="1.0.0"> |
---|
| 2 | <description>significant SNPs in case-control data</description> |
---|
| 3 | |
---|
| 4 | <command interpreter="perl"> |
---|
| 5 | snpFreq2.pl $input $group1_1 $group1_2 $group1_3 $group2_1 $group2_2 $group2_3 0.05 $output |
---|
| 6 | </command> |
---|
| 7 | |
---|
| 8 | <inputs> |
---|
| 9 | <param format="tabular" name="input" type="data" label="Dataset" /> |
---|
| 10 | <param name="group1_1" label="Column with genotype 1 count for group 1" type="data_column" data_ref="input" /> |
---|
| 11 | <param name="group1_2" label="Column with genotype 2 count for group 1" type="data_column" data_ref="input" /> |
---|
| 12 | <param name="group1_3" label="Column with genotype 3 count for group 1" type="data_column" data_ref="input" /> |
---|
| 13 | <param name="group2_1" label="Column with genotype 1 count for group 2" type="data_column" data_ref="input" /> |
---|
| 14 | <param name="group2_2" label="Column with genotype 2 count for group 2" type="data_column" data_ref="input" /> |
---|
| 15 | <param name="group2_3" label="Column with genotype 3 count for group 2" type="data_column" data_ref="input" /> |
---|
| 16 | </inputs> |
---|
| 17 | |
---|
| 18 | <outputs> |
---|
| 19 | <data format="tabular" name="output" /> |
---|
| 20 | </outputs> |
---|
| 21 | |
---|
| 22 | <requirements> |
---|
| 23 | <requirement type="binary">R</requirement> |
---|
| 24 | </requirements> |
---|
| 25 | |
---|
| 26 | <tests> |
---|
| 27 | <test> |
---|
| 28 | <param name="input" ftype="tabular" value="snpFreqInput.txt" dbkey="hg18" /> |
---|
| 29 | <param name="group1_1" value="4" /> |
---|
| 30 | <param name="group1_2" value="5" /> |
---|
| 31 | <param name="group1_3" value="6" /> |
---|
| 32 | <param name="group2_1" value="7" /> |
---|
| 33 | <param name="group2_2" value="8" /> |
---|
| 34 | <param name="group2_3" value="9" /> |
---|
| 35 | <output name="output" file="snpFreqTestOut.txt" /> |
---|
| 36 | </test> |
---|
| 37 | </tests> |
---|
| 38 | |
---|
| 39 | <help> |
---|
| 40 | |
---|
| 41 | **Dataset formats** |
---|
| 42 | |
---|
| 43 | The input is tabular_, with six columns of allele counts. The output is also tabular, |
---|
| 44 | and includes all of the input data plus the additional columns described below. |
---|
| 45 | (`Dataset missing?`_) |
---|
| 46 | |
---|
| 47 | .. _tabular: ./static/formatHelp.html#tab |
---|
| 48 | .. _Dataset missing?: ./static/formatHelp.html |
---|
| 49 | |
---|
| 50 | ----- |
---|
| 51 | |
---|
| 52 | **What it does** |
---|
| 53 | |
---|
| 54 | This tool performs a basic analysis of bi-allelic SNPs in case-control |
---|
| 55 | data, using the R statistical environment and Fisher's exact test to |
---|
| 56 | identify SNPs with a significant difference in the allele frequencies |
---|
| 57 | between the two groups. R's "qvalue" package is used to correct for |
---|
| 58 | multiple testing. |
---|
| 59 | |
---|
| 60 | The input file includes counts for each allele combination (AA aa Aa) |
---|
| 61 | for each group at each SNP position. The assignment of codes (1 2 3) |
---|
| 62 | to these genotypes is arbitrary, as long as it is consistent for both |
---|
| 63 | groups. Any other input columns are ignored in the computation, but |
---|
| 64 | are copied to the output. The output appends eight additional columns, |
---|
| 65 | namely the minimum expected counts of the three genotypes for each |
---|
| 66 | group, the p-value, and the q-value. |
---|
| 67 | |
---|
| 68 | ----- |
---|
| 69 | |
---|
| 70 | **Example** |
---|
| 71 | |
---|
| 72 | - input file:: |
---|
| 73 | |
---|
| 74 | chr1 210 211 38 4 15 56 0 1 x |
---|
| 75 | chr1 228 229 55 0 2 56 0 1 x |
---|
| 76 | chr1 230 231 46 0 11 55 0 2 x |
---|
| 77 | chr1 234 235 43 0 14 55 0 2 x |
---|
| 78 | chr1 236 237 55 0 2 13 10 34 x |
---|
| 79 | chr1 437 438 55 0 2 46 0 11 x |
---|
| 80 | chr1 439 440 56 0 1 55 0 2 x |
---|
| 81 | chr1 449 450 56 0 1 13 20 24 x |
---|
| 82 | chr1 518 519 56 0 1 38 4 15 x |
---|
| 83 | |
---|
| 84 | Here the group 1 genotype counts are in columns 4 - 6, while those |
---|
| 85 | for group 2 are in columns 7 - 9. |
---|
| 86 | |
---|
| 87 | Note that the "x" column has no meaning. It was added to this example |
---|
| 88 | to show that extra columns can be included, and to make it easier |
---|
| 89 | to see where the new columns are appended in the output. |
---|
| 90 | |
---|
| 91 | - output file:: |
---|
| 92 | |
---|
| 93 | chr1 210 211 38 4 15 56 0 1 x 47 2 8 47 2 8 1.50219088598917e-05 6.32501425679652e-06 |
---|
| 94 | chr1 228 229 55 0 2 56 0 1 x 55.5 0 1.5 55.5 0 1.5 1 0.210526315789474 |
---|
| 95 | chr1 230 231 46 0 11 55 0 2 x 50.5 0 6.5 50.5 0 6.5 0.0155644201009862 0.00409590002657532 |
---|
| 96 | chr1 234 235 43 0 14 55 0 2 x 49 0 8 49 0 8 0.00210854461554067 0.000739840215979182 |
---|
| 97 | chr1 236 237 55 0 2 13 10 34 x 34 5 18 34 5 18 6.14613878554783e-17 4.31307984950725e-17 |
---|
| 98 | chr1 437 438 55 0 2 46 0 11 x 50.5 0 6.5 50.5 0 6.5 0.0155644201009862 0.00409590002657532 |
---|
| 99 | chr1 439 440 56 0 1 55 0 2 x 55.5 0 1.5 55.5 0 1.5 1 0.210526315789474 |
---|
| 100 | chr1 449 450 56 0 1 13 20 24 x 34.5 10 12.5 34.5 10 12.5 2.25757007974134e-18 2.37638955762246e-18 |
---|
| 101 | chr1 518 519 56 0 1 38 4 15 x 47 2 8 47 2 8 1.50219088598917e-05 6.32501425679652e-06 |
---|
| 102 | |
---|
| 103 | </help> |
---|
| 104 | </tool> |
---|