1 | <tool id="hgv_snpFreq" name="snpFreq" version="1.0.0"> |
---|
2 | <description>significant SNPs in case-control data</description> |
---|
3 | |
---|
4 | <command interpreter="perl"> |
---|
5 | snpFreq2.pl $input $group1_1 $group1_2 $group1_3 $group2_1 $group2_2 $group2_3 0.05 $output |
---|
6 | </command> |
---|
7 | |
---|
8 | <inputs> |
---|
9 | <param format="tabular" name="input" type="data" label="Dataset" /> |
---|
10 | <param name="group1_1" label="Column with genotype 1 count for group 1" type="data_column" data_ref="input" /> |
---|
11 | <param name="group1_2" label="Column with genotype 2 count for group 1" type="data_column" data_ref="input" /> |
---|
12 | <param name="group1_3" label="Column with genotype 3 count for group 1" type="data_column" data_ref="input" /> |
---|
13 | <param name="group2_1" label="Column with genotype 1 count for group 2" type="data_column" data_ref="input" /> |
---|
14 | <param name="group2_2" label="Column with genotype 2 count for group 2" type="data_column" data_ref="input" /> |
---|
15 | <param name="group2_3" label="Column with genotype 3 count for group 2" type="data_column" data_ref="input" /> |
---|
16 | </inputs> |
---|
17 | |
---|
18 | <outputs> |
---|
19 | <data format="tabular" name="output" /> |
---|
20 | </outputs> |
---|
21 | |
---|
22 | <requirements> |
---|
23 | <requirement type="binary">R</requirement> |
---|
24 | </requirements> |
---|
25 | |
---|
26 | <tests> |
---|
27 | <test> |
---|
28 | <param name="input" ftype="tabular" value="snpFreqInput.txt" dbkey="hg18" /> |
---|
29 | <param name="group1_1" value="4" /> |
---|
30 | <param name="group1_2" value="5" /> |
---|
31 | <param name="group1_3" value="6" /> |
---|
32 | <param name="group2_1" value="7" /> |
---|
33 | <param name="group2_2" value="8" /> |
---|
34 | <param name="group2_3" value="9" /> |
---|
35 | <output name="output" file="snpFreqTestOut.txt" /> |
---|
36 | </test> |
---|
37 | </tests> |
---|
38 | |
---|
39 | <help> |
---|
40 | |
---|
41 | **Dataset formats** |
---|
42 | |
---|
43 | The input is tabular_, with six columns of allele counts. The output is also tabular, |
---|
44 | and includes all of the input data plus the additional columns described below. |
---|
45 | (`Dataset missing?`_) |
---|
46 | |
---|
47 | .. _tabular: ./static/formatHelp.html#tab |
---|
48 | .. _Dataset missing?: ./static/formatHelp.html |
---|
49 | |
---|
50 | ----- |
---|
51 | |
---|
52 | **What it does** |
---|
53 | |
---|
54 | This tool performs a basic analysis of bi-allelic SNPs in case-control |
---|
55 | data, using the R statistical environment and Fisher's exact test to |
---|
56 | identify SNPs with a significant difference in the allele frequencies |
---|
57 | between the two groups. R's "qvalue" package is used to correct for |
---|
58 | multiple testing. |
---|
59 | |
---|
60 | The input file includes counts for each allele combination (AA aa Aa) |
---|
61 | for each group at each SNP position. The assignment of codes (1 2 3) |
---|
62 | to these genotypes is arbitrary, as long as it is consistent for both |
---|
63 | groups. Any other input columns are ignored in the computation, but |
---|
64 | are copied to the output. The output appends eight additional columns, |
---|
65 | namely the minimum expected counts of the three genotypes for each |
---|
66 | group, the p-value, and the q-value. |
---|
67 | |
---|
68 | ----- |
---|
69 | |
---|
70 | **Example** |
---|
71 | |
---|
72 | - input file:: |
---|
73 | |
---|
74 | chr1 210 211 38 4 15 56 0 1 x |
---|
75 | chr1 228 229 55 0 2 56 0 1 x |
---|
76 | chr1 230 231 46 0 11 55 0 2 x |
---|
77 | chr1 234 235 43 0 14 55 0 2 x |
---|
78 | chr1 236 237 55 0 2 13 10 34 x |
---|
79 | chr1 437 438 55 0 2 46 0 11 x |
---|
80 | chr1 439 440 56 0 1 55 0 2 x |
---|
81 | chr1 449 450 56 0 1 13 20 24 x |
---|
82 | chr1 518 519 56 0 1 38 4 15 x |
---|
83 | |
---|
84 | Here the group 1 genotype counts are in columns 4 - 6, while those |
---|
85 | for group 2 are in columns 7 - 9. |
---|
86 | |
---|
87 | Note that the "x" column has no meaning. It was added to this example |
---|
88 | to show that extra columns can be included, and to make it easier |
---|
89 | to see where the new columns are appended in the output. |
---|
90 | |
---|
91 | - output file:: |
---|
92 | |
---|
93 | chr1 210 211 38 4 15 56 0 1 x 47 2 8 47 2 8 1.50219088598917e-05 6.32501425679652e-06 |
---|
94 | chr1 228 229 55 0 2 56 0 1 x 55.5 0 1.5 55.5 0 1.5 1 0.210526315789474 |
---|
95 | chr1 230 231 46 0 11 55 0 2 x 50.5 0 6.5 50.5 0 6.5 0.0155644201009862 0.00409590002657532 |
---|
96 | chr1 234 235 43 0 14 55 0 2 x 49 0 8 49 0 8 0.00210854461554067 0.000739840215979182 |
---|
97 | chr1 236 237 55 0 2 13 10 34 x 34 5 18 34 5 18 6.14613878554783e-17 4.31307984950725e-17 |
---|
98 | chr1 437 438 55 0 2 46 0 11 x 50.5 0 6.5 50.5 0 6.5 0.0155644201009862 0.00409590002657532 |
---|
99 | chr1 439 440 56 0 1 55 0 2 x 55.5 0 1.5 55.5 0 1.5 1 0.210526315789474 |
---|
100 | chr1 449 450 56 0 1 13 20 24 x 34.5 10 12.5 34.5 10 12.5 2.25757007974134e-18 2.37638955762246e-18 |
---|
101 | chr1 518 519 56 0 1 38 4 15 x 47 2 8 47 2 8 1.50219088598917e-05 6.32501425679652e-06 |
---|
102 | |
---|
103 | </help> |
---|
104 | </tool> |
---|