[2] | 1 | <tool id="hgv_ldtools" name="LD" version="1.0.0"> |
---|
| 2 | <description>linkage disequilibrium and tag SNPs</description> |
---|
| 3 | |
---|
| 4 | <command interpreter="bash"> |
---|
| 5 | ldtools_wrapper.sh rsquare=$rsquare freq=$freq input=$input output=$output |
---|
| 6 | </command> |
---|
| 7 | |
---|
| 8 | <inputs> |
---|
| 9 | <param format="tabular" name="input" type="data" label="Dataset"/> |
---|
| 10 | <param name="rsquare" label="r<sup>2</sup> threshold" type="float" value="0.64"> |
---|
| 11 | <validator type="in_range" message="rsquare must be in range [0.00, 1.00]" min="0.00" max="1.00" /> |
---|
| 12 | </param> |
---|
| 13 | <param name="freq" label="Minimum allele frequency threshold" type="float" value="0.00"> |
---|
| 14 | <validator type="in_range" message="freq must be in range (0.00, 0.50]" min="0.00" max="0.50" /> |
---|
| 15 | </param> |
---|
| 16 | </inputs> |
---|
| 17 | |
---|
| 18 | <outputs> |
---|
| 19 | <data format="tabular" name="output" /> |
---|
| 20 | </outputs> |
---|
| 21 | |
---|
| 22 | <tests> |
---|
| 23 | <test> |
---|
| 24 | <param name="input" value="ldInput1.txt" /> |
---|
| 25 | <param name="rsquare" value="0.64" /> |
---|
| 26 | <param name="freq" value="0.00" /> |
---|
| 27 | <param name="output" file="ldOutput1.txt" /> |
---|
| 28 | </test> |
---|
| 29 | </tests> |
---|
| 30 | |
---|
| 31 | <help> |
---|
| 32 | **Dataset formats** |
---|
| 33 | |
---|
| 34 | The input and output datasets are tabular_. |
---|
| 35 | (`Dataset missing?`_) |
---|
| 36 | |
---|
| 37 | .. _tabular: ./static/formatHelp.html#tab |
---|
| 38 | .. _Dataset missing?: ./static/formatHelp.html |
---|
| 39 | |
---|
| 40 | ----- |
---|
| 41 | |
---|
| 42 | **What it does** |
---|
| 43 | |
---|
| 44 | This tool can be used to analyze the patterns of linkage disequilibrium |
---|
| 45 | (LD) between polymorphic sites in a locus. SNPs are grouped based on the |
---|
| 46 | threshold level of LD as measured by r\ :sup:`2` (regardless of genomic |
---|
| 47 | position), and a representative "tag SNP" is reported for each group. |
---|
| 48 | The other SNPs in the group are in LD with the tag SNP, but not necessarily |
---|
| 49 | with each other. |
---|
| 50 | |
---|
| 51 | The underlying algorithm is the same as the one used in ldSelect (Carlson |
---|
| 52 | et al. 2004). However, this tool is implemented to be much faster and more |
---|
| 53 | efficient than ldSelect. |
---|
| 54 | |
---|
| 55 | The input is a tabular file with genotype information for each individual |
---|
| 56 | at each SNP site, in exactly four columns: site ID, sample ID, and the |
---|
| 57 | two allele nucleotides. |
---|
| 58 | |
---|
| 59 | ----- |
---|
| 60 | |
---|
| 61 | **Example** |
---|
| 62 | |
---|
| 63 | - input file:: |
---|
| 64 | |
---|
| 65 | rs2334386 NA20364 G T |
---|
| 66 | rs2334386 NA20363 G G |
---|
| 67 | rs2334386 NA20360 G G |
---|
| 68 | rs2334386 NA20359 G G |
---|
| 69 | rs2334386 NA20358 G G |
---|
| 70 | rs2334386 NA20356 G G |
---|
| 71 | rs2334386 NA20357 G G |
---|
| 72 | rs2334386 NA20350 G G |
---|
| 73 | rs2334386 NA20349 G G |
---|
| 74 | rs2334386 NA20348 G G |
---|
| 75 | rs2334386 NA20347 G G |
---|
| 76 | rs2334386 NA20346 G G |
---|
| 77 | rs2334386 NA20345 G G |
---|
| 78 | rs2334386 NA20344 G G |
---|
| 79 | rs2334386 NA20342 G G |
---|
| 80 | etc. |
---|
| 81 | |
---|
| 82 | - output file:: |
---|
| 83 | |
---|
| 84 | rs2238748 rs2793064,rs6518516,rs6518517,rs2283641,rs5993533,rs715590,rs2072123,rs2105421,rs2800954,rs1557847,rs807750,rs807753,rs5993488,rs8138035,rs2800980,rs2525079,rs5992353,rs712966,rs2525036,rs807743,rs1034727,rs807744,rs2074003 |
---|
| 85 | rs2871023 rs1210715,rs1210711,rs5748189,rs1210709,rs3788298,rs7284649,rs9306217,rs9604954,rs1210703,rs5748179,rs5746727,rs5748190,rs5993603,rs2238766,rs885981,rs2238763,rs5748165,rs9605996,rs9606001,rs5992398 |
---|
| 86 | rs7292006 rs13447232,rs5993665,rs2073733,rs1057457,rs756658,rs5992395,rs2073760,rs739369,rs9606017,rs739370,rs4493360,rs2073736 |
---|
| 87 | rs2518840 rs1061325,rs2283646,rs362148,rs1340958,rs361956,rs361991,rs2073754,rs2040771,rs2073740,rs2282684 |
---|
| 88 | rs2073775 rs10160,rs2800981,rs807751,rs5993492,rs2189490,rs5747997,rs2238743 |
---|
| 89 | rs5747263 rs12159924,rs2300688,rs4239846,rs3747025,rs3747024,rs3747023,rs2300691 |
---|
| 90 | rs433576 rs9605439,rs1109052,rs400509,rs401099,rs396012,rs410456,rs385105 |
---|
| 91 | rs2106145 rs5748131,rs2013516,rs1210684,rs1210685,rs2238767,rs2277837 |
---|
| 92 | rs2587082 rs2257083,rs2109659,rs2587081,rs5747306,rs2535704,rs2535694 |
---|
| 93 | rs807667 rs2800974,rs756651,rs762523,rs2800973,rs1018764 |
---|
| 94 | rs2518866 rs1206542,rs807467,rs807464,rs807462,rs712950 |
---|
| 95 | rs1110661 rs1110660,rs7286607,rs1110659,rs5992917,rs1110662 |
---|
| 96 | rs759076 rs5748760,rs5748755,rs5748752,rs4819925,rs933461 |
---|
| 97 | rs5746487 rs5992895,rs2034113,rs2075455,rs1867353 |
---|
| 98 | rs5748212 rs5746736,rs4141527,rs5748147,rs5748202 |
---|
| 99 | etc. |
---|
| 100 | |
---|
| 101 | ----- |
---|
| 102 | |
---|
| 103 | **Reference** |
---|
| 104 | |
---|
| 105 | Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. (2004) |
---|
| 106 | Selecting a maximally informative set of single-nucleotide polymorphisms for |
---|
| 107 | association analyses using linkage disequilibrium. |
---|
| 108 | Am J Hum Genet. 74(1):106-20. Epub 2003 Dec 15. |
---|
| 109 | |
---|
| 110 | </help> |
---|
| 111 | </tool> |
---|