root/galaxy-central/tools/human_genome_variation/gpass.xml @ 3

リビジョン 2, 4.1 KB (コミッタ: hatakeyama, 14 年 前)

import galaxy-central

行番号 
1<tool id="hgv_gpass" name="GPASS" version="1.0.0">
2  <description>significant single-SNP associations in case-control studies</description>
3
4  <command interpreter="perl">
5    gpass.pl ${input1.extra_files_path}/${input1.metadata.base_name}.map ${input1.extra_files_path}/${input1.metadata.base_name}.ped $output $fdr
6  </command>
7
8  <inputs>
9    <param name="input1" type="data" format="lped" label="Dataset"/>
10    <param name="fdr" type="float" value="0.05" label="FDR"/>
11  </inputs>
12
13  <outputs>
14    <data name="output" format="tabular" />
15  </outputs>
16
17  <requirements>
18    <requirement type="binary">gpass</requirement>
19  </requirements>
20
21  <!-- we need to be able to set the seed for the random number generator
22  <tests>
23    <test>
24      <param name='input1' value='gpass_and_beam_input' ftype='lped' >
25        <metadata name='base_name' value='gpass_and_beam_input' />
26        <composite_data value='gpass_and_beam_input.ped' />
27        <composite_data value='gpass_and_beam_input.map' />
28        <edit_attributes type='name' value='gpass_and_beam_input' />
29      </param>
30      <param name="fdr" value="0.05" />
31      <output name="output" file="gpass_output.txt" />
32    </test>
33  </tests>
34  -->
35
36  <help>
37**Dataset formats**
38
39The input dataset must be in lped_ format, and the output is tabular_.
40(`Dataset missing?`_)
41
42.. _lped: ./static/formatHelp.html#lped
43.. _tabular: ./static/formatHelp.html#tab
44.. _Dataset missing?: ./static/formatHelp.html
45
46-----
47
48**What it does**
49
50GPASS (Genome-wide Poisson Approximation for Statistical Significance)
51detects significant single-SNP associations in case-control studies at a user-specified FDR.  Unlike previous methods, this tool can accurately approximate the genome-wide significance and FDR of SNP associations, while adjusting for millions of multiple comparisons, within seconds or minutes.
52
53The program has two main functionalities:
54
551. Detect significant single-SNP associations at a user-specified false
56   discovery rate (FDR).
57
58   *Note*: a "typical" definition of FDR could be
59            FDR = E(# of false positive SNPs / # of significant SNPs)
60
61   This definition however is very inappropriate for association mapping, since SNPs are
62   highly correlated.  Our FDR is
63   defined differently to account for SNP correlations, and thus will obtain
64   a proper FDR in terms of "proportion of false positive loci".
65
662. Approximate the significance of a list of candidate SNPs, adjusting for
67   multiple comparisons. If you have isolated a few SNPs of interest and want
68   to know their significance in a GWAS, you can supply the GWAS data and let
69   the program specifically test those SNPs.
70
71
72*Also note*: the number of SNPs in a study cannot be both too small and at the same
73time too clustered in a local region. A few hundreds of SNPs, or tens of SNPs
74spread in different regions, will be fine. The sample size cannot be too small
75either; around 100 or more individuals (case + control combined) will be fine.
76Otherwise use permutation.
77
78-----
79
80**Example**
81
82- input map file::
83
84    1  rs0  0  738547
85    1  rs1  0  5597094
86    1  rs2  0  9424115
87    etc.
88
89- input ped file::
90
91    1 1 0 0 1  1  G G  A A  A A  A A  A A  A G  A A  G G  G G  A A  G G  G G  G G  A A  A A  A G  A A  G G  A G  A G  A A  G G  A A  G G  A A  G G  A G  A A  G G  A A  G G  A A  A G  A G  G G  A G  G G  G G  A A  A G  A A  G G  G G  G G  G G  A G  A A  A A  A A  A A
92    1 1 0 0 1  1  G G  A G  G G  A A  A A  A G  A A  G G  G G  G G  A A  G G  A G  A G  G G  G G  A G  G G  A G  A A  G G  A G  G G  A A  G G  G G  A G  A G  G G  A G  A A  A A  G G  G G  A G  A G  G G  A G  A A  A A  A G  G G  A G  G G  A G  G G  G G  A A  G G  A G
93    etc.
94
95- output dataset, showing significant SNPs and their p-values and FDR::
96
97    #ID   chr   position   Statistics  adj-Pvalue  FDR
98    rs35  chr1  136606952  4.890849    0.991562    0.682138
99    rs36  chr1  137748344  4.931934    0.991562    0.795827
100    rs44  chr2  14423047   7.712832    0.665086    0.218776
101    etc.
102
103-----
104
105**Reference**
106
107Zhang Y, Liu JS. (2010)
108Fast and accurate significance approximation for genome-wide association studies.
109Submitted.
110
111  </help>
112</tool>
Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。