root/galaxy-central/tools/rgenetics/rgGLM.xml

リビジョン 2, 6.8 KB (コミッタ: hatakeyama, 14 年 前)

import galaxy-central

行番号 
1<tool id="rgGLM1" name="Linear Models:" version="0.2">
2    <code file="listFiles.py"/>
3    <code file="rgGLM_code.py"/>
4
5    <description>for genotype data</description>
6
7    <command interpreter="python">
8        rgGLM.py '$i.extra_files_path/$i.metadata.base_name' '$phef.extra_files_path/$phef.metadata.base_name'
9        "$title1" '$predvar' '$covar' '$out_file1' '$logf' '$i.metadata.base_name'
10        '$inter' '$cond' '$gender' '$mind' '$geno' '$maf' '$logistic' '$gffout'
11    </command>
12
13    <inputs>
14      <page>
15       <param name='title1' label='Title for outputs' type='text' value='GLM' size="80" />
16       <param name="i" type="data" format="pbed" label="Genotype file" size="80"  />
17       <param name="phef"  type="data" format="pphe" label="Phenotype file" size="80"
18       help="Dependent variable and covariates will be chosen from this file on the next page"/>
19       <param name="logistic" type="text" value = "0" label="1=Use a logistic model (trait must be 1/2 coded like affection)"
20       help="Please read the Plink documentation about this option"  />
21       <param name="gender" type="text" value = "0" label="1=Add a gender term to model"  />
22       <param name='inter' label='1=Build an interaction model - please read the docs carefully before using this'
23         type='text' value='0' size="1" />
24       <param name="cond"  type="text"  area='true' size='15x20' value = ""
25       label="condition on this whitespace delimited rs (snp id) list"  />
26       <param name="mind" type="float" value = "0.1" label="Remove subjects with missing genotypes gt (eg 0.1)"
27       help = "Set to 1 to include all subjects in the input file" />
28       <param name="geno"  type="float" value = "0.1" label="Remove markers with missing genotypes gt (eg 0.1)"
29       help = "Set to 1 to include all markers in the input file"  />
30       <param name="maf"  type="float" value = "0.01" label="Remove markers with MAF lt (eg 0.01) "
31       help = "Set to 0 to include all markers in the input file"/>
32      </page>
33      <page>
34       <param name="predvar" size="80"  type="select" label="Dependent Trait"
35       dynamic_options="get_phecols(phef=phef,selectOne=1)"  display="radio" multiple="false"
36       help="Model this characteristic in terms of subject snp genotypes - eg rare allele dosage for additive model" />
37       <param name="covar" size="80"  type="select" label="Covariates"
38       dynamic_options="get_phecols(phef=phef,selectOne=0)" multiple="true" display="checkboxes"
39       help="Use these phenotypes as covariates in models of snp dosage effects on the dependent trait"/>
40      </page>
41   </inputs>
42
43   <outputs>
44       <data format="tabular" name="out_file1" />
45       <data format="txt" name="logf"  />
46       <data format="gff" name="gffout"  />
47   </outputs>
48<tests>
49 <test>
50  <param name='i' value='tinywga' ftype='pbed' >
51   <metadata name='base_name' value='tinywga' />
52   <composite_data value='tinywga.bim' />
53   <composite_data value='tinywga.bed' />
54   <composite_data value='tinywga.fam' />
55   <edit_attributes type='name' value='tinywga' />
56 </param>
57 <param name='phef' value='tinywga' ftype='pphe' >
58   <metadata name='base_name' value='tinywga' />
59   <composite_data value='tinywga.pphe' />
60   <edit_attributes type='name' value='tinywga' />
61 </param>
62 <param name='title1' value='rgGLMtest1' />
63 <param name='predvar' value='c1' />
64 <param name='covar' value='None' />
65 <param name='inter' value='0' />
66 <param name='cond' value='' />
67 <param name='gender' value='0' />
68 <param name='mind' value='1.0' />
69 <param name='geno' value='1.0' />
70 <param name='maf' value='0.0' />
71 <param name='logistic' value='0' />
72 <output name='out_file1' file='rgGLMtest1_GLM.xls' ftype='tabular' compare="diff" />
73 <output name='logf' file='rgGLMtest1_GLM_log.txt' ftype='txt' compare="diff" lines_diff='36'/>
74 <output name='gffout' file='rgGLMtest1_GLM_topTable.gff' compare="diff" ftype='gff' />
75 </test>
76</tests>
77<help>
78
79.. class:: infomark
80
81**Syntax**
82
83Note this is a two form tool - you will choose the dependent trait and covariates
84on the second page based on the phenotype file you choose on the first page
85
86- **Genotype file** is the input Plink format compressed genotype (pbed) file
87- **Phenotype file** is the input Plink phenotype (pphe) file with FAMID IID followed by phenotypes
88- **Dependant variable** is the term on the left of the model and is chosen from the pphe columns on the second page
89- **Logistic** if you are (eg) using disease status as the outcome variable (case/control) - otherwise the model is linear.
90- **Covariates** are covariate terms on the right of the model, also chosen on the second page
91- **Interactions** will add interactions - please be careful how you interpret these - see the Plink documentation.
92- **Gender** will add gender as a model term - described in the Plink documentation
93- **Condition** will condition the model on one or more specific SNP rs ids as a whitespace delimited sequence
94- **Format** determines how your data will be returned to your Galaxy workspace
95
96-----
97
98.. class:: infomark
99
100**Summary**
101
102This tool will test GLM models for SNP predicting a dependent phenotype
103variable with adjustment for specified covariates.
104
105If you don't see the genotype or phenotype data set you want here, it can be imported using
106one of the methods available from the rg get data tool group.
107
108Output format can be UCSC .bed if you want to see one column of your
109results as a fully fledged UCSC genome browser track. A map file containing the chromosome and offset for each marker is
110required for writing this kind of output.
111Alternatively you can use .gg for the UCSC Genome Graphs tool which has all of the advantages
112of the the .bed track, plus a neat, visual front end that displays a lot of useful clues.
113Either of these are a very useful way of quickly getting a look
114at your data in full genomic context.
115
116Finally, if you can't live without
117spreadsheet data, choose the .xls tab delimited format. It's not a stupid binary excel file. Just a plain old tab
118delimited
119one with a header. Fortunately excel is dumb enough to open these without much protest.
120
121-----
122
123.. class:: infomark
124
125**Attribution**
126
127This tool allows you to control settings for models using Plink linear models. So, we rely on the author (Shaun Purcell)
128for the documentation you need specific to those settings - they are very nicely documented at
129http://pngu.mgh.harvard.edu/~purcell/plink/anal.shtml#glm
130
131Tool and Galaxy datatypes originally designed and written for the Rgenetics
132series of whole genome scale statistical genetics tools by ross lazarus (ross.lazarus@gmail.com)
133supported by NIH grant
134Shaun Purcell created and maintains Plink
135
136Please acknowledge your use of this tool, Galaxy and Plink in your publications and let
137us know so we can keep track. These tools all rely on highly competitive grant funding
138so your letting us know about publications is important to our ongoing support.
139</help>
140</tool>
141
Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。