[2] | 1 | <tool id="rgGLM1" name="Linear Models:" version="0.2"> |
---|
| 2 | <code file="listFiles.py"/> |
---|
| 3 | <code file="rgGLM_code.py"/> |
---|
| 4 | |
---|
| 5 | <description>for genotype data</description> |
---|
| 6 | |
---|
| 7 | <command interpreter="python"> |
---|
| 8 | rgGLM.py '$i.extra_files_path/$i.metadata.base_name' '$phef.extra_files_path/$phef.metadata.base_name' |
---|
| 9 | "$title1" '$predvar' '$covar' '$out_file1' '$logf' '$i.metadata.base_name' |
---|
| 10 | '$inter' '$cond' '$gender' '$mind' '$geno' '$maf' '$logistic' '$gffout' |
---|
| 11 | </command> |
---|
| 12 | |
---|
| 13 | <inputs> |
---|
| 14 | <page> |
---|
| 15 | <param name='title1' label='Title for outputs' type='text' value='GLM' size="80" /> |
---|
| 16 | <param name="i" type="data" format="pbed" label="Genotype file" size="80" /> |
---|
| 17 | <param name="phef" type="data" format="pphe" label="Phenotype file" size="80" |
---|
| 18 | help="Dependent variable and covariates will be chosen from this file on the next page"/> |
---|
| 19 | <param name="logistic" type="text" value = "0" label="1=Use a logistic model (trait must be 1/2 coded like affection)" |
---|
| 20 | help="Please read the Plink documentation about this option" /> |
---|
| 21 | <param name="gender" type="text" value = "0" label="1=Add a gender term to model" /> |
---|
| 22 | <param name='inter' label='1=Build an interaction model - please read the docs carefully before using this' |
---|
| 23 | type='text' value='0' size="1" /> |
---|
| 24 | <param name="cond" type="text" area='true' size='15x20' value = "" |
---|
| 25 | label="condition on this whitespace delimited rs (snp id) list" /> |
---|
| 26 | <param name="mind" type="float" value = "0.1" label="Remove subjects with missing genotypes gt (eg 0.1)" |
---|
| 27 | help = "Set to 1 to include all subjects in the input file" /> |
---|
| 28 | <param name="geno" type="float" value = "0.1" label="Remove markers with missing genotypes gt (eg 0.1)" |
---|
| 29 | help = "Set to 1 to include all markers in the input file" /> |
---|
| 30 | <param name="maf" type="float" value = "0.01" label="Remove markers with MAF lt (eg 0.01) " |
---|
| 31 | help = "Set to 0 to include all markers in the input file"/> |
---|
| 32 | </page> |
---|
| 33 | <page> |
---|
| 34 | <param name="predvar" size="80" type="select" label="Dependent Trait" |
---|
| 35 | dynamic_options="get_phecols(phef=phef,selectOne=1)" display="radio" multiple="false" |
---|
| 36 | help="Model this characteristic in terms of subject snp genotypes - eg rare allele dosage for additive model" /> |
---|
| 37 | <param name="covar" size="80" type="select" label="Covariates" |
---|
| 38 | dynamic_options="get_phecols(phef=phef,selectOne=0)" multiple="true" display="checkboxes" |
---|
| 39 | help="Use these phenotypes as covariates in models of snp dosage effects on the dependent trait"/> |
---|
| 40 | </page> |
---|
| 41 | </inputs> |
---|
| 42 | |
---|
| 43 | <outputs> |
---|
| 44 | <data format="tabular" name="out_file1" /> |
---|
| 45 | <data format="txt" name="logf" /> |
---|
| 46 | <data format="gff" name="gffout" /> |
---|
| 47 | </outputs> |
---|
| 48 | <tests> |
---|
| 49 | <test> |
---|
| 50 | <param name='i' value='tinywga' ftype='pbed' > |
---|
| 51 | <metadata name='base_name' value='tinywga' /> |
---|
| 52 | <composite_data value='tinywga.bim' /> |
---|
| 53 | <composite_data value='tinywga.bed' /> |
---|
| 54 | <composite_data value='tinywga.fam' /> |
---|
| 55 | <edit_attributes type='name' value='tinywga' /> |
---|
| 56 | </param> |
---|
| 57 | <param name='phef' value='tinywga' ftype='pphe' > |
---|
| 58 | <metadata name='base_name' value='tinywga' /> |
---|
| 59 | <composite_data value='tinywga.pphe' /> |
---|
| 60 | <edit_attributes type='name' value='tinywga' /> |
---|
| 61 | </param> |
---|
| 62 | <param name='title1' value='rgGLMtest1' /> |
---|
| 63 | <param name='predvar' value='c1' /> |
---|
| 64 | <param name='covar' value='None' /> |
---|
| 65 | <param name='inter' value='0' /> |
---|
| 66 | <param name='cond' value='' /> |
---|
| 67 | <param name='gender' value='0' /> |
---|
| 68 | <param name='mind' value='1.0' /> |
---|
| 69 | <param name='geno' value='1.0' /> |
---|
| 70 | <param name='maf' value='0.0' /> |
---|
| 71 | <param name='logistic' value='0' /> |
---|
| 72 | <output name='out_file1' file='rgGLMtest1_GLM.xls' ftype='tabular' compare="diff" /> |
---|
| 73 | <output name='logf' file='rgGLMtest1_GLM_log.txt' ftype='txt' compare="diff" lines_diff='36'/> |
---|
| 74 | <output name='gffout' file='rgGLMtest1_GLM_topTable.gff' compare="diff" ftype='gff' /> |
---|
| 75 | </test> |
---|
| 76 | </tests> |
---|
| 77 | <help> |
---|
| 78 | |
---|
| 79 | .. class:: infomark |
---|
| 80 | |
---|
| 81 | **Syntax** |
---|
| 82 | |
---|
| 83 | Note this is a two form tool - you will choose the dependent trait and covariates |
---|
| 84 | on the second page based on the phenotype file you choose on the first page |
---|
| 85 | |
---|
| 86 | - **Genotype file** is the input Plink format compressed genotype (pbed) file |
---|
| 87 | - **Phenotype file** is the input Plink phenotype (pphe) file with FAMID IID followed by phenotypes |
---|
| 88 | - **Dependant variable** is the term on the left of the model and is chosen from the pphe columns on the second page |
---|
| 89 | - **Logistic** if you are (eg) using disease status as the outcome variable (case/control) - otherwise the model is linear. |
---|
| 90 | - **Covariates** are covariate terms on the right of the model, also chosen on the second page |
---|
| 91 | - **Interactions** will add interactions - please be careful how you interpret these - see the Plink documentation. |
---|
| 92 | - **Gender** will add gender as a model term - described in the Plink documentation |
---|
| 93 | - **Condition** will condition the model on one or more specific SNP rs ids as a whitespace delimited sequence |
---|
| 94 | - **Format** determines how your data will be returned to your Galaxy workspace |
---|
| 95 | |
---|
| 96 | ----- |
---|
| 97 | |
---|
| 98 | .. class:: infomark |
---|
| 99 | |
---|
| 100 | **Summary** |
---|
| 101 | |
---|
| 102 | This tool will test GLM models for SNP predicting a dependent phenotype |
---|
| 103 | variable with adjustment for specified covariates. |
---|
| 104 | |
---|
| 105 | If you don't see the genotype or phenotype data set you want here, it can be imported using |
---|
| 106 | one of the methods available from the rg get data tool group. |
---|
| 107 | |
---|
| 108 | Output format can be UCSC .bed if you want to see one column of your |
---|
| 109 | results as a fully fledged UCSC genome browser track. A map file containing the chromosome and offset for each marker is |
---|
| 110 | required for writing this kind of output. |
---|
| 111 | Alternatively you can use .gg for the UCSC Genome Graphs tool which has all of the advantages |
---|
| 112 | of the the .bed track, plus a neat, visual front end that displays a lot of useful clues. |
---|
| 113 | Either of these are a very useful way of quickly getting a look |
---|
| 114 | at your data in full genomic context. |
---|
| 115 | |
---|
| 116 | Finally, if you can't live without |
---|
| 117 | spreadsheet data, choose the .xls tab delimited format. It's not a stupid binary excel file. Just a plain old tab |
---|
| 118 | delimited |
---|
| 119 | one with a header. Fortunately excel is dumb enough to open these without much protest. |
---|
| 120 | |
---|
| 121 | ----- |
---|
| 122 | |
---|
| 123 | .. class:: infomark |
---|
| 124 | |
---|
| 125 | **Attribution** |
---|
| 126 | |
---|
| 127 | This tool allows you to control settings for models using Plink linear models. So, we rely on the author (Shaun Purcell) |
---|
| 128 | for the documentation you need specific to those settings - they are very nicely documented at |
---|
| 129 | http://pngu.mgh.harvard.edu/~purcell/plink/anal.shtml#glm |
---|
| 130 | |
---|
| 131 | Tool and Galaxy datatypes originally designed and written for the Rgenetics |
---|
| 132 | series of whole genome scale statistical genetics tools by ross lazarus (ross.lazarus@gmail.com) |
---|
| 133 | supported by NIH grant |
---|
| 134 | Shaun Purcell created and maintains Plink |
---|
| 135 | |
---|
| 136 | Please acknowledge your use of this tool, Galaxy and Plink in your publications and let |
---|
| 137 | us know so we can keep track. These tools all rely on highly competitive grant funding |
---|
| 138 | so your letting us know about publications is important to our ongoing support. |
---|
| 139 | </help> |
---|
| 140 | </tool> |
---|
| 141 | |
---|