[2] | 1 | <tool id="rgEigPCA1" name="Eigensoft:"> |
---|
| 2 | <code file="rgEigPCA_code.py"/> |
---|
| 3 | <description>PCA Ancestry using SNP</description> |
---|
| 4 | |
---|
| 5 | <command interpreter="python"> |
---|
| 6 | rgEigPCA.py "$i.extra_files_path/$i.metadata.base_name" "$title" "$out_file1" |
---|
| 7 | "$out_file1.files_path" "$k" "$m" "$t" "$s" "$pca" |
---|
| 8 | </command> |
---|
| 9 | |
---|
| 10 | <inputs> |
---|
| 11 | |
---|
| 12 | <param name="i" type="data" label="Input genotype data file" |
---|
| 13 | size="120" format="ldindep" /> |
---|
| 14 | <param name="title" type="text" value="Ancestry PCA" label="Title for outputs from this run" |
---|
| 15 | size="80" /> |
---|
| 16 | <param name="k" type="integer" value="4" label="Number of principal components to output" |
---|
| 17 | size="3" /> |
---|
| 18 | <param name="m" type="integer" value="0" label="Max. outlier removal iterations" |
---|
| 19 | help="To turn on outlier removal, set m=5 or so. Do this if you plan on adjusting any analyses" |
---|
| 20 | size="3" /> |
---|
| 21 | <param name="t" type="integer" value="5" label="# principal components used for outlier removal" |
---|
| 22 | size="3" /> |
---|
| 23 | <param name="s" type="integer" value="6" label="#SDs for outlier removal" |
---|
| 24 | help = "Any individual with SD along one of k top principal components > s will be removed as an outlier." |
---|
| 25 | size="3" /> |
---|
| 26 | |
---|
| 27 | </inputs> |
---|
| 28 | |
---|
| 29 | <outputs> |
---|
| 30 | <data name="out_file1" format="html" /> |
---|
| 31 | <data name="pca" format="txt" /> |
---|
| 32 | </outputs> |
---|
| 33 | |
---|
| 34 | <!-- python $TOOLPATH/$TOOL.py "$INPATH/tinywga" "$NPRE" ${OUTPATH}/${NPRE}.html $OUTPATH 4 2 2 2 $OUTPATH/pca.out $BINPATH --> |
---|
| 35 | |
---|
| 36 | <tests> |
---|
| 37 | <test> |
---|
| 38 | <param name='i' value='tinywga' ftype='ldindep' > |
---|
| 39 | <metadata name='base_name' value='tinywga' /> |
---|
| 40 | <composite_data value='tinywga.bim' /> |
---|
| 41 | <composite_data value='tinywga.bed' /> |
---|
| 42 | <composite_data value='tinywga.fam' /> |
---|
| 43 | <edit_attributes type='name' value='tinywga' /> |
---|
| 44 | </param> |
---|
| 45 | <param name='title' value='rgEigPCAtest1' /> |
---|
| 46 | <param name="k" value="4" /> |
---|
| 47 | <param name="m" value="2" /> |
---|
| 48 | <param name="t" value="2" /> |
---|
| 49 | <param name="s" value="2" /> |
---|
| 50 | <output name='out_file1' file='rgtestouts/rgEigPCA/rgEigPCAtest1.html' ftype='html' compare='diff' lines_diff='195'> |
---|
| 51 | <extra_files type="file" name='rgEigPCAtest1_PCAPlot.pdf' value="rgtestouts/rgEigPCA/rgEigPCAtest1_PCAPlot.pdf" compare="sim_size" delta="3000"/> |
---|
| 52 | </output> |
---|
| 53 | <output name='pca' file='rgtestouts/rgEigPCA/rgEigPCAtest1.txt' compare='diff'/> |
---|
| 54 | </test> |
---|
| 55 | </tests> |
---|
| 56 | |
---|
| 57 | <help> |
---|
| 58 | |
---|
| 59 | |
---|
| 60 | **Syntax** |
---|
| 61 | |
---|
| 62 | - **Genotype data** is the input genotype file chosen from available library files. |
---|
| 63 | - **Title** is used to name the output files |
---|
| 64 | - **Tuning parameters** documented in the Eigensoft documentation - see below |
---|
| 65 | |
---|
| 66 | (Note that you may need to convert an existing genotype file into that format to use this tool) |
---|
| 67 | |
---|
| 68 | ----- |
---|
| 69 | |
---|
| 70 | **Summary** |
---|
| 71 | |
---|
| 72 | **Attribution** |
---|
| 73 | This tool runs and relies on the work of many others, including the |
---|
| 74 | maintainers of the Eigensoft program, and the R and |
---|
| 75 | Bioconductor projects. For full attribution, source code and documentation, please see |
---|
| 76 | http://genepath.med.harvard.edu/~reich/Software.htm, http://cran.r-project.org/ |
---|
| 77 | and http://www.bioconductor.org/ respectively |
---|
| 78 | |
---|
| 79 | This implementation is a Galaxy tool wrapper around these third party applications. |
---|
| 80 | It was originally designed and written for family based data from the CAMP Illumina run of 2007 by |
---|
| 81 | ross lazarus (ross.lazarus@gmail.com) and incorporated into the rgenetics toolkit. |
---|
| 82 | |
---|
| 83 | copyright Ross Lazarus 2007 |
---|
| 84 | Licensed under the terms of the LGPL as documented http://www.gnu.org/licenses/lgpl.html |
---|
| 85 | but is about as useful as a sponge boat without EIGENSOFT pca code. |
---|
| 86 | |
---|
| 87 | **README from eigensoft2** |
---|
| 88 | |
---|
| 89 | [rerla@beast eigensoft2]$ cat README |
---|
| 90 | EIGENSOFT version 2.0, January 2008 (for Linux only) |
---|
| 91 | |
---|
| 92 | This is the same as our EIGENSOFT 2.0 BETA release with a few recent changes |
---|
| 93 | as described at http://genepath.med.harvard.edu/~reich/New_In_EIGENSOFT.htm. |
---|
| 94 | |
---|
| 95 | Features of EIGENSOFT version 2.0 include: |
---|
| 96 | -- Keeping track of ref/var alleles in all file formats: see CONVERTF/README |
---|
| 97 | -- Handling data sets up to 8 billion genotypes: see CONVERTF/README |
---|
| 98 | -- Output SNP weightings of each principal component: see POPGEN/README |
---|
| 99 | |
---|
| 100 | The EIGENSOFT package implements methods from the following 2 papers: |
---|
| 101 | Patterson N. et al. 2006 PLoS Genetics in press (population structure) |
---|
| 102 | Price A.L. et al. 2006 NG 38:904-9 (EIGENSTRAT stratification correction) |
---|
| 103 | |
---|
| 104 | See POPGEN/README for documentation of population structure programs. |
---|
| 105 | |
---|
| 106 | See EIGENSTRAT/README for documentation of EIGENSTRAT programs. |
---|
| 107 | |
---|
| 108 | See CONVERTF/README for documentation of programs for converting file formats. |
---|
| 109 | |
---|
| 110 | |
---|
| 111 | Executables and source code: |
---|
| 112 | ---------------------------- |
---|
| 113 | All C executables are in the bin/ directory. |
---|
| 114 | |
---|
| 115 | We have placed source code for all C executables in the src/ directory, |
---|
| 116 | for users who wish to modify and recompile our programs. For example, to |
---|
| 117 | recompile the eigenstrat program, type |
---|
| 118 | "cd src" |
---|
| 119 | "make eigenstrat" |
---|
| 120 | "mv eigenstrat ../bin" |
---|
| 121 | |
---|
| 122 | Note that some of our software will only compile if your system has the |
---|
| 123 | lapack package installed. (This package is used to compute eigenvectors.) |
---|
| 124 | Some users may need to change "blas-3" to "blas" in the Makefile, |
---|
| 125 | depending on how blas and lapack are installed. |
---|
| 126 | |
---|
| 127 | If cc is not available on your system, try "cp Makefile.alt Makefile" |
---|
| 128 | and then recompile. |
---|
| 129 | |
---|
| 130 | If you have trouble compiling and running our code, try compiling and |
---|
| 131 | running the pcatoy program in the src directory: |
---|
| 132 | "cd src" |
---|
| 133 | "make pcatoy" |
---|
| 134 | "./pcatoy" |
---|
| 135 | If you are unable to run the pcatoy program successfully, please contact |
---|
| 136 | your system administrator for help, as this is a systems issue which is |
---|
| 137 | beyond our scope. Your system administrator will be able to troubleshoot |
---|
| 138 | your systems issue using this trivial program. [You can also try running |
---|
| 139 | the pcatoy program in the bin directory, which we have already compiled.] |
---|
| 140 | </help> |
---|
| 141 | </tool> |
---|
| 142 | |
---|