root/galaxy-central/tools/regVariation/best_regression_subsets.xml @ 2

リビジョン 2, 2.4 KB (コミッタ: hatakeyama, 14 年 前)

import galaxy-central

行番号 
1<tool id="BestSubsetsRegression1" name="Perform Best-subsets Regression">
2  <description> </description>
3  <command interpreter="python">
4    best_regression_subsets.py
5      $input1
6      $response_col
7      $predictor_cols
8      $out_file1
9      $out_file2
10      1>/dev/null
11      2>/dev/null
12  </command>
13  <inputs>
14    <param format="tabular" name="input1" type="data" label="Select data" help="Query missing? See TIP below."/>
15    <param name="response_col" label="Response column (Y)" type="data_column" data_ref="input1" />
16    <param name="predictor_cols" label="Predictor columns (X)" type="data_column" data_ref="input1" multiple="true" >
17        <validator type="no_options" message="Please select at least one column."/>
18    </param>
19  </inputs>
20  <outputs>
21    <data format="input" name="out_file1" metadata_source="input1" />
22    <data format="pdf" name="out_file2" />
23  </outputs>
24  <requirements>
25    <requirement type="python-module">rpy</requirement>
26  </requirements>
27  <tests>
28    <!-- Testing this tool will not be possible because this tool produces a pdf output file.
29    -->
30  </tests>
31  <help>
32
33.. class:: infomark
34
35**TIP:** If your data is not TAB delimited, use *Edit Queries-&gt;Convert characters*
36
37-----
38
39.. class:: infomark
40
41**What it does**
42
43This tool uses the 'regsubsets' function from R statistical package for regression subset selection. It outputs two files, one containing a table with the best subsets and the corresponding summary statistics, and the other containing the graphical representation of the results. 
44
45-----
46
47.. class:: warningmark
48
49**Note**
50
51- This tool currently treats all predictor and response variables as continuous variables.
52
53- Rows containing non-numeric (or missing) data in any of the chosen columns will be skipped from the analysis.
54
55- The 6 columns in the output are described below:
56
57  - Column 1 (Vars): denotes the number of variables in the model
58  - Column 2 ([c2 c3 c4...]): represents a list of the user-selected predictor variables (full model). An asterix denotes the presence of the corresponding predictor variable in the selected model.
59  - Column 3 (R-sq): the fraction of variance explained by the model
60  - Column 4 (Adj. R-sq): the above R-squared statistic adjusted, penalizing for higher number of predictors (p)
61  - Column 5 (Cp): Mallow's Cp statistics 
62  - Column 6 (bic): Bayesian Information Criterion.
63
64
65  </help>
66</tool>
Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。