[2] | 1 | <tool id="rgQQ1" name="QQ Plots:"> |
---|
| 2 | <code file="rgQQ_code.py"/> |
---|
| 3 | |
---|
| 4 | <description>for p values from an analysis </description> |
---|
| 5 | |
---|
| 6 | <command interpreter="python"> |
---|
| 7 | rgQQ.py "$input1" "$name" "$sample" "$cols" "$allqq" "$height" "$width" "$logtrans" "$allqq.id" "$__new_file_path__" |
---|
| 8 | </command> |
---|
| 9 | |
---|
| 10 | <inputs> |
---|
| 11 | <page> |
---|
| 12 | <param name="input1" type="data" label="Choose the History dataset containing p values to QQ plot" |
---|
| 13 | size="80" format="tabular" help="Query missing? See Tip below" /> |
---|
| 14 | <param name="name" type="text" size="80" label = "Descriptive title for QQ plot" value="QQ" /> |
---|
| 15 | |
---|
| 16 | <param name="logtrans" type="boolean" label = "Use a log scale - recommended for p values in range 0-1.0" |
---|
| 17 | truevalue="true" falsevalue="false"/> |
---|
| 18 | <param name="sample" type="float" label="Random sample fraction - set to 1.0 for all data points" value="0.01" |
---|
| 19 | help="If you have a million values, the QQ plots will be huge - a random sample of 1% will be fine" /> |
---|
| 20 | <param name="height" type="integer" label="PDF image height (inches)" value="6" /> |
---|
| 21 | <param name="width" type="integer" label="PDF image width (inches)" value="6" /> |
---|
| 22 | </page> |
---|
| 23 | <page> |
---|
| 24 | <param name="cols" type="select" display="checkboxes" multiple="True" |
---|
| 25 | help="Choose from these numeric columns in the data file to make a quantile-quantile plot against a uniform distribution" |
---|
| 26 | label="Columns (p values 0-1 eg) to make QQ plots" dynamic_options="get_columns( input1 )" /> |
---|
| 27 | </page> |
---|
| 28 | </inputs> |
---|
| 29 | |
---|
| 30 | <outputs> |
---|
| 31 | <data format="pdf" name="allqq" /> |
---|
| 32 | </outputs> |
---|
| 33 | |
---|
| 34 | <tests> |
---|
| 35 | <test> |
---|
| 36 | <param name='input1' value='tinywga.pphe' /> |
---|
| 37 | <param name='name' value="rgQQtest1" /> |
---|
| 38 | <param name='logtrans' value="false" /> |
---|
| 39 | <param name='sample' value='1.0' /> |
---|
| 40 | <param name='height' value='8' /> |
---|
| 41 | <param name='width' value='10' /> |
---|
| 42 | <param name='cols' value='3' /> |
---|
| 43 | <output name='allqq' file='rgQQtest1.pdf' ftype='binary' compare="diff" lines_diff="29"/> |
---|
| 44 | </test> |
---|
| 45 | </tests> |
---|
| 46 | |
---|
| 47 | <help> |
---|
| 48 | |
---|
| 49 | .. class:: infomark |
---|
| 50 | |
---|
| 51 | **Explanation** |
---|
| 52 | |
---|
| 53 | A quantile-quantile (QQ) plot is a good way to see systematic departures from the null expectation of uniform p-values |
---|
| 54 | from a genomic analysis. If the QQ plot shows departure from the null (ie a uniform 0-1 distribution), you hope that this will be |
---|
| 55 | in the very smallest p-values suggesting that there might be some interesting results to look at. A log scale will help emphasise departures |
---|
| 56 | from the null at low p values more clear |
---|
| 57 | |
---|
| 58 | ----- |
---|
| 59 | |
---|
| 60 | .. class:: infomark |
---|
| 61 | |
---|
| 62 | **Syntax** |
---|
| 63 | |
---|
| 64 | This tool has 2 pages. On the first one you choose the data set and output options, then on the second page, the |
---|
| 65 | column names are shown so you can choose the one containing the p values you wish to plot. |
---|
| 66 | |
---|
| 67 | - **History data** is one of your history tabular data sets |
---|
| 68 | - **Descriptive Title** is the text to appear in the output file names to remind you what the plots are! |
---|
| 69 | - **Use a Log scale** is recommended for p values in the range 0-1 as it highlights departures from the null at small p values |
---|
| 70 | - **Random Sample Fraction** is the fraction of points to randomly sample - highly recommended for >5k or so values |
---|
| 71 | - **Height and Width** will determine the scale of the pdf images |
---|
| 72 | |
---|
| 73 | |
---|
| 74 | ----- |
---|
| 75 | |
---|
| 76 | .. class:: infomark |
---|
| 77 | |
---|
| 78 | **Summary** |
---|
| 79 | |
---|
| 80 | Generate a uniform QQ plot for any large number of p values from an analysis. |
---|
| 81 | Essentially a plot of n ranked p values against their rank as a centile - ie rank/n |
---|
| 82 | |
---|
| 83 | Works well where you have a column containing p values from |
---|
| 84 | a statistical test of some sort. These will be plotted against the values expected under the null. Departure |
---|
| 85 | from the diagonal suggests one distribution is more extreme than the other. You hope your p values are |
---|
| 86 | smaller than expected under the null. |
---|
| 87 | |
---|
| 88 | The sampling fraction will help cut down the size of the pdfs. If there are fewer than 5k points on any plot, all will be shown. |
---|
| 89 | Otherwise the sampling fraction will be used or 5k, whichever is larger. |
---|
| 90 | |
---|
| 91 | Note that the use of a log scale is ill-advised if you are plotting log transformed p values because the |
---|
| 92 | uniform distribution chosen for the qq plot is always 0-1 and log transformation is applied if required. |
---|
| 93 | The most useful plots for p values are log QQ plots of untransformed p values in the range 0-1 |
---|
| 94 | |
---|
| 95 | Originally designed and written for family based data from the CAMP Illumina run of 2007 by |
---|
| 96 | ross lazarus (ross.lazarus@gmail.com) |
---|
| 97 | |
---|
| 98 | </help> |
---|
| 99 | </tool> |
---|