[2] | 1 | <tool id="quality_score_distribution" name="Build base quality distribution" version="1.0.2"> |
---|
| 2 | <description></description> |
---|
| 3 | |
---|
| 4 | <command interpreter="python">short_reads_figure_score.py $input1 $output1 </command> |
---|
| 5 | |
---|
| 6 | <inputs> |
---|
| 7 | <page> |
---|
| 8 | <param name="input1" type="data" format="qualsolexa, qual454" label="Quality score file" help="No dataset? Read tip below"/> |
---|
| 9 | </page> |
---|
| 10 | </inputs> |
---|
| 11 | |
---|
| 12 | <outputs> |
---|
| 13 | <data name="output1" format="png" /> |
---|
| 14 | </outputs> |
---|
| 15 | <requirements> |
---|
| 16 | <requirement type="python-module">rpy</requirement> |
---|
| 17 | </requirements> |
---|
| 18 | <tests> |
---|
| 19 | <test> |
---|
| 20 | <param name="input1" value="solexa.qual" ftype="qualsolexa" /> |
---|
| 21 | <output name="output1" file="solexaScore.png" ftype="png" /> |
---|
| 22 | </test> |
---|
| 23 | <test> |
---|
| 24 | <param name="input1" value="454.qual" ftype="qual454" /> |
---|
| 25 | <output name="output1" file="454Score.png" ftype="png" /> |
---|
| 26 | </test> |
---|
| 27 | </tests> |
---|
| 28 | <help> |
---|
| 29 | |
---|
| 30 | .. class:: warningmark |
---|
| 31 | |
---|
| 32 | To use this tool, your dataset needs to be in the *Quality Score* format. Click the pencil icon next to your dataset to set the datatype to *Quality Score* (see below for examples). |
---|
| 33 | |
---|
| 34 | ----- |
---|
| 35 | |
---|
| 36 | **What it does** |
---|
| 37 | |
---|
| 38 | This tool takes Quality Files generated by Roche (454), Illumina (Solexa), or ABI SOLiD machines and builds a graph showing score distribution like the one below. Such graph allows you to perform initial evaluation of data quality in a single pass. |
---|
| 39 | |
---|
| 40 | ----- |
---|
| 41 | |
---|
| 42 | **Examples of Quality Data** |
---|
| 43 | |
---|
| 44 | Roche (454) or ABI SOLiD data:: |
---|
| 45 | |
---|
| 46 | >seq1 |
---|
| 47 | 23 33 34 25 28 28 28 32 23 34 27 4 28 28 31 21 28 |
---|
| 48 | |
---|
| 49 | Illumina (Solexa) data:: |
---|
| 50 | |
---|
| 51 | -40 -40 40 -40 -40 -40 -40 40 |
---|
| 52 | |
---|
| 53 | ----- |
---|
| 54 | |
---|
| 55 | **Output example** |
---|
| 56 | |
---|
| 57 | Quality scores are summarized as boxplot (Roche 454 FLX data): |
---|
| 58 | |
---|
| 59 | .. image:: ../static/images/short_reads_boxplot.png |
---|
| 60 | |
---|
| 61 | where the **X-axis** is coordinate along the read and the **Y-axis** is quality score adjusted to comply with the Phred score metric. Units on the X-axis depend on whether your data comes from Roche (454) or Illumina (Solexa) and ABI SOLiD machines: |
---|
| 62 | |
---|
| 63 | - For Roche (454) X-axis (shown above) indicates **relative** position (in %) within reads as this technology produces reads of different lengths; |
---|
| 64 | - For Illumina (Solexa) and ABI SOLiD X-axis shows **absolute** position in nucleotides within reads. |
---|
| 65 | |
---|
| 66 | Every box on the plot shows the following values:: |
---|
| 67 | |
---|
| 68 | o <---- Outliers |
---|
| 69 | o |
---|
| 70 | -+- <---- Upper Extreme Value that is no more |
---|
| 71 | | than box length away from the box |
---|
| 72 | | |
---|
| 73 | +--+--+ <---- Upper Quartile |
---|
| 74 | | | |
---|
| 75 | +-----+ <---- Median |
---|
| 76 | | | |
---|
| 77 | +--+--+ <---- Lower Quartile |
---|
| 78 | | |
---|
| 79 | | |
---|
| 80 | -+- <---- Lower Extreme Value that is no more |
---|
| 81 | than box length away from the box |
---|
| 82 | o <---- Outlier |
---|
| 83 | |
---|
| 84 | |
---|
| 85 | |
---|
| 86 | </help> |
---|
| 87 | </tool> |
---|