short_reads_figure_score.py $input1 $output1
rpy
.. class:: warningmark
To use this tool, your dataset needs to be in the *Quality Score* format. Click the pencil icon next to your dataset to set the datatype to *Quality Score* (see below for examples).
-----
**What it does**
This tool takes Quality Files generated by Roche (454), Illumina (Solexa), or ABI SOLiD machines and builds a graph showing score distribution like the one below. Such graph allows you to perform initial evaluation of data quality in a single pass.
-----
**Examples of Quality Data**
Roche (454) or ABI SOLiD data::
>seq1
23 33 34 25 28 28 28 32 23 34 27 4 28 28 31 21 28
Illumina (Solexa) data::
-40 -40 40 -40 -40 -40 -40 40
-----
**Output example**
Quality scores are summarized as boxplot (Roche 454 FLX data):
.. image:: ../static/images/short_reads_boxplot.png
where the **X-axis** is coordinate along the read and the **Y-axis** is quality score adjusted to comply with the Phred score metric. Units on the X-axis depend on whether your data comes from Roche (454) or Illumina (Solexa) and ABI SOLiD machines:
- For Roche (454) X-axis (shown above) indicates **relative** position (in %) within reads as this technology produces reads of different lengths;
- For Illumina (Solexa) and ABI SOLiD X-axis shows **absolute** position in nucleotides within reads.
Every box on the plot shows the following values::
o <---- Outliers
o
-+- <---- Upper Extreme Value that is no more
| than box length away from the box
|
+--+--+ <---- Upper Quartile
| |
+-----+ <---- Median
| |
+--+--+ <---- Lower Quartile
|
|
-+- <---- Lower Extreme Value that is no more
than box length away from the box
o <---- Outlier