[2] | 1 | <tool id="color2nuc" name="Convert Color Space" version="1.0.0"> |
---|
| 2 | <description> to Nucleotides </description> |
---|
| 3 | <command interpreter="python">convert_SOLiD_color2nuc.py $input1 $input2 $output1 </command> |
---|
| 4 | |
---|
| 5 | <inputs> |
---|
| 6 | <param name="input1" type="data" format="txt" label="SOLiD color coding file" /> |
---|
| 7 | <param name="input2" type="select" label="Keep prefix nucleotide"> |
---|
| 8 | <option value="yes">Yes</option> |
---|
| 9 | <option value="no">No</option> |
---|
| 10 | </param> |
---|
| 11 | </inputs> |
---|
| 12 | <outputs> |
---|
| 13 | <data name="output1" format="fasta" /> |
---|
| 14 | </outputs> |
---|
| 15 | <!-- |
---|
| 16 | <tests> |
---|
| 17 | <test> |
---|
| 18 | <param name="input1" value="convert_SOLiD_color2nuc_test1.txt" ftype="txt" /> |
---|
| 19 | <param name="input2" value="no" /> |
---|
| 20 | <output name="output1" file="convert_SOLiD_color2nuc_test1.out" /> |
---|
| 21 | </test> |
---|
| 22 | </tests> |
---|
| 23 | --> |
---|
| 24 | <help> |
---|
| 25 | |
---|
| 26 | .. class:: warningmark |
---|
| 27 | |
---|
| 28 | The tool was designed for color space files generated from an ABI SOLiD sequencer. The file format must be fasta-like: the title starts with a ">" character, and each color space sequence starts with a leading nucleotide. |
---|
| 29 | |
---|
| 30 | ----- |
---|
| 31 | |
---|
| 32 | **What it does** |
---|
| 33 | |
---|
| 34 | This tool converts a color space sequence to nucleotides. The leading character must be a nucleotide: A, C, G, or T. |
---|
| 35 | |
---|
| 36 | ----- |
---|
| 37 | |
---|
| 38 | **Example** |
---|
| 39 | |
---|
| 40 | - If the color space file looks like this:: |
---|
| 41 | |
---|
| 42 | >seq1 |
---|
| 43 | A013 |
---|
| 44 | >seq2 |
---|
| 45 | T011213122200221123032111221021210131332222101 |
---|
| 46 | |
---|
| 47 | - If you would like to **keep** the leading nucleotide:: |
---|
| 48 | |
---|
| 49 | >seq1 |
---|
| 50 | AACG |
---|
| 51 | >seq2 |
---|
| 52 | TTGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT |
---|
| 53 | |
---|
| 54 | - If you **do not want to keep** the leading nucleotide (the length of nucleotide sequence will be one less than the color-space sequence):: |
---|
| 55 | |
---|
| 56 | >seq1 |
---|
| 57 | ACG |
---|
| 58 | >seq2 |
---|
| 59 | TGTCATGAGAAAGACAGCCGACACTCAAGTCAACGTATCTCTGGT |
---|
| 60 | |
---|
| 61 | ----- |
---|
| 62 | |
---|
| 63 | **ABI SOLiD Color Coding Alignment matrix** |
---|
| 64 | |
---|
| 65 | Each di-nucleotide is represented by a single digit: 0 to 3. The matrix is symmetric, thus the leading nucleotide is necessary to determine the sequence (otherwise there are four possibilities). |
---|
| 66 | |
---|
| 67 | |
---|
| 68 | .. image:: ../static/images/dualcolorcode.png |
---|
| 69 | |
---|
| 70 | |
---|
| 71 | </help> |
---|
| 72 | </tool> |
---|