1 | <tool id="trimmer" name="Trim" version="0.0.1">
|
---|
2 | <description>leading or trailing characters</description>
|
---|
3 | <command interpreter="python"> |
---|
4 | trimmer.py -a -f $input1 -c $col -s $start -e $end -i $ignore $fastq > $out_file1
|
---|
5 | </command>
|
---|
6 | <inputs>
|
---|
7 | <param format="tabular,txt" name="input1" type="data" label="this dataset"/> |
---|
8 | <param name="col" type="integer" value="0" label="Trim this column only" help="0 = process entire line" />
|
---|
9 | <param name="start" type="integer" size="10" value="1" label="Trim from the beginning to this position" help="1 = do not trim the beginning"/> |
---|
10 | <param name="end" type="integer" size="10" value="0" label="Remove everything from this position to the end" help="0 = do not trim the end"/> |
---|
11 | <param name="fastq" type="select" label="Is input dataset in fastq format?" help="If set to YES, the tool will not trim evenly numbered lines (0, 2, 4, etc...)"> |
---|
12 | <option selected="true" value="">No</option> |
---|
13 | <option value="-q">Yes</option> |
---|
14 | </param> |
---|
15 | <param name="ignore" type="select" display="checkboxes" multiple="True" label="Ignore lines beginning with these characters" help="lines beginning with these are not trimmed"> |
---|
16 | <option value="62">></option> |
---|
17 | <option value="64">@</option> |
---|
18 | <option value="43">+</option> |
---|
19 | <option value="60"><</option> |
---|
20 | <option value="42">*</option> |
---|
21 | <option value="45">-</option> |
---|
22 | <option value="61">=</option> |
---|
23 | <option value="124">|</option> |
---|
24 | <option value="63">?</option> |
---|
25 | <option value="36">$</option> |
---|
26 | <option value="46">.</option> |
---|
27 | <option value="58">:</option> |
---|
28 | <option value="38">&</option> |
---|
29 | <option value="37">%</option> |
---|
30 | <option value="94">^</option> |
---|
31 | <option value="35">#</option> |
---|
32 | </param> |
---|
33 | </inputs>
|
---|
34 | <outputs>
|
---|
35 | <data name="out_file1" format="input" metadata_source="input1"/>
|
---|
36 | </outputs> |
---|
37 | <tests> |
---|
38 | <test> |
---|
39 | <param name="input1" value="trimmer_tab_delimited.dat"/> |
---|
40 | <param name="col" value="0"/> |
---|
41 | <param name="start" value="1"/> |
---|
42 | <param name="end" value="13"/> |
---|
43 | <param name="ignore" value="62"/> |
---|
44 | <param name="fastq" value="No"/> |
---|
45 | <output name="out_file1" file="trimmer_a_f_c0_s1_e13_i62.dat"/> |
---|
46 | </test> |
---|
47 | <test> |
---|
48 | <param name="input1" value="trimmer_tab_delimited.dat"/> |
---|
49 | <param name="col" value="2"/> |
---|
50 | <param name="start" value="1"/> |
---|
51 | <param name="end" value="2"/> |
---|
52 | <param name="ignore" value="62"/> |
---|
53 | <param name="fastq" value="No"/> |
---|
54 | <output name="out_file1" file="trimmer_a_f_c2_s1_e2_i62.dat"/> |
---|
55 | </test> |
---|
56 | |
---|
57 | </tests> |
---|
58 |
|
---|
59 | <help>
|
---|
60 |
|
---|
61 |
|
---|
62 | **What it does**
|
---|
63 |
|
---|
64 | Trims specified number of characters from a dataset or its field (if dataset is tab-delimited).
|
---|
65 |
|
---|
66 | -----
|
---|
67 |
|
---|
68 | **Example 1**
|
---|
69 |
|
---|
70 | Trimming this dataset::
|
---|
71 | |
---|
72 | 1234567890 |
---|
73 | abcdefghijk
|
---|
74 |
|
---|
75 | by setting **Trim from the beginning to this position** to *2* and **Remove everything from this position to the end** to *6* will produce:: |
---|
76 | |
---|
77 | 23456 |
---|
78 | bcdef
|
---|
79 | |
---|
80 | ----- |
---|
81 | |
---|
82 | **Eaxmple 2** |
---|
83 | |
---|
84 | Trimming column 2 of this dataset:: |
---|
85 | |
---|
86 | bcde 12345 fghij 67890 |
---|
87 | fghij 67890 abcde 12345 |
---|
88 | |
---|
89 | by setting **Trim content of this column only** to *2*, **Trim from the beginning to this position** to *2*, and **Remove everything from this position to the end** to *4* will produce::
|
---|
90 | |
---|
91 | abcde 234 fghij 67890 |
---|
92 | fghij 789 abcde 12345 |
---|
93 | |
---|
94 | ----- |
---|
95 | |
---|
96 | **Trimming FASTQ datasets** |
---|
97 | |
---|
98 | This tool can be used to trim sequences and quality strings in fastq datasets. This is done by selected *Yes* from the **Is input dataset in fastq format?** dropdown. If set to *Yes*, the tool will skip all even numbered lines (see warning below). For example, trimming last 5 bases of this dataset:: |
---|
99 | |
---|
100 | @081017-and-081020:1:1:1715:1759 |
---|
101 | GGACTCAGATAGTAATCCACGCTCCTTTAAAATATC |
---|
102 | + |
---|
103 | II#IIIIIII$5+.(9IIIIIII$%*$G$A31I&&B |
---|
104 | |
---|
105 | cab done by setting **Remove everything from this position to the end** to 31:: |
---|
106 | |
---|
107 | @081017-and-081020:1:1:1715:1759 |
---|
108 | GGACTCAGATAGTAATCCACGCTCCTTTAAA |
---|
109 | + |
---|
110 | II#IIIIIII$5+.(9IIIIIII$%*$G$A3 |
---|
111 | |
---|
112 | **Note** that headers are skipped. |
---|
113 | |
---|
114 | .. class:: warningmark
|
---|
115 |
|
---|
116 | **WARNING:** This tool will only work on properly formatted fastq datasets where (1) each read and quality string occupy one line and (2) '@' (read header) and "+" (quality header) lines are evenly numbered like in the above example.
|
---|
117 | |
---|
118 |
|
---|
119 | </help>
|
---|
120 | </tool>
|
---|