Context Navigation

awk_tool.xml

リビジョン 3, 5.9 KB (コミッタ: kohda, 14 年前)
Install Unix tools http://hannonlab.cshl.edu/galaxy_unix_tools/galaxy.html

行番号
1	<tool id="cshl_awk_tool" name="awk">
2	<description></description>
3	<command interpreter="sh">awk_wrapper.sh $input $output '$file_data' '$FS' '$OFS'</command>
4	<inputs>
5	<param format="txt" name="input" type="data" label="File to process" />
6
7	<param name="FS" type="select" label="Input field-separator">
8	<option value=",">comma (,)</option>
9	<option value=":">colons (:) </option>
10	<option value=" ">single space</option>
11	<option value=".">dot (.)</option>
12	<option value="-">dash (-)</option>
13	<option value="\|">pipe (\|)</option>
14	<option value="_">underscore (_)</option>
15	<option selected="True" value="tab">tab</option>
16	</param>
17
18	<param name="OFS" type="select" label="Output field-separator">
19	<option value=",">comma (,)</option>
20	<option value=":">colons (:)</option>
21	<option value=" ">space ( )</option>
22	<option value="-">dash (-)</option>
23	<option value=".">dot (.)</option>
24	<option value="\|">pipe (\|)</option>
25	<option value="_">underscore (_)</option>
26	<option selected="True" value="tab">tab</option>
27	</param>
28
29
30	<!-- Note: the parameter ane MUST BE 'url_paste' -
31	This is a hack in the galaxy library (see ./lib/galaxy/util/__init__.py line 142)
32	If the name is 'url_paste' the string won't be sanitized, and all the non-alphanumeric characters
33	will be passed to the shell script -->
34	<param name="file_data" type="text" area="true" size="5x35" label="AWK Program" help="">
35	<validator type="expression" message="Invalid Program!">value.find('\'')==-1</validator>
36	</param>
37
38	</inputs>
39	<tests>
40	<test>
41	<param name="input" value="unix_awk_input1.txt" />
42	<output name="output" file="unix_awk_output1.txt" />
43	<param name="FS" value="tab" />
44	<param name="OFS" value="tab" />
45	<param name="file_data" value="$2>0.5 { print $2*9, $1 }" />
46	</test>
47	</tests>
48	<outputs>
49	<data format="input" name="output" metadata_source="input" />
50	</outputs>
51	<help>
52
53	What it does
54
55	This tool runs the unix awk command on the selected data file.
56
57	.. class:: infomark
58
59	TIP: This tool uses the extended regular expression syntax (not the perl syntax).
60
61
62	Further reading
63
64	- Awk by Example (http://www.ibm.com/developerworks/linux/library/l-awk1.html)
65	- Long AWK tutorial (http://www.grymoire.com/Unix/Awk.html)
66	- Learn AWK in 1 hour (http://www.selectorweb.com/awk.html)
67	- awk cheat-sheet (http://cbi.med.harvard.edu/people/peshkin/sb302/awk_cheatsheets.pdf)
68	- Collection of useful awk one-liners (http://student.northpark.edu/pemente/awk/awk1line.txt)
69
70	-----
71
72	AWK programs
73
74	Most AWK programs consist of patterns (i.e. rules that match lines of text) and actions (i.e. commands to execute when a pattern matches a line).
75
76	The basic form of AWK program is::
77
78	pattern { action 1; action 2; action 3; }
79
80
81
82
83
84	Pattern Examples
85
86	- $2 == "chr3" will match lines whose second column is the string 'chr3'
87	- $5-$4>23 will match lines that after subtracting the value of the fourth column from the value of the fifth column, gives value alrger than 23.
88	- /AG..AG/ will match lines that contain the regular expression AG..AG (meaning the characeters AG followed by any two characeters followed by AG). (This is the way to specify regular expressions on the entire line, similar to GREP.)
89	- $7 ~ /A{4}U/ will match lines whose seventh column contains 4 consecutive A's followed by a U. (This is the way to specify regular expressions on a specific field.)
90	- 10000 < $4 && $4 < 20000 will match lines whose fourth column value is larger than 10,000 but smaller than 20,000
91	- If no pattern is specified, all lines match (meaning the action part will be executed on all lines).
92
93
94
95	Action Examples
96
97	- { print } or { print $0 } will print the entire input line (the line that matched in pattern). $0 is a special marker meaning 'the entire line'.
98	- { print $1, $4, $5 } will print only the first, fourth and fifth fields of the input line.
99	- { print $4, $5-$4 } will print the fourth column and the difference between the fifth and fourth column. (If the fourth column was start-position in the input file, and the fifth column was end-position - the output file will contain the start-position, and the length).
100	- If no action part is specified (not even the curly brackets) - the default action is to print the entire line.
101
102
103
104
105
106
107
108
109
110	AWK's Regular Expression Syntax
111
112	The select tool searches the data for lines containing or not containing a match to the given pattern. A Regular Expression is a pattern descibing a certain amount of text.
113
114	- *( ) { } [ ] . ? + \ ^ $ are all special characters. \\** can be used to "escape" a special character, allowing that special character to be searched for.
115	- ^ matches the beginning of a string(but not an internal line).
116	- ( .. ) groups a particular pattern.
117	- { n or n, or n,m } specifies an expected number of repetitions of the preceding pattern.
118
119	- {n} The preceding item is matched exactly n times.
120	- {n,} The preceding item ismatched n or more times.
121	- {n,m} The preceding item is matched at least n times but not more than m times.
122
123	- [ ... ] creates a character class. Within the brackets, single characters can be placed. A dash (-) may be used to indicate a range such as a-z.
124	- . Matches any single character except a newline.
125	- ***** The preceding item will be matched zero or more times.
126	- ? The preceding item is optional and matched at most once.
127	- + The preceding item will be matched one or more times.
128	- ^ has two meaning:
129	- matches the beginning of a line or string.
130	- indicates negation in a character class. For example, [^...] matches every character except the ones inside brackets.
131	- $ matches the end of a line or string.
132	- \\| Separates alternate possibilities.
133
134
135	Note: AWK uses extended regular expression syntax, not Perl syntax. \\d, \\w, \\s etc. are not supported.
136
137	</help>
138	</tool>

Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。

Context Navigation

root/galaxy-central/tools/unix_tools/awk_tool.xml

異なるフォーマットでダウンロード: