Context Navigation

execute_dwt_var_perClass.xml

リビジョン 2, 5.2 KB (コミッタ: hatakeyama, 14 年前)
import galaxy-central

行番号
1	<tool id="compute_p-values_max_variances_feature_occurrences_in_one_dataset_using_discrete_wavelet_transfom" name="Compute P-values and Max Variances for Feature Occurrences" version="1.0.0">
2	<description>in one dataset using Discrete Wavelet Transfoms</description>
3
4	<command interpreter="perl">
5	execute_dwt_var_perClass.pl $inputFile $outputFile1 $outputFile2 $outputFile3
6	</command>
7
8	<inputs>
9	<param format="tabular" name="inputFile" type="data" label="Select the input file"/>
10	</inputs>
11
12	<outputs>
13	<data format="tabular" name="outputFile1"/>
14	<data format="tabular" name="outputFile2"/>
15	<data format="pdf" name="outputFile3"/>
16	</outputs>
17
18	<help>
19
20	.. class:: infomark
21
22	What it does
23
24	This program generates plots and computes table matrix of maximum variances, p-values, and test orientations at multiple scales for the occurrences of a class of features in one dataset of DNA sequences using multiscale wavelet analysis technique.
25
26	The program assumes that the user has one set of DNA sequences, S, which consists of one or more sequences of equal length. Each sequence in S is divided into the same number of multiple intervals n such that n = 2^k, where k is a positive integer and k >= 1. Thus, n could be any value of the set {2, 4, 8, 16, 32, 64, 128, ...}. k represents the number of scales.
27
28	The program has one input file obtained as follows:
29
30	For a given set of features, say motifs, the user counts the number of occurrences of each feature in each interval of each sequence in S, and builds a tabular file representing the count results in each interval of S. This is the input file of the program.
31
32	The program gives three output files:
33
34	- The first output file is a TABULAR format file giving the scales at which each features has a maximum variances.
35	- The second output file is a TABULAR format file representing the variances, p-values, and test orientation for the occurrences of features at each scale based on a random permutation test and using multiscale wavelet analysis technique.
36	- The third output file is a PDF file plotting the wavelet variances of each feature at each scale.
37
38	-----
39
40	.. class:: warningmark
41
42	Note
43
44	- If the number of features is greater than 12, the program will divide each output file into subfiles, such that each subfile represents the results of a group of 12 features except the last subfile that will represents the results of the rest. For example, if the number of features is 17, the p-values file will consists of two subfiles, the first for the features 1-12 and the second for the features 13-17. As for the PDF file, it will consists of two pages in this case.
45	- In order to obtain empirical p-values, a random perumtation test is implemented by the program, which results in the fact that the program gives slightly different results each time it is run on the same input file.
46
47	-----
48
49
50	Example
51
52	Counting the occurrences of 8 features (motifs) in 16 intervals (one line per interval) of set of DNA sequences in S gives the following tabular file::
53
54	deletionHoptspot insertionHoptspot dnaPolPauseFrameshift indelHotspot topoisomeraseCleavageSite translinTarget vDjRecombinationSignal x-likeSite
55	226 403 416 221 1165 832 749 1056
56	236 444 380 241 1223 746 782 1207
57	242 496 391 195 1116 643 770 1219
58	243 429 364 191 1118 694 783 1223
59	244 410 371 236 1063 692 805 1233
60	230 386 370 217 1087 657 787 1215
61	275 404 402 214 1044 697 831 1188
62	265 443 365 231 1086 694 782 1184
63	255 390 354 246 1114 642 773 1176
64	281 384 406 232 1102 719 787 1191
65	263 459 369 251 1135 643 810 1215
66	280 433 400 251 1159 701 777 1151
67	278 385 382 231 1147 697 707 1161
68	248 393 389 211 1162 723 759 1183
69	251 403 385 246 1114 752 776 1153
70	239 383 347 227 1172 759 789 1141
71
72	We notice that the number of scales here is 4 because 16 = 2^4. Runnig the program on the above input file gives the following 3 output files:
73
74	The first output file::
75
76	motifs max_var at scale
77	deletionHoptspot NA
78	insertionHoptspot NA
79	dnaPolPauseFrameshift NA
80	indelHotspot NA
81	topoisomeraseCleavageSite 3
82	translinTarget NA
83	vDjRecombinationSignal NA
84	x.likeSite NA
85
86	The second output file::
87
88	motif 1_var 1_pval 1_test 2_var 2_pval 2_test 3_var 3_pval 3_test 4_var 4_pval 4_test
89
90	deletionHoptspot 0.457 0.048 L 1.18 0.334 R 1.61 0.194 R 3.41 0.055 R
91	insertionHoptspot 0.556 0.109 L 1.34 0.272 R 1.59 0.223 R 2.02 0.157 R
92	dnaPolPauseFrameshift 1.42 0.089 R 0.66 0.331 L 0.421 0.305 L 0.121 0.268 L
93	indelHotspot 0.373 0.021 L 1.36 0.254 R 1.24 0.301 R 4.09 0.047 R
94	topoisomeraseCleavageSite 0.305 0.002 L 0.936 0.489 R 3.78 0.01 R 1.25 0.272 R
95	translinTarget 0.525 0.061 L 1.69 0.11 R 2.02 0.131 R 0.00891 0.069 L
96	vDjRecombinationSignal 0.68 0.138 L 0.957 0.46 R 2.35 0.071 R 1.03 0.357 R
97	x.likeSite 0.928 0.402 L 1.33 0.261 R 0.735 0.431 L 0.783 0.422 R
98
99	The third output file:
100
101	.. image:: ../static/operation_icons/dwt_var_perClass.png
102
103	</help>
104
105	</tool>

Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。

Context Navigation

root/galaxy-central/tools/discreteWavelet/execute_dwt_var_perClass.xml

異なるフォーマットでダウンロード: