root/galaxy-central/tools/annotation_profiler/annotation_profiler.xml

リビジョン 2, 7.4 KB (コミッタ: hatakeyama, 14 年 前)

import galaxy-central

行番号 
1<tool id="Annotation_Profiler_0" name="Profile Annotations" Version="1.0.0">
2  <description>for a set of genomic intervals</description>
3  <command interpreter="python">annotation_profiler_for_interval.py -i $input1 -c ${input1.metadata.chromCol} -s ${input1.metadata.startCol} -e ${input1.metadata.endCol} -o $out_file1 $keep_empty -p ${GALAXY_DATA_INDEX_DIR}/annotation_profiler/$dbkey $summary -b 3 -t $table_names</command>
4  <inputs>
5    <param format="interval" name="input1" type="data" label="Choose Intervals">
6      <validator type="dataset_metadata_in_file" filename="annotation_profiler_valid_builds.txt" metadata_name="dbkey" metadata_column="0" message="Profiling is not currently available for this species."/>
7    </param>
8    <param name="keep_empty" type="select" label="Keep Region/Table Pairs with 0 Coverage">
9      <option value="-k">Keep</option>
10      <option value="" selected="true">Discard</option>
11    </param>
12    <param name="summary" type="select" label="Output per Region/Summary">
13      <option value="-S">Summary</option>
14      <option value="" selected="true">Per Region</option>
15    </param>
16    <param name="table_names" type="drill_down" display="checkbox" hierarchy="recurse" multiple="true" label="Choose Tables to Use" help="Selecting no tables will result in using all tables." from_file="annotation_profiler_options.xml"/>
17   </inputs>
18   <outputs>
19     <data format="input" name="out_file1">
20       <change_format>
21         <when input="summary" value="-S" format="tabular" />
22       </change_format>
23     </data>
24   </outputs>
25   <tests>
26     <test>
27       <param name="input1" value="4.bed" dbkey="hg18"/>
28       <param name="keep_empty" value=""/>
29       <param name="summary" value=""/>
30       <param name="table_names" value="acembly,affyGnf1h,knownAlt,knownGene,mrna,multiz17way,multiz28way,refGene,snp126"/>
31       <output name="out_file1" file="annotation_profiler_1.out" />
32     </test>
33     <test>
34       <param name="input1" value="3.bed" dbkey="hg18"/>
35       <param name="keep_empty" value=""/>
36       <param name="summary" value="Summary"/>
37       <param name="table_names" value="acembly,affyGnf1h,knownAlt,knownGene,mrna,multiz17way,multiz28way,refGene,snp126"/>
38       <output name="out_file1" file="annotation_profiler_2.out" />
39     </test>
40   </tests>
41   <help>
42**What it does**
43
44Takes an input set of intervals and for each interval determines the base coverage of the interval by a set of features (tables) available from UCSC. Genomic regions from the input feature data have been merged by overlap / direct adjacency (e.g. a table having ranges of: 1-10, 6-12, 12-20 and 25-28 results in two merged ranges of: 1-20 and 25-28).
45
46By default, this tool will check the coverage of your intervals against all available features; you may, however, choose to select only those tables that you want to include. Selecting a section heading will effectively cause all of its children to be selected.
47
48You may alternatively choose to receive a summary across all of the intervals that you provide.
49
50-----
51
52**Example**
53
54Using the interval below and selecting several tables::
55
56 chr1 4558 14764 uc001aab.1 0 -
57
58results in::
59
60 chr1 4558 14764 uc001aab.1 0 - snp126Exceptions 151 142
61 chr1 4558 14764 uc001aab.1 0 - genomicSuperDups 10206 1
62 chr1 4558 14764 uc001aab.1 0 - chainOryLat1 3718 1
63 chr1 4558 14764 uc001aab.1 0 - multiz28way 10206 1
64 chr1 4558 14764 uc001aab.1 0 - affyHuEx1 3553 32
65 chr1 4558 14764 uc001aab.1 0 - netXenTro2 3050 1
66 chr1 4558 14764 uc001aab.1 0 - intronEst 10206 1
67 chr1 4558 14764 uc001aab.1 0 - xenoMrna 10203 1
68 chr1 4558 14764 uc001aab.1 0 - ctgPos 10206 1
69 chr1 4558 14764 uc001aab.1 0 - clonePos 10206 1
70 chr1 4558 14764 uc001aab.1 0 - chainStrPur2Link 1323 29
71 chr1 4558 14764 uc001aab.1 0 - affyTxnPhase3HeLaNuclear 9011 8
72 chr1 4558 14764 uc001aab.1 0 - snp126orthoPanTro2RheMac2 61 58
73 chr1 4558 14764 uc001aab.1 0 - snp126 205 192
74 chr1 4558 14764 uc001aab.1 0 - chainEquCab1 10206 1
75 chr1 4558 14764 uc001aab.1 0 - netGalGal3 3686 1
76 chr1 4558 14764 uc001aab.1 0 - phastCons28wayPlacMammal 10172 3
77
78Where::
79
80 The first added column is the table name.
81 The second added column is the number of bases covered by the table.
82 The third added column is the number of regions from the table that is covered by the interval.
83
84Alternatively, requesting a summary, using the intervals below and selecting several tables::
85
86 chr1 4558 14764 uc001aab.1 0 -
87 chr1 4558 19346 uc001aac.1 0 -
88
89results in::
90
91 #tableName tableSize tableRegionCount allIntervalCount allIntervalSize allCoverage allTableRegionsOverlaped allIntervalsOverlapingTable nrIntervalCount nrIntervalSize nrCoverage nrTableRegionsOverlaped nrIntervalsOverlapingTable
92 snp126Exceptions 133601 92469 2 24994 388 359 2 1 14788 237 217 1
93 genomicSuperDups 12268847 657 2 24994 24994 2 2 1 14788 14788 1 1
94 chainOryLat1 70337730 2542 2 24994 7436 2 2 1 14788 3718 1 1
95 affyHuEx1 15703901 112274 2 24994 7846 70 2 1 14788 4293 38 1
96 netXenTro2 111440392 1877 2 24994 6100 2 2 1 14788 3050 1 1
97 snp126orthoPanTro2RheMac2 700436 690674 2 24994 124 118 2 1 14788 63 60 1
98 intronEst 135796064 2332 2 24994 24994 2 2 1 14788 14788 1 1
99 xenoMrna 129031327 1586 2 24994 20406 2 2 1 14788 10203 1 1
100 snp126 956976 838091 2 24994 498 461 2 1 14788 293 269 1
101 clonePos 224999719 39 2 24994 24994 2 2 1 14788 14788 1 1
102 chainStrPur2Link 7948016 119841 2 24994 2646 58 2 1 14788 1323 29 1
103 affyTxnPhase3HeLaNuclear 136797870 140244 2 24994 22601 17 2 1 14788 13590 9 1
104 multiz28way 225928588 38 2 24994 24994 2 2 1 14788 14788 1 1
105 ctgPos 224999719 39 2 24994 24994 2 2 1 14788 14788 1 1
106 chainEquCab1 246306414 141 2 24994 24994 2 2 1 14788 14788 1 1
107 netGalGal3 203351973 461 2 24994 7372 2 2 1 14788 3686 1 1
108 phastCons28wayPlacMammal 221017670 22803 2 24994 24926 6 2 1 14788 14754 3 1
109
110Where::
111 
112 tableName is the name of the table
113 tableChromosomeCoverage is the number of positions existing in the table for only the chromosomes that were referenced by the interval file
114 tableChromosomeCount is the number of regions existing in the table for only the chromosomes that were referenced by the interval file
115 tableRegionCoverage is the number of positions existing in the table between the minimal and maximal bounding regions that were referenced by the interval file
116 tableRegionCount is the number of regions existing in the table between the minimal and maximal bounding regions that were referenced by the interval file
117 
118 allIntervalCount is the number of provided intervals
119 allIntervalSize is the sum of the lengths of the provided interval file
120 allCoverage is the sum of the coverage for each provided interval
121 allTableRegionsOverlapped is the sum of the number of regions of the table (non-unique) that were overlapped for each interval
122 allIntervalsOverlappingTable is the number of provided intervals which overlap the table
123 
124 nrIntervalCount is the number of non-redundant intervals
125 nrIntervalSize is the sum of the lengths of non-redundant intervals
126 nrCoverage is the sum of the coverage of non-redundant intervals
127 nrTableRegionsOverlapped is the number of regions of the table (unique) that were overlapped by the non-redundant intervals
128 nrIntervalsOverlappingTable is the number of non-redundant intervals which overlap the table
129 
130
131.. class:: infomark
132
133**TIP:** non-redundant (nr) refers to the set of intervals that remains after the intervals provided have been merged to resolve overlaps
134
135  </help>
136</tool>
Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。