[2] | 1 | <tool id="find_diag_hits" name="Find diagnostic hits" version="1.0.0"> |
---|
| 2 | <description></description> |
---|
| 3 | <requirements> |
---|
| 4 | <requirement type="package">taxonomy</requirement> |
---|
| 5 | </requirements> |
---|
| 6 | <command interpreter="python">find_diag_hits.py $input1 $id_col $rank_list $out_format $out_file1</command> |
---|
| 7 | <inputs> |
---|
| 8 | <param format="taxonomy" name="input1" type="data" label="Find diagnostic hits in"/> |
---|
| 9 | <param name="id_col" type="data_column" data_ref="input1" numerical="False" label="Select column with sequence id" /> |
---|
| 10 | <param name="rank_list" type="select" display="checkboxes" multiple="true" label="select taxonomic ranks"> |
---|
| 11 | <option value="superkingdom">Superkingdom</option> |
---|
| 12 | <option value="kingdom">Kingdom</option> |
---|
| 13 | <option value="subkingdom">Subkingdom</option> |
---|
| 14 | <option value="superphylum">Superphylum</option> |
---|
| 15 | <option value="phylum">Phylum</option> |
---|
| 16 | <option value="subphylum">Subphylum</option> |
---|
| 17 | <option value="superclass">Superclass</option> |
---|
| 18 | <option value="class">Class</option> |
---|
| 19 | <option value="subclass">Subclass</option> |
---|
| 20 | <option value="superorder">Superorder</option> |
---|
| 21 | <option value="order">Order</option> |
---|
| 22 | <option value="suborder">Suborder</option> |
---|
| 23 | <option value="superfamily">Superfamily</option> |
---|
| 24 | <option value="family">Family</option> |
---|
| 25 | <option value="subfamily">Subfamily</option> |
---|
| 26 | <option value="tribe">Tribe</option> |
---|
| 27 | <option value="subtribe">Subtribe</option> |
---|
| 28 | <option value="genus">Genus</option> |
---|
| 29 | <option value="subgenus">Subgenus</option> |
---|
| 30 | <option selected="true" value="species">Species</option> |
---|
| 31 | <option value="subspecies">Subspecies</option> |
---|
| 32 | </param> |
---|
| 33 | <param name="out_format" type="select" label="Select output format"> |
---|
| 34 | <option value="reads">Diagnostic read list</option> |
---|
| 35 | <option value="counts">Number of diagnostic reads per taxonomic rank</option> |
---|
| 36 | </param> |
---|
| 37 | </inputs> |
---|
| 38 | <outputs> |
---|
| 39 | <data format="tabular" name="out_file1" /> |
---|
| 40 | </outputs> |
---|
| 41 | <tests> |
---|
| 42 | <test> |
---|
| 43 | <param name="input1" value="taxonomyGI.taxonomy" ftype="taxonomy"/> |
---|
| 44 | <param name="id_col" value="1" /> |
---|
| 45 | <param name="rank_list" value="order,genus" /> |
---|
| 46 | <param name="out_format" value="counts" /> |
---|
| 47 | <output name="out_file1" file="find_diag_hits.tabular" /> |
---|
| 48 | </test> |
---|
| 49 | </tests> |
---|
| 50 | |
---|
| 51 | |
---|
| 52 | <help> |
---|
| 53 | |
---|
| 54 | **What it does** |
---|
| 55 | |
---|
| 56 | When performing metagenomic analyses it is often necessary to identify sequence reads corresponding to a particular taxonomic group, or, in other words, diagnostic of a particular taxonomic rank. This utility performs this analysis. It takes data generated by *Taxonomy manipulation->Fetch Taxonomic Ranks* as input and outputs either a list of sequence reads unique to a particular taxonomic rank, or a list of taxonomic ranks and the count of unique reads corresponding to each rank. |
---|
| 57 | |
---|
| 58 | ------ |
---|
| 59 | |
---|
| 60 | **Example** |
---|
| 61 | |
---|
| 62 | Suppose the *Taxonomy manipulation->Fetch Taxonomic Ranks* generated the following taxonomy representation:: |
---|
| 63 | |
---|
| 64 | read1 2 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Laurasiatheria n Ruminantia n Bovidae Bovinae n n Bos n Bos taurus n |
---|
| 65 | read2 12585 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n |
---|
| 66 | read1 58615 root Eukaryota Metazoa n n Arthropoda n Hexapoda Insecta Neoptera Amphiesmenoptera Lepidoptera Glossata Papilionoidea Nymphalidae Nymphalinae Melitaeini Phyciodina Anthanassa n Anthanassa otanes n |
---|
| 67 | read3 56785 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n |
---|
| 68 | |
---|
| 69 | Running this tool with the following parameters: |
---|
| 70 | |
---|
| 71 | * *Select column with sequence id* set to **c1** |
---|
| 72 | * *Select taxonomic ranks* with **order**, and **genus** checked |
---|
| 73 | * *Output format* set to **Diagnostic read list** |
---|
| 74 | |
---|
| 75 | will return:: |
---|
| 76 | |
---|
| 77 | read2 Primates order |
---|
| 78 | read3 Primates order |
---|
| 79 | read2 Homo genus |
---|
| 80 | read3 Homo genus |
---|
| 81 | |
---|
| 82 | Changing *Output format* set to **Number of diagnostic reads per taxonomic rank** will produce:: |
---|
| 83 | |
---|
| 84 | Primates 2 order |
---|
| 85 | Homo 2 genus |
---|
| 86 | |
---|
| 87 | .. class:: infomark |
---|
| 88 | |
---|
| 89 | Note that **read1** is omitted because it is non-unique: it hits Mammals and Insects at the same time. |
---|
| 90 | |
---|
| 91 | -------- |
---|
| 92 | |
---|
| 93 | .. class:: warningmark |
---|
| 94 | |
---|
| 95 | This tool omits "**n**" corresponding to ranks missing from NCBI taxonomy. In the above example *Home sapiens* contains the order name (Primates) while *Bos taurus* does not. |
---|
| 96 | |
---|
| 97 | |
---|
| 98 | </help> |
---|
| 99 | </tool> |
---|