root/galaxy-central/tools/fastx_toolkit/fasta_clipping_histogram.xml

リビジョン 2, 3.0 KB (コミッタ: hatakeyama, 14 年 前)

import galaxy-central

行番号 
1<tool id="cshl_fasta_clipping_histogram" name="Length Distribution">
2        <description>chart</description>
3        <requirements><requirement type="package">fastx_toolkit</requirement></requirements>
4        <command>fasta_clipping_histogram.pl $input $outfile</command>
5       
6        <inputs>
7                <param format="fasta" name="input" type="data" label="Library to analyze" />
8        </inputs>
9
10        <outputs>
11                <data format="png" name="outfile" metadata_source="input" />
12        </outputs>
13<help>
14
15**What it does**
16
17This tool creates a histogram image of sequence lengths distribution in a given fasta dataset file.
18
19**TIP:** Use this tool after clipping your library (with **FASTX Clipper tool**), to visualize the clipping results.
20
21-----
22
23**Output Examples**
24
25In the following library, most sequences are 24-mers to 27-mers.
26This could indicate an abundance of endo-siRNAs (depending of course of what you've tried to sequence in the first place).
27
28.. image:: ./static/fastx_icons/fasta_clipping_histogram_1.png
29
30
31In the following library, most sequences are 19,22 or 23-mers.
32This could indicate an abundance of miRNAs (depending of course of what you've tried to sequence in the first place).
33
34.. image:: ./static/fastx_icons/fasta_clipping_histogram_2.png
35
36
37-----
38
39
40**Input Formats**
41
42This tool accepts short-reads FASTA files. The reads don't have to be short, but they do have to be on a single line, like so::
43
44   >sequence1
45   AGTAGTAGGTGATGTAGAGAGAGAGAGAGTAG
46   >sequence2
47   GTGTGTGTGGGAAGTTGACACAGTA
48   >sequence3
49   CCTTGAGATTAACGCTAATCAAGTAAAC
50
51
52If the sequences span over multiple lines::
53
54   >sequence1
55   CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAG
56   TCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAG
57   aactggtctttacctTTAAGTTG
58
59Use the **FASTA Width Formatter** tool to re-format the FASTA into a single-lined sequences::
60
61   >sequence1
62   CAGCATCTACATAATATGATCGCTATTAAACTTAAATCTCCTTGACGGAGTCTTCGGTCATAACACAAACCCAGACCTACGTATATGACAAAGCTAATAGaactggtctttacctTTAAGTTG
63
64
65-----
66
67
68
69**Multiplicity counts (a.k.a reads-count)**
70
71If the sequence identifier (the text after the '>') contains a dash and a number, it is treated as a multiplicity count value (i.e. how many times that individual sequence repeated in the original FASTA file, before collapsing).
72
73Example 1 - The following FASTA file *does not* have multiplicity counts::
74
75    >seq1
76    GGATCC
77    >seq2
78    GGTCATGGGTTTAAA
79    >seq3
80    GGGATATATCCCCACACACACACAC
81
82Each sequence is counts as one, to produce the following chart:
83
84.. image:: ./static/fastx_icons/fasta_clipping_histogram_3.png
85
86
87Example 2 - The following FASTA file have multiplicity counts::
88
89    >seq1-2
90    GGATCC
91    >seq2-10
92    GGTCATGGGTTTAAA
93    >seq3-3
94    GGGATATATCCCCACACACACACAC
95
96The first sequence counts as 2, the second as 10, the third as 3, to produce the following chart:
97
98.. image:: ./static/fastx_icons/fasta_clipping_histogram_4.png
99
100Use the **FASTA Collapser** tool to create FASTA files with multiplicity counts.
101
102</help>
103</tool>
104<!-- FASTA-Clipping-Histogram is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->
Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。