sequences
	fastx_toolkit
	zcat -f '$input' | fastx_collapser -v -o '$output' 
	
		
	
    
	
		
	
  
**What it does**
This tool collapses identical sequences in a FASTA file into a single sequence.
--------
**Example**
Example Input File (Sequence "ATAT" appears multiple times):: 
    >CSHL_2_FC0042AGLLOO_1_1_605_414
    TGCG
    >CSHL_2_FC0042AGLLOO_1_1_537_759
    ATAT
    >CSHL_2_FC0042AGLLOO_1_1_774_520
    TGGC
    >CSHL_2_FC0042AGLLOO_1_1_742_502
    ATAT
    >CSHL_2_FC0042AGLLOO_1_1_781_514
    TGAG
    >CSHL_2_FC0042AGLLOO_1_1_757_487
    TTCA
    >CSHL_2_FC0042AGLLOO_1_1_903_769
    ATAT
    >CSHL_2_FC0042AGLLOO_1_1_724_499
    ATAT
Example Output file::
    >1-1
    TGCG
    >2-4
    ATAT
    >3-1
    TGGC
    >4-1
    TGAG
    >5-1
    TTCA
    
.. class:: infomark
Original Sequence Names / Lane descriptions (e.g. "CSHL_2_FC0042AGLLOO_1_1_742_502") are discarded. 
The output sequence name is composed of two numbers: the first is the sequence's number, the second is the multiplicity value.
The following output::
    >2-4
    ATAT
means that the sequence "ATAT" is the second sequence in the file, and it appeared 4 times in the input FASTA file.