Context Navigation

pass.xml @ 2

リビジョン 2, 4.4 KB (コミッタ: hatakeyama, 14 年前)
import galaxy-central

Rev	行番号
[2]	1	<tool id="hgv_pass" name="PASS" version="1.0.0">
	2	<description>significant transcription factor binding sites from ChIP data</description>
	3
	4	<command interpreter="bash">
	5	pass_wrapper.sh "$input" "$min_window" "$max_window" "$false_num" "$output"
	6	</command>
	7
	8	<inputs>
	9	<param format="gff" name="input" type="data" label="Dataset"/>
	10	<param name="min_window" label="Smallest window size (by # of probes)" type="integer" value="2" />
	11	<param name="max_window" label="Largest window size (by # of probes)" type="integer" value="6" />
	12	<param name="false_num" label="Expected total number of false positive intervals to be called" type="float" value="5.0" help="N.B.: this is a <em>count</em>, not a rate." />
	13	</inputs>
	14
	15	<outputs>
	16	<data format="tabular" name="output" />
	17	</outputs>
	18
	19	<requirements>
	20	<requirement type="binary">pass</requirement>
	21	<requirement type="binary">sed</requirement>
	22	</requirements>
	23
	24	<!-- we need to be able to set the seed for the random number generator
	25	<tests>
	26	<test>
	27	<param name="input" ftype="gff" value="pass_input.gff"/>
	28	<param name="min_window" value="2"/>
	29	<param name="max_window" value="6"/>
	30	<param name="false_num" value="5"/>
	31	<output name="output" file="pass_output.tab"/>
	32	</test>
	33	</tests>
	34	-->
	35
	36	<help>
	37	Dataset formats
	38
	39	The input is in GFF_ format, and the output is tabular_.
	40	(`Dataset missing?`_)
	41
	42	.. _GFF: ./static/formatHelp.html#gff
	43	.. _tabular: ./static/formatHelp.html#tab
	44	.. _Dataset missing?: ./static/formatHelp.html
	45
	46	-----
	47
	48	What it does
	49
	50	PASS (Poisson Approximation for Statistical Significance) detects
	51	significant transcription factor binding sites in the genome from
	52	ChIP data. This is probably the only peak-calling method that
	53	accurately controls the false-positive rate and FDR in ChIP data,
	54	which is important given the huge discrepancy in results obtained
	55	from different peak-calling algorithms. At the same time, this
	56	method achieves a similar or better power than previous methods.
	57
	58	<!-- we don't have wrapper support for the "prior" file yet
	59	Another unique feature of this method is that it allows varying
	60	thresholds to be used for peak calling at different genomic
	61	locations. For example, if a position lies in an open chromatin
	62	region, is depleted of nucleosome positioning, or a co-binding
	63	protein has been detected within the neighborhood, then the position
	64	is more likely to be bound by the target protein of interest, and
	65	hence a lower threshold will be used to call significant peaks.
	66	As a result, weak but real binding sites can be detected.
	67	-->
	68
	69	-----
	70
	71	Hints
	72
	73	- ChIP-Seq data:
	74
	75	If the data is from ChIP-Seq, you need to convert the ChIP-Seq values
	76	into z-scores before using this program. It is also recommended that
	77	you group read counts within a neighborhood together, e.g. in tiled
	78	windows of 30bp. In this way, the ChIP-Seq data will resemble
	79	ChIP-chip data in format.
	80
	81	- Choosing window size options:
	82
	83	The window size is related to the probe tiling density. For example,
	84	if the probes are tiled at every 100bp, then setting the smallest
	85	window = 2 and largest window = 6 is appropriate, because the DNA
	86	fragment size is around 300-500bp.
	87
	88	-----
	89
	90	Example
	91
	92	- input file::
	93
	94	chr7 Nimblegen ID 40307603 40307652 1.668944 . . .
	95	chr7 Nimblegen ID 40307703 40307752 0.8041307 . . .
	96	chr7 Nimblegen ID 40307808 40307865 -1.089931 . . .
	97	chr7 Nimblegen ID 40307920 40307969 1.055044 . . .
	98	chr7 Nimblegen ID 40308005 40308068 2.447853 . . .
	99	chr7 Nimblegen ID 40308125 40308174 0.1638694 . . .
	100	chr7 Nimblegen ID 40308223 40308275 -0.04796628 . . .
	101	chr7 Nimblegen ID 40308318 40308367 0.9335709 . . .
	102	chr7 Nimblegen ID 40308526 40308584 0.5143972 . . .
	103	chr7 Nimblegen ID 40308611 40308660 -1.089931 . . .
	104	etc.
	105
	106	In GFF, a value of dot '.' is used to mean "not applicable".
	107
	108	- output file::
	109
	110	ID Chr Start End WinSz PeakValue # of FPs FDR
	111	1 chr7 40310931 40311266 4 1.663446 0.248817 0.248817
	112
	113	-----
	114
	115	References
	116
	117	Zhang Y. (2008)
	118	Poisson approximation for significance in genome-wide ChIP-chip tiling arrays.
	119	Bioinformatics. 24(24):2825-31. Epub 2008 Oct 25.
	120
	121	Chen KB, Zhang Y. (2010)
	122	A varying threshold method for ChIP peak calling using multiple sources of information.
	123	Submitted.
	124
	125	</help>
	126	</tool>

Note: リポジトリブラウザについてのヘルプは TracBrowser を参照してください。

Context Navigation

root/galaxy-central/tools/human_genome_variation/pass.xml @ 2

異なるフォーマットでダウンロード: