Compare Prospector

Compare Prospector Output

The following is a typical output and explanations:

****************************************
*                                      *
*   CompareProspector Search Result    *
*                                      *
****************************************

First, the program searches for motifs in a number of runs and the highest-scoring motif from each run is listed:

Try #n; motif_score; motif_find(and its reverse compliment); number of aligned segments. For example,

Try #19 3.101 CAGCTGTT AACAGCTG 21

The program then reports the number of top motifs you specified. Each motif looks like the following:

The highest scoring 3 motifs are:

Motif width (blk 1, blk 2); Gap [min gap, max gap]; Motif raw score (= log(number of segments) * (relative entropy of the motif)); Number of aligned segments; (since CompareProspector only searches for one-block motifs, the width of the second motif, as well as the min and max gaps, will always be 0).

Motif #1: (CAGCTGTC/GACAGCTG)

******************************

Width (8, 0); Gap [0, 0]; MotifScore 3.169; Sites 21

Motifs are listed as position-specific probability matrices with each line representing a motif column. Motif base probabilities are listed. Consensus (Con)--the most abundant base, reverse compliment consensus (rCon), degenerate consensus (Deg)—where all bases with > 25% abundance are considered, and reverse degenerate consensus (rDeg) are represented in IUPAC symbols. Also listed are sequences that have the motif, including sequence name, length of the sequence, site number, orientation of the site (“f” means the site is on the forward strand, whereas “r” means the site is on the reverse strand), the starting position of the site, and the actual sequence of the site.

Blk1 A C G T Con rCon Deg rDeg

1 0.20 99.38 0.23 0.20 C G C G

2 99.35 0.23 0.23 0.20 A T A T

3 0.20 42.72 56.89 0.20 G C S S

4 0.20 99.38 0.23 0.20 C G C G

5 0.20 0.23 0.23 99.35 T A T A

6 0.20 0.23 99.38 0.20 G C G C

7 0.20 33.28 0.23 66.30 T A Y R

8 0.20 56.89 0.23 42.69 C G Y R

Sequences contributed to this alignment: sequence name, segment number for that sequence (e.g. SeqName6 contributed 2 segments to the motif, one at f10, the other at f63), starting alignment position (r53 means 53 from the end of sequence in reverse direction, f47 is 47 from beginning of sequence in forward direction), sequence of the aligned segments.

> seq1 len 3197 site #1 f 3120

CAGCTGTC

> seq2 len 11081 site #1 r 10857

CACCTGTT

> seq3 len 973 site #1 f 130

CAGCTGTC

For all the questions, please email iliu@smi.stanford.edu. The server is still in developmental stage, we apologize for any inconveniences.

Thanks for using CompareProspector!