A Deterministic Motif Finding Algorithm with Application to the Human Genome
Lawrence S. Hon and Ajay N. Jain
We present a novel algorithm, MaMF, for identifying transcription factor (TF) binding site motifs. The method is deterministic and depends on an indexing technique to optimize the search process. On common yeast data sets, MaMF performs competitively with other methods. We also present results on a challenging group of eight sets of human genes known to be responsive to a diverse group of TFs. In every case, MaMF finds the annotated motif among the top scoring putative motifs, performing better than other motif finders. We analyzed the remaining high scoring motifs and show that many correspond to other TFs that are known to co-occur with the annotated TF motifs. The significant and frequent presence of co-occurring transcription factor binding sites explains in part the difficulty of human motif finding. MaMF is a very fast algorithm, suitable for application to large numbers of interesting gene sets.
Downloadable files include: