Gesellschaft für Informatik e.V.

Lecture Notes in Informatics

German Conference on Bioinformatics P-136, 96-105 (2008).

Gesellschaft für Informatik, Bonn


Andreas Beyer (ed.), Michael Schroeder (ed.)

Copyright © Gesellschaft für Informatik, Bonn


Statistical detection of cooperative transcription factors with similarity adjustment

Utz J. Pape , Holger Klein and Martin Vingron


Statistical assessment of cis-regulatory modules (CRMs) is a crucial task in computational biology. Usually, one concludes from exceptional co-occurrences of DNA motifs that the corresponding transcription factors are co-operative. However, similar DNA motifs tend to co-occur in random sequences due to high probability of overlapping occurrences. Therefore, it is important to consider similarity of DNA motifs in the statistical assessment. Based on previous work, we propose to adjust the window size for co-occurrence detection. Using the derived approximation, one obtains different window sizes for different sets of DNA motifs depending on their similarities. This ensures that the probability of co-occurrences in random sequences are equal. Applying the approach to selected similar and dissimilar DNA motifs from human transcription factors shows the necessity of adjustment and confirms the accuracy of the approximation. Our previously published statistics can only deal with non-overlapping windows. Therefore, we extend the approach and derive Chen-Stein error bounds for the approximation. Comparing the error bounds for similar and dissimilar DNA motifs shows that the approximation for similar DNA motifs yields large bounds. Hence, one has to be careful using overlapping windows. Based on the error bounds, one can pre-compute the approximation errors and select an appropriate overlap-scheme before running the analysis. Software and source code are available at

Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-226-0

Last changed 04.10.2013 18:19:02