Gesellschaft fr Informatik e.V.

Lecture Notes in Informatics

German Conference on Bioinformatics 2004, GCB 2004, October 4-6, 2004, Bielefeld, Germany P-53, 13-24 (2004).

GI, Gesellschaft für Informatik, Bonn


Robert Giegerich, Jens Stoye (eds.)

Copyright © GI, Gesellschaft für Informatik, Bonn


Weighted sequencing from compomers: DNA de-novo sequencing from mass spectrometry data in the presence of false negative peaks

Sebastian Böcker


One of the main endeavors in today's Life Science remains the efficient sequencing of long DNA molecules. Today, most de-novo sequencing of DNA is still performed using electrophoresis-based Sanger Sequencing introduced in 1977, in spite of certain restrictions of this method. Recently, we proposed a new method for DNA sequencing using base-specific cleavage and mass spectrometry, that appears to be a promising alternative to classical DNA sequencing approaches: Among its benefits is the extremely fast data acquisition of mass spectrometry. This leads to the combinatorial problem of Sequencing From Compomers (SFC), and to the definition of sequencing graphs. Simulations indicate that this method may allow for de-novo sequencing of DNA molecules with 200+ nt. An open problem in the context of SFC is that it does not take into account false negative peaks (missing peaks) that are common for real-world mass spectra. Here, we present a natural generalization of SFC, the Weighted Sequencing from Compomers (WSC) Problem, that allows us to cope with false negative peaks. We also show that the family of graphs introduced to solve SFC, can be generalized to capture the new aspects of WSC. Finally, we present a branch-and-bound algorithm to find all sequences that agree with the sample mass spectra with the exception of some missing peaks.

Full Text: PDF

GI, Gesellschaft für Informatik, Bonn
ISBN 3-88579-382-2

Last changed 24.01.2012 21:47:53