Gesellschaft für Informatik e.V.

Lecture Notes in Informatics

German conference on bioinformatics 2010 P-173, 61-70 (2010).

Gesellschaft für Informatik, Bonn

Copyright © Gesellschaft für Informatik, Bonn


Repeat-aware comparative genome assembly

Peter Husemann and Jens Stoye


The current high-throughput sequencing technologies produce gigabytes of data even when prokaryotic genomes are processed. In a subsequent assembly phase, the generated overlapping reads are merged, ideally into one contiguous sequence. Often, however, the assembly results in a set of contigs which need to be stitched together with additional lab work. One of the reasons why the assembly produces several distinct contigs are repetitive elements in the newly sequenced genome. While knowing order and orientation of a set of non-repetitive contigs helps to close the gaps between them, special care has to be taken for repetitive contigs. Here we propose an algorithm that orders a set of contigs with respect to a related reference genome while treating the repetitive contigs in an appropriate way.

Full Text: PDF

Gesellschaft für Informatik, Bonn
ISBN 978-3-88579-267-3

Last changed 04.10.2013 18:32:34