License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.WABI.2022.23
URN: urn:nbn:de:0030-drops-170577
Go to the corresponding LIPIcs Volume Portal

Guerrini, Veronica ; Conte, Alessio ; Grossi, Roberto ; Liti, Gianni ; Rosone, Giovanna ; Tattini, Lorenzo

phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering

LIPIcs-WABI-2022-23.pdf (2 MB)


Molecular phylogenetics is a fundamental branch of biology. It studies the evolutionary relationships among the individuals of a population through their biological sequences, and may provide insights about the origin and the evolution of viral diseases, or highlight complex evolutionary trajectories.
In this paper we develop a method called phyBWT, describing how to use the extended Burrows-Wheeler Transform (eBWT) for a collection of DNA sequences to directly reconstruct phylogeny, bypassing the alignment against a reference genome or de novo assembly. Our phyBWT hinges on the combinatorial properties of the eBWT positional clustering framework. We employ eBWT to detect relevant blocks of the longest shared substrings of varying length (unlike the k-mer-based approaches that need to fix the length k a priori), and build a suitable decomposition leading to a phylogenetic tree, step by step. As a result, phyBWT is a new alignment-, assembly-, and reference-free method that builds a partition tree without relying on the pairwise comparison of sequences, thus avoiding to use a distance matrix to infer phylogeny.
The preliminary experimental results on sequencing data show that our method can handle datasets of different types (short reads, contigs, or entire genomes), producing trees of quality comparable to that found in the benchmark phylogeny.

BibTeX - Entry

  author =	{Guerrini, Veronica and Conte, Alessio and Grossi, Roberto and Liti, Gianni and Rosone, Giovanna and Tattini, Lorenzo},
  title =	{{phyBWT: Alignment-Free Phylogeny via eBWT Positional Clustering}},
  booktitle =	{22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)},
  pages =	{23:1--23:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-243-3},
  ISSN =	{1868-8969},
  year =	{2022},
  volume =	{242},
  editor =	{Boucher, Christina and Rahmann, Sven},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-170577},
  doi =		{10.4230/LIPIcs.WABI.2022.23},
  annote =	{Keywords: Phylogeny, partition tree, BWT, positional cluster, alignment-free, reference-free, assembly-free}

Keywords: Phylogeny, partition tree, BWT, positional cluster, alignment-free, reference-free, assembly-free
Collection: 22nd International Workshop on Algorithms in Bioinformatics (WABI 2022)
Issue Date: 2022
Date of publication: 26.08.2022
Supplementary Material: Software (Source Code):

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI