Abstract
Morphisms are widely studied combinatorial objects that can be used for generating infinite families of words. In the context of Information theory, injective morphisms are called (variable length) codes. In Data compression, the morphisms, combined with parsing techniques, have been recently used to define new mechanisms to generate repetitive words. Here, we show that the repetitiveness induced by applying a morphism to a word can be captured by a compression scheme based on the BurrowsWheeler Transform (BWT). In fact, we prove that, differently from other compressionbased repetitiveness measures, the measure r_bwt (which counts the number of equalletter runs produced by applying BWT to a word) strongly depends on the applied morphism. More in detail, we characterize the binary morphisms that preserve the value of r_bwt(w), when applied to any binary word w containing both letters. They are precisely the Sturmian morphisms, which are wellknown objects in Combinatorics on words. Moreover, we prove that it is always possible to find a binary morphism that, when applied to any binary word containing both letters, increases the number of BWTequal letter runs by a given (even) number. In addition, we derive a method for constructing arbitrarily large families of binary words on which BWT produces a given (even) number of new equalletter runs. Such results are obtained by using a new class of morphisms that we call ThueMorselike. Finally, we show that there exist binary morphisms μ for which it is possible to find words w such that the difference r_bwt(μ(w))r_bwt(w) is arbitrarily large.
BibTeX  Entry
@InProceedings{fici_et_al:LIPIcs.CPM.2023.10,
author = {Fici, Gabriele and Romana, Giuseppe and Sciortino, Marinella and Urbina, Cristian},
title = {{On the Impact of Morphisms on BWTRuns}},
booktitle = {34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023)},
pages = {10:110:18},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772761},
ISSN = {18688969},
year = {2023},
volume = {259},
editor = {Bulteau, Laurent and Lipt\'{a}k, Zsuzsanna},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/17964},
URN = {urn:nbn:de:0030drops179641},
doi = {10.4230/LIPIcs.CPM.2023.10},
annote = {Keywords: Morphism, BurrowsWheeler transform, Sturmian word, Sturmian morphism, ThueMorse morphism, Repetitiveness measure}
}
Keywords: 

Morphism, BurrowsWheeler transform, Sturmian word, Sturmian morphism, ThueMorse morphism, Repetitiveness measure 
Collection: 

34th Annual Symposium on Combinatorial Pattern Matching (CPM 2023) 
Issue Date: 

2023 
Date of publication: 

21.06.2023 