Abstract
The texttopattern Hamming distances problem asks to compute the Hamming distances between a given pattern of length m and all lengthm substrings of a given text of length n ≥ m. We focus on the wellstudied kmismatch version of the problem, where a distance needs to be returned only if it does not exceed a threshold k. Moreover, we assume n ≤ 2m (in general, one can partition the text into overlapping blocks). In this work, we develop data structures for the dynamic version of the kmismatch problem supporting two operations: An update performs a singleletter substitution in the pattern or the text, whereas a query, given an index i, returns the Hamming distance between the pattern and the text substring starting at position i, or reports that the distance exceeds k.
First, we describe a simple data structure with 𝒪̃(1) update time and 𝒪̃(k) query time. Through considerably more sophisticated techniques, we show that 𝒪̃(k) update time and 𝒪̃(1) query time is also achievable. These two solutions likely provide an essentially optimal tradeoff for the dynamic kmismatch problem with m^{Ω(1)} ≤ k ≤ √m: we prove that, in that case, conditioned on the 3SUM conjecture, one cannot simultaneously achieve k^{1Ω(1)} time for all operations (updates and queries) after n^{𝒪(1)}time initialization. For k ≥ √m, the same lower bound excludes achieving m^{1/2Ω(1)} time per operation. This is known to be essentially tight for constantsized alphabets: already Clifford et al. (STACS 2018) achieved 𝒪̃(√m) time per operation in that case, but their solution for large alphabets costs 𝒪̃(m^{3/4}) time per operation. We improve and extend the latter result by developing a tradeoff algorithm that, given a parameter 1 ≤ x ≤ k, achieves update time 𝒪̃(m/k +√{mk/x}) and query time 𝒪̃(x). In particular, for k ≥ √m, an appropriate choice of x yields 𝒪̃(∛{mk}) time per operation, which is 𝒪̃(m^{2/3}) when only the trivial threshold k = m is provided.
BibTeX  Entry
@InProceedings{clifford_et_al:LIPIcs.CPM.2022.18,
author = {Clifford, Rapha\"{e}l and Gawrychowski, Pawe{\l} and Kociumaka, Tomasz and Martin, Daniel P. and Uzna\'{n}ski, Przemys{\l}aw},
title = {{The Dynamic kMismatch Problem}},
booktitle = {33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022)},
pages = {18:118:15},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772341},
ISSN = {18688969},
year = {2022},
volume = {223},
editor = {Bannai, Hideo and Holub, Jan},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2022/16145},
URN = {urn:nbn:de:0030drops161454},
doi = {10.4230/LIPIcs.CPM.2022.18},
annote = {Keywords: Pattern matching, Hamming distance, dynamic algorithms}
}
Keywords: 

Pattern matching, Hamming distance, dynamic algorithms 
Collection: 

33rd Annual Symposium on Combinatorial Pattern Matching (CPM 2022) 
Issue Date: 

2022 
Date of publication: 

22.06.2022 