License: Creative Commons Attribution 4.0 International license (CC BY 4.0)
When quoting this document, please refer to the following
DOI: 10.4230/LIPIcs.ICALP.2021.11
URN: urn:nbn:de:0030-drops-140803
Go to the corresponding LIPIcs Volume Portal

Agarwal, Pankaj K. ; Hu, Xiao ; Sintos, Stavros ; Yang, Jun

Dynamic Enumeration of Similarity Joins

LIPIcs-ICALP-2021-11.pdf (0.9 MB)


This paper considers enumerating answers to similarity-join queries under dynamic updates: Given two sets of n points A,B in ℝ^d, a metric ϕ(⋅), and a distance threshold r > 0, report all pairs of points (a, b) ∈ A × B with ϕ(a,b) ≤ r. Our goal is to store A,B into a dynamic data structure that, whenever asked, can enumerate all result pairs with worst-case delay guarantee, i.e., the time between enumerating two consecutive pairs is bounded. Furthermore, the data structure can be efficiently updated when a point is inserted into or deleted from A or B.
We propose several efficient data structures for answering similarity-join queries in low dimension. For exact enumeration of similarity join, we present near-linear-size data structures for 𝓁₁, 𝓁_∞ metrics with log^{O(1)} n update time and delay. We show that such a data structure is not feasible for the 𝓁₂ metric for d ≥ 4. For approximate enumeration of similarity join, where the distance threshold is a soft constraint, we obtain a unified linear-size data structure for 𝓁_p metric, with log^{O(1)} n delay and update time. In high dimensions, we present an efficient data structure with worst-case delay-guarantee using locality sensitive hashing (LSH).

BibTeX - Entry

  author =	{Agarwal, Pankaj K. and Hu, Xiao and Sintos, Stavros and Yang, Jun},
  title =	{{Dynamic Enumeration of Similarity Joins}},
  booktitle =	{48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)},
  pages =	{11:1--11:19},
  series =	{Leibniz International Proceedings in Informatics (LIPIcs)},
  ISBN =	{978-3-95977-195-5},
  ISSN =	{1868-8969},
  year =	{2021},
  volume =	{198},
  editor =	{Bansal, Nikhil and Merelli, Emanuela and Worrell, James},
  publisher =	{Schloss Dagstuhl -- Leibniz-Zentrum f{\"u}r Informatik},
  address =	{Dagstuhl, Germany},
  URL =		{},
  URN =		{urn:nbn:de:0030-drops-140803},
  doi =		{10.4230/LIPIcs.ICALP.2021.11},
  annote =	{Keywords: dynamic enumeration, similarity joins, worst-case delay guarantee}

Keywords: dynamic enumeration, similarity joins, worst-case delay guarantee
Collection: 48th International Colloquium on Automata, Languages, and Programming (ICALP 2021)
Issue Date: 2021
Date of publication: 02.07.2021

DROPS-Home | Fulltext Search | Imprint | Privacy Published by LZI