Abstract
Designing coresets  smallspace sketches of the data preserving cost of the solutions within (1± ε)approximate factor  is an important research direction in the study of centerbased kclustering problems, such as kmeans or kmedian. Feldman and Langberg [STOC'11] have shown that for kclustering of n points in general metrics, it is possible to obtain coresets whose size depends logarithmically in n. Moreover, such a dependency in n is inevitable in general metrics. A significant amount of recent work in the area is devoted to obtaining coresests whose sizes are independent of n for special metrics, like ddimensional Euclidean space [Huang, Vishnoi, STOC'20], doubling metrics [Huang, Jiang, Li, Wu, FOCS'18], metrics of graphs of bounded treewidth [Baker, Braverman, Huang, Jiang, Krauthgamer, Wu, ICML’20], or graphs excluding a fixed minor [Braverman, Jiang, Krauthgamer, Wu, SODA’21].
In this paper, we provide the first constructions of coresets whose size does not depend on n for kclustering in the metrics induced by geometric intersection graphs. For example, we obtain (k log²k)/ε^𝒪(1) size coresets for kclustering in Euclideanweighted unitdisk graphs (UDGs) and unitsquare graphs (USGs). These constructions follow from a general theorem that identifies two canonical properties of a graph metric sufficient for obtaining coresets whose size is independent of n. The proof of our theorem builds on the recent work of CohenAddad, Saulpic, and Schwiegelshohn [STOC '21], which ensures smallsized coresets conditioned on the existence of an interesting set of centers, called centroid set. The main technical contribution of our work is the proof of the existence of such a smallsized centroid set for graphs that satisfy the two canonical properties. Loosely speaking, the metrics of geometric intersection graphs are "similar" to the Euclidean metrics for points that are close, and to the shortest path metrics of planar graphs for points that are far apart. The main technical challenge in constructing centroid sets of small sizes is in combining these two very different metrics.
The new coreset construction helps to design the first (1+ε)approximation for centerbased clustering problems in UDGs and USGs, that is fixedparameter tractable in k and ε (FPTAS).
BibTeX  Entry
@InProceedings{bandyapadhyay_et_al:LIPIcs.SoCG.2023.10,
author = {Bandyapadhyay, Sayan and Fomin, Fedor V. and Inamdar, Tanmay},
title = {{Coresets for Clustering in Geometric Intersection Graphs}},
booktitle = {39th International Symposium on Computational Geometry (SoCG 2023)},
pages = {10:110:16},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959772730},
ISSN = {18688969},
year = {2023},
volume = {258},
editor = {Chambers, Erin W. and Gudmundsson, Joachim},
publisher = {Schloss Dagstuhl  LeibnizZentrum f{\"u}r Informatik},
address = {Dagstuhl, Germany},
URL = {https://drops.dagstuhl.de/opus/volltexte/2023/17860},
URN = {urn:nbn:de:0030drops178605},
doi = {10.4230/LIPIcs.SoCG.2023.10},
annote = {Keywords: kmedian, kmeans, clustering, coresets, geometric graphs}
}
Keywords: 

kmedian, kmeans, clustering, coresets, geometric graphs 
Collection: 

39th International Symposium on Computational Geometry (SoCG 2023) 
Issue Date: 

2023 
Date of publication: 

09.06.2023 