Abstract
We consider the problem of estimating the graph size, where one is given only local access to the graph. We formally define a query model in which one starts with a seed node and is allowed to make queries about neighbours of nodes that have already been seen. In the case of undirected graphs, an estimator of Katzir et al. (2014) based on a sample from the stationary distribution pi uses O(1/pi_2 + d_avg) queries; we prove that this is tight. In addition, we establish this as a lower bound even when the algorithm is allowed to crawl the graph arbitrarily; the results of Katzir et al. give an upper bound that is worse by a multiplicative factor t_mix(1/n^4).
The picture becomes significantly different in the case of directed graphs. We show that without strong assumptions on the graph structure, the number of nodes cannot be predicted to within a constant multiplicative factor without using a number of queries that are at least linear in the number of nodes; in particular, rapid mixing and small diameter, properties that most realworld networks exhibit, do not suffice. The question of interest is whether any algorithm can beat breadthfirst search. We introduce a new parameter, generalising the wellstudied conductance, such that if a suitable bound on it exists and is known to the algorithm, the number of queries required is sublinear in the number of edges; we show that this is tight.
BibTeX  Entry
@InProceedings{kanade_et_al:LIPIcs:2017:7976,
author = {Varun Kanade and Frederik MallmannTrenn and Victor Verdugo},
title = {{How Large Is Your Graphl}},
booktitle = {31st International Symposium on Distributed Computing (DISC 2017)},
pages = {34:134:16},
series = {Leibniz International Proceedings in Informatics (LIPIcs)},
ISBN = {9783959770538},
ISSN = {18688969},
year = {2017},
volume = {91},
editor = {Andr{\'e}a W. Richa},
publisher = {Schloss DagstuhlLeibnizZentrum fuer Informatik},
address = {Dagstuhl, Germany},
URL = {http://drops.dagstuhl.de/opus/volltexte/2017/7976},
URN = {urn:nbn:de:0030drops79767},
doi = {10.4230/LIPIcs.DISC.2017.34},
annote = {Keywords: Estimation, Random Walks, Social Networks}
}
Keywords: 

Estimation, Random Walks, Social Networks 
Collection: 

31st International Symposium on Distributed Computing (DISC 2017) 
Issue Date: 

2017 
Date of publication: 

12.10.2017 