On Coresets for Clustering in Small Dimensional Euclidean Spaces

作者: Lingxiao Huang , Ruiyuan Huang , Zengfeng Huang , Xuan Wu

DOI:

关键词:

摘要: We consider the problem of constructing small coresets for -Median in Euclidean spaces. Given a large set of data points , a coreset is a much smaller set , so that the -Median costs of any centers wrt and are close. Existing literature mainly focuses on the high-dimension case and there has been great success in obtaining dimension-independent bounds, whereas the case for small is largely unexplored. Considering many applications of Euclidean clustering algorithms are in small dimensions and the lack of systematic studies in the current literature, this paper investigates coresets for -Median in small dimensions. For small , a natural question is whether existing near-optimal dimension-independent bounds can be significantly improved. We provide affirmative answers to this question for a range of parameters. Moreover, new lower bound results are also proved, which are the highest for small . In particular, we completely settle the coreset size bound for -d -Median (up to log factors). Interestingly, our results imply a strong separation between -d -Median and -d -Median. As far as we know, this is the first such separation between and in any dimension.

参考文章(0)