作者: Vladimir Braverman , Robert Krauthgamer , Shaofeng H.-C. Jiang , Xuan Wu
DOI:
关键词:
摘要: We design coresets for Ordered k-Median, a generalization of classical clustering problems such as k-Median and k-Center, that offers more flexible data analysis, like easily combining multiple objectives (e.g., to increase fairness or Pareto optimization). Its objective function is defined via the Weighted Averaging (OWA) paradigm Yager (1988), where points are weighted according predefined weight vector, but in order their contribution (distance from centers). A powerful data-reduction technique, called coreset, summarize point set $X$ $\mathbb{R}^d$ into small (weighted) $X'$, every $k$ potential centers, value coreset $X'$ approximates within factor $1\pm \epsilon$. When there (weights), above standard might have limited usefulness, whereas \emph{simultaneous} which was introduced recently by Bachem Lucic Lattanzi (2018), approximation holds all weights (in addition Our main result construction simultaneous size $O_{\epsilon, d}(k^2 \log^2 |X|)$ k-Median. To validate efficacy our we ran experiments on real geographical set. find algorithm produces translates massive speedup computations, while maintaining high accuracy range weights.