Frank NIELSEN | Centroids, statistical centroids, and center-based clustering

Centroids and statistical centroids (and other centers: medians, circumcenters, etc.)

In general, let us define a centroid as the minimizer of the average distance of a center object to a given collection of objects. For example, the Euclidean centroid minimizes the average squared Euclidean distance to a point set (= minimize the set variance wrt to a center), and is well-known to be the center of mass, the arithmetic mean of the points. When the collection of objects is a set of distributions and the distance a statistical distance, we get a statistical centroid (a probability distribution). In statistics, distances can be metric ones (eg, the total variation or Wasserstein distances) or non-metric thrice differentiable ones (called divergences like the Kullback-Leibler divergence or the Bhattacharrya divergence, and more generally f-divergences). Computing centroids are essential for center-based clustering a la k-means (or variational k-means). Here is a collection of centroids that we have studied:

Disclaimer: Papers are copyrighted by their respective owners and provided online as a courtesy to publishers. Please check the appropriate copyright information and license agreements of documents. If you do not comply with copyright/license terms, you are not allowed to download any material.

Bregman centroids (closed-forms) and symmetrized Bregman centroids (usually not in closed-form). When dealing with exponential families, Kullback-Leibler centroids amount to Bregman centroids.
Sided and Symmetrized Bregman Centroids (IEEE TIT, 2009)
Jeffreys centroids, the symmetrized Kullback-Leibler centroids (closed-form for positive non-normalized distributions and guaranteed approximations for probability distributions).
Jeffreys Centroids: A Closed-Form Expression for Positive Histograms and a Guaranteed Tight Approximation for Frequency Histograms (IEEE SPL, 2013)
Dual pair of centroids for mixed divergences (closed-form for mixed alpha-divergences).
On Clustering Histograms with k-Means by Using Mixed alpha-Divergences (Entropy 2014)
Jensen centroids (including Jensen-Shannon centroids), also called Burbea-Rao centroids, and skewed Jensen centroids (with Bregman centroids in limit cases).
- The Burbea-Rao and Bhattacharyya Centroids (IEEE TIT, 2011)
- Jensen divergence based SPD matrix means and applications (ICPR 2012)
Robust total Bregman centroids.
Shape retrieval using hierarchical total Bregman soft clustering (IEEE PAMI, 2012)
Robust (and non-robust) total Jensen centroids.
Total Jensen divergences: Definition, properties and clustering (ICASSP 2015)
Conformal centroids (wrt. conformal divergences).
On Conformal Divergences and Their Population Minimizers (IEEE TIT 2016)
Symmetrized skewed centroids smoothly generalizing Jensen-Shannon centroids and Jeffreys centroids.
A family of statistical symmetric divergences based on Jensen's inequality (arxiv 2010)

When we define new statistical distances, we investigate how to efficiently compute them (and then can consider computing corresponding centroids)

Divergences based on Shannon, Rényi, Tsallis, Sharma-Mittal entropies have closed-form formula for distributions belonging to the same exponential family.
- Entropies and cross-entropies of exponential families (ICIP 2010)
- On Rényi and Tsallis entropies and divergences for exponential families (arxiv 2011)
- A closed-form expression for the Sharma-Mittal entropy of exponential families (Journal of Physics A: Mathematical and Theoretical, 2012)
Chernoff divergence (called Chernoff information) can be exactly characterized geometrically with closed-form for uni-order exponential famlies.
An Information-Geometric Characterization of Chernoff Information (IEEE SPL 2013)
Chernoff information of exponential families (arxiv 2011)
Kullback-Leibler divergence of statistical mixtures is not analytic, but get deterministic lower and upper bounds using log-sum-exp inequalities.
- Guaranteed Bounds on Information-Theoretic Measures of Univariate Mixtures Using Piecewise Log-Sum-Exp Inequalities
- Guaranteed bounds on the Kullback-Leibler divergence of univariate mixtures using piecewise log-sum-exp inequalities(arxiv, IEEE SPL 2016)
Model hyperbolic centroids (Fisher-Rao centroid of location-scale families).
Model centroids for the simplification of Kernel Density estimators (ICASSP 2012)
Tailored divergences for closed-form formula on statistical mixtures: Cauchy-Schwarz divergence, Jensen-Rényi divergence.
Closed-Form Information-Theoretic Divergences for Statistical Mixtures (ICPR 2012)
Approximations of arbitrary f-divergences for exponential families with natural affine parameter space.
On the chi square and higher-order chi distances for approximating f-divergences (IEEE SPL, 2014)
Non-flat alpha-divergence probability centroids and flat alpha-divergence positive measure centroids.
Non-flat clustering with alpha-divergences (ICASSP 2011)
Finsler centroids (and medians).
Medians and means in Finsler geometry (LMS Journal of Computation and Mathematics 2012)

So far we have considered sets of independent random variables. Let us now consider dependent random variables like correlated stochastic processes (random walks, etc) estimated from time-series datasets.

Optimal copula transport: Copulas can be used to either analyze the intra-dependence of a multivariate time series or the inter-dependence between two time series.
Optimal copula transport for clustering multivariate time series (ICASSP 2016)
Fisher-Rao geodesic distance between copulas.
Optimal Transport vs. Fisher-Rao distance between Copulas for Clustering Multivariate Time Series (IEEE SSP 2016)

Now instead of using the average divergence (or squared metric distance), we may consider minimizing the maximum distance of the representative center to the input set: This is the definition of the minmax center, 1-center, or circumcenter of the smallest enclosing ball.

Euclidean circumcenter.
Bregman circumcenter.
- On the smallest enclosing information disk (IPL 2008)
- Fitting the Smallest Enclosing Bregman Ball (ECML 2005)
Riemannian minimax center.
On approximating the Riemannian 1-center (DCG 2013)
Hyperbolic circumcenter.
- Approximating Covering and Minimum Enclosing Balls in Hyperbolic Geometry
- Hyperbolic Voronoi Diagrams Made Easy (ICCSA 2010)
  Related works:
  - Wasserstein centroids.
    Fast Computation of Wasserstein Barycenters, Marco Cuturi, Arnaud Doucet, 2013 (ICML 2014).
    Tsallis Regularized Optimal Transport and Ecological Inference (arxiv 2016)
  - Projective divergences and projective centroids.
    Patch matching with polynomial exponential families and projective divergences (SISAP 2016)
  Last updated in November 2016 by Frank Nielsen.