Randomised Wasserstein Barycenter Computation: Resampling with Statistical Guarantees


Heinemann F, Munk A, Zemel Y


SIAM Journal on Mathematics of Data Science


SIAM Journal on Mathematics of Data Science 2022 4:1, 229-259.


We propose a hybrid resampling method to approximate finitely supported Wasserstein barycenters on large-scale datasets, which can be combined with any exact solver. Nonasymptotic bounds on the expected error of the objective value as well as the barycenters themselves allow one to calibrate computational cost and statistical accuracy. The rate of these upper bounds is shown to be optimal and independent of the underlying dimension, which appears only in the constants. Using a simple modification of the subgradient descent algorithm of Cuturi and Doucet, we showcase the applicability of our method on myriad simulated datasets, as well as a real-data example from cell microscopy, which are out of reach for state-of-the-art algorithms for computing Wasserstein barycenters.