20 Nov 2025

Bernoulli Society: One World Probability Webinar #4

Date 20 Nov 2025
Time 14:00 GMT+01:00 - 16:00 GMT+01:00
Level of instruction Intermediate
Instructor
Registration fee

Sub-Gaussian estimation under heavy tails and contamination 
(Zoraida F. Rico)

Abstract

Estimating the mean and covariance under heavy tails and adversarial contamination remains a central challenge in robust statistics. In this talk, we revisit the classical trimmed mean estimator for one-dimensional mean estimation, providing new finite-sample insights. We show that the trimmed mean achieves optimal performance in this setting and satisfies a strong form of the CLT. This work is joint with Roberto I. Oliveira and Paulo Orenstein (2025).

We then turn to covariance estimation for a d-dimensional random vector from an i.i.d. sample. We show that, even in the presence of relatively heavy tails and adversarial contamination, this estimator achieves the optimal dimension-free rate of convergence. This part is based on joint work with Roberto I. Oliveira (Annals of Statistics, 2024).

Robust, sub-Gaussian mean estimators in metric spaces 
(Roberto I. Oliveira)

Abstract

Estimating the mean of a random vector from i.i.d. data has received considerable attention. When the data take values in more general metric spaces, an appropriate extension of the notion of the mean is the Fréchet mean. While asymptotic properties of the most natural Fréchet mean estimator (the empirical Fréchet mean) have been thoroughly researched, non-asymptotic performance bounds have only been studied recently. This talk considers the performance of estimators of the Fréchet mean in general metric spaces under possibly heavy-tailed and contaminated data. In such cases, the empirical Fréchet mean is a poor estimator. We propose a general estimator based on high-dimensional extensions of trimmed means and prove general performance bounds. Unlike all previously established bounds, ours generalize the optimal bounds known for Euclidean data. Much like in the Euclidean case, the optimal accuracy is governed by two “variance” terms: a “global variance” term that is independent of the prescribed confidence, and a potentially much smaller, confidence-dependent “local variance” term. We apply our results for metric spaces with curvature bounded from below, such as Wasserstein spaces, and for uniformly convex Banach spaces. The talk is based on arXiv:2509.13606, which is joint work with Daniel Bartl (NUS), Gabor Lugosi (ICREA/UPF), and Zoraida Rico (Bocconi).