Penn Arts & Sciences Logo

AMCS Colloquium

Friday, April 28, 2023 - 1:45pm

Bodhisattva Sen

Columbia University, Statistics


University of Pennsylvania

DRL - A4

Abstract: We propose a general framework for distribution-free nonparametric testing in multi-dimensions, based on a notion of multivariate ranks defined using the theory of optimal transport (see e.g., Villani (2003)). We demonstrate the applicability of this approach by constructing exactly distribution-free tests for two classical nonparametric problems: (i) testing for the equality of two multivariate distributions, and (ii) testing for mutual independence between two random vectors. In particular, we propose (multivariate) rank versions of Hotelling T^2 and kernel two-sample tests (e.g., Gretton et al. (2012), Szekely and Rizzo (2013)), and kernel tests for independence (e.g., Gretton et al. (2007), Szekely et al. (2007)) for scenarios (i) and (ii) respectively. We investigate the consistency and asymptotic distributions of these tests, both under the null and local contiguous alternatives. We also study the local power and asymptotic (Pitman) efficiency of these multivariate tests (based on optimal transport), and show that a subclass of these tests achieve attractive efficiency lower bounds that mimic the remarkable efficiency results of Hodges and Lehmann (1956) and Chernoff and Savage (1958) (for the Wilcoxon-rank sum test). To the best of our knowledge, these are the first collection of multivariate, nonparametric, exactly distribution-free tests that provably achieve such attractive efficiency lower bounds. We also study the rates of convergence of the rank maps (aka optimal transport maps).