Anti-concentration of random variables from zero-free regions
- University of Illinois at Chicago
- More about Marcus Michelen
- University of Cambridge
- More about Julian Sahasrabudhe
Editorial introduction
Anti-concentration of random variables from zero-free regions, Discrete Analysis 2022:13, 29 pp.
In recent years there have been a number of breakthroughs concerning the probability that a random matrix is singular. There are several natural notions of “random matrix”, but perhaps the most famous result in this direction is Tikhomirov’s theorem that the probability that a random n×n matrix with independent ±1 entries is singular with probability (1/2+o(1))n.
A key tool in many arguments in this area is the use of anticoncentration estimates. A random variable X is informally said to be “concentrated” at a point t if X is close to t with high probability. There have been many proofs that exploit the “concentration of measure” phenomenon, which is roughly the fact that if a random variable is made out of several “sufficiently independent” parts, then it will be highly concentrated about its mean. An anticoncentration estimate is, as its name suggests, more like the opposite – finding upper bounds on the probability that the random variable takes a particular value, or belongs to some small interval.
Sometimes this can be shown by means of a local central limit theorem – that is, a theorem that says not just that some discrete random variable belongs to large intervals with approximately the probability given by an appropriate normal distribution, but even that it takes particular values with the probabilities one would heuristically expect: if these are small, as they often are, then one obtains anticoncentration.
Unfortunately, there are not many general tools for proving anticoncentration. What makes this paper interesting is that it provides us with another one: it shows that anticoncentration estimates for a probability distribution can be deduced from properties of its generating function.
To get some idea of why this might be the case, let X1,X2,…,Xn be independent random variables with Xi taking the value 1 with probability pi and 0 with probability 1−pi for each i. The generating function of the sum ∑ni=1Xi is ∏ni=1(1−pi+pix), which has roots at −(1−pi)/pi, and it is easy to see that, conversely, if the generating function of a random variable X is a polynomial of degree n with negative roots −c1,…,−cn, then X is a sum of independent Bernoulli variables Xi with (1−pi)/pi=ci for each i.
By standard techniques, one can prove good anticoncentration estimates for such a random variable X as long as the pi are bounded away from 0 and 1 (obviously if too many of the Xi take particular values with high probability then we cannot hope for anticoncentration), so we see that we have good anticoncentration estimates if the roots of the generating function are negative, bounded, and bounded away from 0. Note that good anticoncentration here means that no value is taken with probability significantly greater than n−1/2, which is the trivial minimum given that the standard deviation of X is at most n1/2.
The main result of this paper shows that the example just described is a special case of a much more general phenomenon: to guarantee anticoncentration of a random variable X that takes values in {0,1,…,n} it is enough if the absolute values of the roots of the generating function are bounded and bounded away from zero, and the arguments of the roots are bounded away from zero. Thus, the condition on the absolute values is the same, but the requirement on the arguments is greatly weakened – instead of all the arguments being π, they simply have to lie outside the interval (−δ,δ) for some δ>0 that is independent of n. One of the useful aspects of this result is that nothing is assumed about how X is built: this makes it possible in principle (and, as the authors demonstrate, also in practice) to prove anticoncentration results for random variables that are not sums of many independent variables.
Broadly, the proof consists of two parts. First, a lower bound on the variance is provided in terms of a truncated Mellin transform of the difference of the logarithmic probability generating function evaluated at two related points. Second, the Mellin transform near zero is bounded from below in terms of the geometry of the roots of the probability generating function. Both parts are non-trivial and require a combination of various techniques, some of them taken from an earlier paper of the same authors, which related probability generating functions to central limit theorems.