An algebraic inverse theorem for the quadratic Littlewood-Offord problem, and an application to Ramsey graphs

Matthew Kwan; Lisa Sauermann

doi:10.19086/da.14351

Probability

August 11, 2020 BST

An algebraic inverse theorem for the quadratic Littlewood-Offord problem, and an application to Ramsey graphs

https://doi.org/10.19086/da.14351

math.co (mathematics - combinatorics)math.pr (mathematics - probability)

Photo by P T on Unsplash

Kwan, Matthew, and Lisa Sauermann. 2020. “An Algebraic Inverse Theorem for the Quadratic Littlewood-Offord Problem, and an Application to Ramsey Graphs.” Discrete Analysis, August. https://doi.org/10.19086/da.14351.

Editorial introduction

Read article at ArXiv

An algebraic inverse theorem for the quadratic Littlewood-Offord problem, and an application to Ramsey graphs, Discrete Analysis 2020:12, 34 pp.

The Littlewood-Offord problem is the following general question. Let $v_1,\dots,v_n$ be vectors in $\mathbb R^d$ of norm at least 1 and let $B\subset\mathbb R^d$ be a closed ball of diameter $\Delta$ . Of the $2^n$ sums $\sum_{i\in E}v_i$ , where $E\subset\{1,2,\dots,n\}$ , how many can lie in $B$ ? Erdős observed that when $d=1$ and $\Delta<1$ , it is not possible for $\sum_{i\in E}v_i$ and $\sum_{i\in F}v_i$ both to lie in $B$ if $E$ is a proper subset of $F$ . Sperner’s theorem states that the largest collection of subsets of $\{1,2,\dots,n\}$ with no set in the collection being a proper subset of another has size at most $\binom n{\lfloor n/2\rfloor}$ (with equality if one takes all sets of size $\lfloor n/2\rfloor$ or alternatively all sets of size $\lceil n/2\rceil$ ). He showed more generally that the best bound if $B$ is an interval of length $\Delta$ is the sum of the sizes of the $\lfloor\Delta\rfloor+1$ largest binomial coefficients. It is easy to see that those results are best possible by considering the case where every $v_i$ is equal to 1. For general $d$ , Kleitman proved in 1965 that if $\Delta<1$ then the same bound of $\binom n{\lfloor n/2\rfloor}$ holds, and a good understanding for general $\Delta$ was achieved by Frankl and Füredi in 1988.

Of particular interest is the case $\Delta=0$ , where one is asking in how many ways the same value can be taken. It is convenient to rephrase this question in an equivalent probabilistic way. When $d=1$ the rephrased question asks the following. Let $\xi_1,\dots,\xi_n$ be independent Bernoulli random variables and let $a_1,\dots,a_n$ be real numbers greater than 1. What is the largest that a probability of the form $\mathbb P[\sum_ia_i\xi_i=t]$ can be? The result of Erdős states that it cannot be greater than $2^{-n}\binom n{\lfloor n/2\rfloor}$ , which is of order $n^{-1/2}$ .

Sums of this kind come up naturally in the study of random matrices. Indeed, if $M$ is a random $n\times n$ 01-matrix with all its entries independent, then the coordinates of the image of the vector $(a_1,\dots,a_n)^T$ will all be of the above form. (In practice it is more convenient to study $\pm 1$ -valued matrices, but the problems are essentially equivalent.) In a series of important papers, Tao and Vu realized that it helped them to understand the properties of sequences $(a_1,\dots,a_n)$ that come somewhere near achieving the Erdős bound, in the sense that the probability has a power dependence on $n$ . They proved an inverse Littlewood-Offord theorem, showing that such sequences must have arithmetic relationships between the $a_i$ – their results in this direction are closely related to Freiman’s inverse theorem for sets with small sumsets.

This paper considers a generalization in another direction. The problem above can be expressed as a question about linear polynomials in $n$ variables. If $f$ is such a polynomial, how large can the probability that $f(\xi_1,\dots,\xi_n)=t$ be, given suitable assumptions about the coefficients? Once the question is phrased that way, it is natural to ask what happens if we replace “linear” by “quadratic”. The example $(x_1+\dots+x_n)^2$ (with $n$ even) shows that the probability that $f(x)=0$ can be as large as the maximum probability in the linear case, but it is notable that this example has very low rank: it is the square of a single linear polynomial. The paper proves that this phenomenon is more general: if a value is taken with probability significantly greater than $n^{-1}$ , then the polynomial must be close to a polynomial of low rank.

To see that $n^{-1}$ is a natural bound, consider a quadratic $\sum_{i,j=1}^n\epsilon_{ij}x_ix_j$ , where the $\epsilon_{ij}$ take the values $\pm 1$ independently at random. For each fixed choice of $\{0,1\}$ -values for the $x_i$ , the variance of this sum will be of order the square of the number of $x_i$ that are equal to 1, which is typically of order $n^2$ . Therefore, the standard deviation is of order $n$ , so there is a range of order $n$ roughly equally likely integer values that the sum might take. Therefore we expect that if we make a typical choice for the $\epsilon_{ij}$ , these values will be roughly equally likely as we range over sequences $x$ with a given number of 1s, so they will occur with probability around $n^{-1}$ . Since the quadratic typically has full rank, we therefore cannot conclude anything about the rank from the existence of a value that occurs with this probability.

As this example shows, there are two phenomena at play here. As in the linear case, there are arithmetic properties of the set of coefficients, but in the quadratic case it also makes a great deal of difference how the coefficients are distributed amongst the pairs $ij$ – in the case where the coefficients take just two values, the structure of the graph of pairs where one of the values is taken is very important.

In the light of this, it should not come as too much of a surprise that the results of this paper have graph-theoretic applications. A Ramsey graph is a graph with $n$ vertices that contains no clique or independent set with more than $C\log n$ vertices. The main examples of Ramsey graphs are random graphs, and there are are a number of results that aim to show that a Ramsey graph must be random-like in various ways. For instance, one class of such results states that there must be many different numbers of edges that an induced subgraph can have. The results of this paper are highly relevant, since the number of edges in an induced subgraph is in one-to-one correspondence with the value of the associated quadratic polynomial on the characteristic function of the vertex set of that subgraph, so the paper shows that if any number of edges occurs with probability significantly greater than $n^{-1}$ , then the graph must have quite strong “low rank structure”. The authors show that this structure is incompatible with the graph being Ramsey, thereby giving an asymptotic answer to a question of Kwan, Sudakov and Tran.

^{Matthew Kwan talking about the results in this paper.}

Read article at ArXiv