Random multilinear maps and the Erdős box problem

David Conlon; Cosmin Pohoata; Dmitriy Zakharov

doi:10.19086/da.28336

Extremal combinatorics

September 27, 2021 BST

Random multilinear maps and the Erdős box problem

https://doi.org/10.19086/da.28336

math.co (mathematics - combinatorics)

ccby-4.0

Photo by Esther Jiao on Unsplash

Conlon, David, Cosmin Pohoata, and Dmitriy Zakharov. 2021. “Random Multilinear Maps and the Erdős Box Problem.” Discrete Analysis, September. https://doi.org/10.19086/da.28336.

Editorial introduction

Read article at ArXiv

Random multilinear maps and the Erdős box problem, Discrete Analysis 2021:17, 8 pp.

A major theme in extremal combinatorics is determining the maximum number of edges that a graph or hypergraph can have if it does not contain a certain fixed graph or hypergraph. In the case of graphs, Turán’s theorem gives an exact result when the forbidden subgraph is the complete graph $K_r$ (the maximum number of edges is attained when the graph is an $(r-1)$ -partite graph with vertex sets as equal as possible in size), and the Erdős-Stone theorem shows that if the forbidden subgraph has chromatic number $r$ then the same example is at least asymptotically best possible.

When $r=2$ , meaning that the forbidden subgraph is bipartite, the Erdős-Stone theorem merely tells us that the asymptotic density of the graph is zero, which can be proved by a relatively simple direct argument, first noted by Kővári, Sós and Turán. The Zarankiewicz problem, which has led to a large amount of research, is to obtain more precise bounds for the number of edges. Even for general complete bipartite graphs there is a significant gap between the best known upper and lower bounds.

One case of interest is that of 4-cycles. If $G$ is a graph with $n$ vertices and we write $N(x)$ for the neighbourhood of a vertex $x$ , then the number of edges of $G$ is $\sum_x|N(x)|$ (ignoring a factor 2) and the number of labelled 4-cycles, including degenerate ones, is $\sum_{x,y}|N(x)\cap N(y)|^2$ . By Cauchy-Schwarz, the latter quantity is at least $n^{-2}(\sum_{x,y}|N(x)\cap N(y)|)^2$ . But counts the number of labelled and possibly degenerate paths of length 2, so it is equal to which is up to a constant also the number of degenerate 4-cycles. Thus, there is a non-degenerate cycle provided that
$n^{-2}(\sum_z|N(z)|^2)^2\gg\sum_z|N(z)|^2,$
or in other words provided that $\sum_z|N(z)|^2\gg n^2$ . But by Cauchy-Schwarz again, $\sum_z|N(z)|^2\geq n^{-1}(\sum_z|N(z)|)^2=m^2n^{-1}$ . Thus, if $m\gg n^{3/2}$ , then $G$ contains a non-degenerate 4-cycle.

One might expect an upper bound proved in as natural a way as this to be matched by a lower bound given by the standard technique of picking a random graph and removing a few edges. However, if we choose edges independently with probability $p$ , then the expected number of (labelled and possibly degenerate) 4-cycles is $p^4n^4$ while the number of edges is $pn^2$ (up to a constant), so for this to work we need $p^3\gg n^{-2}$ , which results in a graph with around $n^{4/3}$ edges instead of $n^{3/2}$ . Nevertheless, the $n^{3/2}$ bound can be shown to be sharp using a construction due to Klein that appears in a paper of Erdős. Define a graph by taking the vertex set to be $\mathbb F_p^2$ and join $(x_1,y_1)$ to $(x_2,y_2)$ if $y_1+y_2=(x_1+x_2)^2$ . If $(x_1,y_1), (x_2,y_2), (x_3,y_3), (x_4,y_4)$ forms a 4-cycle in this graph, then setting $a_i=x_i+x_{i+1}$ (where $i+1$ is interpreted as 1 when $i=4$ ), we have that $a_1+a_3=a_2+a_4$ and $a_1^2+a_3^2=a_2^2+a_4^2$ . This implies that $a_1-a_2=a_4-a_3$ and $a_1^2-a_2^2=a_4^2-a_3^2$ , which implies that either $a_1=a_2$ or $a_1+a_2=a_4+a_3$ , and therefore that $a_1=a_4$ and $a_2=a_3$ . Thus, all 4-cycles are degenerate.

There are many situations like this in extremal graph theory and extremal combinatorics more generally, where random constructions work reasonably well, but are outperformed by explicit algebraic constructions. However, for many of these situations, even the best known algebraic constructions do not give bounds that match what is known in the other direction.

Recently, an interesting new method has emerged, which is a sort of hybrid of the random and algebraic approaches. Though the roots of the idea can be found earlier, it took off properly in 2014 with an influential paper by Boris Bukh entitled Random algebraic construction of extremal graphs. (A somewhat different mixture of randomness and algebra can also be found in Peter Keevash’s remarkable constructions of designs from the same year.)

Broadly speaking, Bukh’s insight was that if a random construction has algebraic aspects to it, then it is sometimes possible to show that if a forbidden configuration is present, then some much larger configuration must also be present. It is easier to rule out the larger configuration, and this ultimately leads to a better bound. Bukh’s example was a bipartite graph with vertex sets equal to $\mathbb F_q^s$ , with $x$ joined to $y$ if and only if $(x,y)$ is a zero of a random polynomial in $2s$ variables of degree at most $d$ in each variable (for suitably chosen $d$ ). The randomness of the polynomial is enough for the calculation of various moments, but the all-important non-randomness arises because the size of the zero set of a polynomial is tightly constrained: the zero set itself has a dimension $k$ and the size of the set must be approximately $q^k$ . In particular, if it is not bounded, then it must be at least comparable to $q$ .

This paper concerns the natural hypergraph generalization of the 4-cycle problem above. That is, one is looking for the maximum number of edges in a $d$ -partite $d$ -uniform hypergraph that contains no complete $d$ -partite subhypergraph with two vertices in each vertex class. If the vertex sets of the hypergraph are $X_1,\dots,X_d$ and one represents edges by points in $X_1\times\dots\times X_d$ , then this forbidden configuration is a set of the form $Y_1\times\dots\times Y_d$ where each $Y_i$ is a set of size 2: that is, it is a discrete aligned cuboid, which explains the name “box problem”. Alternatively, if we think of the edges of the hypergraph as simplices, then the forbidden configuration is a $d$ -dimensional octahedron (often known as a cross-polytope).

One motivation for these extremal problems is that $d$ -dimensional octahedra are important in the theory of quasirandomness: just as a graph is quasirandom if it has as few 4-cycles as possible given its density, so a hypergraph is quasirandom (in many useful senses) if it has as few $d$ -dimensional octahedra as possible.

Cauchy-Schwarz type arguments show that if the $d$ vertex sets have size $n$ , then a box-free hypergraph has $O(n^{d-1/2^{d-1}})$ edges. However, as we have already seen when $d=2$ , the probabilistic method with deletions does not give a matching lower bound: instead it gives a hypergraph with $cn^{d-d/(2^d-1)}$ edges.

Gunderson, Rödl and Sidorenko used a random algebraic construction to improve on this bound for all $d$ such that $d$ is coprime to $2^d-1$ . When $d=3$ , they took three copies $X,Y$ and $Z$ of $\mathbb F_p^5$ as their vertex sets and for each $x\in X$ and $y\in Y$ they took all triples $(x,y,z)$ such that $z$ belonged to a randomly chosen 3-dimensional affine subspace $R_{xy}$ of $\mathbb F_p^5$ . This construction yields a lower bound of $cn^{13/5}$ , whereas the bound from the purely random method is $cn^{18/7}$ , which is smaller (by 1/35).

In this paper the authors use random multilinear maps. That is, they let each $X_i$ be a copy of $\mathbb F_q^s$ for a carefully chosen $s$ , and then for suitable $r$ they choose a random multilinear map $\mu:X_1\times\dots\times X_d\to\mathbb F_q^r$ and take as their edge set the set of all $(x_1,\dots,x_d)$ such that $\mu(x_1,\dots,x_d)=(1,1,\dots,1)$ .

The key point about this construction is that if the resulting hypergraph contains a box $Y_1\times\dots\times Y_d$ , then in fact it contains the much larger set $Z_1\times\dots\times Z_d$ , where each $Z_i$ is the line the contains $Y_i$ . This enables the authors to obtain a non-trivial gain using a double counting argument and then to apply the deletion method more efficiently. Their method works for all $d$ and improves on the bound of Gunderson, Rödl and Sidorenko, when their method works, except if $d$ is a power of 2. This very interesting proof technique may well have further applications.

The bound they obtain is slightly complicated to specify, but the exponent is at least $d-1/{\lceil d^{-1}(2^d-1)\rceil}$ . Since $d$ cannot divide $2^d-1$ , this is bigger than the $d-d/(2^d-1)$ that is given by the probabilistic method. Though the gain is small, it turns out that by a result of Ferber, McKinley and Samotij, any power-type gain on the obvious probabilistic bound yields an optimal counting result for the number of box-free hypergraphs with $n$ vertices in each class – thus, the small gain has a very satisfying corollary.

More details can be found in the paper, or in the following talk given by one of the authors.

Read article at ArXiv