An efficient container lemma

Jozsef Balogh; Wojciech Samotij

doi:10.19086/da.17354

Extremal combinatorics

December 10, 2020 BST

An efficient container lemma

https://doi.org/10.19086/da.17354

containershypergraphscombinatorics

Photo by Bernd Dittrich on Unsplash

Balogh, Jozsef, and Wojciech Samotij. 2020. “An Efficient Container Lemma.” Discrete Analysis, December. https://doi.org/10.19086/da.17354.

Editorial introduction

Read article at ArXiv

An efficient container lemma, Discrete Analysis 2020:17, 56 pp.

The hypergraph container lemma, discovered independently in 2012 by David Saxton and Andrew Thomason, and by József Balogh, Robert Morris and Wojciech Samotij, is an extremely powerful tool that has been used to solve a large number of important problems in extremal combinatorics and to give simple new proofs of many others. It is not easy to give a precise statement of it in a short space, but one can give something of its flavour by looking at a special case.

Suppose one would like to estimate the number of triangle-free graphs with vertex set $\{1,2,\dots,n\}$ . One might at first guess that a typical triangle-free graph would be something like a random graph with edge probability around $n^{-1/2}$ and appropriate edge-deletions. This should give us a lower bound of the form $n^{cn^{3/2}}$ . However, a moment’s reflection shows that a bound like this is a very long way indeed from being best possible, since if $H$ is a complete bipartite graph with $n^2/4$ edges, then the number of subgraphs of $H$ , all of which are triangle free, is $2^{n^2/4}$ , which is vastly bigger than $n^{n^{3/2}}$ .

Encouraged by this, one may try to go further, by exploiting the fact that one could have started with any number of different complete bipartite graphs. However, there are only $2^n$ of them, so this will give at best a marginal improvement to the exponent $n^2/4$ .

It therefore begins to appear as though almost all triangle-free graphs are in a certain sense organized into “clusters” – that is, families of triangle free graphs where all the graphs in a given family are contained in some large representative triangle-free graph. And while the above remarks do not come close to proving a statement of such a kind, it turns out that this apparent clustering phenomenon is real, at least when appropriately formulated. Indeed, the following result is a special case of the container lemma.

Theorem. For every $\epsilon>0$ there exists a constant $C$ such that for every $n$ there is a family $\mathcal G$ of graphs with vertex set $\{1,2,\dots,n\}$ with the following properties.
1. No graph in $\mathcal G$ contains more than $\epsilon n^3$ triangles.
2. Every triangle-free graph is contained in some $G$ that belongs to $\mathcal G$ .
3. $|\mathcal G|\leq n^{Cn^{3/2}}$ .

To apply this result to our counting problem requires a lemma that says that a graph with at most $\epsilon n^3$ triangles cannot have more than $(\frac 14+c(\epsilon))n^2$ edges, where $c(\epsilon)\to 0$ as $\epsilon\to 0$ . Such lemmas are usually called supersaturation lemmas, and are usually stated in the contrapositive, so in this case the statement would be that a graph with substantially more than $n^2/4$ edges must contain many triangles.

Given this lemma, we obtain immediately that the number of triangle-free graphs is at most $2^{(1/4+c(\epsilon))n^2}n^{C(\epsilon)n^{3/2}}$ , which implies an upper bound of the form $2^{(1/4+o(1))n^2}$ , and a more precise statement can be obtained by optimizing over $\epsilon$ .

One can also use the theorem to show that the largest triangle-free subgraph $H$ of a sparse random graph $G$ cannot contain substantially more than half the edges of $G$ . A straightforward approach to this problem is difficult, because a simple calculation shows that the expected number of triangle-free subgraphs of a random graph with the appropriate edge probability (which is of order $n^{-1/2}$ ) is large. However, that is because a few sparse graphs – the ones that are close to being bipartite – have a huge number of triangle-free subgraphs. With the help of the theorem above, one can instead argue as follows. A typical sparse random graph will intersect every graph in $\mathcal G$ in roughly half its edges (provided the edge probability is significantly larger than $Cn^{-1/2}$ , so that estimates are strong enough for a union bound to work). Therefore no subgraph with substantially more than half the edges is contained in any $G\in\mathcal G$ . It follows that every subgraph with substantially more than half the edges contains a triangle.

These results were already known, but the hypergraph container lemma is a considerable generalization of the theorem above, which enables one to prove many other results of a similar kind, as well as other classes of results, and all in a unified way. In the above special case, the relevant hypergraph is a 3-uniform hypergraph whose vertices are the edges of the complete graph on $\{1,2,\dots,n\}$ and whose faces are the triples of edges that form triangles. A triangle-free graph can be regarded as an independent set in this hypergraph. The container lemma is a similar statement that applies to a much wider class of hypergraphs.

This important paper provides a substantial extension of the container method, by providing a genuinely different and more efficient way to construct containers. A significant advantage of this is that it enables one to prove results where the hypergraph is $k$ -uniform for some $k$ that tends to infinity with $n$ at a significant rate. For example, as long as $r=o(\log n/\log\log n)$ the authors can prove results about graphs that contain no clique of size $r$ . Since a random graph does not contain cliques of size $C\log n$ , this is almost as much as could be hoped for.

The authors obtain a number of other applications, but it seems very likely that this is just the start, and that, just like the original container lemma, the efficient container lemma they prove will become an essential tool in extremal combinatorics.

Read article at ArXiv