Efficient arithmetic regularity and removal lemmas for induced bipartite patterns

Let $G$ be an abelian group of bounded exponent and $A \subseteq G$. We show that if the collection of translates of $A$ has VC dimension at most $d$, then for every $\epsilon>0$ there is a subgroup $H$ of $G$ of index at most $\epsilon^{-d-o(1)}$ such that one can add or delete at most $\epsilon|G|$ elements to/from $A$ to make it a union of $H$-cosets. We also establish a removal lemma with polynomial bounds, with applications to property testing, for induced bipartite patterns in a finite abelian group with bounded exponent.


Introduction
Szemerédi's regularity lemma [26] gives a rough structural decomposition for all graphs and is one of the most powerful tools in graph theory. A major drawback of the regularity lemma is that the number of parts in the decomposition grows as an exponential tower of 2's of height a power of 1/ǫ, where ǫ is the regularity parameter. A natural question that has been studied by many researchers is: in what circumstances can one get a more effective bound? Namely, under what conditions does every graph in a family F of graphs necessarily have a partition with much fewer parts, say polynomial in 1/ǫ? One natural condition for a family F of graphs is that it is hereditary, that is, if G ∈ F then every induced subgraph of G is also in F. For hereditary families, it turns out that the bound on the number of parts in a regular partition is polynomial in 1/ǫ if the neighborhood set system of every graph in the family has bounded VC dimension, and otherwise the bound is tower-type. This gives a satisfactory answer to the problem.
A set system S is a collection of subsets of some ground set Ω. Here we only consider finite Ω. We say that U ⊂ Ω is shattered by S if for every U ′ ⊂ U there is some T ∈ S with T ∩ U = U ′ . The Vapnik-Chervonenkis dimension (or VC dimension) of S, denoted VC-dim S, is the size of the largest shattered set.
Let G be a graph. The neighborhood N (v) of a vertex v ∈ V (G) is the set of vertices adjacent to v. The VC dimension of a graph G is defined to be VC-dim{N (v) : v ∈ V }.
Given a bipartite graph F with vertex bipartition V (F ) = U ∪ V , we say that a map φ : V (F ) → V (G) bi-induces F if for every (u, v) ∈ U × V , the pair uv is an edge of F if and only if φ(u)φ(v) is an edge of G. Note that we have no requirements about edges in G between vertices in the image of U , and likewise with V . We say that G contains a bi-induced copy of H if there exists a map φ as above that is injective on each of U and V . 1 It is known that the following are equivalent for a hereditary family F of graphs: (1) The VC dimension of the graphs in F is uniformly bounded.
(2) There is a bipartite graph F such that none of the graphs in F has a bi-induced copy of F .
NA was supported in part by an ISF grant and a GIF grant. JF was supported by a Packard Fellowship, by NSF Career Award DMS-1352121 and by an Alfred P. Sloan Fellowship.
YZ was supported by NSF Award DMS-1362326. 1 Having a bi-induced copy of F is weaker than having an induced copy of F , where in the latter we also require that there are no edges in G between vertices in the image of U , and likewise with V . Also, an alternative notion of bi-induced copy of H assumes that φ is injective. The discussed results hold for this alternative notion as well.
(3 The family F has a forbidden induced bipartite graph, a forbidden induced complement of a bipartite graph, and a forbidden induced split graph. (4) The number of graphs in F on n vertices is at most 2 n 2−ǫ for some ǫ = ǫ(F) > 0. In contrast, every other hereditary family of graphs contains at least 2 n 2 /4 labeled graphs on n vertices. (5) There is a constant k = k(F) such that every n-vertex graph in F has an equitable vertex partition into at most ǫ −k parts such that all but at most an ǫ-fraction of the pairs of parts have edge density at most ǫ or at least 1 − ǫ. In contrast, every other hereditary family of graphs has a graph that requires a tower in a power of 1/ǫ parts in any ǫ-regular equitable vertex partition.
The above characterizations give an interesting dichotomy between hereditary families of graphs of bounded VC dimension versus those of unbounded VC dimension. It shows that families of graphs with bounded VC dimension have smaller growth and are more structured. The equivalence of (1) and (4) was given by Alon, Balogh, Bollobás, and Morris [1]. Alon, Fischer, and Newman [3] proved a bipartite version of the regularity lemma for graphs of bounded VC dimension, and the version for all graphs is due to Lovász and Szegedy [15]. The proof was simplified with improved bounds by Fox, Pach, and Suk [11]. Further results related to the above equivalences for tournaments can be found in [10].
A half-graph is a bipartite graph on 2k vertices {u 1 , . . . , u k }∪{v 1 , . . . , v k } such that u i is adjacent to v j if and only if i ≤ j. Malliaris and Shelah [16] proved if a graph has no bi-induced copy of the half-graph on 2k vertices, then one can partition the vertex set into ǫ −O k (1) many parts such that every pair of parts is ǫ-regular (there are no irregular pairs). Bi-inducing a half-graph is related to a notion of stability in model theory, and for this reason Malliaris and Shelah called their result a "stable regularity lemma".
The above discussion summarizes some relevant results for graphs. We now turn our attention to subsets of groups and their associated Cayley graphs. Let G be a finite abelian group, written additively. Let A ⊂ G. Consider the Cayley sum graph formed by taking the elements of G as vertices, where x, y ∈ G are adjacent if x+ y ∈ A (we may end up with some loops; alternatively, we can consider a bipartite version of this construction). The VC dimension of the graph corresponds to the VC dimension of the collection of translates of A, which we simply call the VC dimension of A, defined as VC-dim A := VC-dim{A + x : x ∈ G}.
For a bipartite graph F with vertex bipartition U ∪ V , we say that a map φ : We say that A has a bi-induced copy of F if there exists a map φ as above that is injective on each of U and V .
Observe that A has a bi-induced copy of F if its VC dimension is large enough. To see this, first note that if no pair of vertices in U have identical neighborhoods in V , and A has VC dimension at least |V |, then A has a bi-induced copy of F . Indeed, we can construct φ by mapping V to a subset of G shattered by translates of A (such a choice exists since VC-dim A ≥ |V |). Since φ(V ) is shattered, for every u ∈ U , there is some y u ∈ G such that (A − y) ∩ φ(V ) = φ(N (u)). Let φ send u to this y u , for each u ∈ U . We obtain a map φ : V (F ) → G that bi-induces F , though this map may not be injective on U (it is always injective on V ) if some pairs of vertices of U have identical neighborhoods, but this can be easily fixed 2 .
Green [12] proved an arithmetic analogue of Szemerédi's regularity lemma for abelian groups. The statement is much simpler in the case of abelian groups of bounded exponents, which is the main focus of our paper (some remarks regarding general groups are given in the final section). For an abelian group G and a subset A ⊂ G, a coset H + x of a subgroup H is called ǫ-regular if all the nontrivial Fourier coefficients of A ∩ (H + x), when interpreted as a subset of H + x, are at most ǫ. For each ǫ > 0 and positive integer r, Green's arithmetic regularity lemma states that there is K = K(r, ǫ) such that the following holds. If G has exponent at most r and A ⊂ G, then there is a subgroup H ⊂ G of index at most K such that all but an ǫ-fraction of the cosets of H are ǫ-regular.
Recently, an arithmetic analog of the Malliaris-Shelah stable regularity lemma was proved by Terry and Wolf [25] for G = F n p with p fixed. It was shown that if A ⊂ G has no bi-induced copy of a half-graph on 2k vertices, then there is a subgroup H of G of index at most e ǫ −O k,p (1) such that for every x ∈ G, one has either Here the subscripts on the O k,p (1) mean that the constant is allowed to depend on k and p. The result was subsequently extended to general groups by Conant, Pillay, and Terry [6], who showed that for every finite group G, if A ⊂ G has no bi-induced copy of the half-graph on 2k vertices, then there However, the general group version of the theorem [6] gives no quantitative bounds on the index of H due to the model theoretic tools involved in its proof.
We saw earlier that forbidding a fixed bi-induced bipartite graph implies bounded VC dimension. Our first main result generalizes a variant of Terry and Wolf's result to sets of bounded VC dimension, and gives bounds of polynomial order in 1/ǫ. Its proof can be found in Section 2.

Theorem 1.1 (Regularity lemma). Fix positive integers r and d. If G is a finite abelian group with exponent at most r, and A ⊂ G has VC dimension at most d, then for every
Here o(1) is some quantity that goes to zero as ǫ → 0, at a rate possibly depending on r and d.
We also prove a removal lemma for bi-induced copies of a fixed bipartite graph. Let us first recall the classical graph removal lemma. We say that an n-vertex graph is ǫ-far from some property if one needs to add or delete more than ǫn 2 edges to satisfy the property. The triangle removal lemma 3 says that if an n-vertex graph is ǫ-far from triangle-free, then its triangle density is at least δ(ǫ) > 0. The original graph regularity proof [19] of the triangle removal lemma shows that we may take 1/δ(ǫ) to be a tower of 2's of height ǫ −O(1) , which was improved to height O(log(1/ǫ)) in [9]. It is known that there exists a constant c > 0 such that the bound in the triangle removal lemma cannot be improved to δ = ǫ −c log(1/ǫ) (see [8] for a survey on graph removal lemmas). There is also a removal lemma for induced subgraphs [2], initially proved using a so-called strong regularity lemma, though better bounds were later obtained in [7].
Our second main result gives an arithmetic analog of the removal lemma, with polynomial bounds, for bi-induced patterns. We say that Here is our second main result, whose proof can be found in Section 4. Theorem 1.2 (Removal lemma). Fix a positive integer r and a bipartite graph F . Let G be a finite abelian group with exponent at most r. For every 0 < ǫ < 1/2, if A ⊂ G is ǫ-far from bi-induced-F -free, then the probability that a uniform random map φ : We mention an application to property testing. The removal lemma gives a polynomial-time randomized sampling algorithm for distinguishing sets A ⊂ G that are bi-induced-F -free from those that are ǫ-far from bi-induced-F -free. Indeed, sample a random map φ : V (F ) → G, and output YES if φ bi-induces F and is injective on each vertex part of F , and otherwise output NO. If A is bi-induced-F -free, then the algorithm always outputs NO. On the other hand, if F is ǫ-far from bi-induced-F -free, then by the theorem above, the algorithm outputs YES with probability at least ǫ O F (1) , provided that G is large enough, so that φ is injective with high probability. We can then repeat the experiment ǫ −O F (1) times to obtain a randomized algorithm that succeeds with high probability.

Regularity lemma
In this section, we prove Theorem 1.1. We say that a set system S on a finite ground set Ω is δ-separated if |S∆T | ≥ δ |Ω| for all distinct S, T ∈ S. We quote a bound on the size of a δ-separated system.
By taking a maximal δ-separated collection of translates of A ⊂ G, we deduce, below, that A must be δ-close to many of its own translates.
Lemma 2.2. Let G be a finite abelian group, and A ⊂ G a subset with VC dimension at most d, and δ > 0. Then We quote a result from additive combinatorics. We use the following standard notation: The name "Bogolyubov-Ruzsa lemma" was given by Sanders [20], who proved the theorem with the current best bound c r (K) = e −Or(log 4 2K) (see [20,Theorem 11.1]). We refer the readers to the introductions of [20,21] for the history of this result. A version of the theorem for G = Z was initially proved by Ruzsa [17] as a key step towards his proof of Freiman's theorem. The assertion of the polynomial Freiman-Ruzsa conjecture, a central open problem in additive combinatorics, would follow from an improvement of the bound to c r (K) = K −Or (1) .
In our next lemma, we start from the conclusion of Lemma 2.2, which gives us a large set B such that A ≈ A + x for all x ∈ B. Consider the sequence B, 2B, 4B, 8B, . . . . Since B is large, the size of 2 i B cannot keep on growing, so we can find a set B ′ = 2 i B with small doubling |B ′ + B ′ | ≤ K |B ′ |, and i not too large. Theorem 2.3 then implies that 2B ′ − 2B ′ contains a large subgroup, in which every element x satisfies A ≈ A + x, which is close to what we need. Proof. Let K = K(δ) be a quantity increasing sufficiently slowly to infinity as δ → 0. Since |2 i B| ≤ |G| for every i ≥ 0, we have |2ℓB| ≤ K|ℓB| for some ℓ = 2 i < 2 log K (|G|/|B|) = k log K 2 = k o (1) .
Remark. Instead of applying the Bogolyubov-Ruzsa lemma as we do above, it is also possible to prove Lemma 2.4 using Freiman's theorem for groups of bounded exponent (due to Ruzsa [18]): if A is a finite subset of an abelian group of exponent at most r such that |A + A| ≤ K |A|, then A is contained in a subgroup of size O r,K (1) |A|. At the point in the proof of Lemma 2.4 where we apply Theorem 2.3, we can instead contain ℓB inside a subgroup of size δ −o(1) |ℓB|, and using Kneser's theorem (see [24,Theorem 5.5]) we can deduce that H = ℓ ′ B is a subgroup for some ℓ ′ = δ −o(1) ℓ. From this point we can proceed as the rest of the proof.

A strengthened regularity lemma
In the next section, we prove a removal lemma for bi-induced patterns. The regularity lemma we stated in Theorem 1.1 seems not quite strong enough to establish the removal lemma. Below we prove a strengthening, where the VC dimension hypothesis is weakened to a more robust one. Instead of requiring that A has bounded VC dimension, we will ask that, with probability at least 0.9, say, the VC dimension of the collection of translates of A is bounded if we restrict the ground set G to a random set. We state the result below in the form of two alternatives: either A has high VC dimension when sampled, or it satisfies a regularity lemma with polynomial bounds. Here o(1) refers to a quantity that goes to zero as ǫ → 0, at a rate that can depend on r and d.
Recall that Lemma 2.2 tells us that if VC-dim A ≤ d, then B = {x : |A∆(A + x)| ≤ δ |G|} has size at least (δ/30) d |G|. We will derive a similar bound for B under the weaker hypothesis, namely the negation of (a) in Theorem 1.1, from which we can deduce (b) using Lemma 2.4 as in the proof of the previous regularity lemma Theorem 1.1.
Lemma 3.2. Let k ≤ n/2 be positive integers. In an n-vertex graph with maximum degree at most n/k, a random k-element subset of the vertices contains an independent set of size at least k/4 with probability at least 1 − e −k/8 .

Proof.
Let v 1 , . . . , v k be a sequence of k vertices chosen uniformly at random without replacement. Let I be the independent set formed greedily by, starting with the empty set, putting each v i , sequentially as i = 1, 2, . . . , into I if doing so keeps I an independent set. During the process, when at most k/4 elements are added to I, the probability that a new v i is added to I is at least 1 − (k/4)(n/k) n−k ≥ 1 2 , since among the remaining n − k vertices, at most (k/4)(n/k) of them are adjacent to vertices already added to I at this point. It follows that |I| stochastically dominates min{X, k/4}, where X is distributed as Binomial(k, 1/2). Thus P(|I| < k/4) ≤ P(X < k/4) ≤ e −k/8 by the Chernoff bound. Therefore, {v 1 , . . . , v k } contains with an independent set I of size at least k/4 with probability at least 1 − e −k/8 .
We recall a basic result on VC dimension.   X be a random 12m d -element subset of G, and Y a random m- Proof. Suppose, on the contrary, that |B| < |G| /(12m d ). Consider the Cayley graph on G generated by B \ {0}, i.e., there is edge between x, y ∈ G whenenver x − y ∈ B. Applying Lemma 3.2 with k = 12m d to this graph, we find that with probability at least 1 − e −m d , a random 12m d -element subset X ⊂ G contains an independent set I ⊂ X with |I| ≥ 3m d with respect to this graph, i.e., |(A + x)∆(A + y)| > δ |G| for all distinct x, y ∈ I. It follows, by union bound and averaging, that we can fix such a set X so that VC-dim{(A + x) ∩ Y : x ∈ X} ≤ d with probability at least 3m 2d (1 − δ) m for the random m-element set Y ⊂ G.
If |B| < |G| /(12m d ), then by Lemma 3.5, if X and Y are random 2m d -element subsets of G, then VC-dim{(A + x) ∩ Y : x ∈ X} > d with probability at least 0.9.
The result then follows after an appropriate choice of δ = ǫ 1+o(1) .

Removal lemma
In this section, we prove the removal lemma, Theorem 1.2, for bi-induced patterns. The result is analogous to the induced removal lemma [2] which can be proved using a strong version of the graph regularity lemma. The usual way of proving the strong graph regularity lemma is to iterate the graph regularity lemma. For our arithmetic setting, as we are concerned with bi-induced patterns, the situation is a bit easier: we simply apply the regularity lemma, Proposition 3.1, twice, where the second time we choose a smaller error parameter compared to the first time. If option (a) holds either time, then we can extract a bi-induced copy of F from each sample with high VC dimension. Otherwise, (b) holds, and we can modify A by a small amount to A ′ , which must also have a bi-induced copy of F (since A is ǫ-far from bi-induced-F -free). The set A ′ is a union of H-cosets where H is a subgroup of bounded index, and we will show that a single bi-induced copy of F in A ′ leads to many copies.
We may assume that |G| ≥ ǫ −O(|V (F )|) or else the conclusion is automatic from just a single bi-induced copy of F in A.
Suppose, for some k = ǫ −O(|V (F )|) , with probability at least 0.9, random k-element subsets X, Y ⊂ G satisfy VC-dim{(A+x)∩Y : x ∈ X} > d, in which case there exist injective maps U → X and V → Y that bi-induce F in A by footnote 2. Then the probability that random injections U → G and V → G bi-induce F is at least 0.9 k can choose the random injection U → G by first choosing the random k-element subset X ⊂ G and then taking a random injection U → X, and similarly with V . With probability 1 − O F (|G| −1 ) a random map V (F ) → G is injective on U and V , so it bi-induces F with probability at least ǫ O(|V (F )| 2 ) . We apply Proposition 3.1 with two different parameters ǫ 1 = ǫ/10 and some ǫ 2 = ǫ d+o(1) 1 . If option (a) is true in either case, then the previous paragraph implies the conclusion of the Theorem. Otherwise, we obtain subgroups H 1 and H 2 of G, such that for each i ∈ {1, 2}, one has and there exists some union S i of H i -cosets satisfying |A∆S i | ≤ ǫ i |G|. Furthermore, we may assume that ǫ 2 is chosen so that h 1 ǫ 2 |U | |V | < 1/4. (1) . We say that a coset x + H of H is good if |A∆(x + H)| / |H| is within η := 1/(2|U ||V |) of 0 or 1, and bad otherwise. At most an ǫ 2 /η-fraction of H-cosets are bad, since otherwise bad H-cosets would together contribute more than (ǫ 2 /η)η |G| elements to A∆S 2 as S 2 is also a union of H-cosets, but this is impossible as |A∆S 2 | ≤ ǫ 2 |G|.
Pick an arbitrary subgroup K of G containing exactly one element from each coset of H 1 (so that G = H 1 ⊕ K as a direct sum). Let z ∈ H 1 be chosen uniformly at random. Then z + K + H is a union of |K| = h 1 many H-cosets. For each y ∈ K, the random H-coset z + y + H is uniformly chosen from all H-cosets in y + H 1 . Applying the union bound, we see that the probability that z + K + H contains a bad H-coset is at most h 1 ǫ 2 /η < 2h 1 ǫ 2 |U | |V | < 1/2.
Therefore there is some instance such that |A ′ ∆A| < ǫ |G|, and z + K + H is a union of good H-cosets.
Since A is ǫ-far from bi-induced-F -free, A ′ contains a bi-induced-copy of F . So there exist and only if uv ∈ E(F ). Since A ′ is a union of H 1 -cosets, and there is an element of K in every H 1 -coset, we may assume that x ′ u ∈ K for each u ∈ U and y ′ v ∈ z + K for each v ∈ V . Consider independent and uniform random elements x u ∈ x ′ u +H for each u ∈ U , and y v ∈ y ′ v +H for each v ∈ V . For each (u, v) ∈ U × V , the random element x u + y v is distributed uniformly in the H-coset x ′ u + y ′ v + H, which is a good H-coset since x ′ u + y ′ v ∈ z + K as K is a subgroup. So with probability at least 1 − η, one has x u + y v ∈ A if and only if x ′ u + y ′ v ∈ A ′ , which in turn occurs if and only if uv ∈ E(F ). Taking a union bound over (u, v) ∈ U × V , the following holds with probability at least 1 − |U | |V | η = 1/2: for every (u, v) ∈ U × V , one has x ′ u + y ′ v ∈ A if and only if uv ∈ E(F ). Since each x u and y v is restricted to a single H-coset, it follows that a uniform random map φ : V (F ) → G bi-induces F with probability at least 1 2 (|H| / |G|) |V (F )| ≥ ǫ (d 2 +d+o (1))|V (F )| .

Concluding remarks
We conjecture that the result can be extended to general groups, not necessarily abelian. A special case of the conjecture, though with a somewhat stronger but non-quantitative conclusion, where one forbids a half-graph of fixed size (instead of assuming bounded VC dimension), was recently established by Conant, Pillay, and Terry [6] using model theoretic tools.
Note that the bounded exponent hypothesis in the conjecture above cannot be dropped. Indeed, if G = Z/pZ with p prime, and A = {1, 2, . . . , ⌊p⌋ /2⌋}, then VC-dim A ≤ 3, while G has no nontrivial subgroups, so the conclusion of the conjecture is false. Nonetheless, there may be regularity lemmas using other structures in addition to subgroups. An example of such a result is discussed later in this section.
We also conjecture that the removal lemma should generalize to arbitrary groups as well, although it seems to be open even for the general abelian groups.
Conjecture 5.2. Fix a bipartite graph F . Let G be a finite group. For every 0 < ǫ < 1/2, if A ⊂ G is ǫ-far from bi-induced-F -free, then the probability that a uniform random map φ : It seems likely that the theory developed by Breuillard, Green, and Tao [4,5] on the structure of approximate groups should be useful in the case of nonabelian groups. We hope to study these problems in the future.
In classical results in additive combinatorics, such as Freiman's theorem, when the ambient group does not have many subgroups, generalized progressions and Bohr sets often play the role of subgroups when the group does not have many subgroups. For example, in Green and Ruzsa's [13] extension of Freiman's theorem to general abelian groups, the basic structural objects are coset progressions, which are sets of the form P = Q + H, where H is a subgroup, and Q is some generalized arithmetic progression {x 0 + i 1 x 1 + · · · + i d x d : 0 ≤ i j < ℓ j for each j}, and the sum Q+H is a direct sum in the sense that every element in Q+H can be written as q+h with q ∈ Q and h ∈ H in a unique way. We say that the progression is proper if all the terms x 0 + i 1 x 1 + · · · + i d x d in Q are distinct. We call d the dimension of the progression.
The Bogolyubov-Ruzsa lemma, Theorem 2.3, holds for general abelian groups (see [13,Section 5]; also see [20]). Theorem 5.3 (Bogolyubov-Ruzsa lemma for general abelian groups). Let G be an abelian group, and A ⊂ G a finite set such that |A + A| ≤ K|A|. Then 2A − 2A contains a proper coset progression P of dimension at most d(K) and size at least c(K)|A|, for some constants c(K), d(K) > 0.
By modifying the proof of Theorem 2.3 so that we apply Theorem 5.3 instead of 2.3, we obtain an analog of the first claim in Theorem 2.3 for general finite abelian groups. We are not sure if some variant of this result can be used to prove a removal lemma.
Proposition 5.4. For every ǫ > 0 and D = D(ǫ) → ∞ as ǫ → 0, if G is a finite abelian group, and A ⊂ G has VC dimension at most d, then there exist some proper coset progression P of dimension at most D and size |P | ≥ ǫ d+o(1) |G|, such that |(A + x)∆A| ≤ ǫ|G| for all x ∈ P . Here o(1) is some quantity that goes to zero as ǫ → 0, at a rate depending on d and D.
We conclude with the following related question that we do not know how to answer (even for k = 2). An affirmative answer would strengthen Szemerédi's theorem.
Question 5.5. Let k be a positive integer and δ > 0. Let p be a sufficiently large prime, and A ⊂ Z/pZ with δp ≤ |A| ≤ (1 − δ)p. Can we always find a 2k-term arithmetic progression in Z/pZ where the first k terms lie in A and the last k terms lie outside of A?
If p had a small prime factor, then taking A to be a non-trivial subgroup of Z/pZ gives a counterexample. To see the relevance to the rest of this paper, observe that such a 2k-term arithmetic progression would bi-induce a half-graph on 2k vertices. For example, if x − (k − 1)d, x − (k − 2)d, . . . , x ∈ A and x + d, . . . , x + kd / ∈ A, then x i = x − id and y j = jd have the property that, for 1 ≤ i, j ≤ d, x i + y j ∈ A if and only if j ≤ i.