Discrepancy of High-Dimensional Permutations

Let $L$ be an order-$n$ Latin square. For $X, Y, Z \subseteq \{1, ... ,n\}$, let $L(X, Y. Z)$ be the number of triples $i\in X, j\in Y, k\in Z$ such that $L(i,j) = k$. We conjecture that asymptotically almost every Latin square satisfies $|L(X, Y, Z) - \frac 1n |X||Y||Z||\le O(\sqrt{|X||Y||Z|})$ for every $X, Y$ and $Z$. Let $\varepsilon(L):= \max |X||Y||Z|$ when $L(X, Y, Z)=0$. The above conjecture implies that $\varepsilon(L) \le O(n^2)$ holds asymptotically almost surely (this bound is obviously tight). We show that there exist Latin squares with $\varepsilon(L) \le O(n^2)$, and that $\varepsilon(L) \le O(n^2 \log^2 n)$ for almost every order-$n$ Latin square. On the other hand, we recall that $\varepsilon(L)\geq \Omega(n^{33/14})$ if $L$ is the multiplication table of an order-$n$ group. Some of these results extend to higher dimensions. Many open problems remain.


Introduction
The notion of discrepancy is central to all branches of discrete mathematics. Indeed, several books [11,4,2] have been dedicated to this subject. Roughly speaking, one asks how well finite sets can approximate a uniform measure. A bit more concretely, the problem is defined in terms of a collection F of subsets in a probability space (Ω, µ). We seek the minimum of sup X∈F | |S∩X| |S| − µ(X)| over all sets S of given cardinality. Such questions and their many variants make sense and are interesting in numerous contexts. An important example from graph theory is the expander mixing lemma. Let G = (V, E) be a d-regular n-vertex graph. This lemma asserts that if G is an expander graph, then for every two subsets A, B ⊆ V there holds |e(A, B) − d n |A||B|| = O( |A||B|) where e(A, B) is the number of ordered pairs (a, b) with a ∈ A, b ∈ B and ab ∈ E. The unspecified constant in the big-oh term depends on G's spectrum, but we do not elaborate on this point and refer the reader to the survey [6].
A considerable body of recent research is aimed at developing a theory of highdimensional combinatorics. Many basic combinatorial constructs have interesting highdimensional counterparts, and it is natural to study discrepancy phenomena in these frameworks. Specifically we consider discrepancy in high-dimensional permutations. Let us briefly recall this concept [10]. We equate a (classical, i.e., one-dimensional) permutation with its permutation matrix, namely, an n × n array of zeros and ones where every row and every column contains exactly one 1. In analogy, a d-dimensional permutation A is an [n] d+1 = n × n × . . . × n array of zeros and ones such that for every index d + 1 ≥ i ≥ 1 and every choice of integers α j ∈ [n] over 1 ≤ j = i ≤ d + 1 there Date: July 26, 2016. Supported by ERC grant 339096 High-Dimensional Combinatorics. 1 is exactly one choice of x ∈ [n] for which A(α 1 , . . . , α i−1 , x, α i+1 , . . . , α d+1 ) = 1. Note, in particular, that a two-dimensional permutation is synonymous with a Latin square.
The class F that defines our discrepancy problem is comprised of all boxes T = The volume of this box is defined to be vol(T ) := |T i |. Our discrepancy problem is to find d-dimensional permutations A, such that for every box T it holds that A(T ) := |{α ∈ T : A(α) = 1}| is close to vol(T ) n . (Clearly this is what one would expect, since the density of 1 entries in a d-dimensional permutation is 1 n ). We propose the following conjecture.
There are at least two reasons why we expect this to be true. Consider the following "poor man's analog" of a random Latin square. It is a random n × n × n array of zeros and ones whose entries are chosen independently with the same distribution, where 1 is chosen with probability 1 n . It is easily verified that this relation holds in that model. In addition, a d-dimensional permutation may be viewed as a (d + 1)-partite (d + 1)uniform hypergraph, and we find the similarity with the expander mixing lemma rather compelling.
We say that T is an empty box in A if A(T ) = 0, and denote by ε(A) the maximal volume of an empty box in A. One consequence of the above conjecture is that there are d-dimensional permutations A such that ε(A) = O(n 2 ). On the other hand, it is easy to see that ε(A) = Ω(n 2 ) for every d-dimensional permutation, since every (classical) permutation matrix contains a ⌊ n 2 ⌋ × ⌊ n 2 ⌋ block of zeros. Indeed, let A be an arbitrary d-dimensional permutation. Pick some T 2 ⊆ [n] of cardinality ⌊ n 2 ⌋ and some t 3 , . . . , t d+1 ∈ [n], and let T 3 = {t 3 }, . . . , T d+1 = {t d+1 }. We can find a subset T 1 ⊆ [n] of cardinality ⌊ n 2 ⌋ for which T = T 1 × . . . × T d+1 ⊆ [n] d+1 is an empty box in A. Indeed, for every t ∈ T 2 , there is exactly one x ∈ [n] for which A(x, t, t 3 , . . . , t d+1 ) = 1 and clearly x cannot belong to T 1 . But altogether only ⌊ n 2 ⌋ elements are ruled out from being in T 1 , one per each element of T 2 so that at least ⌊ n 2 ⌋ are still acceptable and the claim follows.
We prove the following theorems in this spirit for 2-dimensional permutations, i.e., for Latin squares.  We tend to believe the following statement which subsumes both theorems: Moreover, it is conceivable that the discrepancy condition of Conjecture 1.1 holds for asymptotically almost every d-dimensional permutation.
It is easy to see that the multiplication table of a finite group is a Latin square, and problems that we consider here have been previously addressed in the group theory literature. Babai and Sos [1], defined a subset S ⊂ Γ of a finite group to be product-free if there are no three elements x, y, z ∈ S with xy = z. Note that in our language this means that S × S × S is an empty box in the Latin square L corresponding to Γ. Using the classification of finite simple groups, Babai and Sos showed that every finite group contains large product-free sets. Subsequently, Kedlaya [7] improved their bound. His result implies: Theorem 1.5 (Kedlaya). If L is a Latin square that is the mutiplication table of an order-n group, then ε(L) ≥ cn 33 14 for some fixed c > 0.
On the other hand, Gowers [5] has exhibited order-n groups for which ε(L) ≤ Cn 8 3 for some fixed C > 0.
These results show that a typical Latin square has substantially lower discrepancy than any group of the same order.
A cube is a box A × B × C with |A| = |B| = |C|. It is easy to see that every order-n Latin square has an empty cube of side ⌊(n + 1/4) 1/2 − 1/2⌋, and we can show the following.
Theorem 1.6. There exist infinitely many order-n Latin squares L in which every empty cube has side O((n log n) 1/2 ).
As mentioned, Kedlaya finds an empty cube of side Ω(n 11/14 ) in the Latin square of every order-n group.
Again, analogs in general dimension suggest themselves as we discuss in Section 5. The proof of Theorem 1.2 is based on our earlier work [10] in which we derived an upper bound on the number of d-dimensional permutations. The proof of Theorems 1.3 and 1.6 is based on ideas developed by P. Keevash in his recent breakthrough work on the theory of combinatorial designs. He considers in [9] a random greedy process in which a set system evolves as sets are added to it in sequence. As he shows, with high probability the partial design that is obtained this way can be completed to a bona-fide design.

Proof of theorem 1.2
This result follows from an upper bound on high dimensional permanents proved in [10]. Recall that the support of an r-dimensional array X is Define the permanent of a (d + 1)-dimensional 0-1 array A to be the number of dpermutations whose support is contained in Supp(A), and let r i 1 ,...,i d be the number of ones in the line A(i 1 , ..., i d , ·), i.e., the number of integers x ∈ [n] for which A(i 1 , ..., i d , x) = 1. Then We denote the number of order-n Latin squares by L(n). Fix sets X, Y, Z ⊆ [n] and let B denote the n × n × n 0-1 array which is 0 in the box X × Y × Z and 1 elsewhere. The probability that X × Y × Z is an empty box of an order-n Latin square chosen uniformly at random is Per(B) L(n) . A counting argument due to van Lint and Wilson [13] shows that L(n) = (1 + o (1) , and so we obtain the following upper bound by applying (1) to B.
≤ e O(n log 2 (n)) e −|X||Y ||Z|/n . Next, we apply the union bound over all boxes whose volume is at least M n 2 log 2 (n) for a large constant M whose value will be chosen later. There are at most (2 n ) 3 ways to choose A, B and C, and so if L is an order-n Latin square chosen uniformly at random, we have Pr(ε(L) ≥ M n 2 log 2 (n)) ≤ 2 3n · e O(n log 2 (n)) e −M n log 2 (n) .
Therefore, for any constant M that is larger than the constant in the big-oh term, we obtain a vanishingly small probability.

Proof of theorem 1.3
Here we use an insight from Keevash's breakthrough papers [8,9] on the existence and asymptotic enumeration of designs. We consider his construction for the specific case of Steiner triple systems. The first part of the algorithm involves a random greedy strategy which is stopped when all but a vanishingly small fraction of the vertex pairs are covered by triples. The crux of his proof is that, with high probability, the resulting set of uncovered triples can be completed to a Steiner triple system.
An analogous result is most likely also true for the random construction of Latin squares. However, to simplify matters, we use Keevash's results on Steiner triple systems and adapt them to our needs. Note that every order-n Steiner triple system X yields a (symmetric) order-n Latin square L as follows: L(i, j) = k ⇔ {i, j, k} ∈ X and L(i, i) = i for all i ∈ [n]. We define an empty box in X to be a triple of sets A, B, C ⊆ [n] such that {i, j, k} ∈ X for every i ∈ A, j ∈ B, k ∈ C. We say that this box has volume |A||B||C|, and denote the largest volume of an empty box in X by ϕ(X). Since an empty box in L is also an empty box in X, we have ε(L) ≤ ϕ(X).
Steiner triple systems constructed using Keevash's method tend to have no large empty boxes: Keevash's algorithm asymptotically almost surely constructs a Steiner triple system for every large enough n such that n ≡ 1 or 3 (mod 6). This proposition implies that for such n there exist order-n Latin squares L with ε(L) ≤ M n 2 .

Proof of the proposition:
In view of the way in which Keevash's construction proceeds, it suffices to show that at the end of the random greedy process there remain no large empty boxes. Since that process is monotone and triples only get added, it suffices to show that after a small fraction of this stage is completed, no large box remains empty. Recall that at each step of the process a triple is chosen at random from among the legal triples, i.e., those that have at most one vertex in common with every previously selected triple.
Given A, B, C ⊆ [n], an ABC triple is a triple that meets A, B and C. An ABC triple meets A and B, but does not meet C, etc. Clearly, the set F of all ABC triples satisfies 1−o(1) 6 |A||B||C| ≤ |F | ≤ |A||B||C|. Let λ > 0 be a constant whose value will be chosen later. We refer to the initial λn 2 steps of the random greedy process as the first stage, and prove that if |A||B||C| ≥ M n 2 , then it is very unlikely that no triple in F is selected during the first stage. There are 8 n choices for A, B, C, so if for every choice of A, B, C this statement fails with probability o(8 −n ), our claim is established.
Indeed, this sounds plausible. Since |F |/ n 3 ≥ (1− o(1))(M/n) , the probability that during λn 2 steps we never select a triple from F ought to be exponentially small in n. However, this heuristic argument ignores the fact that triples in F may become illegal during the process even if we never select a triple from F . Thus, the choice of an ABC triple may invalidate as many as 3|C| triples in F . We need to show that whp not too many such choices are made 1 .
We will show that (2) Whp, at the end of first stage, at least |F | 2 triples in F remain legal.
Consequently, the probability that during the first stage we select no member of F is at most (1))3λ|F |/n .
To prove Statement (2) we show first that whp during the first stage at most |A||B|/108 triples of type ABC get chosen. Thus the chosen ABC triples invalidate at most 3|C| · |A||B|/108 ≤ |F |/6 triples of F . Together with the analogous contribution of types ABC andĀBC, at most half of the triples in F get invalidated, and so at least half of them remain legal.
There are at most |A||B|n triples of type ABC, and each chosen triple invalidates at most 3n triples. If X is the number of type ABC triples that we sample during the first stage, then X = λn 2 i=1 X i , where X i is the indicator random variable of the event that the i-th chosen triangle is in ABC. Therefore, .
We recall the following generalization of Chernoff's inequality (see Theorem 3.4 in [12]). Namely, if Y is the sum of N Bernoulli random variables Y 1 , ..., Y N and for every subset S ⊂ [N ] we have Pr(∧ i∈S Y i = 1) ≤ p |S| for some 0 < p < 1, then for every δ > 0, Let q := |A||B|n/ n 3 − 3λn 3 . The probability that X i = 1 conditioned on the values of previous variables is always at most q, and so Pr(∧ i∈S X i = 1) ≤ q |S| for every S ⊂ λn 2 . Moreover, |A||B|/108 , and so we have .
Also, |A||B| ≥ |F |/n, so if we take λ = 1 1500 , Since |F | ≥ M n 2 /6, if we take M = 9000 we have It should be easy to substantially improve the estimate of M , but we do not do it here, since we have no specific guess as to the best attainable bound.

Proof of theorem 1.6
Fix s ≥ 100 · √ n log n. We show that whp the Latin square constructed using Proposition 3.1 has no empty cube of side length s.
Indeed, in the spirit of the above proof, we give an upper bound on the probability that a fixed triple A, B, C ⊆ [n] with |A| = |B| = |C| = s is an empty box in a Keevash Steiner triple system. The bound that we get is the probability that statement (2) fails plus the probability that A × B × C is empty given that statement (2) holds, which is at most 2e −(1+o(1))|F |/500n ≤ 2e −s 3 /3000n . Applying the union bound, the probability that there is such a box is at most n s 3 · 2e −s 3 /3000n ≤ 2 exp s(3 log n − s 2 /3000n) .
This tends to zero for s ≥ 100 · √ n log n.

open questions and concluding remarks
Let us recall some of the open questions and aims raised above. There are some obvious implications among them, as the reader can easily see.
(1) Can one find explicit constructions of high-dimensional permutations with good discrepancy properties? Kedlaya's theorem, mentioned above, suggests that substantial new ideas will be required to accomplish this. (2) Prove that ε(L) = O(n 2 ) for almost every order-n Latin square. has sideÕ(n 1/d )? HereÕ refers to an unspecified polylog term, but perhaps this is even true with O(n 1/d ), which would clearly be tight.
We are presently unable to extend Theorem 1.2 to dimensions d ≥ 3, since the available bounds on the number of d-dimensional permutations are not tight enough. Note that this would require very accurate estimates, which seem out of reach with current methods. In this context we recall our conjectured lower bound [10] It is conceivable that the machinery in [8] may be useful in this pursuit.
The prospect of extending Theorem 1.3 to higher dimensions seems more hopeful. In our proof we show that Keevash's triple systems contain no large empty boxes. This yields this property for the Latin squares representing these Steiner systems. The proof of Proposition 3.1 goes through for (n, d+1, d)-Steiner systems in general, and for d = 3 it is even possible to associate 3-dimensional permutations to such Steiner systems. Namely, to an (n, 4, 3)-Steiner system X we associate the 3-dimensional permutations A given by A(i, j, k, l) = 1 if {i, j, k, l} ∈ X, and A(i, i, j, j) = 1 for all i, j ∈ [n]. Therefore Theorem 1.3 holds in dimension 3 as well. However, in dimensions d ≥ 4 there seems to be no obvious way of associating Steiner systems with permutations, and so a different approach is needed. It is natural to try and adapt Keevash's method to the construction of high-dimensional permutations, i.e., to analyze the random greedy algorithm in this setting.
We note that the Latin squares constructed in the proof of Theorem 1.3 have a large discrepancy, due to overly dense boxes that they contain. Keevash's construction associates each vertex v ∈ V with an element a v ∈ F 2 a , where 2 a−2 ≤ n ≤ 2 a−1 . He then considers triples, x, y, z ∈ V such that a x + a y + a z = 0 in F 2 a . Such a triple that remains legal at the end of the greedy process, gets added to the Steiner triple system. But the additive group of F 2 a has many subgroups. If we take X = Y = Z to be the members of a subgroup, we obtain a collection of vertices with many triples. From the perspective of Latin squares, this is an overly dense box.
It would be interesting to find an explicit construction of Latin squares without large empty boxes. However, most of the known explicit constructions of Latin squares come from groups, but Kedlaya's theorem implies that the multiplication tables of groups always have large empty boxes, which indicates that new ideas are needed here.
Our main conjecture can be viewed as a special case of a much broader problem, that we state in terms of Latin squares, but extensions to permutations of arbitrary dimensions suggest themselves as well. Consider an order-n Latin square A, a subset S ⊆ [n] and an index 3 ≥ t ≥ 1, say t = 2. The corresponding section of A is a bipartite graph Σ S,t = (U, V, E) on the vertex set U ∪ V , where U = V = [n] and ij ∈ E iff there is an x ∈ S for which A(i, x, j) = 1. We call this a k-section where |S| = k. Now pick a parameter of interest f = f (G) that is defined for k-regular bipartite graphs G each part of which has n vertices and let F (k, n) be the optimum of f over all such graphs.
Problem: Do there exist Latin squares such that f (Σ S,t ) = (1 + o(1))F (k, n) for every k-section of A? For which graph parameters does this hold for almost every Latin square? For the function f (G) = max A⊂U,B⊂V |E(A, B) − k n |A||B|| we recover our discrepancy conjecture for Latin squares. Many other functions and problems suggest themselves, e.g., minimizing f (G), the largest nontrivial eigenvalue of G.