The analytic rank of tensors and its applications

The analytic rank of a tensor, first defined by Gowers and Wolf in the context of higher-order Fourier analysis, is defined to be the logarithm of the bias of the tensor. We prove that it is a subadditive measure of rank, namely that the analytic rank of the sum of two tensors is at most the sum of their individual analytic ranks. This analytic property turns out to have surprising applications: (i) common roots of tensors are always positively correlated; and (ii) the slice rank and partition rank, which were defined recently in the resolution of the cap-set problem in Ramsey theory, can be replaced by the analytic rank.


Introduction
The main objects of study in this paper are tensors, or equivalently multilinear forms. Tensors have intimate relations to central problems in computer science and combinatorics. The complexity of matrix multiplication is captured by the rank of the matrix multiplication tensor, see e.g. the survey [Blä13]. In arithmetic complexity, proving super-linear lower bounds for tensors is related to proving lower bounds for arithmetic circuits and formulas, see for example [CKSV16] and the citations within. More relevant to the topic of the current paper, recently defined notions of ranks were instrumental in the resolution of the cap-set problem in additive combinatorics [CLP17,EG17,Tao16], which is in itself intimately related to the problem of matrix multiplication [ASU13, BCC + 16].
The standard notion of tensor rank, as well as the more recent notions (called slice rank and partition rank, which will be formally defined shortly) are inherently combinatorial notions of rank. The focus on this paper in on an analytic notion of rank, which was first defined by Gowers and Wolf [GW11] in the context of higher-order Fourier analysis. The purpose is to (i) explore the power of this new notion of rank and (ii) connect it to the more well studied combinatorial notions. Our main results can be informally stated as follows.
Theorem 1.1 (Main results, informal). For any tensor, the analytic rank lower bounds all the previously known combinatorial notions of rank (standard one, slice rank and partition rank). Moreover, it can replace the role of the slice rank or partition rank in applications in Ramsey theory in product spaces.
The main reason why we find this theorem interesting is that the current techniques used to resolve the cap-set problem, do not seem to extend to more general problems of a similar flavour, except for very few cases. In general, the problems studied are Ramsey-type problems in product spaces, where the goal is to upper bound the maximal size of a set without a particular sub-structure. We give two examples that illustrate this.
Example 1.2 (k-AP free sets). Consider F n p for a fixed prime p (say p = 5) and n large. A k-AP (length k arithmetic progression) in F n p is a set of the form x, x+d, x+2d, . . . , x+(k −1)d with d = 0. The cap-set problem asks what is the maximal size of A ⊂ F n p which is 3-AP free. Ellenberg and Gijswijt [EG17] proved that the answer is O(c n ) for some c < p; that is, A has to be exponentially small. Previously, the best known bound was O(p n /n 1+ε ) for some ε > 0 [BK12]. However, these new techniques are seemingly unable to extend to bound the size of 4-AP free sets, where the best known bound is O(p n /n ε ) for some ε > 0 [GT09b]. [NS17] used techniques similar to these used to resolve the cap-set problem, to prove that the largest 3-sunflower free set A ⊂ {0, 1} n has size O(c n ) for some c < 2. Again, A is exponentially small. However, the techniques seem to fail to extend to bound the size of 4-sunflower free sets, where the best bounds are 2 n−O( √ n) (the bound follows from the Erdös-Rado sunflower theorem [ER60] via standard reductions).

Example 1.3 (Erdös-Szemerédi sunflower free sets). Consider families of subsets of
Thus, we see that while the new tensor-based techniques achieve amazing success on some problems, they are not robust in the sense that they do not generalize easily. One of the results of this paper is that the analytic rank can replace the role played by the slice rank or partition rank in the current proofs, and in fact it is always a lower bound for these latter ranks. Thus, this raises an alternative approach to using tensor-rank based techniques to study these Ramsey problems.

Tensors and tensor ranks
We start by giving a formal definition of tensors and tensor ranks.
Tensors. Let F be a field, V a finite dimensional linear space over F, and d ≥ 1. An order-d tensor (also called a d-linear form) is a multilinear map T : V d → F. Equivalently, if V is n-dimensional, then we can identify V ∼ = F n , in which case Here we use the convention [n] = {1, . . . , n} and x i = (x i 1 , . . . , x i n ) ∈ F n for i = 1, . . . , d. The tensor T is identified with the d-dimensional array of its coefficients (T i 1 ,...,i d : Tensor ranks. There are several "combinatorial" notions of tensor rank studied in the literature. They all have the following form: the rank of T is the minimal r ≥ 1, such that T can be factored as the sum of r rank one tensors. The only difference is what is considered to be a "rank one tensor".
The most common definition, which is usually simply called "rank", is that T is rank one if it can be factored as where each T i is a an order-1 tensor (namely, a linear function). Recently, in the study of the cap-set problem and followup works, two other definitions were introduced. A tensor T has "slice rank one" [CLP17,EG17,Tao16] if it can be factored as where T 1 is an order-1 tensor and T 2 is an order-(d − 1) tensor. A tensor T has "partition rank one" [Nas17] if it can be factored as is a set which satisfies 1 ≤ |A| ≤ d − 1, T 1 is an order-|A| tensor, and T 2 is an order-(d − |A|) tensor. Let us denote the rank, slice rank, and partition rank of a tensor T by rank(T ), srank(T ), prank(T ), respectively. Then since rank one tensors are also slice rank one tensors, and slice rank one tensors are also partition rank one tensors, we have:

The analytic rank
A different notion of rank was introduced by Gowers and Wolf [GW11] in the context of higher-order Fourier analysis. Let F = F p be a prime finite field, and let ω p = exp(2πi/p) be a primitive p-th root of unity. Let T : The bias of T is always real and in (0, 1]. To see that, define T (·, x 2 , . . . , x d ) to be the order-1 tensor on x 1 given a fixing of x 2 , . . . , x d . Then This is since for an order-1 tensor (namely, a linear form), its bias is 1 if it is identically zero, and is 0 otherwise. The analytic rank of T is defined to be arank(T ) := − log |F| bias(T ).
As bias(T ) ∈ (0, 1] we have that arank(T ) ≥ 0. The following example might help shed some light on the definition. It shows that in the case of order-2 tensors (i.e. bilinear forms, corresponding to matrices), the analytic rank is equivalent to the standard notion of rank.
Example 1.4. Consider the order-2 tensor T : Hence arank(T ) = r which coincides with the usual notion of rank for bilinear forms.
Gowers and Wolf [GW11] proved that the analytic rank is approximately sub-additive, in the sense that arank(T + S) ≤ 2 d (arank(T ) + arank(S)). We show that the analytic rank is in fact sub-additive. The fact that we do not lose any constant factor is crucial in the applications.

Applications
Theorem 1.5 has some surprising applications, which we describe next.
Common roots of tensors are positively correlated. We show that the common roots of order-d tensors on a common input are always positively correlated.
Claim 1.6. Let T 1 , . . . , T m , S 1 , . . . , S n : V d → F be order-d tensors. Then Proof. Define two order-(d + 1) tensors as follows: Equation (1) gives that the LHS of Equation (2) is equal to bias(T + S), whereas the RHS is equal to bias(T )bias(S). The claim then follows from Theorem 1.5.
The analytic rank can replace the partition rank. The motivation behind the introduction of the slice rank and the partition rank, was to study the cap-set problem, and more generally Ramsey problems in product spaces. Works in this space include [CLP17, EG17, Nas17, BCC + 17, Kle16,KSS16,NS17].
In all these problems, a certain tensor T : (F n ) d → F is defined which captures the problem structure. An independent set in T is a subset A ⊂ [n] that satisfies The goal is to upper bound the size of the largest independent set in T . The proofs combine the following two properties: (i) The slice rank, or partition rank, of the specific tensor T studied is low. This is usually an ad-hoc argument, which relies on the specific definition of T .
(ii) If T contains an independent set A then its partition rank (and hence also its slice rank) is at least |A|.
This allows to upper bound the size of the maximal indpendent set in T . We show that the analytic rank can be used instead of the slice rank, or the partition rank, and obtain comparable bounds. This raises the possibility of proving bounds on the analytic rank directly, which may circumvent some of the challenges in extending the current line of work to other Ramsey problems (see Example 1.2 and Example 1.3 and the discussion that follows).

(ii) If T contains an independent set A then arank(T ) ≥ c|A|.
Here c = c(d, |F|) satisfies that c ≥ 2 −d always and c ≥ 1 − log(d−1) log |F| which is better for large F.
The fact that we do not obtain c = 1 is not important for the applications, as typically d is a small constant (say 3 or 4) and the goal is to prove bounds on |A| which are exponential in n = dim(V ), which is assumed to be large.

Is the analytic rank really better than the partition rank?
Given Theorem 1.7, a natural question arises: is the analytic rank "better" than the partition rank? namely, are there tensors T where arank(T ) ≪ prank(T )? This question is intimately related to the line of work known as "bias implies low rank" in higher-order Fourier analysis [GT09a,KL08,HS10,BL18,BL15]. Re-interpreting these results in the language of analytic rank vs partition rank, the known results are: (i) If T is an order-d tensor with arank(T ) ≤ r, then prank(T ) ≤ f (r, d), where f has an Ackerman-type dependence on its parameters [BL15]. Note that f does not depend on the underlying field F or the dimension of the tensor n. However, there are no examples known where the gap between the analytic rank and partition rank is more than a constant. The best separation we know of is for the identity tensor.
Example 1.8 (Identity tensor). Let T : (F n p ) d → F p be defined as Naslund [Nas17] proved that T has maximal partition rank, namely prank(T ) = n. On the other hand, the calculation in the proof of Theorem 1.7 shows that arank(T ) = cn where c = c(d, |F|) is the constant given in Theorem 1.7.
We refer the reader also to [BHH + 18], which studies related problems in the context of proving arithmetic circuit lower bounds. This leads to the following natural question.

Problem 1.9. Is it true that for any order-d tensor T it holds that prank(T ) ≤ c d arank(T ), where c d is a constant which depends only on d?
We conclude with another interesting problem, relating to the scope of definition of the analytic rank. We note that Gowers and Wolf in [GW11] defined the analytic rank for functions over Z N , but the treatment there does not seem related to the problems studied in this paper.
Organization. Theorem 1.5 is proved in Section 2, and Theorem 1.7 is proved in Section 3.
2 Proof of Theorem 1.5 We prove Theorem 1.5 in this section. It suffices to prove that for any two order-d tensors T, S : V d → F it holds that bias(T + S) ≥ bias(T )bias(S). ( We first introduce some notation. Let That is, T I is the tensor T evaluated over x I , y I c . Observe that T (x + y) decomposes as the sum T (x + y) =
Here, we used the fact that the joint distributions of (x, y) and (x, x+ y) are identical. Next, decompose S(x + y) using Equation (4) as bias(T )bias(S) = bias Observe that T I (x I , b I c ) is an order-|I| tensor of the inputs x I . The proof of Theorem 1.5 follows from the following lemma, applied to R R I (x I ).
In order to prove Lemma 2.1, we first prove the following claim.
3 Proof of Theorem 1.7 We prove Theorem 1.7 in this section. We break it as a series of claims. We first show that the analytic rank is at most the partition rank. Proof. Given Theorem 1.5, it suffices to prove the claim for tensors T of partition rank one. Assume that T : V d → F factors as where A ∪ B is a partition of [d], |A|, |B| ≥ 1. We will show that bias(T ) ≥ |F| −1 which implies that arank(T ) ≤ 1. For a, b ∈ F define the function Lemma 2.1 gives that |bias(F a,b )| ≤ bias(T ).
On the other hand, if we let a, b ∈ F be chosen uniformly, then It follows that bias(T ) ≥ |F| −1 , as claimed.
Next, we show that the analytic rank cannot increase in a restriction of a tensor to a subspace.
Claim 3.2. Let T : V d → F be an order-d tensor, let U ⊂ V be a subspace and consider the restricted tensor T | U : U d → F. Then arank(T | U ) ≤ arank(T ).
Consider any fixing of w 1 , . . . , w d ∈ W . Then The claim follows by averaging over w 1 , . . . , w d ∈ W .
Finally, we show that if a tensor contains an independent set A then its analytic rank is at least linear in |A|. A convexity argument shows that for e ≥ 2, c(d, e) ≥ c(d, 2) = log 2 (1 − 2 −(d−1) ) ≥ 2 −(d−1) . Next, if we assume e ≥ d (otherwise the second bound on c is trivial) then c(d, e) ≥ log e e d−1 = 1 − log(d−1) log e .