Polynomial bound for the partition rank vs the analytic rank of tensors

A tensor defined over a finite field $\mathbb{F}$ has low analytic rank if the distribution of its values differs significantly from the uniform distribution. An order $d$ tensor has partition rank 1 if it can be written as a product of two tensors of order less than $d$, and it has partition rank at most $k$ if it can be written as a sum of $k$ tensors of partition rank 1. In this paper, we prove that if the analytic rank of an order $d$ tensor is at most $r$, then its partition rank is at most $f(r,d,|\mathbb{F}|)$, where, for fixed $d$ and $\mathbb{F}$, $f$ is a polynomial in $r$. This is an improvement of a recent result of the author, where he obtained a tower-type bound. Prior to our work, the best known bound was an Ackermann-type function in $r$ and $d$, though it did not depend on $\mathbb{F}$. It follows from our results that a biased polynomial has low rank; there too we obtain a polynomial dependence improving the previously known Ackermann-type bound. A similar polynomial bound for the partition rank was obtained independently and simultaneously by Mili\'cevi\'c.


Bias and rank of polynomials
For a finite field F and a polynomial P : F n → F, we say that P is unbiased if the distribution of the values P(x) is close to the uniform distribution on F; otherwise we say that P is biased.It is an important direction of research in higher order Fourier analysis to understand the structure of biased polynomials.
Note that a generic degree d polynomial should be unbiased.In fact, as we will see below, if a degree d polynomial is biased, then it can be written as a function of not too many polynomials of degree at most d − 1.Let us now make this discussion more precise.is defined to be the smallest integer r such that there exist polynomials Q 1 , . . ., Q r : F n → F of degree at most d − 1 and a function f : F r → F such that P = f (Q 1 , . . ., Q r ).
As discussed above, it is known that if a polynomial has large bias, then it has low rank.The first result in this direction was proved by Green and Tao [4] who showed that if F is a field of prime order and P : F n → F is a polynomial of degree d with d < |F| and bias(P) ≥ δ > 0, then rank(P) ≤ c(F, δ, d).Kaufman and Lovett [8] proved that the condition d < |F| can be omitted.
In both results, c has Ackermann-type dependence on its parameters.Finally, Bhowmick and Lovett [1] proved that if d < char(F) and bias(P) ≥ |F| −s , then rank(P) ≤ c ′ (d, s).The novelty of this result is that c ′ does not depend on F. However, it still has Ackermann-type dependence on d and s.
One of our main results is the following theorem, which improves the result of Bhowmick and Lovett, unless |F| is very large.
Theorem 1.4.Let F be a finite field and let χ be a nontrivial character of F. Let P be a polynomial F n → F of degree d < char(F).Suppose that bias χ (P) ≥ ǫ > 0 where ǫ ≤ 1/|F|.Then where c is an absolute constant and c Recall that if G is an Abelian group and d is a positive integer, then the Gowers U d norm (which is only a seminorm for d = 1) of f : G → C is defined to be , where C is the conjugation operator.It is a major area of research to understand the structure of functions f whose U d norm is large.Our next theorem is a result in this direction.
Theorem 1.5.Let F be a finite field and let χ be a nontrivial character of F. Let P be a polynomial F n → F of degree d < char(F).Let f (x) = χ(P(x)) and assume that f U d ≥ ǫ > 0 where ǫ ≤ 1/|F|.Then where c is an absolute constant and c ′ (d) = 4 d d .
Our result implies a similar improvement to the bounds for the quantitative inverse theorem for Gowers norms for polynomial phase functions of degree d.
Theorem 1.6.Let F be a field of prime order and let P be a polynomial x) where ω = e 2πi |F| and assume that f U d ≥ ǫ > 0 where ǫ ≤ 1/|F|.Then there exists a polynomial Q : where c is an absolute constant and c ′ (d) = 4 d d .
Theorems 1.4 and 1.6 easily follow from Theorem 1.5.
Proof of Theorem 1.4.Note that when f (x) = χ(P(x)), then The result is now immediate from Theorem 1.5.
Proof of Theorem 1.6.By Theorem 1.5, there exists a set of r ≤ (c Thus, there exists some χ ∈ Ĝ with

Analytic rank and partition rank of tensors
Related to the bias and rank of polynomials are the notions of analytic rank and partition rank of tensors.Recall that if F is a field and V 1 , . . ., V d are finite dimensional vector spaces over F, then an order d tensor is a multilinear map ) Each V k can be identified with F n k for some n k , and then there exist , where e i is the ith standard basis vector.
The following notion was introduced by Gowers and Wolf [3].
Definition 1.7.Let F be a finite field, let V 1 , . . ., V d be finite dimensional vector spaces over F and let T : Then the analytic rank of T is defined to be arank(T ) = − log |F| bias(T ), where bias(T ) ] for any nontrivial character χ of F.
Remark 1.8.This is well-defined.Indeed, if χ is a nontrivial character of F, then where T (v 1 , . . ., v d−1 , x) is viewed as a function in x.The second equality holds because ] does not depend on χ, and is always positive.Moreover, it is at most 1, therefore the analytic rank is always nonnegative.
A different notion of rank was defined by Naslund [13].
We say that T has In general, the partition rank of T is the smallest r such that T can be written as the sum of r tensors of partition rank 1.This number is denoted prank(T ).
Kazhdan and Ziegler [9] and Lovett [11] proved that arank(T ) ≤ prank(T ).In the other direction, it follows from the work of Bhowmick and Lovett [1] that if an order d tensor T has arank(T ) ≤ r, then prank(T ) ≤ f (r, d) for some function f .Note that f does not depend on |F| or the dimension of the vector spaces V k .However, f has an Ackermann-type dependence on d and r.For d = 3, 4, better bounds were established by Haramaty and Shpilka [5].They proved that for d = 3 we have prank(T ) = O(r 4 ), and that for d = 4 we have prank(T ) = exp(O(r)).
The main result of our paper is a polynomial upper bound, which holds for general d.
an order d tensor with arank(T ) ≤ r and assume that r ≥ 1.Then prank(T ) ≤ (c for some absolute constant c, and c We remark that a very similar result was obtained independently and simultaneously by Milićević [12].Moreover, in the special case d = 4, a similar bound was proved independently by Lampert [10]. It is not hard to see that Theorem 1.10 implies Theorem 1.5.Indeed, let P be a polynomial F n → F of degree d < char(F), let f (x) = χ(P(x)) and assume that f U d ≥ ǫ > 0, where Note that T (y 1 , . . ., y d ) = D y 1 . . .D y d P(x) where D y g(x) = g(x + y) − g(x).Thus, by Taylor's approximation theorem, since d < char(F), we get for some polynomial W of degree at most d − 1.
By equation (1), T can be written as a sum of at most (c d) tensors of partition rank 1.Hence, 1  d! T (x, . . ., x) can be written as a sum of at most (c d) , and therefore P has rank at most We remark that the proof of the main result of this paper follows the strategy introduced by the author in [7], but the argument is improved locally at a few places.
2 The proof of Theorem 1.10

Notation and preliminaries
In the rest of the paper, we identify V i with F n i .Thus, the set of all tensors which will be denoted by G throughout this section.Also, B will always stand for the multiset can be viewed as the set of d-dimensional (n 1 , . . ., n d )-arrays over F which in turn can be viewed as F n 1 n 2 ...n d , equipped with the entry-wise dot product.
For I ⊂ [d], we write F I for i∈I F n i so that we naturally have G = F I ⊗ F I c , where then rs is the same as the entry-wise dot product r.s.Also, note that viewing r as a d-multilinear map R : Finally, we use a non-standard notation and write kB to mean the set of elements of G which can be written as a sum of at most k elements of B, where B is some fixed (multi)subset of G, and similarly, we write kB − lB for the set of elements that can be obtained by adding at most k members and subtracting at most l members of B.
We will use the next result several times in our proofs.It is a version of Bogolyubov's lemma, due to Sanders.

Lemma 2.1 (Sanders [14]
).There is an absolute constant C with the following property.Let A be a subset of Throughout the paper, C stands for the constant appearing in the previous lemma.Clearly we may assume that C ≥ 1. Logarithms are base 2.

The main lemma and some consequences
Theorem 1.10 will follow easily from the next lemma, which is the main technical result of this paper.See [2] for an application of a qualitative version of this lemma.

Lemma 2.2. Let d ≥ 1 be an integer and let
then there exists a multiset Q whose elements are pure tensors chosen from f 1 (d)B ′ − f 1 (d)B ′ (but with arbitrary multiplicity) with the following property.The set of arrays r ∈ G with r.q = 0 for at least Throughout the paper, the functions G, c 1 , c 2 will refer to the functions introduced in the previous lemma.In fact, as F is fixed, we will write G(d, δ) to mean G(d, δ, F).
In this subsection we deduce Theorem 1.10 from Lemma 2.2.The notion introduced in the next definition is closely related to the partition rank, but will be somewhat more convenient to work with.
Definition 2.3.Let k be a positive integer.We say that r ∈ G is k-degenerate if for every I ⊂ there exists a subspace H I ⊂ F I of dimension at most k such that r ∈ Note that for every i we have s i .w= 0 for all w ∈ D ′ and so also s i .q= 0 for all q ∈ Q.
Now we are in a position to prove Theorem 1.10 conditional on Lemma 2.2.
Proof of Theorem 1.10.Let T : be an order d tensor with arank(T ) ≤ r.By Remark 1.8, we have But there exists some absolute constant c such that

The overview of the proof of Lemma 2.2
The proof of the lemma goes by induction on d.In what follows, we shall prove results conditional on the assumption that Lemma 2.2 has been verified for all d ′ < d.Eventually, we will use these results to prove the induction step.
In this subsection, we give a detailed sketch of the proof in the d = 3 case.At the end of the subsection, we also briefly sketch the d > 3 case.

The high-level outline in the case d = 3
We assume that Lemma 2.2 has been proven for d ≤ 2 and use this assumption to show that it holds for d = 3.We will take such that the Q I have roughly equal size.This implies that if for some r ∈ G we have r.q = 0 for almost all q ∈ Q, then r.q = 0 holds for almost all q ∈ Q I for every I = {1}, {2}, {3}, {1, 2, 3}.We define Q {1,2,3} first, in a way that if r.q = 0 for almost all q ∈ Q {1,2,3} , then r = x + y where x ∈ V {1,2,3} for a vector space V {1,2,3} which is independent of r and have small dimension, and y has small partition rank.This already implies that any array r ∈ G with for some subspaces H I (r) ⊂ F I depending on r and of small dimension.We then find and V {2,3} ⊂ F {2,3} are subspaces independent of r and have small dimension, and K I (r) ⊂ F I are subspaces of small dimension (although quite a bit larger than dim(H I (r))).Then we find where V {2} ⊂ F n 2 and V {1,3} ⊂ F {1,3} are subspaces independent of r and have small dimension, and L {1,2} (r) ⊂ F {1,2} is a subspace of small dimension.Finally, we find , where V {3} ⊂ F n 3 and V {1,2} ⊂ F {1,2} are subspaces independent of r and have small dimension.
How will we find Q {1,2,3} , Q {1} , Q {2} and Q {3} ?In this outline we will only explain how to find is a subspace of low codimension, and for each u ∈ U, Q u ⊂ F {1,3} is a multiset consisting of pure tensors such that if for some x ∈ F {1,3} we have x.t = 0 for almost all t ∈ Q u , then for some subspaces W I (u) ⊂ F I not depending on x and of small dimension.Let us call a Q u with this property forcing.We will also make sure that all the Q u have roughly the same size.

Why does this Q {2} work?
In what follows, we will sketch why this choice is suitable.We remark that in the general case this is done in Lemma 2.15.Let R consist of those such that r.q = 0 for almost all q ∈ Q {2} .Let r ∈ R. Write r = r 2 + r 3 + r 4 where It is enough to prove that for some small subspaces V {2} ⊂ F n 2 , V {1,3} ⊂ F {1,3} and L ′ {1,2} (r) ⊂ F {1,2} (in fact, we will be able to take First note that r 2 u has small (partition) rank for every u ∈ U. Indeed, r)u, where, for a vector space L of tensors, Lu denotes the space {su : s ∈ L}.
Moreover, since the Q u all have roughly the same size, for almost every u ∈ U we have that r.(u ⊗ t) = 0 holds for almost every t ∈ Q u .But r.(u ⊗ t) = (ru).t,therefore as Q u is forcing, it follows that for any such for some subspaces W I (u) ⊂ F I not depending on r and of small dimension.Since any element of F n 1 ⊗ W {3} (u) + W {1} (u) ⊗ F n 3 has small partition rank, it follows that for almost every u ∈ U, where s(u) is a tensor of small partition rank.Define a sequence 0 = Z(0) ⊂ Z(1) ⊂ . . .⊂ Z(m) ⊂ F {1,3} of subspaces recursively as follows.
Given Z( j), if there is some r ∈ R such that r 4 u is far from Z( j) for many u ∈ U, then set . What we mean by r 4 u being far from Z( j) is that there is no z ∈ Z( j) such that r 4 u − z has small partition rank.For suitably chosen parameters, one can show that this procedure cannot go on for too long, ie. that for some not too large m we have that for every r ∈ R, for almost all u ∈ U there is some z ∈ Z(m) with r 4 u − z having small partition rank.Now let r ∈ R. Let X(r) be the set consisting of those x ∈ K {1,3} (r) which are close to Z(m).Then r 4 u ∈ X(r) for almost every u ∈ U. Let t 1 , . . ., t α be a maximal linearly independent subset of X(r) and extend it to a basis t 1 , . . ., t α , t ′ 1 , . . ., t ′ β for K {1,3} (r).Now if a linear combination of t 1 , . . ., t α , t ′ 1 , . . ., t ′ β is in X(r), then the coefficients of t ′ 1 , . . ., t ′ β are all zero.Write r 4 = i≤α s i ⊗ t i + j≤β s ′ j ⊗ t ′ j for some s i , s ′ j ∈ F n 2 .Since r 4 u ∈ X(r) for almost all u ∈ U, we have, for all j, that s ′ j .u= 0 for almost all u ∈ U. Since these hold for more than half of u ∈ U, we obtain s ′ j ∈ U ⊥ for every j, therefore j≤β s ′ j ⊗ t ′ j ∈ U ⊥ ⊗ F {1,3} .Since t i ∈ X(r) for every i, we may choose z i ∈ Z(m) such that t i = z i + y i where y i ∈ F {1,3} has small partition rank.Now i≤α s i ⊗ t i ∈ F n 2 ⊗ Z(m) + i≤α s i ⊗ y i .Moreover, as α is small and each y i has small partition rank, we have i≤α s i ⊗ y i ∈ L ′ {1,2} (r) ⊗ F n 3 for some small L ′ {1,2} (r) ⊂ F {1,2} .So we have proved (2) with V {2} = U ⊥ and V {1,3} = Z(m).

Why can we find such a
Now we describe why there must exist Q {2} with elements chosen from 2 3 3+3 B ′ − 2 3 3+3 B ′ and having the required properties.We remark that in the general case this is done in Lemma 2.14.
We want to find a subspace U ⊂ F n 2 of low codimension, and forcing multisets Q u ⊂ F {1,3} (u ∈ U) consisting of pure tensors such that for every u the induction hypothesis we can find a forcing set in 2 3 2+3 R − 2 3 2+3 R consisting of pure tensors.Therefore it is enough to find a low codimensional subspace U and dense sets R u ⊂ D (for every u ∈ U) such that u ⊗ R u ⊂ 32B ′ − 32B ′ .As B ′ is dense in B, we have a dense subset S ⊂ F n 2 and dense subsets T s ⊂ D (s ∈ S ) such that s ⊗ T s ⊂ B ′ for every s ∈ S .By Bogolyubov's lemma (Lemma 2.1), there is a low codimensional subspace U contained in 2S − 2S .To establish the existence of a dense R u ⊂ D with u ⊗ R u ⊂ 32B ′ − 32B ′ for every u ∈ U, it is enough to prove the following lemma.
Indeed, once we have this lemma, it follows that for any s 1 , s 2 , s 3 , Lemma 2.5 follows easily from the next two lemmas.
Lemma 2.6.Let A be a dense subset of D. Then there exist a dense subspace V ⊂ F n 1 and for Proof.There exist a dense subset B ⊂ F n Lemma 2.7.Suppose that we have dense subspaces V, V ′ ⊂ F n 1 , for each v ∈ V a dense subspace In particular, this intersection is a dense subset of D.
Proof.The identity is trivial.Since the subspaces V ∩ V ′ and W v ∩ W ′ v are dense, the second assertion follows.

How can this be extended to d > 3?
Now we briefly sketch what the main difficulties are in the d > 3 case and how we can address them.The underlying strategy is similar: we take an ordering ≺ of the set of non-empty subsets and for each such I we choose Q I such that any array with r.q = 0 for almost all q ∈ Q I has where U J , U J c , K J c (r) can have dimension slightly larger than those of W J , W J c and H J c , but they are still low dimensional.In the d = 3 case, we have made use of a decomposition r = r 2 + r 3 + r 4 where r 4 ∈ F I ⊗ H I c (r), r 2 u has small partition rank and r 3 u is in a small subspace independent of r for every u ∈ F I .In general, such a decomposition need not exist.For example, when d = 4 and I = {1, 2}, then an array in W {1} ⊗ F {2,3,4} (or in F n 1 ⊗ H {2,3,4} (r) if we were to take {1, 2} ≺ {1}), when multiplied by some pure tensor u ∈ F {1,2} , yields a tensor which need not have small partition rank and need not lie a small space independent of r.However, by restricting the possible choices for u, we can make sure that the product is always zero.So we will take a decomposition r = r 1 + r 2 + r 3 + r 4 such that r 4 ∈ F I ⊗ H I c (r); for every pure tensor u ∈ F I , r 2 u has small partition rank and r 3 u lies in a small space depending only on u; and crucially, for every q ∈ Q I , r 1 .q= 0. To achieve this, we need to insist that J ≺ I whenever J I and that Q I is orthogonal to certain subspaces.To see this, note that in the above example where d = 4 and I = {1, 2} we need that {1} ≺ {1, 2} and Q {1,2} is orthogonal to W {1} ⊗F {2,3,4} .(If we had {1, 2} ≺ {1}, then in (4) we would have a term F n 1 ⊗ H {2,3,4} (r) rather than W {1} ⊗ F {2,3,4} , which we could not control.) We also need to generalise Lemma 2.5 to the case d > 3. Instead of using v∈V v ⊗ W v as in Lemma 2.6, we need to define an object in B such that 1. an instance of the object can be found in kB ′ − kB ′ for some small k whenever B ′ is dense in B (generalising Lemma 2.6) 2. the intersection of few instances of this object is a dense subset of B (generalising Lemma 2.7) In the next subsection we describe this object and show that it has the required properties.
The next lemma describes a property of systems which was not needed for us in the d = 3 case, but is crucial in the general case.It is required for finding a suitable decomposition r = r 1 + r 2 + r 3 + r 4 described at the end of the previous subsection.Indeed, we need a set Q I which is orthogonal to certain spaces of the form W J ⊗ F J c (ie. is contained in W ⊥ J ⊗ F J c ) to make sure that r 1 .q= 0 for every q ∈ Q I .We will use the following lemma to guarantee the existence of such a set Q I .
Lemma 2.11.Let Q be a k-system and for every non-empty I ⊂ [d], let L I ⊂ F I be a subspace of codimension at most l.Let T = I (L I ⊗ F I c ).Then Q ∩ T contains an f -system for f = k + 2 d l.
Proof.Let the spaces of Q be U u 1 ,...,u j−1 .It suffices to prove that for every 1 ≤ j ≤ d, and every Thus, it suffices to prove that for every I ⊂ [ j] with

The proof of Lemma 2.2
We now turn to the proof of Lemma 2.2.As described in the outline, the first step is to find a Q [d] such that if r.q = 0 for almost all q ∈ Q [d] , then r = x + y where x ∈ V [d] for a small space V [d] independent of r, and y has low partition rank.Lemma 2.12.Let d ≥ 2 and suppose that Lemma 2.2 has been proved for d ′ = d − 1.Let B ′ ⊂ B be such that |B ′ | ≥ δ|B| for some δ > 0. Then there exist some Q ⊂ 2B ′ − 2B ′ consisting of pure tensors and a subspace V [d] ⊂ F [d] of dimension at most 4C(log(2/δ)) 4 with the following property.Any array r with r.q = 0 for at least 7  8 |Q| choices q ∈ Q can be written as r = x + y where x ∈ V Moreover, by Lemma 2.1, for every t ∈ D ′ , there exists a subspace U t ⊂ F n d of codimension at most C(log(2/δ)) 4 such that t ⊗ U t ⊂ 2B ′ − 2B ′ .After passing to suitable subspaces, we may assume that all U t have the same codimension k ≤ C(log(2/δ)) 4 .Now let Q = ∪ t∈D ′ (t ⊗ U t ).
The next lemma is the last ingredient of the proof.It is a generalisation of the discussion in Subsubsection 2.3.2.Given a tensor r ∈ V [d] + I⊂[d−1],I ∅ F I ⊗H I c (r), we turn the terms F I ⊗H I c (r) one by one into terms V I ⊗F I c +F I ⊗V I c where V J are small and do not depend on r. (Note that this is not quite the same as our approach to the case d = 3.)As briefly explained in Subsubsection 2.3.4,the order in which the various I are considered is important: we define ≺ to be any total order on the set of non-empty subsets of [d − 1] such that if J I then J ≺ I.It is worth noting that unlike in the d = 3 case, the subspaces V J , V J c with J ≺ I are allowed to change when V I and V I c get defined (although in fact the V J c will not change, and the V J change only for J I).All we require is that they do not become much larger.
∅ and let W J ⊂ F J , W J c ⊂ F J c be subspaces of dimension at most k for every J ≺ I.Moreover, let W [d] ⊂ F [d] have dimension at most k.Suppose that Q ′ , Q s (and Q I ) have the six properties described in Lemma 2.14.Then any array with dim(H J c (r)) ≤ k and the property that r.q = 0 for at least for some U J ⊂ F J , U J c ⊂ F J c not depending on r and some K J c (r) ⊂ F J c possibly depending on r, all of dimension at most k 2c 2 (|I|) .Proof.By (4) in Lemma 2.14, for every s ∈ Q ′ there exist subspaces V J (s) ⊂ F J for every J ⊂ I c , J ∅, with dimension at most g 1 = G(d − 1, |F| −2 3 d+4 C(log 2 d−1 /δ) 4 ) such that the set of arrays t ∈ F I c with t.q = 0 for at least (1 where g 2 = 2 −3 d+2 .Note, for future reference, that with dim(H J c (r)) ≤ k and the property that r.q = 0 for at least (1 every s ∈ Q ′ (r), and in particular, for at least (1 − g 3 )|Q ′ | choices s ∈ Q ′ , there exists some t(s) ∈ T (s) such that r 4 s − t(s) is g 4 -degenerate.So for at least g 3 |Q ′ | choices s ∈ Q ′ there is some t(s) ∈ T (s) such that r 4 s − t(s) is g 4 -degenerate, but there is no z ∈ Z( j) such that r 4 s − z is (g 4 + 1)g 4 -degenerate.In this case there is no z ∈ Z( j) such that z − t(s) is g 2 4 -degenerate.On the other hand, since r 4 s ∈ H I c (r) ⊂ Z( j + 1), there is some z ∈ Z( j + 1) such that z − t(s) is g 4 -degenerate.For any i, let K(i, s) be the subspace of T (s) spanned by those t ∈ T (s) for which there is some z ∈ Z(i) with z − t being g 4 -degenerate.Since the dimension of T (s) is at most g 4 , we have t(s) K( j, s), else there would exist some z ∈ Z( j) such that z − t(s) is g 2 4 -degenerate.On the other hand, t(s) ∈ K( j + 1, s).Thus, dim K( j + 1, s) > dim K( j, s).This holds for at least Since K(m, s) ⊂ T (s), we have dim K(m, s) ≤ g 4 .Thus, g 3 and dim Z(m) ≤ kg 4 g 3 .Write Z = Z(m).Now let r ∈ R. Let X(r) be the set consisting of those x ∈ H I c (r) for which there is some z ∈ Z with x − z being (g 4 + 1)g 4 -degenerate.Then r 4 s ∈ X(r) apart from at most 2g 3 |Q ′ | choices s ∈ Q ′ .Let t 1 , . . ., t α be a maximal linearly independent subset of X(r) and extend it to a basis t 1 , . . ., t α , t ′ 1 , . . ., t ′ β for H I c (r).Now if a linear combination of t 1 , . . ., t α , t ′ 1 , . . ., t ′ β is in X(r), then the coefficients of t ′ 1 , . . ., t ′ β are all zero.Write r 4 = i≤α s i ⊗ t i + j≤β s ′ j ⊗ t ′ j for some s i , s ′ j ∈ F I .Since r 4 q ∈ X(r) for at least (1 in Lemma 2.14 there exist subspaces L J ⊂ F J (J ⊂ I, J ∅) not depending on r, and of dimension at most G(|I|, |F| −2 d+1 dk ) such that s ′ j ∈ J⊂I,J ∅ L J ⊗ F I\J for all j.Thus, r 4 ∈ i≤α s i ⊗ t i + J⊂I,J ∅ L J ⊗ F J c .Moreover, for every i ≤ α, we have t i ∈ X(r), so there exist We claim that dim(Z), dim(K ′ J c ) and dim(L J ) are all bounded by k 2c 2 (|I|) − k.Firstly, note that g Claim.For every 0 ≤ i ≤ 2 d−1 − 1 there exists a multiset Q I i of pure tensors with elements chosen from 2 3 d+3 B ′ − 2 3 d+3 B ′ , and subspaces W I j (i) ⊂ F I j , W (I j ) c (i) ⊂ F (I j ) c for every j ≤ i (for j = 0, we only require W [d] (i) and not W ∅ (i)) with the following properties.The dimension of each of these spaces is at most g 1 (i) = G(d − 1, δ) α(i) , where α(i) = 4 • Π 1≤ j≤i 2c 2 (|I j |) .Moreover, if r ∈ G has r.q = 0 for at least (1 − 1 4 (2 −3 d+2 ) 2 )|Q I j | choices q ∈ Q I j for all j ≤ i, then r ∈ W [d] (i) + 1≤ j≤i (W I j (i) ⊗ F (I j ) c + F I j ⊗ W (I j ) c (i)) + j>i F I j ⊗ H (I j ) c (i, r) holds for some H (I j ) c (i, r) possibly depending on r and of dimension at most g 1 (i).
Proof of Claim.This is proved by induction on i.For i = 0, by Lemma 2.12, there exist Q ∅ ⊂ 2B ′ − 2B ′ consisting of pure tensors and V [d] ⊂ F [d] of dimension at most 4C(log(2/δ)) 4 ≤ 4C(2 log(1/δ)) 4 ≤ G(d − 1, δ) 4 such that if r.q = 0 for at least 7  8 |Q ∅ | choices q ∈ Q ∅ , then r can be written as r = x + y where x ∈ V [d] and y is g 2 -degenerate for g 2 = G(d − 1, δ 4|F| 4C(log 2/δ) 4 ).Since Once we have found suitable sets W I j (i − 1) and W (I j ) c (i − 1) for all j ≤ i − 1, we can apply Lemmas 2.14 and 2.15 with I = I i and k = g 1 (i − 1) to find a suitable Q I i , W I j (i) and W (I j ) c (i) for all j ≤ i, and the claim is proved, since g 1 (i) = g 1 (i − 1) Thus, α(2 d−1 − 1) ≤ 4 d d .This completes the proof of the lemma.

Definition 1 . 1 .
Let F be a finite field and let χ be a nontrivial character of F. The bias of a function f : F n → F with respect to χ is defined to be biasχ ( f ) = E x∈F n [χ( f (x))].(Hereand elsewhere in the paper E x∈G h(x) denotes 1 |G| x∈G h(x).)Remark 1.2.Most of the previous work is on the case F = F p with p a prime, in which case the standard definition of bias is bias( f ) = E x∈F n ω f (x) where ω = e 2πi p .Definition 1.3.Let P be a polynomial F n → F of degree d.The rank of P (denoted rank(P)) 4 from [3], T is a tensor of order d.Moreover, by the same lemma, we have T (y 1 , . . ., y d ) = S ⊂[d] (−1) d−|S | P(x + i∈S y i ) for any x ∈ F n .Thus, bias(T ) = E y 1 ,...,y d ∈F n χ(T (y 1 , . . ., y d )) = E y 1 ,...,y d ∈F n S ⊂[d] C d−|S | f (x + i∈S y i ) for any x ∈ F n .By averaging over all x ∈ F n , it follows that bias(T ) = f 2 d U d ≥ ǫ 2 d .Thus, arank(T ) ≤ 2 d log |F| (1/ǫ).Note that 2 d log |F| (1/ǫ) ≥ 1.Therefore, by Theorem 1.10 with r = 1 and dense subsets C b ⊂ F n 3 for each b ∈ B such that b ⊗ C b ⊂ A. By Bogolyubov's lemma, 2B − 2B contains a dense subspace V ⊂ F n 1 , and for every b