Random restrictions of high-rank tensors and polynomial maps
- Algorithms and complexity group, Centrum Wiskunde & Informatica
- More about Jop Briët
- Department of Applied Mathematics and Theoretical Physics, University of Cambridge
- More about Davi Castro-Silva
Editorial introduction
Random restrictions of high-rank tensors and polynomial maps, Discrete Analysis 2024:9, 24 pp.
Let A be an n×n matrix of rank r. The following argument shows that if we randomly restrict A by picking each row and each column independently with positive probability σ, then the resulting submatrix will be likely to have rank at least τr for some τ>0 that depends only on σ. We note first that A has r independent columns. By standard probabilistic estimates, there is an exponentially small (in r) probability that the number s of columns chosen in our random selection is smaller than, say, σr/2. The restriction of the matrix to those columns has row-rank s, so we can find s linearly independent rows, and by standard arguments again there is an exponentially small probability that fewer than σs/2 of those rows are selected. So the probability that we do not end up with a matrix of rank at least σ2r/4 is exponentially small.
The aim of this paper is to obtain a similar conclusion for tensors. (A tensor can be thought of as a higher-dimensional matrix. For example, a 3-tensor over a field F is a 3-dimensional array of elements of F.) However, for tensors there are many different notions of rank, each one typically taking a property of matrix rank and generalizing it: it turns out that several properties that are equivalent for matrices have natural generalizations that are not equivalent for tensors.
There is another problem that turns out to be more serious. Suppose that we have fixed on a notion of tensor rank. The natural thing to try to do in order to generalize the result about matrices would be to prove a result similar to the result that the rank of a matrix equals its row rank and its column rank. However, this is false: for instance, there exists a 3-tensor with slice-rank 4 with no 4×4×4 subtensor of slice-rank 4. (Slice rank is a notion introduced by Tao that played a central role in a reformulation of the solution to the cap-set problem by Ellenberg and Gijswijt, building on work of Croot, Lev and Pach.)
Let T be a d-tensor over a field F, defined as a function from [n1]×⋯×[nd] to F, where [n] is shorthand for {1,2,…,n}. Given a subset E⊂[ni] define Ti,E to be the restriction of T to [n1]×⋯×[ni−1]×E×[ni+1]×⋯×[nd].
To obtain a generalization of the matrix statement, it would be sufficient to prove that if T is a d-tensor of rank r, then for any i≤d we can find a set Ei of size linear in r such that the restriction Ti,Ei has rank |Ei|. In fact, a somewhat weaker property would also be sufficient, which the authors call the linear core property. This states that one can find a t×⋯×t subtensor of T of rank at least s, where t≤CrkT and s≥crkT for some constants C,c>0.
Unfortunately, for many kinds of tensor rank, while it is conjectured that they have the linear core property, this has not been proved. The best result we have in that direction, due to Thomas Karam, is that we can at least obtain such a subtensor for values of t and s that depend only on the rank of T (for many different definitions of rank), but the bounds he obtains are exponential rather than linear.
For this reason, the authors of the paper choose a different approach. To deal with the multiplicity of definitions of tensor ranks, they prove a more abstract result that applies to all “natural” definitions, which are ones that satisfy three obvious properties. The first is subadditivity – that rk(S+T)≤rkS+rkT. The second is monotonicity – that the rank of any restriction Ti,Ei is at most the rank of T. And the third is a Lipschitz condition: that if E⊂F, then rkTi,F−rkTi,E≤|F∖E|.
The problem is then reduced to one of proving a suitable concentration result for subadditive monotone Lipschitz functions on the collection of subsets of [n], or equivalently on {0,1}n with respect to the biased measure where each element is chosen independently with probability σ. If such a function has maximum value r, then the concentration result the authors prove gives an exponential upper bound for the probability that the value of such a function is smaller than σr multiplied by a suitable constant.
There is a close connection between d-tensors and multivariable polynomials of degree d – indeed, associated with a d-tensor is a d-linear map of which the tensor provides the coefficients, and one can also obtain a (symmetric) d-linear map from a polynomial by differencing d times. It is thus natural to ask how well randomly restricting a multivariable polynomial to a dense set of variables preserves its rank (which can be defined as the rank of the associated d-linear map or d-tensor).
Sometimes it is possible to deduce results about polynomials directly from corresponding results about tensors and multilinear maps, but for this result it is more difficult. An issue is that there is no canonical way to represent even a homogeneous polynomial as a multilinear map. (Such a canonical representation does exist when the field has high characteristic, but the result proved here is field-independent.) One might instead consider trying to apply the abstract concentration inequality directly by looking at the function that takes a set E to the rank of the restriction of a given polynomial to E, but this function turns out not to be subadditive. Instead, the authors prove another abstract result, making use of ideas from the analysis of Boolean functions, this time concerning monotone Lipschitz functions that are not necessarily subadditive.
The original motivation for these results was a problem in complexity theory. Very roughly, the results here can be used to show that bounded-depth and bounded fanin circuits supplemented by mod-2 addition gates of arbitrary fanin are insufficiently powerful to decode corrupted error-correcting codes.