Improved $\ell^p$-Boundedness for Integral $k$-Spherical Maximal Functions

We improve the range of $\ell^p(\mathbb Z^d)$-boundedness of the integral $k$-spherical maximal functions introduced by Magyar. The previously best known bounds for the full $k$-spherical maximal function require the dimension $d$ to grow at least cubicly with the degree $k$. Combining ideas from our prior work with recent advances in the theory of Weyl sums by Bourgain, Demeter, and Guth and by Wooley, we reduce this cubic bound to a quadratic one. As an application, we deduce improved bounds in the ergodic Waring--Goldbach problem.


Introduction
Our interest lies in proving p (Z d )-bounds for the integral k-spherical maximal functions when k ≥ 3. These maximal functions are defined in terms of their associated averages, which we now describe. Define a positive definite function f on R d by f(x) = f d,k (x) := |x 1 | k + · · · + |x d | k , and note that when x ∈ R d + , f(x) is the diagonal form x k 1 + · · · + x k d . For λ ∈ N, let R(λ) denote the number of integral solutions to the equation We are interested in averages given by convolution with these measures: We know from the literature on Waring's problem that as λ → ∞, one has the asymptotic where S d,k (λ) is a convergent product of local densities: Here µ p (λ) with p < ∞ is related to the solubility of (1.1) over the p-adic field Q p , and µ ∞ (λ) to solubility over the reals. It is known that when d is sufficiently large in terms of k, one has In particular, these bounds hold for d ≥ 4k when k ≥ 4 is a power of 2, and for d ≥ 3 2 k otherwise (see Theorems 4.3 and 4.6 in Vaughan [17]).
Throughout the paper we use the notation f (x) g(x) or g(x) f (x) to mean that there exists a constant C > 0 so that | f (x)| ≤ C|g(x)| for all sufficiently large x ≥ 0. The implicit constant C may depend on 'inessential' or fixed parameters, but will be independent of 'x'; below the implicit constants will often depend on the parameters k, d and p. For instance, (1.3) means that there exists positive constants C 1 and C 2 depending on d and k so that C 1 ≤ S d,k (λ) ≤ C 2 .
In view of (1.2) and (1.3), we may replace the convolution σ λ * f above by the average Variants of this maximal function were introduced by Magyar [9] and studied later in [11,10,8,2,6,7]. In particular, Magyar, Stein and Wainger [11] considered the above maximal function when k = 2 and d ≥ 5 and proved that it is bounded on p (Z d ) when p > d d−2 . This result is sharp except at the endpoint, for which the restricted weak-type bound was proved later by Ionescu [8]. To the best of our knowledge, the sharpest results on the boundedness of A * for degrees k ≥ 3 are those obtained by the third author in [7]. In the present paper, we give a further improvement.
We remark that when d > d 0 (k), p 0 (d, k) always lies in the range (1,2). Our main result is the following.
Theorem 1. If k ≥ 3, d > d 0 (k) and p > p 0 (d, k), then the maximal operator A * , defined by (1.5), is bounded on p (Z d ): that is, Let d * 0 (k) := 1 + d 0 (k) denote the least dimension in which Theorem 1 establishes that A * is bounded on 2 (Z d ). We emphasize that for large k, we have d * 0 (k) = k 2 − k + O(k 1/2 ), whereas in previous results, such as the work of the third author [6,7], one required d > k 3 − k 2 . While our results improve on these previous results by a factor of the degree k, the conjecture is that the maximal function is bounded on 2 (Z d ) for d k (see [6] for details), but such a result appears to be way beyond the reach of present methods.
It is also instructive to compare d * 0 (k) to the known bounds for the functionG(k) in the theory of Waring's problem (defined as the least value of d for which the asymptotic formula (1.2) holds). It transpires that the values of d * 0 (k) match the best known upper bounds onG(k) for all but a handful of small values of k, and even in those cases, we miss the best known bound onG(k) only by a dimension or two. For an easier comparison, we list the numerical values of d * 0 (k), k ≤ 10, their respective analogues in earlier work, and the bounds onG(k) in Table 1 [6], and the known upper bounds onG(k) in Bourgain [3] and Vaughan [16].
A key ingredient in the proof of Theorem 1 and its predecessors is an approximation formula generalizing (1.2). First introduced in [11] when k = 2, such approximations are obtained for the average's corresponding Fourier multiplier: where ξ ∈ T d and e(z) = e 2πiz . We need to introduce some notation in order to state our approximation formula. Given an integer q ≥ 1, we write Z q = Z/qZ and Z * q for the group of units; we also write e q (x) = e(x/q). The d-dimensional Gauss sum of degree k is defined as for a ∈ Z q and b ∈ Z d q . If Σ λ denotes the surface in R d + defined by (1.1) and dS λ (x) denotes the induced Lebesgue measure on Σ λ , we define a continuous surface measure dσ λ (x) on Σ λ by We note that dσ λ is essentially a probability measure on Σ λ for all λ. We also fix a smooth bump function ψ, which is 1 on − where the error terms E λ are the multipliers of convolution operators satisfying the dyadic maximal inequality for each Λ ≥ 1 and all sufficiently small δ > 0.
Our Approximation Formula takes the same shape as those in [6] and [7], but with an improved error term that relies on two recent developments: • the underlying analytic methods were improved in the authors' previous work [1], • and the recent resolution of the main conjecture about Vinogradov's mean value integral [19,4] and related refinements of classical mean value estimates [18,3].
The most dramatic improvement follows from our improved analytic methods originating in [1] where we improve the range of 2 (Z d ) by a factor of the degree k. In [7], this sort of improvement -which also used the recent resolution of the Vinogradov mean values theorems [18,3] -was limited to maximal functions over sufficiently sparse sequences. Here, our bounds supersede those for integral k-spherical maximal functions over sparse sequences in [7] because our treatment of the minor arcs in the error term is more efficient. The reader may compare Lemmas 3.2 and 3.1 below to Lemmas 2.1 and 2.2 of [7] to determine the efficacy of our method here. Consequently, [18] and [3] allow us to further improve slightly upon a more direct application of the Vinogradov mean value theorems from [19] and [4]. One minor drawback is that in our method the -losses in [19,4] do not allow us to deduce endpoint bounds.
As an application, we deduce that the maximal function of the "ergodic Waring-Goldbach problem" introduced in our recent work [1] is bounded on the same range of spaces as above. That maximal function is associated with averages where, instead of sampling over integer points, we sample over points where all coordinates are prime. To be precise, let R * (λ) denote the number of prime solutions to the equation (1.1) weighted by logarithmic factors: that is, where 1 P d is the indicator function of vectors x ∈ Z d with all coordinates prime. When R * (λ) > 0, define the normalized arithmetic surface measure and the respective convolution operators Similarly to (1.2), we know that as λ → ∞, one has the asymptotic where S * d,k (λ) is a product of local densities similar to S d,k (λ) above. Moreover, when d > 3k and λ is restricted to a particular arithmetic progression Γ d,k , we have 1 S * d,k (λ) 1, and the above estimate turns into a true asymptotic formula for R * (λ) (see the introduction and references in [1]). By Theorem 6 of [1] and Theorem 1 above, we immediately obtain the following result. Here, as in [1], d 1 (3) = 13 and d 1 (k) = k 2 + k + 3.
As another application one may give analogous improvements of the ergodic theorems obtained in [6], but we do not consider this here.
To establish our theorems, we follow the paradigm in [11] and strengthen the connection to Waring's problem as initiated in [7] by using a lemma from [1]; we then use recent work on Waring's problem to obtain improved bounds. We remark that [15] and [12] previously connected mean values (Hypothesis K * and Vinogradov's mean value theorems respectively) to discrete fractional integration. In Section 2, we outline the proofs of Theorems 1 and 2; we recall some results from [6,9] and state the key propositions required in the proofs. The remaining sections establish the propositions. In Section 3 we deal with the minor arcs; particularly, in Section 3.2, we use the recent work of Bourgain, Demeter and Guth [4] on Vinogradov's mean value theorem and a method of Wooley [18] for estimation of mean values over minor arcs. In Section 4, we establish the relevant major arc approximations. Finally, in Section 5, we establish the boundedness of the maximal function associated with the main term in the Approximation Formula.

Outline of the Proof
Since A * is trivially bounded on ∞ (Z d ), we may assume through the rest of the paper that p ≤ 2. Fix Λ ∈ N and consider the dyadic maximal operator When λ ≤ Λ, we have This representation allows us to decompose A λ into operators of the form for various measurable sets B ⊆ T.
Our decomposition of A λ , is inspired by the Hardy-Littlewood circle method. When q ∈ N and 0 ≤ a ≤ q, we define the major arc M(a/q) by We then decompose T into sets of major and minor arcs, given by Since the major arcs M(a/q) are disjoint, this yields a respective decomposition of A λ as . We will use the notations A B * and A B Λ to denote the respective maximal functions obtained from the operators A B λ . For example, from (2.1) and the trivial bound for the trigonometric polynomial h Λ (θ) * f , we obtain the trivial 1 In Section 3, we analyze the minor arc term and prove the following result.

For 1 < p ≤ 2, interpolation between (2.2) and (2.3) yields
The estimation of the major arc terms is more challenging, because an analogue of (2.3) does not hold for A M Λ . Still, it is possible to establish a slightly weaker version of Proposition 2.1. The following result was first established by Magyar [9], for d ≥ 2 k , and then extended by the third author [6] in the present form.
This proposition suffices to establish the p -boundedness of the dyadic maximal functions A Λ (this is the main result of Magyar [9]), but falls just short of what is needed for an equally quick proof of Theorem 1. In Section 4, we will handle the major arc approximations and prove the following proposition.
When we sum (2.6) over dyadic Λ = 2 j , we deduce that Combining this bound and (2.4), we conclude that the p -boundedness of the maximal operator A * follows from the p -boundedness of M * . The following proposition, which we establish in Section 5, then completes the proof of Theorem 1.

Minor Arc Analysis
Our minor arc analysis splits naturally in two steps. The first step is a reduction to mean value estimates related to Waring's problem; for this we use a technique introduced in [1]. We then apply recent work on Waring's problem to estimate the relevant mean values and to prove Proposition 2.1.

Reduction to mean value theorems
The reduction step is based on the following lemma, a special case of Lemma 7 in [1]. In the present form, the result is a slight variation of Lemma 4.2 in [6] and is implicit also in [11]. where B ⊆ T is a measurable set and K(·; ξ) ∈ L 1 (T) is a kernel independent of λ. Further, for Λ ≥ 2, define the dyadic maximal functions Thus, in the proof of Proposition 2.1, we apply (3.1) with K = F N and B = m. The supremum over ξ on the right side of (3.1) then stands in the way of a direct application of known results from analytic number theory. Our next lemma overcomes this obstacle; its proof is a variant of the argument leading to (12) in Wooley [18]. We have S N (θ, ξ) s = h 1 ≤H 1 · · · h l ≤H l a h (θ)e(ξh 1 ), so by applying the Cauchy-Schwarz inequality we deduce that Hence, We have h l ≤H l δ(n; h)δ(m; h).

Mean value theorems
We now recall several mean value estimates from the literature on Waring's problem. The first is implicit in the proof of Theorem 10 of Bourgain [3], which is a variant of a well-known lemma of Hua (see Lemma 2.5 in [17]). The present result follows from eqn. (6.6) in [3]. We note that when l = k, the left side of (3.6) turns into Vinogradov's integral J s,k (N) and the lemma turns into the main result of Bourgain, Demeter and Guth [4]. For small k, we will use another variant of Hua's lemma due to Brüdern and Robert [5]. The following is a weak form of Lemma 5 in [5]. (3.7) Note that by Remark 3.1 (with l − 1 in place of l) and Lemma 3.3 we obtain a version of (3.7) with l(l + 1) in place of 2 l + 2. Together with Lemma 3.4, this observation yields the following bound. (3.8) We also use a variant of Lemma 3.3 that provides extra savings when the integration over θ is restricted to a set of minor arcs. Lemma 3.5, a slight modification of Theorem 1.3 in Wooley [18], improves on (3.6) in the case l = k − 1. Here, m is the set of minor arcs defined at the beginning of §2.

Now we interpolate between the above bounds.
Lemma 3.6. If k ≥ 3 and 2 ≤ l ≤ k − 1 are natural numbers and r is real, with where δ(r) is the linear function of r with values δ(r 1 ) = 0 and δ(k 2 + k) = 1.
Proof. Let r 0 = min{l 2 + l, 2 l + 2}. The hypothesis on r implies that r 0 < r ≤ k 2 + k, so we can find t ∈ [0, 1) such that r = tr 0 + (1 − t)(k 2 + k). By Lemma 3.2 with l = 1 and Corollary 3.1, On the other hand, Lemma 3.2 with l = k − 2 and Lemma 3.5 yield Using Hölder's inequality and the above bounds, we get This inequality takes the form (3.11) with δ(r) = 1 − t(k − l + 1). Since t depends linearly on r and t = 0 when r = k 2 + k, δ(r) is a linear function of r with δ(k 2 + k) = 1. The value of r 1 (k, l) in (3.10) is the unique solution of the linear equation δ(r) = 0.

Proof of Proposition 2.1
By Lemma 3.1 and the arithmetic-geometric mean inequality, (3.14) where I d,k (N) is the integral defined in (3.11). Thus, the proposition will follow, if we prove the inequality Let l 0 (k) denote the value of l for which the maximum in the definition of d 0 (k) is attained (recall (1.6)). When d 0 (k) < d ≤ k 2 + k, we may apply (3.11) with r = d and l = l 0 to deduce (3.15) with δ = δ(d). We now observe that when d ≤ k 2 + k, we have δ(d) = kδ 0 (d, k) and that the hypothesis d > d 0 (k) ensures that δ(d) > 0.
When d > k 2 + k, we enhance our estimates with the help of the L ∞ -bound for S N (θ, ξ) on the minor arcs: by combining a classical result of Weyl (see Lemma 2.4 in Vaughan [17]) and Theorem 5 in Bourgain [3], we have sup τ k being the quantity that appears in the definition of δ 0 (d, k). Thus, when d > k 2 + k, we have We conclude that (3.15) holds with δ = 1 + (d − k 2 − k)τ k .

Major Arc Analysis
We will proceed through a series of successive approximations to A a/q λ , which we will define by their Fourier multipliers. Our approximations are based on the following major arc approximation for exponential sums that appears as part of Theorem 3 of Brüdern and Robert [5]. In this result and throughout the section, we write G(q; a, b) := q −1 x∈Z q e q (ax k + bx) and v N (θ, ξ) := N 0 e(θt k + ξt) dt.

Proof of Proposition 2.3
The bulk of the work concerns the case p = 2 of the proposition. Since with b j the unique integer such that − 1 2 ≤ b j − qξ j < 1 2 and η j = ξ j − b j /q. Let B a/q λ denote the operator on 2 (Z d ) with the above Fourier multiplier. To estimate the 2 -error of approximation of A a/q λ by B a/q λ , we use that when θ ∈ M(0/q), (4.1) and (4.2) yield By the localization of ψ, the above sum has at most one term in which b matches the integer vector that appears in the definition of G N (θ; ξ). Hence, is supported on a set where 1 8 ≤ |qξ j − b j | ≤ 1 2 for some j. For such j, by (4.3), and we conclude that where wb = (w 1 b 1 , . . . , w d b d ) and We remark that C a/q λ (ξ) can be expressed in a matching form, with J λ (ξ − q −1 wb) replaced by the analogous integral over M(0/q). Thus, when d > k, we deduce from Lemma 3.1 and (4.2) that Here, M(0/q) c denotes the complement of the interval M(0/q) in R.
Finally, we note that D a/q λ is really M a/q λ . Indeed, by the discussion on p. 498 in Stein [14] (see also Lemma 5 in Magyar [10]), we have since the surface measure dσ λ is supported in the cube [0, N] d . Combining this observation and (4.4)-(4.6), and summing over a, q, we conclude that when d > 2k, where dσ is the surface measure on the smooth manifold x k 1 + · · · + x k d = 1. By Rubio de Francia's theorem, the maximal function Thus, the lemma will follow, if we establish (5.2) with a = d/k − 1 and d > k + 1 because a > 1/2 once d > 3 2 k. We now turn to (5.2). Similarly to (4.7), we have By the corollary on p. 334 of Stein [14], we have uniformly in ξ. On the other hand, if k2 k |θ| ≤ |ξ|, we have on the support of φ. Hence, Proposition VIII.1 on p. 331 of Stein [14] yields for any fixed M ≥ 1. We now choose an index j, 1 ≤ i ≤ d, such that |ξ| ≤ d|ξ i | and set θ 0 = |ξ i |/(k2 k ). We apply (5.3) to the trigonometric integrals v φ (θ, ξ j ), j i, and to v φ (θ, ξ i ) when |θ| > θ 0 ; we apply (5.4) to v φ (θ, ξ i ) when |θ| ≤ θ 0 . From these bounds and the integral representation for µ(ξ), we obtain µ(ξ) for all d d−k < p ≤ 2 and d > k. The proposition then follows by summing over a and q (the hypothesis on p ensures that the resulting series over q is convergent).
Fix q ∈ N and a ∈ Z * q and write ψ 1 (ξ) = ψ(ξ/2) (so that ψ = ψψ 1 ). We borrow a trick from Magyar, Stein and Wainger [11] to express M a/q λ (ξ) as a linear combination of Fourier multipliers that separate the dependence on λ and from the dependence on a/q: Since w takes on precisely 2 d values for each a/q, it suffices to prove that uniformly for w ∈ {−1, 1} d . To prove (5.5), we will first bound the maximal function over the 'Archimedean' multipliers T q λ,w , and then we will bound the non-Archimedean multipliers S a/q w . This is possible because S a/q w is independent of λ ∈ N. with an implicit constant independent of q and w. We now observe that S a/q w does not depend on λ and apply (5.6) with g = S a/q w f to find that sup λ∈N T q λ,w • S a/q under the same assumptions on d and p which we note are weaker than the hypotheses of the proposition. Finally, observe that G(q; a, b) is a q-periodic function with Z q -Fourier transform equal to b∈Z q e q (−mb)G(q; a, b) = q −1 x∈Z q e q (ax k ) b∈Z q e q (b(x − m)) = e q (am k ), for each m ∈ Z q . Hence, we may apply Proposition 2.2 in [11] and the bound (4.2) for G(q; a, b) to deduce that S a/q The desired inequality (5.5) follows immediately from (5.7) and (5.8).