Decomposition of random walk measures on the one-dimensional torus

The main result of this paper is a decomposition theorem for a measure on the one-dimensional torus. Given a sufficiently large subset $S$ of the positive integers, an arbitrary measure on the torus is decomposed as the sum of two measures. The first one $\mu_1$ has the property that the random walk with initial distribution $\mu_1$ evolved by the action of $S$ equidistributes very fast. The second measure $\mu_2$ in the decomposition is concentrated on very small neighborhoods of a small number of points.


Introduction
This paper is concerned with the dynamics of subsemigroups of the positive integers acting on the one-dimensional torus R/Z. This extensive line of research goes back to Furstenberg, who described the minimal sets of the action of the semigroup generated by two multiplicatively independent integers. They are finite periodic orbits and the whole torus.
Furstenberg also made several conjectures about such actions, which had an enormous impact on the field. Perhaps the most prominent of these asks for a classification of invariant measures on the torus under the action of the semigroup generated by 2 and 3 (or any other pair of multiplicatively independent integers). There has been some remarkable progress on this problem, but the conjecture is still wide open.
These problems become more manageable if one considers the action of "larger" semigroups. For example, Einsiedler and Fish gave a classification of invariant measures under the action of a semigroup with positive logarithmic density.
The main result of this paper is a decomposition theorem for a measure on the torus. Given a "sufficiently large" subset S of the positive integers, an arbitrary measure on the torus is decomposed as the sum of two measures. The first one µ 1 has the property that the random walk with initial distribution µ 1 evolved by the action of S equidistributes very fast. The second measure µ 2 in the decomposition is concentrated on very small neighborhoods of a small number of points.
The proof of the main result uses tools from additive combinatorics and builds on the work of Bourgain, Furman, Lindenstrauss and Mozes on the classification of stationary measures under the action of non-commuting toral automorphisms.
We define the general setting as follows: for L > 0, let S ⊂ [L, 2L] be a set of natural numbers, |S| > L β for 0 < β < 1, with S being ( C, λ )-regular (definition follows). The variables L, β and C, λ > 0 should be considered as global parameters and are referred to in the different theorems, propositions and lemmas, typically by giving thresholds on their values in the conditions of the statements. All the measures in this paper are Borel measures, with the topology of the measure space clear from the context. For countable spaces such as N the topology is the discrete one. In a minor abuse of terminology we refer to the Haar measure on T as the Lebesgue measure on T.
For a non-empty set S ⊂ N we let be the measure that averages over S. The set S acts on the torus in the following standard way: s.x = sx (mod 1) for s ∈ S. For s ∈ S, let T s : T → T be the mapping: T s (x) : x → s.x . We denote by P(·) the space of Borel probability measures on a topological space. For µ ∈ P(T), ν ∈ P(S), define the measure ν * µ ∈ P(T) as follows: where T s * µ(E) = µ(T −1 s (E)) for a Borel set E ⊂ T. The following definition says what it means for a set S to be ( C, λ )-regular. Definition 1. Let C, λ > 0. We say that a set S ⊂ [L, 2L] ⊂ N is ( C, λ )-regular at scale r, where r is a positive real number, if for any interval I ⊂ [L, 2L] ⊂ R with |I| ≥ r. By | · |, we denote cardinality or the Lebesgue measure according to context.
If we say that a set is ( C, λ )-regular, we mean that it is ( C, λ )-regular at scale 1.
We are ready to state the main theorem of this work, which is a decomposition theorem for a measure on the torus. In the statement, µ 1 , µ 2 are non-negative Borel measures on T, and B x,r denotes an open ball in T with center at x and radius r. A finite set X ⊂ T is δ -separated if for any x, y ∈ X, d(x, y) > δ , where d(·, ·) is the usual metric on T.

4)
and there are finite subsets X 1 , X 2 , ..., X l of T, where l < L Cτ , such that each X i is 1 where N = L U and M = N 1−κ , and µ 1 is supported on the complement of the support of µ 2 .
The usefulness of this theorem is due to the fact that M < N in a well controlled manner. This implies a granulation phenomenon on the support of µ 2 , meaning that µ 2 (T) is larger, in a controlled manner, than the Lebesgue measure of the support of µ 2 . In addition, 1.4 effectively describes µ 1 as being "close to uniform" over its support.
In a follow-up paper we intend to show how Theorem 2 can be used to prove effective equidistribution results in this context.

Preliminaries from additive combinatorics
The following inequality is due to Ruzsa. It is surprisingly useful, given how simple it is to prove. We will need the following graph-theoretic result, closely connected to the Balog-Szemerédi-Gowers Theorem, due to Sudakov B., Szemerédi E. and Vu V. H. . 3. for each a ∈ A and b ∈ B , there are n 2 /(2 12 K 5 ) paths of length 3 whose two endpoints are a and b.
We need two pieces of notation for the statement of the next lemma. They will be used throughout the paper. Definition 6. Given µ ∈ P(T) and a real number δ ≥ 0, let The following lemma allows us to extract from an initial set of high Fourier coefficients a set of relatively large Fourier coefficients that is stable with respect to subtraction. The sets are in a window around 0 and are viewed at some resolution M. The proof uses a counting argument which involves the graph-theoretic Lemma 4 as well as the algebraic nature of the ring of integers. Then there exists a subset A 1 ⊂ A 0 such that Proof. By passing to a subset A ⊂ A 0 of size |A| ≥ |A 0 | 4 we may assume that Re(e iθ · µ(a)) > δ 2 , for some fixed θ ∈ [0, 2π) and for every a ∈ A. Therefore where e a (x) stands for e −2πaxi . Note that We have that ∑ a,b∈A where the inequality is due to the Cauchy-Schwarz inequality. Therefore By Lemma 14 we have that Then a − b can be written as in at least |Ā| 2 δ 2 2 37 ways with all a − b 1 , a 1 − b 1 , a 1 − b ∈ H, and so by (2.12) And so (using 2.10) Definition 8. We denote by V (y, ρ) the ρ-neighborhood of y ∈ P 1 . Formally it is: The following theorem is a projection theorem by Bourgain, which can be found in [3].
Let E ⊂ [0, 1] 2 be an r-separated set with |E| > r −α and a non-concentration property Then there exist D ⊂ P 1 and E ⊂ E with is the orthogonal projection of U on the subspace of R 2 spanned by some representative of θ .

Regularity of sets
(3.1) A set B is said to be (C, α)-regular at scale r if the corresponding uniform measure ρ = 1 |B| ∑ x∈B δ x is (C, α)-regular at scale r.
The following lemma [4, Lemma 5.2] relates the defined notion of regularity of a set to the expression of dimension via covering numbers.
then there is a point x ∈ B 0,1 and a probability measure is the following Lemma 12. Let ρ be a (C, α)-regular probability measure at scale r on B ⊂ R d . Then for any ε > 0 there is an r-separated subset A ⊂ supp(ρ) and C ε > 0 such that the uniform measure on A is (C ε , α − ε)regular at scale r on B.
We will be using the following lemma. Lemmas 11, 12 are used in its proof.
Lemma 13 ([4, Lemma 6.7], one dimensional torus). For any ε > 0, there is a C ε so that the following holds. Let µ be a probability measure on T. Assume that for some N > M,t, α Then there is an M < N 1 < N with

Main Bootstrapping Lemma
The following lemma is very simple but it is employed over and over again in the main lemmas, making its explicit statement and proof worthwhile.
Since each a i can be at most 1, we have that This is a contradiction.
We prove the following lemma which is the extraction of the initial set of large Fourier coefficients, using the information of having one single large Fourier coefficient of the random walk measure.
Lemma 15 (Initial dimension). For any probability measure µ on T for n ≥ 1, if for some a ∈ Z | µ n (a)| > δ 0 (4.5) Proof. Note the equality By the above, we have that By Lemma 14 For the next lemma we need the following simple definition. The following is the main technical tool of the proof of our main decomposition theorem, Theorem 2. We either find a large set of Fourier coefficients by regarding a smaller value of the threshold on the coefficients as being "large", or we look at the previous generation random walk measure; the assumption of non-existence of a set that meets our terms is employed with the additive structure of the Fourier coefficients to show two contradicting inequalities. 11) and if Proof. Let α ∆ be as in Theorem 9 for α 0 = min(α ini , 1 − α high )/2, and for κ = λ 10 .
be an M-separated set of maximal cardinality. Let ε < α ∆ 640·20 be a constant to be determined when we explain how to apply the projection theorem (Theorem 17) later in this proof. Note that the proof may end without actually applying the projection theorem. Apply Lemma 13 with respect to µ n , δ (in the roles of µ,t) to obtain an M-separated set which is Cδ −2 , α − 10ε -regular at scale M (C depends on ε). N 1 is as obtained in the conclusion of Lemma 13. Let be an M-separated set of maximal cardinality. And let We first deal with the case that  By passing to a subset E 1 ⊂ E 1 of size |E 1 | > |E 1 | 4 we may assume that for some fixed θ ∈ [0, 2π), for all ξ ∈ E 1 , Re(e iθ · µ(ξ )) > δ 4 128 . Therefore, Then by the relation (4.18) we have the inequality (4.20) In particular there exists s 0 ∈ S such that 1 be an (LM)-separated set of maximal cardinality. By Lemma 14 we have that If the inequality (4.17) holds then we are done (as long as we choose α inc ≤ α ∆ 1280 , and for N 0 = 2N 1 ) as and ε < α ∆ 640·20 . We now turn to the harder case where (4.17) fails. Define δ = δ 2 4 . By Lemma 7 there exists a set E ⊂ E 0 such that Where c, c are absolute constants. Set ρ = 1024C N M α ∆ /640 δ −6 (this is the value in the bound in (4.17)). By (4.18) we have the inequality (4.24) By Lemma 14 we have that Assume that for α inc small to be determined later (but certainly ≤ α ∆ 1280 ), the following holds: Therefore, By the Cauchy-Schwarz inequality we have that Writing inequality (4.31) in the following way we see (using Lemma 14) that For a specific s 1 ∈ S we define the set B as follows: By the pigeonhole principle there exists s 1 ∈ S so that By assumption S is ( C, λ )-regular, and |E 3 | < N 1 M α+α inc . By (E.2) and the (Cδ −2 , α − 10ε) regularity of E 0 we have (recall that δ = δ 2 /4), Hence, we may conclude that for some c ,C independent of N, M, L. Using Rusza's triangle inequality (Lemma 3) we have the following, for fixed s 1 and for all s 2 ∈ B:
The set E is M N 1 -separated and (C 2 δ −8 , 2α − 20ε)-regular at scale M/N 1 . We apply the projection theorem, Theorem 9, to the set E ⊂ [−1, 1] 2 with respect to the measure η on the set of directions in P 1 corresponding to uniform choice of direction from the projection of the set {−s 1 } × B to P 1 . This measure η will satisfy (2.24) for any κ < λ as long as the τ 0 from Theorem 9 satisfies that N 1 M τ 0 (λ −κ) > C 1 , which holds for suitable choice of C * , ε, α inc once L 1 is large enough. Similarly, E will satisfy (2.25) once κ < α ini if C * , ε are small enough and L 1 large enough. Recalling that α inc was chosen to be ≤ α ∆ /1280, we get a contradiction between (4.38) and (4.42) if C * is small enough and L 1 large enough. This completes the proof (with N 0 = N 1 ).

Dimensions of Projections
This section contains background material for a final bootstrapping lemma, which is stated and proved the end. The following part is adapted from [4]. Closely related to the notion of (C, α)-regular measure introduced in Definition 10 is the notion of α-energy of a measure ρ, denoted by E α (ρ), which we define for a compactly supported measure ρ on R d and α < d.
Definition 18. The α-energy of a compactly supported measure ρ on R d and α < d, denoted by E α (ρ), is defined by If ρ is (C, α + ε)-regular on a set B at all scales, then The energy E α (ρ) can also be expressed in terms of the Fourier transform of ρ, up to an implicit constant that tends to ∞ as α → 1 (see [11], Lemma 12.12): If E α (ρ) < ∞, then any set of positive ρ measure has Hausdorff dimension at least α (for this and further information about α-energy, see [11]).
A simple way to adapt this notion to our "coarse" setup, where we do not care about the details of how ρ behaves at scales smaller than r, is to smooth it by convolving with an appropriate kernel. Let Φ be a fixed radially symmetric nonnegative smooth function on R d with Φ 1 = 1 supported on B 0,1 and for Then instead of using the possibly atomic measure ρ, we can consider its smoothed version ρ = ρ * Φ r . In particular, if ρ is (C, α + ε)-regular at scale r on a subset B ⊂ R d , then with the implicit parameter depending only on d and the choice of Φ. See [4], subsection 6.C. for more details. Let Ψ : R → R + be the smooth compactly supported function Ψ(x 1 ) = dx 2 ... dx d Φ(x 1 , x 2 , ..., x d ), (5.6) and define Ψ r analogously to (5.4) Lemma 19 ([4], Lemma 6.10). Let ρ be a probability measure on R, and let φ be the Radon-Nikodym derivative φ = d(ρ * Ψ r ) dx . Then for every 0 < r < r 1 < 1 Moreover, for any subset X ⊂ supp ρ, (5.8) For the next proposition we need a further definition.
Definition 20. Let ρ be a probability measure supported on the unit ball B 0,1 of R d and let ρ θ be the orthogonal projection of the measure ρ in the direction θ ∈ P d−1 . Then ρ θ (t) is defined by Proposition 21 ( [4], Prop. 6.11). Let ρ be a probability measure supported on the unit ball B 0,1 of R d so that E α (ρ) < ∞ for some 0 < α < d, 0 < r < 1, and let η be a measure on S d−1 such that for some c η , β > 0 η(B θ ,ε ) ≤ c η ε β f or every ε > r and θ ∈ S d−1 . (5.10) (5.11) We shall use Proposition 21 with d = 2. Almost quoting from [4], note that if α + β > d and ρ is (C, α )-regular at scale r for α > α, then by (5.2) the right-hand side of (5.11) is bounded from above by a constant (depending on α, α , β , β ,C, ...) while the left hand side is at least In view of Lemma 19, this in particular implies that for η-many choices of θ , the covering number of supp(ρ θ ) by r-intervals is large. The next lemma will be used as a final step after the application of a number of iterations of Lemma 17. then there exists N 1 such that where c is a constant and N 1 is such that  since we may always choose a subset E 1 ⊂ E of cardinality ≥ |E|/4 on which the above inequality holds which is (Cδ −2 , 1 − 2λ /6)-regular (possibly for a slightly different C). Set φ (x) = ∑ s∈S ∑ ξ ∈E e sξ (x). Then by the Cauchy-Schwarz inequality we have We then obtain, and so (5.20) Fix s 2 to be an element in S such that the term corresponding to it in the above sum is the largest.
By Lemma 14 we have that Next, we define a set S by By Lemma 14 we have that |S | ≥ δ 2 16 |S|. Let η be the uniform measure on the set of directions in P 1 corresponding to the set {−s 2 } × S . The ( C, λ )-regularity of S ensures that for any ξ ∈ supp(η) ⊂ P 1 we have the inequality for any positive real number r ≥ M/N and some absolute constant u. Applying Proposition 21 with β = λ , β = 5λ 6 , α = 1 − 5λ 6 and ρ = 1 with C , c depending on λ . Substituting into 5.26, we get We conclude that there is a subset S ⊂ S with |S | > (1 − δ 2 16 )|S| for which if s 1 ∈ S and ξ 0 = (−s 2 , s 1 ) ∈ P 1 , then For any such direction ξ 0 ∈ P 1 , let π ξ 0 denote the orthogonal projection on to the subspace spanned by ξ 0 (considered as a map R 2 → R). By Lemma 19 and 5.29 it follows that This yields the conclusion of our lemma.
We state and prove two key propositions . The first is a general statement which is stated and proved in [4]. The second is the main granulation estimate, which is used in the proof of the main theorem, Theorem 2.
The following proposition and its proof are adapted from Bourgain, Furman, Lindenstrauss and Mozes, [4]. The statement and its proof are harmonic analytic in nature.
Proposition 23 ( [4,Proposition 7.5]). There exists c > 0 such that if t > 0 and a probability measure µ on T d satisfies Proof. We shall need an auxiliary smooth function F on the torus such that where C 1 is a constant depending on d only. To construct such a function, consider the step function F 1 (x) = m B −1 0,r · 1 B 0,r (x), where r = ε/N for some fixed small ε > 0. Then F 1 (a) is close to 1 for a ∈ Z d ∩ B 0,N . If F 2 is a smooth symmetric approximation of F 1 , then the convolution F = F 2 * F 2 has the desired properties.
LetÃ be an M-separated set of size |Ã| > s(N/M) d consisting of coefficients a ∈ Z d ∩ B 0,N with | µ(a)| > t. Upon passing to a subset A ⊂Ã of size we may assume that Re(e iθ · µ(a)) > t 2 for some fixed θ ∈ [0, 2π]. Let The density g of λ = µ * F has the following upper bound: . By 6.8 and 6.9, For each i ∈ I choose x i ∈ Q i such that (6.20) Then 6.15 gives The setX = x i : i ∈ I visits each of the cubes Q j at most once. Thus it may be separated into 2 d subsets each of which never visits any neighboring Q j and is therefore 1 M -separated. At least one of the 2 d subsets X ⊂X has This completes the proof of the proposition.
Proposition 24. For λ , β > 0 there exist k ∈ N,C 1 ,C , L lb > 0, such that if L > L lb , n ≥ k, and S ⊂ [L, 2L] is a ( C, λ )-regular set for some C < L C 1 with |S| > L β , and if the measure µ n = ν * n S * µ satisfies that for some a ∈ Z\{0} and t > L −C 1 that | µ n (a)| > t > 0, (6.24) then there exists a 1 M -separated set X ⊂ T with where N = L|a|, M = |a|. Since t is bounded from below by L −C 1 which will depend only on α ini (and formally also on α high ) then we can modify L lb , if necessary, to be large enough such that the following holds, We now use our bootstrapping lemma, Lemma 17, to obtain denser and denser sets of large Fourier coefficients. We finish by applying the Final Bootstrapping Lemma, 22. The first step is actually checking if we can reach the conclusion of this lemma by applying once Lemma 22.
If α ini > 1 − ε 0 then apply Lemma 22 and Proposition 23 to complete the proof. If α ini ≤ 1 − ε 0 then we do the following. Let α inc be as in Lemma 17 for the chosen values of α ini , α high . Let k = (1 − ε 0 − α ini )/α inc and k = k + 1. Let C 1 be such that if L −C * lb < t then L −C 1 lb < (t 2 k /4 6 k ) 4 /128. Apply Lemma 17 k times to obtain where c is the constant in the conclusion of Lemma 22. Apply Proposition 23 to complete the proof.
As long as there are large Fourier coefficients of the measure µ 1 in the relevant range, we continue in a similar manner: for a ∈ Z\{0} in the range |a| < L τ such that | ν k S * µ 1 (a)| > L −τ obtain X 2 using Proposition 24; in order to apply Proposition 24, the measure µ (1) 1 is normalized so that the input is a probability measureμ (1) 1 , which only increases the Fourier coefficient, so We obtain a set X 2 which is 1 M -separated and has the property that 1 (T).
2 to be the following new measures: We repeat this step in an analogous manner, as long there is an |a| < L τ for which µ  hence max < C −1 (log L) · L 33·2 k τ τ < L 34·2 k τ if L 1 is large enough.