Random volumes in d-dimensional polytopes

Suppose we choose $N$ points uniformly randomly from a convex body in $d$ dimensions. How large must $N$ be, asymptotically with respect to $d$, so that the convex hull of the points is nearly as large as the convex body itself? It was shown by Dyer-F\"uredi-McDiarmid that exponentially many samples suffice when the convex body is the hypercube, and by Pivovarov that the Euclidean ball demands roughly $d^{d/2}$ samples. We show that when the convex body is the simplex, exponentially many samples suffice; this then implies the same result for any convex simplicial polytope with at most exponentially many faces.


Introduction
Consider sampling random points q 1 , q 2 , . . . uniformly and independently from a convex body X ⊆ R d . We are interested in the asymptotics of the random variable V X,N given by the volume of the convex hull of q 1 , . . . , q N . In particular, how large does N have to be to ensure that w.h.p. 1 the volume of the convex hull of q 1 , . . . , q N is a significant fraction of the volume of X?
This problem is well-understood when X is a product space (i.e., a hypercube) or a Euclidean ball. In the case where X is the hypercube [0, 1] d , the coordinates of the q i are independent uniform random variables in [0, 1], and Dyer, Füredi, and McDiarmid [5] proved: Theorem 1.1 (Dyer, Füredi, McDiarmid, 1992). If X is the hypercube [0, 1] d , λ = e ∞ 0 ( 1 u − 1 e u −1 ) 2 du ≈ 2.14, and ε > 0, then as d → ∞ we have that In particular, an exponential number of sample points suffice to capture the volume of the hypercube with the convex hull of the sample (and they even determine the correct base of the exponent). This was generalized in 2009 by Gatzouras and Giannopoulos in [6] to the case of random points with i.i.d. coordinates which instead of uniform are drawn from any even, compactly supported distribution, satisfying certain mild conditions.
On the other hand, if X is the Euclidean ball, Pivovarov proved in [9] that the threshold is superexponential.
Theorem 1.2 (Pivovarov, 2007). If X is the unit Eulidean ball in R d , , and ε > 0, then as d → ∞ we have that For results concerning a more general rotationally symmetric model of the so-called β-polytopes (also exhibiting super-exponential thresholds), see the recent papers [1,2]. For general bounds on N concerning arbitrary log-concave and κ-concave distributions see [3].
We analyze the case where X is a convex simplicial polytope. In particular, we prove: Theorem 1.3. Suppose that q 1 , q 2 , . . . is a sequence of points chosen independently and uniformly from X, where X ⊆ R d is a convex simplicial polytope with m facets. Let Q j = Q j,d ⊆ Ω be the convex hull of {q 1 , . . . , q j }. There are positive universal constant c 0 , C 0 such that for all large Since any convex simplicial polytope with m faces can be partitioned into at most m simplices, which are all affine equivalent, it suffices to prove Theorem 1.3 in the case where X is a simplex. In particular, we let Ω d denote the standard embedding of the (d − 1)-dimensional simplex in d-dimensional space: The heart of our results is thus the following: . is a sequence of points chosen independently and uniformly from Ω = Ω d , and let Q j = Q j,d ⊆ Ω be the convex hull of {q 1 , . . . , q j }. There are positive constants Remark 1.5. By the Borel-Cantelli lemma, it follows that if we take a sequence of instances Ω 1 , Ω 2 , . . . , then Vol(Q N,d )/Vol(Ω d ) → 1 as d → ∞ with probability 1.
Remark 1.6. For clarity, we do not try to optimize any constants in our proofs. We get the theorem with c 0 = 1 4 and C 0 = 300.
The following lower bound shows an exponential dependence is necessary: A similar lower bound with a worse constant follows from Theorem 1 in [3]. To prove Theorem 1.7, we use the approach from [5]. We conjecture that the value of the constant e γ is sharp (the method from [5] yields sharp results in the independent case as well as rotationally symmetric ones -see [1,2,5,9] -where the dependence between components is mild, as in the case of a simplex). For the upper bound, we follow a different strategy, which is summarized at the beginning of the next section.
The rest of the paper comprises two sections devoted to the proofs of Theorems 1.4 and 1.7.
2 Proofs of the upper bound: Theorem 1.4 For the sake of clarity we begin by sketching the structure of the whole proof. For i = 1, . . . , d we define the α-caps C i (α) of the simplex to be the sets Note that they are disjoint as long as α < 1 2 and the volume Vol(C i (α)) of C i (α) is precisely α d−1 · Vol(Ω). In particular, when examining the sequence {q j }, we expect to see a point in C i0 (α) every ( 1 α ) d−1 steps. And for α a constant, after exponentially many steps, we can collect points from each cap C i (α). A routine calculation shows that the expected measure of the convex hull of a random set of d points with one from each C i (α) is exponentially small compared with Ω, though it is not a priori clear how much overlap to expect from multiple such random simplices. The basic strategy of the proof is to define a large set Ω(ε, γ) ⊆ Ω, and then show that for any fixed x ∈ Ω(ε, γ), the point x is very likely to lie in the convex hull of some simplex with one point p x i in each in cap C i (α), where all the the points p x 1 , p x 1 , . . . , p x d occur among the first C d 0 terms of the sequence q 1 , q 2 , . . . . We do this by showing (in Lemma 2.4) that every exponentially many steps, one obtains not only a point p x i which lies in the cap C i (α), but one which is similar to x with respect to it's proximity to a lower dimensional face close to x-this provides points which give a good chance of containing x in the convex hull reasonably quickly. (The fact that the points p x i are large in coordinate i let us view them as a diagonally dominant matrix, which we exploit to show that x is likely to lie in their convex hull.) Linearity of expectation will then show that the measure of the uncovered part of Ω(ε, γ) is very small, and Markov's inequality can then give a w.h.p statement as in the theorem. In particular, although x lying in the convex hull of {q 1 , . . . , q N } is of course equivalent to x lying in some simplex S x with vertices in {q 1 , . . . , q N }, it is perhaps surprising that we prove the theorem by actually identifying S x , rather than, say, considering whether x is separated from the convex hull by a hyperplane.

The exponential model
A basic tool we use is the standard fact that the coordinate vector of a uniformly random point in the simplex Ω can be simply described using independent exponentials, as encapsulated in the first part of the following Lemma: Lemma 2.1. If we generate a random point q ∈ Ω by generating the coordinates q j as where the E i 's are independent, mean 1 exponentials, then q is uniform in Ω. Moreover, if we generate points p i (i = 1, . . . , d) by generating the coordinates as where the E i,j s are independent mean-1 exponentials, then each p i is uniform in the cap C i (α).
Proof. The statement about q follows from the fact that the coordinate vector of a random point in Ω has the same distribution as the vector of d gaps among d − 1 independent uniforms in [0, 1], and that these gaps are distributed as exponentials with a conditioned sum (see e.g., [4], Ch 5, Theorems 2.1 and 2.2).
Consider now a point p i ∈ Ω which is uniform except that we condition that it lies in C i (α). The coordinates p i,j of p i are distributed as for independent mean-1 exponentials E i,j . Note that this conditioning is equivalent to conditioning on Thus rather than condition in (5), by the memoryless property, we could have instead replaced E i,i in that expression with a random variable E i generated as and (3) and (4) follow by substitution.
We will also use the following result of Janson, which gives concentration for sums of exponentials: Lemma 2.2 (Janson [7]). Let W 1 , W 2 , . . . , W m be independent exponentials with means 1 ai . Then, for any λ ≤ 1,

The large typical set
Recall that our proof works by defining a large set of "typical" points in Ω, and then showing that any such point is very unlikely to be still uncovered after exponentially many steps.
To define and work with the appropriate typical set, we will be in interested in the magnitudes of the smallest coordinates of points x in the set. (Roughly speaking, the typical set Ω(ε, γ) defined below is one where none of smallest coordinates are much too small.) For this purpose, we make the following definitions: We now define our typical set as follows: where the coordinates of the vector ε are defined in terms of a constant ε > 0 and by Proof. Let x be a random vector uniform on Ω. In view of Lemma 2.1 and (2), the vector ( of the order statistics of x has the same distribution as the vector of the order statistics of i.i.d. mean one exponentials normalised by their sum, which combined with Theorem 2.3 from Chapter 5 in [4] gives that (x ix ) d i=1 has the same distribution as the vector where the E(j)'s are independent exponentials with rate j. Thus, We estimate these probabilities using Janson's inequality (6). First define the event By (6),

Now consider the events
Note that By (6), Since u − 1 − log u > − 1 2 log u for u ≤ 0.2, we get for i ≤ γd, as long as 1.6ε ≤ 0.2, Thus, Similarly, for i = ⌊γd⌋ + 1, we get Putting these bounds together finishes the proof.

A lightly conditioned candidate simplex
We now fix an arbitrary x ∈ Ω(ε, γ), and consider choosing a p i randomly from C i (α), for some i ∈ {1, . . . , d}, using Lemma 2.1. To use p i as the vertex of a candidate simplex to contain x, we hope to find that runs over the smallest γd coordinates of x; recall that j x denotes the coordinate of the jth smallest component of x. Indeed, we will later argue that conditioning on this event for every i, the random points p 1 , . . . , p d would have a reasonable chance of containing x in their convex hull. The following lemma shows that we can ensure that (10) is not too unlikely to be satisfied, without much conditioning on the random variables E i,jx for j > γd.
Lemma 2.4. Let γ ≤ 1 6 and 2εγ ≤ 5α. Let x ∈ Ω(ε, γ) and let p i be chosen randomly from C i (α) for some fixed i ∈ {1, . . . , d}, as in Lemma 2.1. Then for the event and an event A i,x depending only on the E i,j for which r x (j) ≤ γd (and so independent of B i,x ), we have and where C i,x is the event that Proof. We have The second event in the last line is B i,x , and we define A i,x to be the first event in the last line. We have the claimed probability bound on B i,x from Lemma 2.2. Indeed, for the mean (6) gives and for γ ≤ 1 6 , we have (1 − γ) For A x,i , we compute As a consequence of Lemma 2.4, we will have that if we sample exponentially many points in C i (α), we will with probability at least 1 − e −d have at least one one point p i for which the corresponding event A i,x occurs. In particular, we will with probability at least 1 − de −d have one such point p i for each i = 1, . . . , d. Furthermore, with probability 1 − de −10 −4 d , we have that all the corresponding events B i,x occur. These points p 1 , . . . , p d form the vertices of a candidate simplex; note that the Lemma gives us that these points p i satisfy p i,jx ≤ εj x jx 2d 2 for all 1 ≤ j x ≤ γd, j = i. In the next section, we show that they are not too unlikely to contain the fixed vertex x ∈ Ω(ε, γ). In particular, this will mean that after collecting exponentially many such simplices (in time exponential(d) · N = exponential(d)), the probability that x is not covered by any such simplex will be exponentially small.

Enclosing a fixed x ∈ Ω(ε, γ)
In this section we show that for any fixed x ∈ Ω(ε, γ), it is only exponentially unlikely to be contained in a simplex whose vertices p i , i = 1, 2, . . . , d are each chosen randomly from the corresponding set C i (α).
In particular, our goal in this section is to prove: Lemma 2.5. Let γ ≤ 1 6 and 2εγ ≤ 5α. Fix x ∈ Ω(ε, γ) and suppose that for each i = 1, . . . , d, the point p i is chosen randomly from C i (α). Let A i,x be the events from Lemma 2.4 and let We define the matrix P = p j,i whose rows are the random points p i , and write P = D + R where D is the diagonal of P, and M = D −1 R.
We will apply Gershgorin's Circle Theorem to this matrix M: In particular, we use it to prove the following: Lemma 2.7. We have that Proof. Observe that if the sum in (13) converges, then we can write Thus it remains just to prove that the sum converges. Recall first from the definition of C i (α) in (1) that diagonal entries of P are all at least 1 − α, while the sum of each row is 1. In particular, Gershgorin's Circle Theorem implies the eigenvalues of M have absolute value at most α 1−α , which is less than 1 assuming α < 1 2 .
Next we argue that M is a.s. diagonalizeable. This is the case if the discriminant of the characteristic polynomial of M is nonzero. This discriminant is a polynomial expression involving only products of the off-diagonal entries of M; in particular, it is nonzero with probability 1.
Thus finally we write M = QΛQ −1 and M k = QΛ k Q −1 . This converges exponentially fast, confirming convergence of the sum and thus the lemma.
We are now ready to prove Lemma 2.5.
Proof of Lemma 2.5. As the p i lie in general position, we can always write the given x (uniquely) as a linear combination x = λ 1 p 1 + · · · + λ d p d ; our goal is to show that given i A i,x , there is probability at least c d for some c > 0 that the λ i are all nonnegative. Observe that these coefficients are determined as From (13), we can write which is nonnegative so long as x(I − M) is, since M has only nonnegative entries.
Note that the jth coordinate y j of the product y = x(I − M) is given by Thus Recall from Lemma 2.4 the events B i,x which are all independent of the events Each of the values of j corresponding to small coordinates of x-that is the j for which r x (j) ≤ γd-must satisfy y j ≥ 0 if A x ∩ B x occurs. Indeed, from Lemma 2.4, we know that for all i = j we have p i,j ≤ x j /2 in this case, and so in particular we have that (by (4), we have D i,i ≥ 1 − α). This shows that It therefore remains to handle the case of r x (j) > γd. On the event B x , for i = j, we have p i,j ≤ 5α 4d E i,j (recall (3)), thus, bounding D −1 i,i ≤ 1 1−α ≤ 2 (see (22)) and using that x j ≥ γ 2d for j such that r x (j) > γd, we obtain (on using the first equality in 17) that By a simple inequality The fact that the E i,j for j with r x (j) > γd are not conditioned by A x , Markov's inequality and independence yield The independence of B x and A x , a simple union bound and (12) yield

Covering most of the simplex in exponentially many steps
We are now ready to combine the ingredients to prove the main theorem.
Proof of Theorem 1.4. Recall that we draw N random points q 1 , q 2 , . . . , q N independently and uniformly from the simplex Ω and Q N denotes their convex hull. First, note that by Fubini's theorem, we have where Ω(ε, γ) is the typical set defined in (7). Fix x ∈ Ω(ε, γ). By Lemma 2.5, we will have a good lower bound on P (x ∈ Q N ), provided we know that among the q i there are n points, one from each cap C i (α) which moreover fulfill the events A x . To use that, we condition on all possibilities for the q i and then argue that the majority of the possibilities are good, provided N is large enough.
Formally, given two sequences l = (l 1 , . . . , l N ) ∈ {0, 1, . . . , d} N and θ = (θ 1 , . . . , θ N ) ∈ {0, 1} N , we define the event and q j satisfies A lj ,x if and only if θ j = 1 which tells us which among the points q j fall in the caps and among those which satisfy A i,x . Let Good be the set of those pairs of sequences (l, θ) for which there are 1 ≤ j 1 < . . . < j d ≤ N such that {l j1 , . . . , l jN } = {1, . . . , d} and θ j1 = . . . = θ j d = 1. Then, For (l, θ) ∈ Good, by Lemma 2.5, we have P (x ∈ Q N | E l,θ ) ≥ δ d , so it remains to estimate (l,θ)∈Good P (E l,θ ). Fix i ∈ {1, . . . , d} and let S i be the number of points among the q j which are in C i (α) and satisfy A i,x . We have, By independence, where P (A i,x ) is taken with respect to the probability uniform on C i (α). Therefore, by (12) and a union bound, Thus, Set γ = 1 6 and then choose α to be a small enough constant such that Choose ε ≤ 1 8 (allowing the use of Lemma 2.3 later) such that 2εγ ≤ 5α (allowing the use of Lemma 2.4). Then we take N = C d 1 with C 1 large enough so that the exponential term in (20) satisfies We then have Then, by independence, we get Finally, thanks to (19) and Lemma 2.3, for a positive universal constant c 0 .
Remark 2.8. All of these inequalities hold with (provided d is large enough). Moreover, for the constant c γ in Lemma 2.3, we can take c γ = 3 8 . These justify Remark 1.6.
3 Proof of the lower bound: Theorem 1.7 Since the quantity which will be more convenient here. The following fundamental lemma from [5] is a starting point.
. Suppose X 1 , X 2 , . . . are i.i.d. copies of a random vector X in R d . Define a random polytope Q N = conv{X 1 , . . . , X N } and consider the function ξ = ξ X defined by Then for every subset A of R d , we have and We will only need the first part of Lemma 3.1, that is (24), which will be applied to sets of the form A = {x ∈ R d , ξ(x) > λ}, the (convex) level sets of the function ξ. To get an upper bound on the volume of such sets, we shall use a standard lemma concerning the Legendre transform Λ ⋆ X of the log-moment generating function Λ X of X, Lemma 3.2. For every α > 0, we have Proof. Plainly, for the infimum in the definition (23) of ξ X (x), it is enough to take half-spaces for which x is on the boundary, that is where u, v = i u i v i is the standard scalar product in R d . By Chebyshev's inequality for the exponential funciton, P ( X − x, θ ≥ 0) ≤ e − θ,x Ee θ,X . Consequently, ξ X (x) ≤ e −Λ * X (x) .
Choosing, say δ = ε 8 , we thus get and by the (weak) law of large numbers, the right hand side converges to 0 as d → ∞.