Sorting probability for large Young diagrams

Swee Hong Chan; Igor Pak; Greta Panova

doi:10.19086/da.30071

Analytic Combinatorics

November 30, 2021 BST

Sorting probability for large Young diagrams

https://doi.org/10.19086/da.30071

05a16 06a07 60c05math.pr (mathematics - probability)math.co (mathematics - combinatorics)

ccby-4.0

Photo by Eugene Zh on Unsplash

Chan, Swee Hong, Igor Pak, and Greta Panova. 2021. “Sorting Probability for Large Young Diagrams.” Discrete Analysis, November. https://doi.org/10.19086/da.30071.

Editorial introduction

Read article at ArXiv

Sorting probability for large Young diagrams, Discrete Analysis 2021:24, 57 pp.

Let $P=(X,\leq_P)$ be a finite partially ordered set (or poset, for short). A linear extension $L$ of $P$ is a total ordering $\leq_L$ on $X$ such that for every $x,y\in X$ , if $x\leq_Py$ , then $x\leq_Ly$ . It is an easy exercise to prove that every poset has a linear extension: indeed, if any two elements $x,y$ are incomparable in $P$ , then one can choose to add to $P$ the inequality $x<y$ , together with all other inequalities that follow from this one by transitivity, and no difficulty can arise, since if it did, then the inequality $y<_Px$ would already have followed from transitivity.

Let $P$ be a finite poset and let $L$ be a random linear extension (chosen uniformly from the set of all linear extensions). The famous 1/3-2/3 conjecture states that for every finite poset that is not already a linear order, there exist $x,y$ such that $\mathbb P[x\leq_Ly]$ lies between 1/3 and 2/3. This conjecture is connected with sorting algorithms, since if one has $n$ objects $x_1,\dots,x_n$ totally ordered in some unknown way, then one way to sort the objects is as follows.

Let $P$ be the partial order $\leq_0$ on $X=\{x_1,\dots,x_n\}$ where no two elements are comparable.
While $P$ is not a total order do
- find two elements $x_i, x_j$ such that if $L$ is a random linear extension of $P$ , then $\mathbb{P}(x_i \prec_L x_j) \in \left[\frac{1}{3}, \frac{2}{3}\right]$
- find out which is true out of $x_i$ < $x_j$ and $x_i$ < $x_j$ ;
- $P:=$ the partial order generated by $P$ and whichever of the relations $x_i$ < $x_j$ and $x_j$ < $x_i$ holds.

Since the probability of the new relation is between $1/3$ and $2/3$ , at each step of this algorithm the number of linear extensions of $P$ is at most two thirds of what it was at the previous step, so the total number of comparisons made during the running of the algorithm is at most $\log_{3/2}(n!)$ , which is $O(n\log n)$ , which is within a constant of the obvious entropy lower bound of $\log_2(n!)$ . This bound is attained by several known sorting algorithms. The 1/3-2/3 conjecture is motivated by the question of what the optimal performance is in the worst case (though to determine this, it is not enough to determine what the worst performance is for a single step).

In the absence of a proof of the 1/3-2/3 conjecture, attention has turned to determining how close $\mathbb P[x<_Ly]$ comes to 1/2 in the case of specific posets or classes of posets. More precisely, the sorting probability for a poset $P$ is defined to be the minimum over all $x\ne y$ of $|\mathbb P[x<_Ly]-\mathbb P[y<_L x]|$ , where $L$ is a random linear extension of $P$ . In this language, the 1/3-2/3 conjecture states that the sorting probability of every poset (that is not totally ordered) is at most $1/3$ . (The best known bound to date, which has been known for 25 years, is $1/\sqrt 5$ .)

A general question of interest is to identify natural families of posets for which the sorting probability is small. For example, it is conjectured that the sorting probability tends to zero as the width (that is, the size of the largest antichain) tends to infinity. To date few such families have been found, the main difficulty being to identify a pair of elements $x,y$ for which one can prove that $\mathbb P[x<_Ly]$ is small. This paper takes a significant step in this direction, and introduces several new ideas into the area.

The posets examined in the paper come from Young diagrams. Recall that a Young diagram of an integer partitition $n=n_1+\dots+n_k$ with $n_1\geq\dots\geq n_k$ is an arrangement of $k$ rows of squares with $n_i$ squares in the $i$ th row (drawn with the first row at the top in “English notation” and at the bottom in “French notation”). Such a diagram inherits a partial order from $\mathbb Z^2$ : that is, we can label the box in the $i$ th row and $j$ th column by $(i,j)$ and then say that box $(i,j)$ is at most box $(k,l)$ if and only if $i\leq k$ and $j\leq l$ . If there are $n$ boxes in total, then a linear order that extends this partial order corresponds to a numbering of the boxes from 1 to $n$ such that each new number occupies a minimal position that has not yet been occupied. Equivalently, the numbering must increase along each row and down each column. Such a numbering gives a Young tableau.

Young diagrams and Young tableaux have many applications in algebraic combinatorics and in algebra more generally. For example, there is a one-to-one correspondence between Young diagrams of size $n$ and representations of the symmetric group $S_n$ , with the dimension of a given representation equalling the number of Young tableaux that can be formed from the corresponding Young diagram. (This number can be calculated using the hook-length formula.) Thus, the number of linear extensions of the poset corresponding to a Young diagram is a number of basic algebraic significance.

The paper shows, under various hypotheses on Young diagrams, that the sorting probability tends to zero. Before that, in what the authors describe as a “warm-up”, it gives a new and very elegant proof of the known result that the 1/3-2/3 conjecture holds for all Young diagrams. The ideas in this proof play an important part in the later arguments. One of the tools is called the Naruse hook-length formula, which concerns the number of linear extensions of a skew Young diagram, which is the difference between two Young diagrams, one of which is contained in the other. For the main results, this is combined with the use of Schur functions and random-walk estimates. With this mixture of techniques and algorithmic significance, the paper is of interest in algebraic combinatorics, probabilistic combinatorics, and theoretical computer science.

Read article at ArXiv