This seems to be an elementary question, but it's proving hard for me to just Google. Suppose you have a sequence which picks elements out of $\{a, a^{-1}, b, b^{-1}, c, c^{-1}\}$ with equal probability. After, say, seven steps you'll get words like $a b c c^{-1} a c b^{-1} $ which in this case reduces to $a b a c b^{-1} $ after canceling inverses. My question is this: if I write $c$ as some string of $a, a^{-1}, b, b^{-1}$, (and write $c^{-1}$ as the inverse of that) are there any theorems about the expected length of the new sequence after reduction? For example, I could take $c = a^{-1} b b$ so my previous word becomes just $ab b$. I care mostly about the case of a very long word.

Mr. G
  • 1,106
  • 8
  • 24
  • 3
    You are looking at a random walk on a tree. It is known that such a walk tends to infinity linearly. The statement holds more generally for random walks on hyperbolic groups. – Seirios Dec 13 '17 at 08:25
  • 1
    @Seirios Right, so clearly the sequence of $\{a, a^{-1}, b, b^{-1}, c, c^{-1}\}$ (where each happens with probability $1/6$) goes to infinity at a rate $(5/6)(1) + (1/6)(-1) = 2/3$. But I'm wondering what happens when I replace $c$ with a function of $a, a^{-1}, b, b^{-1}$. If I write $c = a b^{-1} a$, then I'm taking a random walk on $\{a, a^{-1}, b, b^{-1}, a b^{-1} a, a^{-1} b a^{-1}\}$, where each of those six elements still happens with probability $1/6$. But this sequence grows faster than $2/3$ - my question is how much faster? – Mr. G Dec 13 '17 at 15:28
  • Does the word $aba^{-1}$ reduce to $b$, i.e., do the letters commute? – toliveira Jan 06 '18 at 01:35
  • @toliveira No, nothing commutes here. Well except $a a^{-1} = a^{-1} a = 1$ and such. – Mr. G Jan 06 '18 at 01:58

1 Answers1


The keywords are random walks on (hyperbolic) groups and speed of escape. There are many references on the subject (e.g. this one, although it's likely to be too general for your problem), and the precise result you seem to be interested in is a theorem by Kesten (1959).


Let $F_2$ be the free group on two generators. Let us consider a sequence of i.i.d. $F_2$-valued random variables $(X_k)$, with $\mathbb{P} (X_0 = d) = 1/6$ for $d\in \{a,a^{-1}, b, b^{-1}, c, c^{-1}\}$ (replace $c$ by any word in $a$, $b$ you like). Note that the $X_k$'s are symmetric: $X_k^{-1}$ has the same distribution as $X_k$, and the support of their distribution generates $F_2$.

Define a random walk on $F_2$ by taking $S_0 := e$, and $S_{n+1} = X_n S_n$.

The group $F_2$ is endowed with the word metric, that is $d(e, g)$ is the length of the shortest way to write $g$ as a product of the generators $a$, $b$ and their inverses. What you are interested in is $d(e, S_n)$.


Note that $d(e, S_{n+m}) \leq d(e, S_n)+d(e, S_m)$. Hence, the stochastic process $(d(e, S_n))_{n \geq 0}$ is subadditive. Some general results of ergodic theory (e.g. Kingman's subbaditive theorem which is a generalization of the law of large numbers) tells you that there is a real $\ell \geq 0$ such that, almost surely,

$$\ell = \lim_{n \to + \infty} \frac{d(e, S_n)}{n}.$$

Not that this is true for any discrete group (with a left-invariant metric) and any random walk with bounded steps. However, the limit can easily be $0$; indeed, if you take a simple random walk on $\mathbb{Z}^d$, then the central limit theorem tells you that $d(e, S_n)$ is of the order of $\sqrt{n}$, and thus grow sub-linearly.

(Non)-amenable groups

At this point, it is useful to distinguish between two kind of groups: those which are amenable, and those which are not. While there are many equivalent definitions, I'll just give two criteria:

  • if a discrete group has polynomial (or sub-exponential) growth, then it is amenable. For instance, a ball of radius $n$ in $\mathbb{Z}^d$ has $\sim n^d$ elements, so $\mathbb{Z}^d$ has polynomial growth.

  • If a discrete group contains a copy of $F_2$ as a subgroup, then it is not amenable.

A theorem of Kesten

It turns out that non-amenability is exactly what we need to get a linear rate of growth!

Theorem (Kesten)

Let $G$ be a finitely generated group. Let $(S_n) = (X_{n-1} \ldots X_0)$ be a random walk on $G$. Let $\mu \in \mathcal{P} (G)$ be the distribution of $X_0$. Assume that the support of $\mu$ is bounded, and generates $G$. Assume furthermore that the random walk is symmetric, i.e. $X_0 \equiv X_0^{-1}$ in distribution. Then $\ell >0$ if and only if $G$ is non-amenable.

Your problem fits perfectly this framework. In other word, no matter what $c$ you choose, almost surely, the length of the reduced words shall grow linearly (with a speed $\ell$ which depends only on $c$).

I guess that the next step would be to find $\ell$ as a function of $c$, and even get a more elementary proof for given (not too long) $c$. That said, before going to the specifics, I believe that a rough picture of the general theory is useful.

D. Thomine
  • 10,605
  • 21
  • 51
  • Thanks. Does this give me any hope of actually computing $\ell$ though? That's really the hard part, isn't it? Finding $d(e, S_n)$ doesn't seem to be any more feasible than doing the multinomial expansion of $(a + a^{-1} + b + b^{-1} + a b^{-1} + b a^{-1})^n$ (or whatever) and counting the occurrences of words of a given length. – Mr. G Jan 07 '18 at 17:08
  • @Mr. G: My best guess is that the first three letters of $S_n$ converge for large $n$ to a stationary distribution. Computing this distribution should be feasible, and getting $\ell$ from this distribution should be easy (because the average length gained from $S_n$ to $S_{n+1}$ depends on the first three letters of $S_n$ and on $X_n$, so you only have to average among all possibilities). I don't know if this idea can be made rigorous. – D. Thomine Jan 07 '18 at 20:24
  • I think you're exactly right about the stationary distribution. The problem was that it seemed impossible to find the distribution as the fixed point of a recurrence relation. Say $c = a b^{-1}$ and say you consider the twelve possible values of $aa, ab, a b^{-1}$ etc. as your "states." The probability of transitioning to any state at step $n + 1$ depends on the first four letters of your current word (because you can remove $0, 1$ or $2$ at each step). But those four letters don't have to be the $n$ and $n - 1$ states; that information could have been lost if you just went backwards. – Mr. G Jan 07 '18 at 20:52