8

Let $w_{0}$ denote the finite word $01$ in the free monoid $\{ 0, 1 \}^{\ast}$, and for $i \in \mathbb{N}$ define $w_{i}$ as the word obtained by adjoining the first $\left\lfloor \frac{\ell(w_{i-1})}{2} \right\rfloor$ entries in $w_{i-1}$ to the right of $w_{i-1}$. We thus have that:

\begin{align*} w_{0} & = 01 \\ w_{1} & = 010 \\ w_{2} & = 0100 \\ w_{3} & = 010001 \\ w_{4} & = 010001010 \\ & \text{etc.} \end{align*}

Let $$ w = 0100010100100010001010001010010001010010000100010100100010001010100 \ldots$$ denote the infinite binary word obtained in the limit, with respect to the sequence $(w_{i} : i \in \mathbb{N}_{0})$. Since the construction of this infinite word is very simple and natural, it is surprising that the integer sequence $$(0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, \ldots)$$ given by the consecutive entries in $w$ is not currently in the On-Line Encyclopedia of Integer Sequences (OEIS).

Recall that the subword complexity function $\sigma_{v} = \sigma : \mathbb{N} \to \mathbb{N}$ of an infinite word $v$ is the function on $\mathbb{N}$ that maps $n \in \mathbb{N}$ to the number of distinct factors of $v$ of length $n$. Given the simple definition of the binary word $w$, it is natural to ask: what is $\sigma_{w}$? It is not obvious to me how to find a closed-form evaluation of the sequence $$(\sigma_{w}(n) )_{n \in \mathbb{N}} = (2, 3, 5, 8, 12, \ldots),$$ since proving a statement of the form $\sigma_{w}(n) = m$ for fixed $n \in \mathbb{N}$ (where $m \in \mathbb{N}$) appears to be nontrivial in general. However, for certain 'small' values of $n \in \mathbb{N}$, the evaluation of $\sigma_{w}(n)$ is relatively trivial. For example, using induction, it is easily seen that $\sigma_{w}(2)=3$.

It is also natural to ask: What is the abelian complexity function of $w$?

John M. Campbell
  • 1,809
  • 1
  • 11
  • 35
  • 1
    See also http://mathoverflow.net/questions/177996/do-runs-of-every-length-occur-in-this-sequence and http://math.stackexchange.com/questions/876937/do-runs-of-every-length-occur-in-this-string concerning repeatedly appending the *last* "half" of the current word (rather than the first "half"), starting with $001$. Regarding your "computed" values, I would be extremely cautious about the results of direct searches, because the first occurrence of some relatively short words might be infeasibly late to compute. – r.e.s. Apr 20 '16 at 17:06
  • 1
    Are you sure that $\sigma_w(7) =25$ (rather than $26$)? – r.e.s. Apr 20 '16 at 17:48
  • 1
    (My answer suggests that $\sigma_w(7)=\mathrm{Fibonacci}_{7+2}=34$, but that around $10^{11}$ digits would be needed to see an occurrence of each of the $34$ words.) – r.e.s. Apr 22 '16 at 06:07

2 Answers2

2

Claim 1. For any word $x\in\{0,1\}^*$, ($x$ is not a subword of $w$) iff ($11$ is a subword of $x$); i.e., the only words that are not subwords of $w$ are those that contain the subword $11$.

Proof:

  1. $x$ is not a subword of $w$ if $11$ is a subword of $x$

    Let $suffix(w_i)$ and $prefix(w_i)$ denote any suffix and prefix, respectively, of any of the $w_i$. Now, a word $x$ occurs as a subword of $w$ iff it is formed at the junction of some concatenation $(+)$ of $suffix(w_i) + prefix(w_i)$ during some iteration of the rewrite rule in the recursive definition of $w$. E.g., all length-$2$ words occur, except for $11$ -- the latter never occurs because $..1+1..$ cannot occur, as $1..$ is never a $prefix(w_i)$. Since $11$ doesn't occur as a subword in $w$, neither does any word in which $11$ occurs as a subword.

  2. $x$ is not a subword of $w$ only if $11$ is a subword of $x$

    (--to be completed--)

Claim 2. The number of words in $\{0,1\}^n$ that do not contain the subword $11$ is equal to the Fibonacci number $F_{n+2}=F_{n+1} + F_n$, with $F_0=0, F_1=1$.

Proof: See How many $N$ digits binary numbers can be formed where $0$ is not repeated. The same argument applies when $1$ (rather than $0$) is not repeated.

Claim 3. $\sigma_w(n) = F_{n+2}.$

Proof: Immediate from Claim 1 & Claim 2.

Claim 4. $\sigma_w^\mathrm{abelian}(n) = \left\lfloor\frac{n+3}{2}\right\rfloor.$

Proof: This follows from Claim 1 & Claim 2, together with the observation that for each $k$ in $\{0,1,2,...\left\lfloor\frac{n}{2}\right\rfloor\}$, there is a word in $\{0,1\}^n$ containing exactly $k$ $1$s and no $11$-subword, but that for any larger $k$ there is no such word.

Computations using Sage confirm these claims for small $n$, but the first-occurrence times of some relatively short words are infeasibly large, as suggested by the following table of results:

\begin{array}{|c|c||c|} n & \sigma_w(n) & \sigma_w^\mathrm{abelian}(n) & \mathrm{index\ of\ the\ 1st\ occurrence\ of\ }0^n \\ \hline 1&2&2&0 \\ 2&3&2&2 \\ 3&5&3&2 \\ 4&8&3&39 \\ 5&13&4&40360461 \\ 6&21^*&4^*&> 7854933623 \end{array}

$(^*)$ The longest prefix that my system can directly compute is $w_{55}$, which has length $7854933628$. Except for $0^6$, it contains all of the $21$ length-$6$ subwords that do not contain $11$.

r.e.s.
  • 12,647
  • 1
  • 33
  • 65
2

Not a full answer, but probably a useful reference.

You mention that the integer sequence $$(0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 1, 0, 1, 0, \ldots)$$ given by the consecutive entries in $w$ is not currently in OEIS. However, the sequence of positions of $1$ in this sequence, namely: $$ 1, 5, 7, 10, 14, 18, 20, 24, 26, 29, 33, 35, \dotsm $$ appears as A020942, First column of 3rd-order Zeckendorf array, in OEIS. Quoting OEIS

Any number $n$ has unique representation as a sum of terms from $\{1, 2, 3, 4, 6, 9, 13, 19, ...\}^{*}$ such that no two terms are adjacent or pen-adjacent; e.g., $7=6+1$. Sequence gives all $n$ where that representation involves $1$.

$^{(*)}\scriptsize\text{ These terms obey the recurrence equation $a(n) = a(n-1) + a(n-3)$.}$

Two references are given

[1] Larry Ericksen and Peter G. Anderson, Patterns in differences between rows in k-Zeckendorf arrays, The Fibonacci Quarterly 50, February 2012.

[2] C. Kimberling, The Zeckendorf array equals the Wythoff array, Fibonacci Quarterly 33 (1995) 3-8.

Reference [1] gives interesting connections with the sequence of words $w_k$ on a $k$-letter alphabet defined as follows: \begin{align} w_0 &= a_0,\\ w_1 &= a_0a_1,\\ &\ \vdots\\ w_{k-1} &= a_0a_1 \dotsm a_{k-1} \end{align} and $w_i = w_{i-1}w_{i-k}$ for $i \geqslant k$. This makes this article a reasonable approach towards the solution of your problem.

J.-E. Pin
  • 36,191
  • 3
  • 31
  • 82