9

Is it possible to count exactly the number of binary strings of length $n$ that contain no two adjacent blocks of 1s of the same length? More precisely, if we represent the string as $0^{x_1}1^{y_1}0^{x_2}1^{y_2}\cdots 0^{x_{k-1}}1^{y_{k-1}}0^{x_k}$ where all $x_i,y_i \geq 1$ (except perhaps $x_1$ and $x_k$ which might be zero if the string starts or ends with a block of 1's), we should count a string as valid if $y_i\neq y_{i+1}$ for every $1\leq i \leq k-2$.

Positive examples : 1101011 (block sizes are 2-1-2), 00011001011 (block sizes are 2-1-2), 1001100011101 (block sizes are 1-2-3-1)

Negative examples : 1100011 (block sizes are 2-2), 0001010011 (block sizes are 1-1-2), 1101011011 (block sizes are 2-1-2-2)

The sequence for the first $16$ integers $n$ is: 2, 4, 7, 13, 24, 45, 83, 154, 285, 528, 979, 1815, 3364, 6235, 11555, 21414. For $n=3$, only the string 101 is invalid, whereas for $n=4$, the invalid strings are 1010, 0101 and 1001.

N. F. Taussig
  • 66,403
  • 13
  • 49
  • 69
Nocturne
  • 2,250
  • 8
  • 13
  • 1
    Could you clarify what exactly you mean by "two consecutive blocks of 1's of the same length"? – inavda Oct 09 '20 at 21:08
  • 2
    @inavda The string **should not** contain a substring of the form 01{k}0+1{k}0, i.e. a block of 1s of some length $k$, followed by one or more zeros and then by another block of 1s of the same length $k$. – Nocturne Oct 09 '20 at 21:11
  • that is, the blocks of $1$ contained are all of different length ? – G Cab Oct 09 '20 at 21:13
  • 1
    @GCab They can be of the same length if they are not adjacent. 1101011 is a valid string for $n=7$. – Nocturne Oct 09 '20 at 21:14
  • 1
    Any two equal length blocks of 1s must be separated by more than a single 0. Is how I read it. Or possibly must have a block of 1s between them. This is unclear. Is 110011 valid, or are the two 11s considered adjacent? – Arthur Oct 09 '20 at 21:14
  • 1
    What I understand is that if $0^{x_1}1^{y_1}\cdots 0^{x_n}1^{y_n},$ then $y_{i}\neq y_{i+1}.$ – Phicar Oct 09 '20 at 21:16
  • 1
    Basically $110011$ is not allowed but both $1100111$ (diff length) and $1101011$ (same length but there is another sequence of $1$ in between) are allowed. – Math Lover Oct 09 '20 at 21:16
  • 1
    and also $110011$ is valid ? but not $110110$? – G Cab Oct 09 '20 at 21:16
  • @GCab 110011 is not valid, because two adjacent blocks of 1's have the same length. – Nocturne Oct 09 '20 at 21:17
  • @Phicar Yes, I will edit the question with your notation. – Nocturne Oct 09 '20 at 21:18
  • @GCab you need at least one sequence of $1$ of diff length in between for $2$ sequence of $1$ to have same length. – Math Lover Oct 09 '20 at 21:19
  • 2
    Are you interested only in exact results, or also about approximations for large $n$ ? – leonbloy Oct 10 '20 at 19:42
  • 2
    Maybe [this answer](https://math.stackexchange.com/a/1956058/573047) to a simpler problem using generating functions could give some useful idea. – BillyJoe Oct 10 '20 at 20:07
  • 2
    A few sequences that are surprisingly close but not exact matches in the OEIS: http://oeis.org/search?q=2%2C+4%2C+7%2C+13%2C+24%2C+45 – Qiaochu Yuan Oct 11 '20 at 00:59

5 Answers5

4

I confirm your results for $n \le 16$. It might be useful to compute the values by conditioning on $k\in\{1,\dots,\lfloor(n+3)/2\rfloor\}$: \begin{matrix} n\backslash k & 1 & 2 & 3 & 4 & 5 & 6 & 7 & 8 & 9 \\ \hline 0 & 1 \\ 1 & 1 & 1 \\ 2 & 1 & 3 \\ 3 & 1 & 6 & 0 \\ 4 & 1 & 10 & 2 \\ 5 & 1 & 15 & 8 & 0 \\ 6 & 1 & 21 & 22 & 1 \\ 7 & 1 & 28 & 48 & 6 & 0 \\ 8 & 1 & 36 & 92 & 25 & 0 \\ 9 & 1 & 45 & 160 & 77 & 2 & 0 \\ 10 & 1 & 55 & 260 & 196 & 16 & 0 \\ 11 & 1 & 66 & 400 & 437 & 74 & 1 & 0 \\ 12 & 1 & 78 & 590 & 883 & 254 & 9 & 0 \\ 13 & 1 & 91 & 840 & 1652 & 726 & 54 & 0 & 0 \\ 14 & 1 & 105 & 1162 & 2908 & 1818 & 239 & 2 & 0 \\ 15 & 1 & 120 & 1568 & 4869 & 4116 & 857 & 24 & 0 & 0 \\ 16 & 1 & 136 & 2072 & 7819 & 8602 & 2627 & 156 & 1 & 0 \\ \end{matrix}

Maybe try inclusion-exclusion together with stars-and-bars? For fixed $k$, the first term of inclusion-exclusion is the number of nonnegative integer solutions to $$\sum_{j=1}^k x_j + \sum_{j=1}^{k-1} y_j = n - (k-2) - (k-1) = n-2k+3,$$ which is $$\binom{(n-2k+3) + (2k-1)-1}{(2k-1)-1} = \binom{n+1}{2k-2}.$$ For $k\in\{1,2\}$, this formula is correct. For $k \ge 3$, it is only an upper bound.


An alternative approach is to condition on the tail $(y_{k-1},x_k)$. Explicitly, let state space $$S_n = \left\{k \in \{1,\dots,\lfloor(n+3)/2\rfloor\}, y \in \{[k\not=1],\dots,n\}, x \in \{0,\dots,n-y-2k+5\}\right\}.$$ For $(k,y,x) \in S_n$, let $f_n(k,y,x)$ be the number of such binary strings that end in $1^y 0^x$. Then $f$ satisfies the recursion $$f_n(k,y,x) = \begin{cases} 1 &\text{if $n = 0$} \\ [y = 0 \land x = n] &\text{if $k = 1$} \\ \sum\limits_{\substack(k-1,y_{k-2},x_{k-1}) \in S_{n-y-x}:\\ y_{k-2} \not= y \land ((y_{k-2} \ge 1 \land x_{k-1} \ge 1) \lor k=2)} f_{n-y-x}(k-1,y_{k-2},x_{k-1}) &\text{otherwise} \end{cases}$$

The desired values are then $\sum\limits_{(k,y,x) \in S_n} f_n(k,y,x)$.

RobPratt
  • 33,275
  • 3
  • 17
  • 47
  • Might be messy to program with all the conditions, but I guess this dynamic programming approach is the best we can get. Nice diligent work ! – Nocturne Oct 10 '20 at 22:46
  • 1
    @Nocturne thanks. I did code it up to make sure it is correct. What is the source of the problem? – RobPratt Oct 10 '20 at 23:16
3

An aproximation for large $n$

The runs of $0$s and $1$s can be approximated by iid geometric random variables (with $p=1/2$, mean $2$). Hence we have in average $n/2$ runs, of which $n/4$ are runs of $1$s.

Then, the problem is asymtpotically equivalent to : given $m=n/4$ iid Geometric variables $X_1, X_2 \cdots X_m$ find $P_m=$ probability that $X_{i+1} \ne X_i$ for all $i$.

This does not seem a trivial problem, though (and I haven't found any reference).

A crude aproximation would be to assume that the events $X_{i+1} \ne X_i$ are independent. Under this assumption we get

$$P_m \approx P_2^{m-1}= (2/3)^{m-1} \tag 1$$

This approximation is not justified, and it does not seem to improve with $n$ increasing.

The exact value can be obtained by a recursion on the probabilities for each final value, which together with a GF gives me this recursion :

$$P_m = r(1,m) $$

$$r(z,m)= \frac{1}{2z-1} r(1,m-1) - r(2z,m-1) \tag 2$$

with the initial value $r(z,1)=\frac{1}{2z-1}$

Finally, the total number of valid sequences is $C_m = P_m \, 2^n$ ($n=4m$)

I've not yet found an explicit or asympotic for $(2)$.

Some values oc $C_m$

n    m  r(2)            iid(1)          exact
4    1  16              16              13
8    2  170.6           170.6           154
12   3  1950.5          1820.4          1815
16   4  21637.3         19418.1         21414
20   5  243540.2        207126.1        252680     
24   6  2720810.9       2209345.3       2981452
28   7  30515606.3      23566350.0      35179282
leonbloy
  • 56,395
  • 9
  • 64
  • 139
  • 2
    The values for $n\in\{17,\dots,28\}$ are $$39688, 73558, 136333, \color{red}{252680}, 468314, 867962, 1608659, \color{red}{2981452}, 5525763, 10241348, 18981131, \color{red}{35179282}.$$ – RobPratt Oct 11 '20 at 04:17
2

Here I am going to use generating functions like in this answer to a related problem to compute columns of @RobPratt table for $k \ge 3$.

We can define:

$$S_y(k,i) = \left\{\text{n. of solutions for} \sum_{j=1}^{k-1} y_j = i \text{ with } y_j \neq y_{j+1}\right\} \tag{1}\label{1}$$

and then scompose the problem as follows:

$$\left\{\text{n. of solutions for} \sum_{j=1}^k x_j + \sum_{j=1}^{k-1} y_j = n-2k+3 \right\}=\\ \sum_{i=0}^{n-2k+3}\left\{\text{n. of solutions for} \sum_{j=1}^k x_j = n-2k+3-i \right\}S_y(k) =\\ \sum_{i=0}^{n-2k+3}{n-k+2-i \choose k-1}S_y(k,i) \tag{2}\label{2}$$

When $k=3$, the problem of determining $S_y(k,i)=S_y(3,i)$ is all the same as in the above linked problem, only with $2$ variables instead of $4$. Instead of repeating all calculations we can reuse the above answer, removing all terms with an exponent for $y$ greater than $2$, to get the generating function:

$$f(x)=\left[\frac{y^2}{2!}\right]\prod_{n\ge0}(1+yx^n) = \left[\frac{y^2}{2!}\right]\left( 1+\frac y{1-x}+ \frac12\frac{y^2}{(1-x)^2}\right)\left( 1-\frac12\,\frac{y^2}{1-x^2}\right)=\\ \frac{1}{(1-x)^2}-\frac{1}{1-x^2}=\sum_{n=0}^{\infty}\left\{\frac 12 \left[1+(-1)^{n+1}\right]+n\right\}x^n$$

where in the last step I have used WolframAlpha because I am lazy, and then:

$$S_y(3,i) = [x^i]f(x) = \frac 12 \left[1+(-1)^{i+1}\right]+i \tag{3}\label{3}$$

OK, yes, using generating functions for $k = 3$ and $y_1+y_2=i$ is a little overkill, because the $\eqref{3}$ result is obvious (once we choose a value for $y_1$, and this can be done in $i+1$ ways, then $y_2$ is determined; after that the first addendum is needed to discard the $y_1=y_2=i/2$ solution when $i$ is even). Anyway, replacing in $\eqref{2}$ we obtain the formula for the third column of @RobPratt table:

$$\sum_{i=0}^{n-3}{n-1-i \choose 2}\left\{\frac 12 \left[1+(-1)^{i+1}\right]+i\right\}=\\ \frac 1{48} (2 n^4 - 8 n^3 + 4 n^2 + 8 n + 3 (-1)^n - 3)\tag{4}\label{4}$$

where again I have used WolframAlpha for the last step (verified against @RobPratt table here).

Still thinking how to extend this to $k \gt 3$...

BillyJoe
  • 1,889
  • 1
  • 5
  • 18
2

Consider a binary string with $s$ ones and $m$ zeros in total.
Let's put an additional (dummy) fixed zero at the start and at the end of the string. We individuate as a run the consecutive $1$'s between two zeros, thereby including runs of null length. With this scheme we have a fixed number of $m+1$ runs.

Bernoulli_runs_2

The number of different strings with the above numbers of zeros and ones is obviously $$ \left( \matrix{ m + s \cr s \cr} \right) = \left( \matrix{ m + 1 + s - 1 \cr s \cr} \right) $$ which corresponds to the weak compositions of $s$ into $m+1$ parts.

The number of compositions of $s$ into $k$ non-null parts (strong compositions) is instead $$ \binom{s-1}{k-1} $$ and $$ \eqalign{ & \left( \matrix{ m + s \cr s \cr} \right) = \sum\limits_{\left( {1\, \le } \right)\,k\,\left( { \le \,\min \left( {m + 1,s} \right)} \right)} {\left( \matrix{ m + 1 \cr k \cr} \right)\left( \matrix{ s - 1 \cr k - 1 \cr} \right)} = \cr & = \sum\limits_{\left( {1\, \le } \right)\,k\,\left( { \le \,\min \left( {m + 1,s} \right)} \right)} {\left( \matrix{ m + 1 \cr m + 1 - k \cr} \right)\left( \matrix{ s - 1 \cr k - 1 \cr} \right)} \cr} $$

So we can concentrate on strong compositions with no equal consecutive parts.
Consider the strong composition of $s$ into $k$ parts, the last of which is $r$ $$ \left[ {r_{\,1} ,\,r_{\,2} ,\; \cdots ,\,r_{\,k - 1} ,r\;} \right] \quad \left| {\;r_{\,1} + \,r_{\,2} + \; \cdots + \,r_{\,k - 1} + r = s} \right. $$ whose number is $$ C_T(s,k,r) = \left[ {k = 1} \right] + \left( \matrix{ s - r - 1 \cr k - 2 \cr} \right) = \left( \matrix{ s - r - 1 \cr s - r - k + 1 \cr} \right) \quad \left| \matrix{ \;1 \le k \le s \hfill \cr \;1 \le r \le s \hfill \cr} \right. $$ where $[P]$ denotes the Iverson bracket.
Then the sum over $r$ will correctly give $$ \eqalign{ & \sum\limits_{r = 1}^s {C_T (s,k,r)} = \sum\limits_{r = 1}^s {\left( \matrix{ s - r - 1 \cr s - r - k + 1 \cr} \right)} = \sum\limits_{j = 0}^{s - 1} {\left( \matrix{ j - 1 \cr j - k + 1 \cr} \right)} = \cr & = \sum\limits_{\left( {k - 1\, \le } \right)\,j\,\left( { \le \,s - 1} \right)} {\left( \matrix{ s - 1 - j \cr s - 1 - j \cr} \right)\left( \matrix{ j - 1 \cr j - k + 1 \cr} \right)} = \left( \matrix{ s - 1 \cr s - k \cr} \right) \cr} $$

Let's indicate with $C_G (s,p,r), \; C_B (s,p,r)$ the number of good and bad strong compositions of $s$ into $p$ parts the last of which equal to $r$.

Then we have the relations $$ \left\{ \matrix{ C_T (s,p,r) = \left[ {1 \le p \le s} \right]\left[ {1 \le r \le s} \right]\left( \matrix{ s - r - 1 \cr s - r - p + 1 \cr} \right) \hfill \cr C_G (s,1,r) = C_T (s,1,r) = \left[ {r = s} \right]\quad C_B (s,1,r) = 0 \hfill \cr C_G (s,p,r) + C_B (s,p,r) = C_T (s,p,r) \hfill \cr C_B (s,p,r) = \sum\limits_{k = 1}^{s - r} {C_B (s - r,p - 1,k)} + C_G (s - r,p - 1,r) = \hfill \cr = \sum\limits_{k = 1}^{s - r} {C_B (s - r,p - 1,k)} - C_B (s - r,p - 1,r) + C_T (s - r,p - 1,r) \hfill \cr C_G (s,p,r) = \sum\limits_{k = 1}^{s - r} {C_G (s - r,p - 1,k)} - C_G (s - r,p - 1,r) \hfill \cr} \right. $$

In particular for the good strong compositions we can write the recurrence $$ C_G (s,p,r) = \sum\limits_{k = 1}^{s - r} {C_G (s - r,p - 1,k)} - C_G (s - r,p - 1,r) + \left[ {1 = p} \right]\left[ {r = s} \right] $$

After computing $C_G$, we can sum on $r$ and then go back along the previous steps to compute the good weak compositions in terms of $s,m$ and finally the number in $n$, i.e.: $$ N_G (n) = \sum\limits_{s = 1}^n {\sum\limits_{p = 1}^{n - s + 1} {\left( \matrix{ n - s + 1 \cr p \cr} \right) \sum\limits_{r = 1}^s {C_G (s,p,r)} } } $$ which in fact for $0 \le n \le 16$ gives $$ 0, \, 1, \, 3, \, 6, \, 12, \, 23, \, 44, \, 82, \, 153, \, 284, \, 527, \, 978, \, 1814, \, 3363, \, 6234, \, 11554, \, 21413 $$ not counting as good the all zeros string.

RobPratt
  • 33,275
  • 3
  • 17
  • 47
G Cab
  • 33,333
  • 3
  • 19
  • 60
2

I am gonna attempt to complement RobPratt's proposed approach involving inclusion exclusion and stars and bars and be that person who posts a horribly long formula.

Consider $$A_{n,k,r}=\left |\left \{0^{l_1}1^{k_1}\cdots 0^{l_r}1^{k_r}0^{l_{r+1}}\in \{0,1\}^n: k_i>0,k_i\neq k_{i+1}, \sum _{i=1}^rk_i=k \text{ and for $i\neq 1,r+1,$ }l_i>0\right \}\right |.$$ Our desired result will be $$A_n=\sum _{k=0}^n\sum _{r=0}^nA_{n,k,r}.$$ Notice that we can, by the multiplication principle, express $A_{n,k,r}$ as $$A_{n,k,r}=|B_{n,k,r}|\times |C_{k,r}|,$$ where $$B_{n,k,r}=\left \{(l_1,\cdots ,l_{r+1})\in \left (\mathbb{Z}^{\geq 0}\right )^{r+1}:\sum l_i=n-k, l_i>0 \text{ for $1<i<r+1$}\right \}$$ represents the way to place the $0'$s and $$C_{k,r}=\{(k_1,\cdots ,k_r)\in \left (\mathbb{Z}^{> 0}\right )^{r}:\sum k_i=k,k_i\neq k_{i+1}\}.$$ represents the way to place the 1's.

By stars and bars we get that $|B_{n,k,r}|=\binom{n-k-(r-1)+(r+1)-1}{(r+1)-1}=\binom{n-k+1}{r}.$ Now, consider the following set $C_{k,r,x}=\{(k_1,\cdots k_r)\in C_{k,r}:k_x=k_{x+1}\}$ which carries the words with at least one consecutive chunk of 1's of the same size(at index $x$). To illustrate the next step, notice that $|C_{k,r,x}|=\sum _{t=1}^{\lfloor k/2\rfloor}\binom{k-2t-1}{r-2-1},$ for any $1\leq x<r$ by assuming the summands at position $x$ and $x+1$ are the same, this value is given by $t.$

We can then express $$C_{k,r}=\binom{k-1}{r-1}-\sum _{\ell =1}^{r-1}(-1)^{\ell -1}\sum _{X\in \binom{[r-1]}{\ell}}\left | \bigcap_{x\in X} C_{k,r,x}\right |,$$ but now the problem starts again because we need to know how many chunks of elements(consecutive) are in the set $X$ for us to be able to know how many summands are equal. When we know this, we can use stars and bars. Call thr number of chunks $s$ and call the size of $i-$th chunk $\ell _i.$ Notice that we want $\ell _i>0$ and $\sum \ell _i=\ell.$ We then associate a number $t_i$ to be the number in the summand of all elements in the $i-$th chunk (for a total contribution of $(\ell _i+1)t_i$ to the whole sum). We get then that $$C_{k,r}=\sum _{\ell =0}^{r-1}(-1)^{\ell}\sum _{s = 0}^{\ell}\sum _{\substack{ \ell_1+\cdots +\ell_s=\ell \\ \ell _i,t_i>0}}\binom{r-1-\ell -(s-1)+(s+1)-1}{s+1-1}\binom{k-\left (\sum _{i=1}^s(l_i+1)t_i\right )-1}{r-(\ell+s)-1}.$$ Putting all this together we get $$A_{n}=\sum _{k=0}^n\sum _{r=0}^n\binom{n-k+1}{r}\sum _{\ell =0}^{r-1}(-1)^{\ell}\sum _{s = 0}^{\ell}\sum _{\substack{ \ell_1+\ell_2+\cdots +\ell_s=\ell \\ \ell _i,t_i>0}}\binom{r-\ell}{s}\binom{k-\left (\sum _{i=1}^s(l_i+1)t_i\right )-1}{r-(\ell+s)-1}.$$ In this formula,by the combinatorial interpretation, treat $\binom{-1}{-1}=1.$

Using sage I get the sequence $2,4,7,13,24,45,83,154,\dots.$
For the moment I do not see any way to make this less painful.


Another approach that might yield to something. Construct the DFA associated with that language (fix the maximal number of chunks to be $n$ and the maximum number of $1's$ in each chunk to be $k.$) The DFA looks like an $k\times n$ array and consider the Chomsky-Schutzenberger technique One has to solve a system of $k(n-k)$ equations and then try to take the limit as $n,k$ go to $\infty.$ The system to solve in variables $R_{i,j}\in \mathbb{Q}[[x]]$ looks like $$R_{i,j}=\begin{cases}xR_{i,j+1}+xR_{j,0}+[i\neq j] & \text{If }j>0\\ xR_{i,1}+xR_{i,0}+1 & \text{If }j=0.\end{cases}$$

Phicar
  • 13,285
  • 2
  • 18
  • 23