# Two solutions with 20 tests

I am about $10=\lceil\log_21000\rceil$ years late to the party (weird coincidence since it's the solution to $T_*(1,1000)$), yet I am surprised no one has mentioned a solution with $20$ steps for the $T_*(2,1000)$ problem. (The notation I am using is that $T_*(k,N)$ represents the maximum number of steps required to find exactly $k$ poisoned wine bottles in exactly $N$ wine bottles by algorithm $*$ .)

There is an algorithm known as the Generalised Binary Splitting, developed by Hwang, and for the sake of completeness, I shall reproduce that here later.

But first, here's an algorithm for the specific case of $T_s(k=2,N)$ which we claim can be solved in at most $4\lceil\log_4N\rceil$ tests, which gives, for $N=1000$, $T_s(2,1000)=4\lceil\log_41000\rceil=20$ tests.

(Side note : None of these algorithms are proven to be optimal, although both are near optimal, as one can check how much the answers differ from the information lower bound, which is an obvious lower bound, but is in general unachievable).

Here's a proof of the claim.We proceed via induction on $N$. We first see that for $2\le N\le4$, we do need to check every single one of them for identification, thus the number of tests is $N\le 4$, and we also have $T_s(2,N)=4\lceil\log_4N\rceil=4$ and thus the result holds in these cases.

Take any of them as the base case and assume that the statement is true for all number of items $\le N-1$. Then for the case of $N$ items, consider the following algorithm:

If we are in any base case, the number of tests required is at most $4$, so add a $4$ and stop.

If not, make $4$ groups of bottles such that there are $n_1$ groups of $2^{\lceil \log_2(N/4)\rceil}=\lambda$ (say) and another $n_2$ groups of $b\le \lambda$ bottles. This gives us 2 equations: $$n_1\lambda+n_2b=N$$ $$n_1+n_2=4$$ Solving simultaneously gives: $$n_2(b-\lambda)=4\lambda-N$$ This gives us solutions for the pair $(n_2,b)\in \mathbb Z^2$ and we only consider those solutions which satisfy $0\le n_2\le 4$ and $0\le b\le \lambda$. Choose the pair such that $n_2$ is maximum (this minimizes $n_1$, but any solution will work. Just for the sake of a concrete algorithm, we are considering this). Observe that $$\begin{align}\lceil \log_2 b\rceil\le\lceil\log_2\lambda\rceil\end{align}$$
We label the groups with their sizes ($\lambda$ and $b$).
Now we test each group, which requires $t_1=4$ tests. Consider the possibilities. If just $1$ group signals poisoned, then we can have a $\lambda$ group signalling or a $b$ group signalling. If $2$ groups signal poisoned, then the possibilities are $2\ \lambda$ groups signalling or $2\ b$ groups signalling or a $\lambda$ and a $b$ group signalling. These scenarios are further referred to as $A,B,C,D$ and $E$ respectively.

We shall use the result that $1$ poisoned bottle in $n$ bottles can be found in at most $\lceil \log_2 n\rceil$ tests.

Analysing $C,D$ and $E$ first, we note that identifying the $2$ poisoned bottles in the groups can be done in $$C\to 2\lceil \log_2 \lambda\rceil$$ $$D\to 2\lceil \log_2 b\rceil\le 2\lceil \log_2 \lambda\rceil$$ $$E\to \lceil \log_2 b\rceil+\lceil \log_2 \lambda\rceil\le 2\lceil \log_2 \lambda\rceil$$ steps, and we see that they are all bounded by $2\lceil \log_2 \lambda\rceil$ and thus the total number of steps is bounded by $T_1=2\lceil \log_2 \lambda\rceil+4$ steps. Putting in the definition of $\lambda$ and simplifying, we get the total number of tests to be $$T_1=2\lceil\log_2N\rceil=2\lceil2\log_4N\rceil\le 4\lceil\log_4N\rceil=T_s(2,N)$$ Now we analyse $A$ and $B$. If we are in any base case, go to step $1$. If not, go to the beginning of step $2$ with $N:=\lambda$ for $A$ or $N:=b$ for $B$.

By induction, this last step would require either $4\lceil\log_4\lambda\rceil$ or $4\lceil\log_4b\rceil$ steps to identify the poisoned bottles. We require the maximum of these 2, which is easy to see once we note that $4\lceil\log_4b\rceil\le 4\lceil\log_4\lambda\rceil$ and hence $T_2=4\lceil\log_4\lambda\rceil+4$. Simplifying again, we have $$T_2=4\left\lceil\frac12\lceil\log_2N\rceil\right\rceil=4\left\lceil\frac12\lceil2\log_4N\rceil\right\rceil\le 4\lceil\log_4N\rceil$$ This completes the proof.$\square$

The Generalised Binary Splitting kind of uses the same idea, but is much more general. The algorithm goes as follows:

- If $N\le 2k-2$, test the $N$ items individually. If $N\ge2k-1$, set $l=N-k+1$. Define $\alpha=\lfloor\log(l/k)\rfloor$.
- If $N>2d-2$, test a group of size $2^\alpha$. If the outcome is negative, the $2^\alpha$ items in the group are identified as good . Set $N:=N-2^\alpha$ and go to Step 1. If the outcome is positive , use binary splitting to identify one defective and an
unspecified number , say $x$, of good items. Set $N:=N-1-x$ and $k=k-1$. Go to Step 1.

This algorithm gives: $$T_G(k,N)=\begin{cases}N\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{for}\ N\le 2k-2\\ (\alpha+2)d+p-1\ \text{for}\ N\ge 2k-1\end{cases}$$
where $p<k$ is a non-negative integer uniquely defined in $l=2^\alpha k+2^\alpha p+\theta,\ 0\le\theta<2^\alpha$.

The proof goes as follows. The case $N\le 2k-2$ is true due to step 1. For $2k-1\le N\le 3k-2$,$\alpha$ must be $0$ and $G$ is reduced to individual testing. Furthermore, $\theta=0$ and $p=l-k=N-2k+1$. Thus $$(\alpha+2)k+p-1=2k+N-2k+1-1=N=T_G(k,N)$$ For $k=1$, $l=N-k+1$. Note that except for $N=2^\alpha$, $G$ is reduced to binary splitting and $$(\alpha+2)k+p-1=\alpha+1=\lfloor\log_2N\rfloor+1=\lceil\log_2N\rceil=T_G(1,N)$$ For $N=2^\alpha$, $G$ spends one more test than binary splitting (by testing the whole set first) and $$(\alpha+2)k+p-1=1+\lceil\log_2N\rceil=T_G(1,N)$$ For the general case $k\ge 2$ and $N\ge 3k-1$, we proceed via induction on $k+N$. From step 2 $$T_G(k,N)=\max\{1+T_G(k,N-2^\alpha),1+\alpha+T_G(k-1,N-1)\}$$ For $N'=N-2^\alpha$ and $k'=k$, $$\begin{align*}l'&=N'-d'+1=N-2^\alpha-k+1=l-2^\alpha\\&=\begin{cases}2^\alpha k+2^\alpha(p-1)+\theta\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{for}\ p\ge 1 \\2^{\alpha-1}k+2^{\alpha-1}(k-2)+\theta\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{for}\ p=0,\ \theta<2^{\alpha-1}\\2^{\alpha-1}k+2^{\alpha-1}(k-1)+(\theta-2^{\alpha-1})\ \text{for}\ p=0,\ \theta\ge 2^{\alpha-1}\end{cases}\end{align*}$$ Hence by induction $$T_G(k,N-2^\alpha)=\begin{cases}(\alpha+2)k+(p-1)-1\ \text{for}\ p\ge 1\\ (\alpha+1)k+(k-2)-1\ \text{for}\ p=0,\ \theta<2^{\alpha-1}\\ (\alpha+1)k+(k-1)-1\ \text{for}\ p=0,\ \theta\ge 2^{\alpha-1}\end{cases}$$ Consequently $$1+T_G(k-N-2^\alpha)=\begin{cases}(\alpha+2)k+p-2\ \text{for}\ p=0,\ \theta<2^{\alpha-1}\\ (\alpha+2)k+p-1\ \text{otherwise}\end{cases}$$ For $k'=k-1$ and $N'=N-1$, $$l'=N'-k'+1=l=\begin{cases}2^\alpha(k-1)+2^\alpha(p+1)+\theta\ \text{for}\ p\le k-3\\ 2^{\alpha+1}(k-1)+\theta\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{for}\ p=k-2\\ 2^{\alpha+1}(k-1)+2^\alpha+\theta\ \ \ \ \ \ \ \ \ \ \text{for}\ p=k-1\end{cases}$$ And hence by induction $$T_G(k-1,N-1)=\begin{cases}(\alpha+2)(k-1)+(p+1)-1\ \text{for}\ p\le k-3\\ (\alpha+3)(k-1)-1\ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \ \text{for}\ k-2\le p\le k-1\end{cases}$$ Which gives $$1+\alpha+T_G(k-1,N-1)=\begin{cases}(\alpha+2)k+p-2\ \text{for}\ p=k-1\\(\alpha+2)k+p-1\ \text{otherwise}\end{cases}$$ Since for $d\ge 2$, $p=0$ and $p=k-1$ are mutually exclusive, $$T_G(k,N)=\max\{1+T_G(k,N-2^\alpha),1+\alpha+T_G(k-1,N-1)\}=(\alpha+2)k+p-1\square$$