Divide the list into three lists such that their sum are close to each other

Question

Let's say that I have an array of number S = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]. I want to divide into three arrays. The order of the number and the number of item in those array does not matter.

Let's say A1, A2, and A3 are the sub arrays. I want to minimize the function

f(x) = ( SUM(A1) - SUM(S) / 3 )^2 / 3 +
       ( SUM(A2) - SUM(S) / 3 )^2 / 3 +
       ( SUM(A3) - SUM(S) / 3 )^2 / 3

I don't need an optimal solution; I just need the solution that is good enough.
I don't want an algorithm that is too slow. I can trade some speed for a better result, but I cannot trade too much.
The length of S is around 10 to 30.

Why

Why do I need to solve this problem? I want to nicely arrange the box into three columns such that the total height of each columns is not too different from each other.

What have I tried

My first instinct is to use greedy. The result is not that bad, but it does not ensure an optimal solution. Is there a better way?

s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]
s = sorted(s, reverse=True)

a = [[], [], []]
sum_a = [0, 0, 0]

for x in s:
    i = sum_a.index(min(sum_a))
    sum_a[i] += x
    a[i].append(x)

print(a)

My brain isn't fully booted yet, but I'm trying to decide if this is essentially the [knapsack problem](https://en.m.wikipedia.org/wiki/Knapsack_problem) which is NP hard. — Jonathon Reinhart, Dec 19 '16 at 11:49
@JonathonReinhart it is closer to the subset problem, but since OP isn't looking for a perfect solution they can use one of it's approximation algorithms. — DeepSpace, Dec 19 '16 at 11:51
But *"The result is not that bad, it is not ensure optimal solution."* sounds like he is looking for an optimal solution. — Jonathon Reinhart, Dec 19 '16 at 11:52
Your problem is a generalization of subset sum problem, but there is a pseudo polynomial time algorithm and easy (very good) approximation for it. — Saeed Amiri, Dec 19 '16 at 11:52
@JonathonReinhart, no need to be optimal, but it would be nice if it is optimal. As long as, it produce a very good result. — invisal, Dec 19 '16 at 11:53
Thanks @Saeed, yes https://en.m.wikipedia.org/wiki/Subset_sum_problem is NP complete. The effectiveness of your greedy algorithm likely depends on the number / range of your input values. (If you have lots of little ones to work with, it should "balance out" well). — Jonathon Reinhart, Dec 19 '16 at 11:54
By the way, `sorted(s, reverse=True)` has no effect as `sorted` returns a new list. — DeepSpace, Dec 19 '16 at 11:55
Also note that [memoization](http://stackoverflow.com/questions/1988804/what-is-memoization-and-how-can-i-use-it-in-python) is often used in dynamic programming solutions and is easy to add to a function in Python. — Jonathon Reinhart, Dec 19 '16 at 11:57
What DeepSpace said, To sort a list in-place, use its `.sort` method. FWIW, `sorted` works by copying its iterable arg to a list and then calling the `.sort` method on that list. So it's generally better to use `.sort` rather than `sorted` when you can. — PM 2Ring, Dec 19 '16 at 11:59
@PM2Ring, thanks for feedback. Generally, I want to quickly write what I have in mind to show what I have tried. Actually, I haven't done the real implementation yet. Just need to see more possibility first :D — invisal, Dec 19 '16 at 12:01
@Fomalhaut, the length of the S is around 10 to 20. Not big. — invisal, Dec 19 '16 at 12:03
Given that this works, you might be interested in [Code Review](http://codereview.stackexchange.com). Your code is very short but people may still have helpful suggestions. — SuperBiasedMan, Dec 19 '16 at 12:13
@Fomalhaut, nice spot, but I re-edit the function I want to optimize. — invisal, Dec 19 '16 at 12:14
How large is the range of numbers in S? A dynamic programming method that takes O(Sum(S)^3) time is straightforward. — Mo Tao, Dec 19 '16 at 12:20
This paper on multi-way number partitioning might interest you (in particular, section 4, Sequential Number Partitioning (SNP)) : http://www.aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/viewFile/625/705 — גלעד ברקן, Dec 19 '16 at 22:48
@SaeedAmiri, MartijnPieters The duplicate has nowhere near the interest or the variety of answers this question has. I think marking it as a duplicate is a mistake. — גלעד ברקן, Dec 21 '16 at 23:52
@גלעדברקן, It is a duplicate, even though it has better votes and better presentation here. — Saeed Amiri, Dec 21 '16 at 23:59
@SaeedAmiri, MartijnPieters if this question is removed, SO loses quality content. Not to mention all the work put in by the people who contributed. If this question has "better votes and better presentation," as Saeed says, why don't you mark the *other* question as a duplicate then? I don't like this kind of arbitrary moderation - "duplicate" should mean "contains no new information" rather than just following some guideline without thinking about what it means for SO content, and for the people who contribute to make SO what it is. — גלעד ברקן, Dec 22 '16 at 03:53
@גלעדברקן, This question does not have any extra information at all. It has just a picture which makes it look nicer. Answers in the other question are either better than answers here or at least as good as answers here (here answers are actually like solving a homework). Also if you read those rules in the definition of duplicates, you can understand that duplicate does not even mean repeating the question, but means if the question has been answered somewhere else! Maybe even you ask totally different looking question, but it was already answered somewhere. So it does not add anything new. — Saeed Amiri, Dec 22 '16 at 12:44
@SaeedAmiri, (cc MartijnPieters) I used the word, "better," because you did. I think some of the answers here add different content to SO than the answers in the other question. And your comment about the SO guideline for duplicates exactly illustrates my point, "*duplicate* should mean *contains no new or different information* rather than just following some guideline without thinking about what it means for SO content, and for the people who contribute to make SO what it is." — גלעד ברקן, Dec 22 '16 at 14:27
@גלעדברקן, it is always possible to move these answers to the other question. But if you have further discussions on this, you can always discuss it in the meta. — Saeed Amiri, Dec 22 '16 at 16:38
This is not a duplicate of the previously linked question. **Here:** "The order of the number and the number of item in those array does not matter" [**There:**](https://stackoverflow.com/questions/8292791/how-to-divide-an-array-into-3-parts-with-the-sum-of-each-part-roughly-equal#comment10214987_8292791) "I want to group them with an order" — Zero Piraeus, Dec 26 '17 at 18:00

kezzos · Answer 1 · 2016-12-21T15:43:36.123

As you said you don't mind a non-optimal solution, I though I would re-use your initial function, and add a way to find a good starting arrangement for your initial list s

Your initial function:

def pigeon_hole(s):
    a = [[], [], []]
    sum_a = [0, 0, 0]
    for x in s:
        i = sum_a.index(min(sum_a))
        sum_a[i] += x
        a[i].append(x)
    return map(sum, a)

This is a way to find a sensible initial ordering for your list, it works by creating rotations of your list in sorted and reverse sorted order. The best rotation is found by minimizing the standard deviation, once the list has been pigeon holed:

def rotate(l):
    l = sorted(l)
    lr = l[::-1]
    rotation = [np.roll(l, i) for i in range(len(l))] + [np.roll(lr, i) for i in range(len(l))]
    blocks = [pigeon_hole(i) for i in rotation]
    return rotation[np.argmin(np.std(blocks, axis=1))]  # the best rotation

import random
print pigeon_hole(rotate([random.randint(0, 20) for i in range(20)]))

# Testing with some random numbers, these are the sums of the three sub lists
>>> [64, 63, 63]

Although this could be optimized further it is quite quick taking 0.0013s for 20 numbers. Doing a quick comparison with @Mo Tao's answer, using a = rotate(range(1, 30))

# This method
a = rotate(range(1, 30))
>>> [[29, 24, 23, 18, 17, 12, 11, 6, 5], [28, 25, 22, 19, 16, 13, 10, 7, 4, 1], [27, 26, 21, 20, 15, 14, 9, 8, 3, 2]]
map(sum, a)
# Sum's to [145, 145, 145] in 0.002s

# Mo Tao's method
>>> [[25, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 3, 2, 1], [29, 26, 20, 19, 18, 17, 16], [28, 27, 24, 23, 22, 21]]
# Sum's to [145, 145, 145] in 1.095s

This method also seems to find the optimal solution in many cases, although this probably wont hold for all cases. Testing this implementation 500 times using a list of 30 numbers against Mo Tao's answer, and comparing if the sub-lists sum to the same quantity:

c = 0
for i in range(500):
    r = [random.randint(1, 10) for j in range(30)]
    res = pigeon_hole(rotate(r))
    d, e = sorted(res), sorted(tao(r))  # Comparing this to the optimal solution by Mo Tao
    if all([k == kk] for k, kk in zip(d, e)):
        c += 1
    memory = {}
    best_f = pow(sum(s), 3)
    best_state = None

>>> 500 # (they do)

I thought I would provide an update with a more optimized version of my function here:

def rotate2(l):
    # Calculate an acceptable minimum stdev of the pigeon holed list
    if sum(l) % 3 == 0:
        std = 0
    else:
        std = np.std([0, 0, 1])

    l = sorted(l, reverse=True)
    best_rotation = None
    best_std = 100

    for i in range(len(l)):
        rotation = np.roll(l, i)
        sd = np.std(pigeon_hole(rotation))

        if sd == std:  
            return rotation  # If a min stdev if found 

        elif sd < best_std:
            best_std = sd
            best_rotation = rotation

    return best_rotation

The main change is that the search for a good ordering stops once a suitable rotation has been found. Also only the reverse sorted list is searched which doesnt appear to alter the result. Timing this with

print timeit.timeit("rotate2([random.randint(1, 10) for i in range(30)])", "from __main__ import rotate2, random", number=1000) / 1000.

results in a large speed up. On my current computer rotate takes about 1.84ms and rotate2 takes about 0.13ms, so about a 14x speed-up. For comparison גלעד ברקן 's implementation took about 0.99ms on my machine.

Not sure what you're running but here are the results on repl.it (https://repl.it/Euz5) Mine: `--- 0.0004673004150390625 seconds ---` | `rotate2`: `--- 0.0009853839874267578 seconds ---` Also, the function, `KK3`, is pretty cool on its own, lol. Cheers. — גלעד ברקן, Dec 21 '16 at 17:00

Mo Tao · Answer 2 · 2016-12-19T13:15:47.290

4

As I mentioned in the comment of the question, this is the straight-forward dynamic programming method. It takes less than 1 second for s = range(1, 30) and gives optimized solution.

I think the code is self-explained if you known Memoization.

s = range(1, 30)
# s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]
n = len(s)

memory = {}
best_f = pow(sum(s), 3)
best_state = None

def search(state, pre_state):
    global memory, best_f, best_state    
    s1, s2, s3, i = state
    f = s1 * s1 + s2 * s2 + s3 * s3
    if state in memory or f >= best_f:
        return
    memory[state] = pre_state
    if i == n:
        best_f = f
        best_state = state
    else:
        search((s1 + s[i], s2, s3, i + 1), state)
        search((s1, s2 + s[i], s3, i + 1), state)
        search((s1, s2, s3 + s[i], i + 1), state)

search((0, 0, 0, 0), None)

a = [[], [], []]
state = best_state
while state[3] > 0:
    pre_state = memory[state]
    for j in range(3):
        if state[j] != pre_state[j]:
            a[j].append(s[pre_state[3]])
    state = pre_state

print a
print best_f, best_state, map(sum, a)

edited Dec 19 '16 at 13:15

answered Dec 19 '16 at 12:51

Mo Tao

1,045
1
7
17

Thanks the answer this question. I will check and compare with different strategy. Thanks for time your valuable time to answer this question. And yes, I do know memoization :D – invisal Dec 19 '16 at 12:53
@invisal Glad I could help. – Mo Tao Dec 19 '16 at 13:18
Instead of using global variables I'd put that code in a function and use `nonlocal` inside `search`, in this way you avoid poisoning the global namespace... – Bakuriu Dec 19 '16 at 16:56
Your code is happy with `s = [random.randint(0,10) for r in xrange(30)]`. But even with small number range up to `random.randint(0,100)`, code is not so happy anymore... – גלעד ברקן Dec 20 '16 at 14:54
@Bakuriu yes, that would be better. This code is used to demonstrate the algorithm only. – Mo Tao Dec 21 '16 at 06:42
@גלעדברקן Of course, the time complexity is O(SUM(S)^3), and the questioner said that "the SUM(S) is around 100~150". – Mo Tao Dec 21 '16 at 06:44

score 3 · Answer 3 · answered Dec 19 '16 at 12:42

We can research the stability of the solution you found with respect to replacing of elements between found lists. Below I placed my code. If we make the target function better by a replacement we keep found lists and go further hoping that we will make the function better again with another replacement. As the starting point we can take your solution. The final result will be something like a local minimum.

from copy import deepcopy

s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]
s = sorted(s, reverse=True)

a = [[], [], []]
sum_a = [0, 0, 0]

for x in s:
    i = sum_a.index(min(sum_a))
    sum_a[i] += x
    a[i].append(x)


def f(a):
    return ((sum(a[0]) - sum(s) / 3.0)**2 + (sum(a[1]) - sum(s) / 3.0)**2 + (sum(a[2]) - sum(s) / 3.0)**2) / 3


fa = f(a)

while True:
    modified = False

    # placing
    for i_from, i_to in [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]:
        for j in range(len(a[i_from])):
            a_new = deepcopy(a)
            a_new[i_to].append(a_new[i_from][j])
            del a_new[i_from][j]
            fa_new = f(a_new)
            if fa_new < fa:
                a = a_new
                fa = fa_new
                modified = True
                break
        if modified:
            break

    # replacing
    for i_from, i_to in [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]:
        for j_from in range(len(a[i_from])):
            for j_to in range(len(a[i_to])):
                a_new = deepcopy(a)
                a_new[i_to].append(a_new[i_from][j_from])
                a_new[i_from].append(a_new[i_to][j_to])
                del a_new[i_from][j_from]
                del a_new[i_to][j_to]
                fa_new = f(a_new)
                if fa_new < fa:
                    a = a_new
                    fa = fa_new
                    modified = True
                    break
            if modified:
                break
        if modified:
            break

    if not modified:
        break

print(a, f(a)) # [[9, 3, 1, 1], [7, 4, 3], [6, 5, 2]] 0.2222222222222222222

It's interesting that this approach works well even if we start with arbitrary a:

from copy import deepcopy

s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]

def f(a):
    return ((sum(a[0]) - sum(s) / 3.0)**2 + (sum(a[1]) - sum(s) / 3.0)**2 + (sum(a[2]) - sum(s) / 3.0)**2) / 3


a = [s, [], []]
fa = f(a)

while True:
    modified = False

    # placing
    for i_from, i_to in [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]:
        for j in range(len(a[i_from])):
            a_new = deepcopy(a)
            a_new[i_to].append(a_new[i_from][j])
            del a_new[i_from][j]
            fa_new = f(a_new)
            if fa_new < fa:
                a = a_new
                fa = fa_new
                modified = True
                break
        if modified:
            break

    # replacing
    for i_from, i_to in [(0, 1), (0, 2), (1, 0), (1, 2), (2, 0), (2, 1)]:
        for j_from in range(len(a[i_from])):
            for j_to in range(len(a[i_to])):
                a_new = deepcopy(a)
                a_new[i_to].append(a_new[i_from][j_from])
                a_new[i_from].append(a_new[i_to][j_to])
                del a_new[i_from][j_from]
                del a_new[i_to][j_to]
                fa_new = f(a_new)
                if fa_new < fa:
                    a = a_new
                    fa = fa_new
                    modified = True
                    break
            if modified:
                break
        if modified:
            break

    if not modified:
        break

print(a, f(a)) # [[3, 9, 2], [6, 7], [4, 3, 1, 1, 5]] 0.2222222222222222222

It provides a different result but the same value of the function.

Interesting, I will try to understand the code and see the result. Thanks for your time to write such a long reply. — invisal, Dec 19 '16 at 12:52

Gurupad Mamadapur · Answer 4 · 2016-12-20T14:50:42.860

I would have to say that your greedy function does produce good results but tends to become very slow if input size is large say more than 100.

But, you've said that your input size is fixed in the range - 10,30. Hence the greedy solution is actually quite good.Instead of becoming all too greedy in the beginning itself.I propose to become a bit lazy at first and become greedy at the end.

Here is a altered function lazy :

def lazy(s):
    k = (len(s)//3-2)*3 #slice limit

    s.sort(reverse=True)
    #Perform limited extended slicing
    a = [s[1:k:3],s[2:k:3],s[:k:3]]

    sum_a = list(map(sum,a))
    for x in s[k:]:
        i = sum_a.index(min(sum_a))
        sum_a[i] += x
        a[i].append(x)
    return a

What it does is it first sorts the input in descending order and fills items in three sub-lists one-by-one until about 6 items are left.(You can change this limit and test, but for size 10-30 I think this is the best)

When that is done simply continue with the greedy approach.This method takes very less time and more accurate than the greedy solution on average.

Here is a line plot of size versus time -

and size versus accuracy -

Accuracy is the standard deviation from the mean of final sub-lists and the original list. Because you want the columns to stack up at almost similar height and not at the (mean of the original list) height.

Also, the range of item value is between 3-15 so that sum is around 100-150 as you mentioned.

These are the test functions -

def test_accuracy():
    rsd = lambda s:round(math.sqrt(sum([(sum(s)//3-y)**2 for y in s])/3),4)
    sm = lambda s:list(map(sum,s))

    N=[i for i in range(10,30)]
    ST=[]
    MT=[]
    for n in N:
        case = [r(3,15) for x in range(n)]

        ST.append(rsd(sm(lazy(case))))
        MT.append(rsd(sm(pigeon(case))))

    strace = go.Scatter(x=N,y=ST,name='Lazy pigeon')
    mtrace = go.Scatter(x=N,y=MT,name='Pigeon')
    data = [strace,mtrace]

    layout = go.Layout(
    title='Uniform distribution in 3 sublists',
    xaxis=dict(title='List size',),
    yaxis=dict(title='Accuracy - Standard deviation',))
    fig = go.Figure(data=data, layout=layout)

    plotly.offline.plot(fig,filename='N vs A2.html')

def test_timings():
    N=[i for i in range(10,30)]
    ST=[]
    MT=[]
    for n in N:
        case = [r(3,15) for x in range(n)]           
        start=time.clock()
        lazy(case)
        ST.append(time.clock()-start)
        start=time.clock()
        pigeon(case)
        MT.append(time.clock()-start)

    strace = go.Scatter(x=N,y=ST,name='Lazy pigeon')
    mtrace = go.Scatter(x=N,y=MT,name='Pigeon')
    data = [strace,mtrace]

    layout = go.Layout(
    title='Uniform distribution in 3 sublists',
    xaxis=dict(title='List size',),
    yaxis=dict(title='Time (seconds)',))

    fig = go.Figure(data=data, layout=layout)

    plotly.offline.plot(fig,filename='N vs T2.html')

Here is the complete file.

Edit -

I tested kezzos answer for accuracy and it performed really good. The deviation stayed less than .8 all the time.

Average standard deviation in 100 runs.

  Lazy Pigeon     Pigeon         Rotation
  1.10668795      1.1573573      0.54776425

In the case of speed, the order is quite high for rotation function to compare. But, 10^-3 is fine unless you want to run that function repeatedly.

  Lazy Pigeon     Pigeon         Rotation
  5.384013e-05    5.930269e-05   0.004980

Here is bar chart comparing accuracy of all three functions. -

All in all, kezzos solution is the best if you are fine with the speed.

Html files of plotly - versus time,versus accuracy and the bar chart.

Not sure why this was downvoted. Definitely not optimal, but OP is not looking for optimal. One modification I would suggest it to cycle the index to select elements `0, 3+1, 6+2, 9, 12+1, 15+2, ...` instead of just the multiples of three. You'll get better spread with very little added complexity. — Mad Physicist, Dec 19 '16 at 15:07
@MadPhysicist Thanks for the suggestion. I'll also post an accuracy plot using slicing versus pigeon-hole and some other methods shown here and try comparing them with speed. — Gurupad Mamadapur, Dec 19 '16 at 15:10
Just out of general curiosity, what sort of errors you were finding with my method? — kezzos, Dec 20 '16 at 09:28
@kezzos It's fixed now. I am using python 3.5 where map returns a map object instead of a list. — Gurupad Mamadapur, Dec 20 '16 at 13:37
Thanks! I'm also curious how the function `KK3` used for initial approximation might do on its own. — גלעד ברקן, Dec 21 '16 at 15:26

score 1 · Answer 5 · edited May 23 '17 at 12:32

Here's my nutty implementation of Korf's1 Sequential Number Partitioning (SNP), but it only uses Karmarkar–Karp rather than Complete Karmarkar–Karp for the two-way partition (I've included an unused, somewhat unsatisfying version of CKK - perhaps someone has a suggestion to make it more efficient?). On the first subset, it places lower and upper bounds. See the referenced article. I'm sure more efficient implementations can be made. Edit MAX_ITERATIONS for better results versus longer wait :)

By the way, the function, KK3 (extension of Karmarkar–Karp to three-way partition, used to compute the first lower bound), seems pretty good by itself.

from random import randint
from collections import Counter
from bisect import insort
from time import time

def KK3(s):
  s = list(map(lambda x: (x,0,0,[],[],[x]),sorted(s)))

  while len(s) > 1:
    large = s.pop()
    small = s.pop()
    combined = sorted([large[0] + small[2], large[1] + small[1],
large[2] + small[0]],reverse=True)
    combined = list(map(lambda x: x - combined[2],combined))
    combined = combined + sorted((large[3] + small[5], large[4] +
small[4], large[5] + small[3]),key = sum)
    insort(s,tuple(combined))

  return s

#s = [6, 2, 1, 7, 4, 3, 9, 5, 3, 1]

s = [randint(0,100) for r in range(0,30)]

# global variables
s = sorted(s,reverse=True)
sum_s = sum(s)
upper_bound = sum_s // 3
lower_bound = sum(KK3(s)[0][3])
best = (sum_s,([],[],[]))
iterations = 0
MAX_ITERATIONS = 10000

def partition(i, accum):
  global lower_bound, best, iterations
  sum_accum = sum(accum)

  if sum_accum > upper_bound or iterations > MAX_ITERATIONS:
    return

  iterations = iterations + 1

  if sum_accum >= lower_bound:
    rest = KK(diff(s,accum))[0]
    new_diff = sum(rest[1]) - sum_accum

    if new_diff < best[0]:
      best = (new_diff,(accum,rest[1],rest[2]))
      lower_bound = (sum_s - 2 * new_diff) // 3
      print("lower_bound: " + str(lower_bound))

  if not best[0] in [0,1] and i < len(s) - 1 and sum(accum) + sum(s[i
+ 1:]) > lower_bound:
    _accum = accum[:]
    partition(i + 1, _accum + [s[i]])
    partition(i + 1, accum)

def diff(l1,l2):
  return list((Counter(l1) - Counter(l2)).elements())

def KK(s):
  s = list(map(lambda x: (x,[x],[]),sorted(s)))

  while len(s) > 1:
    large = s.pop()
    small = s.pop()
    insort(s,(large[0] - small[0],large[1] + small[2],large[2] + small[1]))

  return s


print(s)
start_time = time()
partition(0,[])
print(best)
print("iterations: " + str(iterations))
print("--- %s seconds ---" % (time() - start_time))

1 Richard E. Korf, Multi-Way Number Partitioning, Computer Science Department, University of California, Los Angeles; aaai.org/ocs/index.php/IJCAI/IJCAI-09/paper/viewFile/625/705

Divide the list into three lists such that their sum are close to each other

Why

What have I tried

5 Answers5