-2

I have an array of integers. Value of each element represents the time taken to process a file. The processing of files consists of merging two files at a time. What is the algorithm to find the minimum time that can be taken for processing all the files. E.g. - {3,5,9,12,14,18}.

The time of processing can be calculated as - Case 1) -

a) [8],9,12,14,18
b) [17],12,14,18
c) [26],17,18
d) 26,[35]
e) 61

So total time for processing is 61 + 35 + 26 + 17 + 8 = 147

Case 2) -

a) [21],5,9,12,14
b) [17],[21],9,14
c) [21],[17],[23]
d) [40],[21]
e) 61

This time the total time is 61 + 40 + 23 + 17 + 21 = 162

Seems to me that continuously sorting the array and adding the least two elements is the best bet for the minimum as in Case 1. Is my logic right? If not what is the right and easiest way to achieve this with best performance?

user9057272
  • 317
  • 2
  • 5
  • 15
  • 2
    Your question is a bit hard to gasp... the first part seems to be a simple **yes or no** question (*"Is this right?"*) and the second part is primarily opinion based (*"Is there a better algorithm to achieve this?"*) without specifying the design goals of *"better"*. I think it doesn't really fit the format of stackoverflow. Please re-think what your actual question is. – grek40 Jul 10 '19 at 21:20
  • I edited the question. I am just looking for the right and quickest logic or pseudo-code. – user9057272 Jul 11 '19 at 14:39

4 Answers4

2

Once you have the sorted list, since you are only removing the two minimum items and replacing them with one, it makes more sense to do a sorted insert and place the new item in the correct place instead of re-sorting the entire list. However, this only saves a fractional amount of time - about 1% faster.

My method CostOfMerge doesn't assume the input is a List but if it is, you can remove the conversion ToList step.

public static class IEnumerableExt {
    public static int CostOfMerge(this IEnumerable<int> psrc) {
        var src = psrc.ToList();
        src.Sort();
        while (src.Count > 1) {
            var sum = src[0]+src[1];
            src.RemoveRange(0, 2);

            var index = src.BinarySearch(sum);
            if (index < 0)
                index = ~index;
            src.Insert(index, sum);

            total += sum;
        }
        return total;
    }
}
NetMage
  • 22,242
  • 2
  • 28
  • 45
2

As already discussed in other answers, the best strategy will be to always work on the two items with minimal cost for each iteration. So the only remaining question is how to efficiently take the two smallest items each time.

Since you asked for best performance, I shamelessly took the algorithm from NetMage and modified it to speed it up roughly 40% for my test case (thanks and +1 to NetMage).

The idea is to work mostly in place on a single array. Each iteration increase the starting index by 1 and move the elements within the array to make space for the sum from current iteration.

public static long CostOfMerge2(this IEnumerable<int> psrc)
{
    long total = 0;

    var src = psrc.ToArray();
    Array.Sort(src);

    var i = 1;
    int length = src.Length;
    while (i < length)
    {
        var sum = src[i - 1] + src[i];

        total += sum;

        // find insert position for sum
        var index = Array.BinarySearch(src, i + 1, length - i - 1, sum);
        if (index < 0)
            index = ~index;
        --index;

        // shift items that come before insert position one place to the left
        if (i < index)
            Array.Copy(src, i + 1, src, i, index - i);

        src[index] = sum;

        ++i;
    }

    return total;
}

I tested with the following calling code (switching between CostOfMerge and CostOfMerge2), with a few different values for random-seed, count of elements and max value of initial items.

static void Main(string[] args)
{
    var r = new Random(10);

    var testcase = Enumerable.Range(0, 400000).Select(x => r.Next(1000)).ToList();
    var sw = Stopwatch.StartNew();
    long resultCost = testcase.CostOfMerge();
    sw.Stop();
    Console.WriteLine($"Cost of Merge: {resultCost}");
    Console.WriteLine($"Time of Merge: {sw.Elapsed}");
    Console.ReadLine();
}

Result for shown configuration for NetMage CostOfMerge:

Cost of Merge: 3670570720
Time of Merge: 00:00:15.4472251

My CostOfMerge2:

Cost of Merge: 3670570720
Time of Merge: 00:00:08.7193612

Ofcourse the detailed numbers are hardware dependent and difference might be bigger or smaller depending on a load of stuff.

grek40
  • 12,159
  • 1
  • 18
  • 46
1

No, that's the minimum for a polyphase merge: where N is the bandwidth (number of files you can merge simultaneously), then you want to merge the smallest (N-1) files at each step. However, with this more general problem, you want to delay the larger files as long as possible -- you may want an early step or two to merge fewer than (N-1) files, somewhat like having a "bye" in an elimination tourney. You want all the latter steps to involve the full (N-1) files.

For instance, given N=4 and files 1, 6, 7, 8, 14, 22:

Early merge:

[22], 14, 22
[58]
total = 80

Late merge:

[14], 8, 14, 22
[58]
total = 72
Prune
  • 72,213
  • 14
  • 48
  • 72
1

Here, you can apply the following logic to get the desired output.

  1. Get first two minimum values from list.
  2. Remove first two minimum values from list.
  3. Append the sum of first two minimum values in list
  4. And continue until the list become of size 1
  5. Return the only element from list. i.e, this will be your minimum time taken to process every item.

You can follow my Java code out there, if you find helpful .. :)

public class MinimumSums {
   private static Integer getFirstMinimum(ArrayList<Integer> list) {
    Integer min = Integer.MAX_VALUE;

    for(int i=0; i<list.size(); i++) {
        if(list.get(i) <= min)
            min = list.get(i);
    }

    return min;
}

private static Integer getSecondMinimum(ArrayList<Integer> list, Integer firstItem) {
    Integer min = Integer.MAX_VALUE;

    for(int i=0; i<list.size(); i++) {
        if(list.get(i) <= min && list.get(i)> firstItem)
            min = list.get(i);
    }
    return min;
}
public static void main(String[] args) {
    Integer[] processes = {5, 9, 3, 14, 12, 18};

    ArrayList<Integer> list = new ArrayList<Integer>();
    ArrayList<Integer> temp = new ArrayList<Integer>();

    list.addAll(Arrays.asList(processes));

    while(list.size()!= 1) {
        Integer firstMin = getFirstMinimum(list); // getting first min value
        Integer secondMin = getSecondMinimum(list, firstMin); // getting second min

        list.remove(firstMin);
        list.remove(secondMin);

        list.add(firstMin+secondMin);
        temp.add(firstMin + secondMin);
    }

    System.out.println(temp); // prints all the minimum pairs.. 
    System.out.println(list.get(0)); // prints the output

}

}