Most efficient (1-loop) way to use Priority Queue in Parallel Jobs task

Question

I am having a hard time with a data-structure related problem. I've tried quite a lot recently, but I don't know how to proceed. The problem is that I have the right output, but the timing is too slow and I don't pass the automated tests.

To solve the problem, I am using a min-heap to implement the priority queue with next free times for the workers -- how could I make it more efficient? Efficiency is critical here.

Task description

You have a program which is parallelized and uses m independent threads to process the given list of n jobs. Threads take jobs in the order they are given in the input. If there is a free thread, it immediately takes the next job from the list. If a thread has started processing a job, it doesn’t interrupt or stop until it finishes processing the job. If several threads try to take jobs from the list simultaneously, the thread with smaller index takes the job. For each job you know exactly how long will it take any thread to process this job, and this time is the same for all the threads. You need to determine for each job which thread will process it and when will it start processing.

Input Format. The first line of the input contains integers m (amount of workers) and n (amount of jobs). The second line contains n integers — the times in seconds it takes any thread to process a specific job. The times are given in the same order as they are in the list from which threads take jobs.

Output Format. Output exactly n lines. i-th line (0-based index is used) should contain two space- separated integers — the 0-based index of the thread which will process the i-th job and the time in seconds when it will start processing that job.*

from collections import deque
import numpy as np

class solveJobs:
    class Node(dict):

        def __getattr__(self, attr):
            return self.get(attr, None)

    Node.__eq__ = lambda self, other: self.nextfreetime == other.nextfreetime and self.worker == other.worker
    Node.__ne__ = lambda self, other: self.nextfreetime != other.nextfreetime and self.worker != other.worker
    Node.__lt__ = lambda self, other: self.nextfreetime < other.nextfreetime or (self.nextfreetime == other.nextfreetime and np.int(self.worker) < np.int(other.worker))
    Node.__le__ = lambda self, other: self.nextfreetime <= other.nextfreetime
    Node.__gt__ = lambda self, other: self.nextfreetime > other.nextfreetime or (self.nextfreetime == other.nextfreetime and np.int(self.worker) > np.int(other.worker))
    Node.__ge__ = lambda self, other: self.nextfreetime >= other.nextfreetime

    class nextfreetimeQueue:

        def __init__(self, nodes):
            self.size = 0
            self.heap = deque([None])
            self.labeled = False

        def __str__(self):
            return str(list(self.heap)[1:])


        def swap(self, i, j):
            '''
            Swap the values of nodes at index i and j.
            '''
            self.heap[i], self.heap[j] = self.heap[j], self.heap[i]
            #        if self.labeled:
            #            I, J = self.heap[i], self.heap[j]
            #            self.position[I.label] = i
            #            self.position[J.label] = j

        def shift_up(self, i):
            '''
            move upward the value at index i to restore heap property.
            '''
            p = i // 2                  # index of parent node
            while p:
                if self.heap[i] < self.heap[p]:
                    self.swap(i, p)     # swap with parent
                i = p                   # new index after swapping with parent
                p = p // 2              # new parent index

        def shift_down(self, i):
            '''
            move downward the value at index i to restore heap property.
            '''
            c = i * 2
            while c <= self.size:
                c = self.min_child(i)
                if self.heap[i] > self.heap[c] or self.heap[i] == self.heap[c]:
                    self.swap(i, c)
                i = c                   # new index after swapping with child
                c = c * 2               # new child index

        def min_child(self, i):
            '''
            Return index of minimum child node.
            '''
            l, r = (i * 2), (i * 2 + 1)     # indices of left and right child nodes
            if r > self.size:
                return l
            else:
                return l if self.heap[l] < self.heap[r] else r

        @property
        def min(self):
            '''
            Return minimum node in heap.
            '''
            return self.heap[1]



        def insert(self, node):
            '''
            Append `node` to the heap and move up
            if necessary to maintain heap property.
            '''
            #        if has_label(node) and self.labeled:
            #            self.position[node.label] = self.size
            self.heap.append(node)
            self.size += 1
            self.shift_up(self.size)


    def read_data(self):
        self.num_workers, jobcount = map(np.int, input().split()) # first number is the amount of WORKERS, second is the number of jobs
        self.job_durations = list(map(np.int, input().split())) # TAKE INTEGER OVER ALL SPLITS OF INPUT
        self.wq = nextfreetimeQueue([])
        for i in range(self.num_workers):
            self.wq.insert(Node(worker=i+1,nextfreetime=0))
        # assert jobcount == len(self.job_durations)
        self.assigned_workers = [None] * len(self.job_durations) # which thread takes
        self.start_times = [None] * len(self.job_durations) # WHEN A JOB IS STARTED

    def write_response(self):
        for i in range(len(self.job_durations)): # for each job, do:
          print(self.assigned_workers[i]-1, self.start_times[i]) # print the worker and when it starts the JOB I


    def assign_jobs(self):

        for i in range(len(self.job_durations)): # loop over all jobs
          next_worker_node =  self.wq.min # finds the minimum free time dict (worker, nextfreetime)
          # nft = next_worker_node['nextfreetime']
          self.assigned_workers[i] = next_worker_node['worker'] # assign the worker index to the list
          self.start_times[i] = next_worker_node['nextfreetime'] # assign that worker's next free time to job starting time
          self.wq.min['nextfreetime'] += self.job_durations[i] # increase workers next free time
          self.wq.shift_down(1)

    def solve(self):
        self.read_data()
        self.assign_jobs()
        self.write_response()

And the reason you're not using [heapq](https://docs.python.org/3.0/library/heapq.html) is? — Jim Mischel, Nov 30 '17 at 03:22
This is not a queue problem - this is a job shop scheduling problem. Use the [longest processing time rule](http://riot.ieor.berkeley.edu/Applications/Scheduling/algorithms.html) to minimize completion time. — theMayer, Nov 30 '17 at 05:28
@theMayer I really think it's a queue problem. The rules make it very clear that jobs are to be processed in the order given, by the first available thread. — Jim Mischel, Nov 30 '17 at 05:37
Ok yes, I take that back. The homework question sucks- the problem posted could never be encountered in the real world. — theMayer, Nov 30 '17 at 12:05
@theMayer This problem happens every day in the real world. It's a simulation of a single producer, multiple consumer system. It even has physical analogues: the checkout line at Fry's Electronics, for example, or the order line at Wendy's when there are multiple cashiers. — Jim Mischel, Nov 30 '17 at 14:15
Not exactly. If that were true, it would be modeled as a stochastic process with interarrival time lambda and service time normally distributed. I would never have written an exercise to be this overly- complicated if the learning objective was to pull from a queue. — theMayer, Nov 30 '17 at 15:05
And further- it is not possible for multiple threads to access a queue at the exact same instant. Resource contention is possible, but computer architecture precludes such a thing as-described. — theMayer, Nov 30 '17 at 15:11
Please read [Under what circumstances may I add “urgent” or other similar phrases to my question, in order to obtain faster answers?](//meta.stackoverflow.com/q/326569) - the summary is that this is not an ideal way to address volunteers, and is probably counterproductive to obtaining answers. Please refrain from adding this to your questions. — halfer, Dec 09 '17 at 19:28

score 1 · Answer 1 · answered Nov 30 '17 at 04:20

A few things come to mind after a quick read through your code.

First, unless there's some compelling reason to write your own min heap, you probably should just use the existing heapq.

A standard array will probably be faster than a deque in this case. All of your insertions and removals are at the end of the array, so you don't incur the O(n) costs of inserting into and removing from the middle or the front.

There are two problems, one minor and one major, with your shift_down code. You have this loop:

while c <= self.size:
    c = self.min_child(i)
    if self.heap[i] > self.heap[c] or self.heap[i] == self.heap[c]:
        self.swap(i, c)
    i = c                   # new index after swapping with child
    c = c * 2               # new child index

The major problem is that it always does log(n) iterations through the loop, even if the item you're inserting belongs at the top. You can reduce that by exiting the loop if at any time self.heap[i] < self.heap[c].

The minor problem is that there's no good reason to check for self.heap[i] == self.heap[c], because that can't happen in your program. If two nodes in the heap have the same value, it would mean that you have the same worker in the queue multiple times.

You can fix both problems with this simple change:

    if self.heap[i] < self.heap[c]
        break         # node is where it belongs. Exit loop
    # otherwise, swap and go around again
    self.swap(i, c)
    i = c
    c = c * 2

Executing the loop too many times is probably your biggest performance problem, especially if the number of workers is large.

There's a faster way to build the heap originally. A sorted list is a valid min-heap. So you could just initialize your heap with the worker numbers in order. No need to pay the O(n log n) cost to insert the individual workers, when you can initialize the heap in O(n). If your workers aren't in sorted order, you can still initialize a heap in O(n). See How can building a heap be O(n) time complexity?, or check out the heapify(x) function in the heapq source code.

Is there a data structure he can use to sort the jobs by processing time? — theMayer, Nov 30 '17 at 05:30
@theMayer There are many such data structures. How does that help him, when he has to process the jobs in the order entered? — Jim Mischel, Nov 30 '17 at 05:35
Thanks a lot Jim! You're right, the equality condition is redundant and breaking the while loop is definitely a good idea. I am still exceeding the required time by a significant margin, though. It seems that I need to use some sort of a fast built-in implementation... — Creative_Data, Dec 01 '17 at 07:01
The reason I have comparable Node-objects (dictionaries) in my data structure is that I can sort them while still knowing what their original position (worker index was). I need to keep track of the indices as I swap them, what would you suggest? — Creative_Data, Dec 01 '17 at 07:09
Okay, heapify sorts the list of Nodes fine, but does not take into account the indexing requirement of having the smaller worker index first. I tried to find the source code for sorted -- this sorting works fine, but I couldn't find the source code.. — Creative_Data, Dec 01 '17 at 08:09
Guys, what about this one? https://pypi.python.org/pypi/pqdict/#downloads — Creative_Data, Dec 01 '17 at 10:40

Most efficient (1-loop) way to use Priority Queue in Parallel Jobs task

Task description

1 Answers1

Linked