Python multiprocessing: How to measure runtime of a subprocess in a worker?

Question

Let work_one() be a function like this:

def work_one(program_path):
    task = subprocess.Popen("./" + program_path, shell=True)
    t = 0
    while True:
        ret = task.poll()
        if ret != None or t >= T: # T is the max time we allow for 'task'
            break;
        else:
            t += 0.05
            time.sleep(0.05)
    return t

There are many such programs to feed to work_one(). When these programs run sequentially, the time reported by each work_one() is a reliable coarse measure of each program's runtime.

However, let say we have a multiprocessing.Pool() instance comprising 20 workers, pool, and we call function like this:

   def work_all(programs):
       pool = multiprocessing.Pool(20)
       ts = pool.map(work_one, programs)
       return ts

Now, the runtime measures reported by work_all() is approximately 20 times larger than what a sequential work_one() would have reported.

This is reasonable, because inside work_one(), when a worker's process adds 0.05 to t and yields (by calling time.sleep()), it might not be able to yield to the subprocess it is managing (task) (unlike when there is only one worker); instead, the OS might decided to give the CPU to another concurrent worker's process. Therefore the number of iterations inside a worker_one() may be 20 times more before task completes.

Question:

How to correctly implement work_one() to get a good runtime measure, knowing that work_one() might be running concurrently?
I also want work_one() to return early if the subprocess task doesn't complete in T seconds, so os.wait*() functions seems not a good solution because they block the parent process.

Amadan · Accepted Answer · 2019-01-15T04:35:15.857

There are several relevant times for a process; here's how to get them all (see What do 'real', 'user' and 'sys' mean in the output of time(1)?; tl;dr: total time CPU was engaged on your process is user_time + system_time):

import time
import os
import subprocess

def work_one(program_path):
    start = time.perf_counter()
    task = subprocess.Popen(program_path, shell=True)
    pid, exit_status, resource_usage = os.wait4(task.pid, 0)
    end = time.perf_counter()
    real_time = end - start
    user_time = resource_usage.ru_utime
    system_time = resource_usage.ru_stime
    return (real_time, user_time, system_time)

EDIT: Modified to provide timeout. However, I couldn't make resource.getrusage return anything but zeroes on my test cases; maybe it needs a longer process, or maybe I'm doing something wrong.

def timeout_handler(signum, frame):
    raise TimeoutError()

timeout = 2

def work_one(program_path):
    try:
        timed_out = False
        signal.signal(signal.SIGALRM, timeout_handler)
        signal.alarm(timeout)
        start = time.perf_counter()
        task = subprocess.Popen(program_path, shell=True)
        pid, exit_status, resource_usage = os.wait4(task.pid, 0)
    except TimeoutError:
        timed_out = True
        resource_usage = resource.getrusage(resource.RUSAGE_CHILDREN)
        os.kill(task.pid, signal.SIGTERM)
    finally:
        signal.alarm(0)
    end = time.perf_counter()
    real_time = end - start
    user_time = resource_usage.ru_utime
    system_time = resource_usage.ru_stime
    return (timed_out, real_time, user_time, system_time)

Sorry.. I forgot to add the important part: I also want `work_one()` to return early if the subprocess `task` doesn't complete in `T` seconds.. I'll modify the question. Except this, this is a very good answer! — Leedehai, Jan 15 '19 at 02:19

Python multiprocessing: How to measure runtime of a subprocess in a worker?

1 Answers1