0

I'm trying to adapt this module to support asynchronous execution when searching for a lot of images in the same screenshot at a given time. I'm kind of new to async coding and after a lot of research I chose Trio to do it (because of it's awesomeness and ease).

The point is:

  • The function receives a list of paths of images
  • At each iteration, it takes a screenshot and tries to find the images in the array (it's better for performance if we don't take a new screenshot for every try in the array)
  • If it finds one, returns the image's path and it's coordinates
  • Do it all over again because some image may appear now on the screen

I'm going to use this in another project with support for async with Trio, that's why I'm trying to convert it.

This is my attempt:


def image_search(image, precision=0.8, pil=None):
    if pil is None:
        pil = pyautogui.screenshot()
    if is_retina:
        pil.thumbnail((round(pil.size[0] * 0.5), round(pil.size[1] * 0.5)))
    return most_probable_location(pil, image, precision)

async def multiple_image_search_loop(images, interval=0.1, timeout=None, precision=0.8):
    async def do_search():
        while True:
            pil = pyautogui.screenshot()
            for image in images:
                if pos := image_search(image, precision, pil):
                    return {
                        "position": pos,
                        "image": image
                    }
            await trio.sleep(interval)

    if timeout:
        with trio.fail_after(timeout):
            return await do_search()
    else:
        return await do_search()

Although the code looks correct I feel I'm missing the point of asynchronous code. This could all be done in a synchronous manner, and I feel I haven't made any difference in it.

It's not so bad if there is no difference in performance because the point is to make this function useful in an async context, without blocking for the whole time it is searching for the images, but if I could optimize things, It sure would be better.

Maybe if instead of awaiting after a search on all images I adapt image_search() with a call to trio.sleep() and open a nursery on the main function would be better? (using the trio.start_soon() method within it for each image on the array). This would block less on the other project I'm going to use it but It would take more time to find an image, am I right?

Teodoro
  • 571
  • 2
  • 12

1 Answers1

2

Trio won't directly parallelize CPU-bound code like this. Being an "async framework" means that it just uses a single CPU thread, while parallelizing I/O and networking operations. If you insert some calls to await trio.sleep(0) then that will let Trio interleave the image searching with other tasks, but it won't make the image searching any faster.

What you might want to do, though, is use a separate thread. I guess your code is probably spending most of its time in opencv, and opencv probably drops the GIL? So using threads will probably let you run your code across multiple CPUs at once, while also letting other async tasks run at the same time. To manage threads like this, Trio lets you do await trio.to_thread.run_sync(some_sync_function, *args), which runs some_sync_function(*args) in a thread. If you run multiple calls like this simultaneously in a nursery, then you'll use multiple threads.

There is one major gotcha to watch out for with threads, though: once a trio.to_thread.run_sync call starts, it can't be cancelled, so timeouts etc. won't take effect until after the call finishes. To work around this you might want to make sure that individual calls don't block for too long.

Also, a side note on style: functions made for Trio usually don't take timeout= arguments like that, because if the user wants to add a timeout, they can write the with block themselves around your function just as easily as passing an argument. So this way you don't have to clutter up APIs with timeout arguments everywhere.

Nathaniel J. Smith
  • 9,038
  • 4
  • 35
  • 46
  • Almost cried with `trio.to_thread.run_sync(some_sync_function, *args)`. Beautiful library, thank you! – Teodoro May 26 '20 at 21:26