1

I am trying to get my head around Python asyncio . This is a simple program i wrote . The logic i am trying to simulate is as follows:

  1. I get a list of names from some database. Since we are going to do something with those names after we get them hence i made it a simple function and not an asynchronous one.

  2. After we get the data we again make a call to some external API using the name that we have. Now since this would be an expensive operation from IO standpoint and the API calls for individual names don't depend on each other it makes sense to make them anonymous.

I looked up this thread in Stackoverflow(Cooperative yield in asyncio) which says that to give back control to the event loop to do something else we have to do asyncio.sleep(0).

Here i am comparing the async behaviour of Node.js and Python. If i give back control to the event loop using the above syntax my long running API call would remain suspended right and would not happen in the background as in Node.js?

In Node.js when we make an external API call we get something back called Promises on which we can wait to finish . It essentially means that the database call or API call is happening in the background and we get back something when it is done.

Am i missing something critical concept here about Python asynchronous programming ? Kindly throw some more light on this .

Below is the code and its output:

import asyncio
import time


async def get_message_from_api(name):
    # This is supposed to be a long running operation like getting data from external API
    print(f"Attempting to yield control to the other tasks....for {name}")
    await asyncio.sleep(0)
    time.sleep(2)
    return f"Creating message for {name}"


async def simulate_long_ops(name):
    print(f"Long running operation starting for {name}")
    message = await get_message_from_api(name)
    print(f"The message returned by the long running operation is {message}")


def get_data_from_database():
    return ["John", "Mary", "Sansa", "Tyrion"]


async def main():
    names = get_data_from_database()
    futures = []
    for name in names:
        futures.append(loop.create_task(simulate_long_ops(name)))
    await asyncio.wait(futures)


if __name__ == '__main__':
    try:
        loop = asyncio.get_event_loop()
        loop.run_until_complete(main())
    except Exception as e:
        print(e)
    finally:
        loop.close()

Output:

Long running operation starting for John
Attempting to yield control to the other tasks....for John
Long running operation starting for Mary
Attempting to yield control to the other tasks....for Mary
Long running operation starting for Sansa
Attempting to yield control to the other tasks....for Sansa
Long running operation starting for Tyrion
Attempting to yield control to the other tasks....for Tyrion
The message returned by the long running operation is Creating message for John
The message returned by the long running operation is Creating message for Mary
The message returned by the long running operation is Creating message for Sansa
The message returned by the long running operation is Creating message for Tyrion
user1867151
  • 3,771
  • 3
  • 29
  • 44

1 Answers1

2

The mistake in your code is that you call time.sleep. You should never call that function in asyncio code, it blocks the whole event loop; await asyncio.sleep() instead. In JavaScript terms, calling time.sleep is almost as bad as sleeping like this instead of like this. (I say "almost" because time.sleep at least doesn't burn CPU cycles while waiting.)

Attempts to work around that mistake led to the second problem, the use of asyncio.sleep(0) to give control to the event loop. Although the idiom was added early, the behavior was documented only much later. As Guido hints in the original issue, explicit yielding to the event loop is only appropriate for advanced usage and its use by beginners is most likely an error. If your long-running operation is async ― as is the case in your code, once time.sleep() is replaced with await asyncio.sleep() ― you don't need to drop to the event loop manually. Instead, the async operation will drop as needed on every await, just like it would in JavaScript.

In Node.js when we make an external API call we get something back called Promises on which we can wait to finish.

In Python a future is a close counterpart, and the async models are very similar. One significant difference is that Python's async functions don't return scheduled futures, but lightweight coroutine objects which you must either await or pass to asyncio.create_task() to get them to run. Since your code does the latter, it looks correct.

The return value of create_task is an object that implements the Future interface. Future sports an add_done_callback method with the semantics you'd expect. But it's much better to simply await the future instead - it makes the code more readable and it's clear where the exceptions go.

Also, you probably want to use asyncio.gather() rather than asyncio.wait() to ensure that exceptions do not go unnoticed. If you are using Python 3.7, consider using asyncio.run() to run the async main function.

user4815162342
  • 104,573
  • 13
  • 179
  • 246
  • 1
    Thanks a lot. Wonderful explanation . Thanks a ton. I did not find one good documentation on the internet which explains everything in detail. I mean there are quite a few but none of them explains with real world examples. The documentation also has scores of methods which are kind of very confusing. – user1867151 Oct 07 '19 at 22:07