Combining semaphore and time limiting in python-trio with asks http request

Question

I'm trying to use Python in an async manner in order to speed up my requests to a server. The server has a slow response time (often several seconds, but also sometimes faster than a second), but works well in parallel. I have no access to this server and can't change anything about it. So, I have a big list of URLs (in the code below, pages) which I know beforehand, and want to speed up their loading by making NO_TASKS=5 requests at a time. On the other hand, I don't want to overload the server, so I want a minimum pause between every request of 1 second (i. e. a limit of 1 request per second).

So far I have successfully implemented the semaphore part (five requests at a time) using a Trio queue.

import asks
import time
import trio

NO_TASKS = 5


asks.init('trio')
asks_session = asks.Session()
queue = trio.Queue(NO_TASKS)
next_request_at = 0
results = []


pages = [
    'https://www.yahoo.com/',
    'http://www.cnn.com',
    'http://www.python.org',
    'http://www.jython.org',
    'http://www.pypy.org',
    'http://www.perl.org',
    'http://www.cisco.com',
    'http://www.facebook.com',
    'http://www.twitter.com',
    'http://www.macrumors.com/',
    'http://arstechnica.com/',
    'http://www.reuters.com/',
    'http://abcnews.go.com/',
    'http://www.cnbc.com/',
]


async def async_load_page(url):
    global next_request_at
    sleep = next_request_at
    next_request_at = max(trio.current_time() + 1, next_request_at)
    await trio.sleep_until(sleep)
    next_request_at = max(trio.current_time() + 1, next_request_at)
    print('start loading page {} at {} seconds'.format(url, trio.current_time()))
    req = await asks_session.get(url)
    results.append(req.text)


async def producer(url):
    await queue.put(url)  


async def consumer():
    while True:
        if queue.empty():
            print('queue empty')
            return
        url = await queue.get()
        await async_load_page(url)


async def main():
    async with trio.open_nursery() as nursery:
        for page in pages:
            nursery.start_soon(producer, page)
        await trio.sleep(0.2)
        for _ in range(NO_TASKS):
            nursery.start_soon(consumer)


start = time.time()
trio.run(main)

However, I'm missing the implementation of the limiting part, i. e. the implementation of max. 1 request per second. You can see above my attempt to do so (first five lines of async_load_page), but as you can see when you execute the code, this is not working:

start loading page http://www.reuters.com/ at 58097.12261669573 seconds
start loading page http://www.python.org at 58098.12367392373 seconds
start loading page http://www.pypy.org at 58098.12380622773 seconds
start loading page http://www.macrumors.com/ at 58098.12389389973 seconds
start loading page http://www.cisco.com at 58098.12397854373 seconds
start loading page http://arstechnica.com/ at 58098.12405119873 seconds
start loading page http://www.facebook.com at 58099.12458010273 seconds
start loading page http://www.twitter.com at 58099.37738939873 seconds
start loading page http://www.perl.org at 58100.37830828273 seconds
start loading page http://www.cnbc.com/ at 58100.91712723473 seconds
start loading page http://abcnews.go.com/ at 58101.91770178373 seconds
start loading page http://www.jython.org at 58102.91875295573 seconds
start loading page https://www.yahoo.com/ at 58103.91993155273 seconds
start loading page http://www.cnn.com at 58104.48031027673 seconds
queue empty
queue empty
queue empty
queue empty
queue empty

I've spent some time searching for answers but couldn't find any.

score 4 · Accepted Answer · answered Oct 03 '18 at 20:19

One of the ways to achieve your goal would be using a mutex acquired by a worker before sending a request and released in a separate task after some interval:

async def fetch_urls(urls: Iterator, responses, n_workers, throttle):
    # Using binary `trio.Semaphore` to be able
    # to release it from a separate task.
    mutex = trio.Semaphore(1)

    async def tick():
        await trio.sleep(throttle)
        mutex.release()

    async def worker():
        for url in urls:
            await mutex.acquire()
            nursery.start_soon(tick)
            response = await asks.get(url)
            responses.append(response)

    async with trio.open_nursery() as nursery:
        for _ in range(n_workers):
            nursery.start_soon(worker)

If a worker gets response sooner than after throttle seconds, it will block on await mutex.acquire(). Otherwise the mutex will be released by the tick and another worker will be able to acquire it.

This is similar to how leaky bucket algorithm works:

Workers waiting for the mutex are like water in a bucket.
Each tick is like a bucket leaking at a constant rate.

If you add a bit of logging just before sending a request you should get an output similar to this:

   0.00169 started
  0.001821 n_workers: 5
  0.001833 throttle: 1
  0.002152 fetching https://httpbin.org/delay/4
     1.012 fetching https://httpbin.org/delay/2
     2.014 fetching https://httpbin.org/delay/2
     3.017 fetching https://httpbin.org/delay/3
      4.02 fetching https://httpbin.org/delay/0
     5.022 fetching https://httpbin.org/delay/2
     6.024 fetching https://httpbin.org/delay/2
     7.026 fetching https://httpbin.org/delay/3
     8.029 fetching https://httpbin.org/delay/0
     9.031 fetching https://httpbin.org/delay/0
     10.61 finished

Thanks for this solution which is the neatest which has been posted so far! I tried your solution and was on the edge of writing that it doesn't work because I tried it with `urls` as a list which then causes every page to be loaded 5 times. But if `urls = iter(pages)` (referring to the variable in my original question), then it works great! I will post an alternative version which is closely orientated in your solution, but I think this should be the accepted solution! — Manu CJ, Feb 01 '19 at 19:35

Matthias Urlichs · Answer 2 · 2018-08-02T08:39:15.690

2

Using trio.current_time() for this is much too complicated IMHO.

The easiest way to do rate limiting is a rate limiter, i.e. a separate task that basically does this:

async def ratelimit(queue,tick, task_status=trio.TASK_STATUS_IGNORED):
    with trio.open_cancel_scope() as scope:
        task_status.started(scope)
        while True:
            await queue.get()
            await trio.sleep(tick)

Example use:

async with trio.open_nursery() as nursery:
    q = trio.Queue(0)
    limiter = await nursery.start(ratelimit, q, 1)
    while whatever:
        await q.put(None) # will return at most once per second
        do_whatever()
    limiter.cancel()

in other words, you start that task with

q = trio.Queue(0)
limiter = await nursery.start(ratelimit, q, 1)

and then you can be sure that at most one call of

await q.put(None)

per second will return, as the zero-length queue acts as a rendezvous point. When you're done, call

 limiter.cancel()

to stop the rate limiting task, otherwise your nursery won't exit.

If your use case includes starting sub-tasks which you need to finish before the limiter gets cancelled, the easiest way to do that is to rin them in another nursery, i.e. instead of

while whatever:
    await q.put(None) # will return at most once per second
    do_whatever()
limiter.cancel()

you'd use something like

async with trio.open_nursery() as inner_nursery:
    await start_tasks(inner_nursery, q)
limiter.cancel()

which would wait for the tasks to finish before touching the limiter.

NB: You can easily adapt this for "burst" mode, i.e. allow a certain number of requests before the rate limiting kicks in, by simply increasing the queue's length.

edited Aug 02 '18 at 08:39

answered Jul 09 '18 at 20:05

Matthias Urlichs

1,789
14
26

Helpful, thank you! As far as I understand it, there is an `await` missing in front of the `sleep`. But how do you make the `ratelimit` task stop, after the other queue is exhausted? In the current implementation you suggest, the nursery doesn't end because of the `while True`-loop. – Manu CJ Jul 24 '18 at 12:28
Easy: `nursery.cancel.scope.cancel()` terminates all tasks. Another way is to send a `bool` into the queue and teach `ratelimit` to terminate when that's `False`. – Matthias Urlichs Jul 24 '18 at 20:05
Missing `await` added. Thanks. Please don't hesitate to accept my answer. ;-) – Matthias Urlichs Jul 24 '18 at 20:06
Edit: now passing an explicit cancel scope for `ratelimit` to the caller – Matthias Urlichs Jul 24 '18 at 20:16
The reason I've not been accepting your answer so far is because I think it contains non-trivial steps for a final answer. For your first version of the answer I've successfully put your suggested lines in the correct places, but it's not perfectly obvious and for the sake of the next reader I'd ask you to provide an entire solution. (Remember that my original question was how to combine a rate limiter with a semaphore, not only how to write a rate limiter.) – Manu CJ Jul 30 '18 at 09:11
Also, in your current version of the answer, there are non-trivial steps for a final solution. For example, you write `When you're done, call ...`, but what does it mean to be done? An empty queue? All other tasks end? So please complete your answer (include all code) and I will accept it. – Manu CJ Jul 30 '18 at 09:16
added reasonably-complete example use. – Matthias Urlichs Jul 31 '18 at 06:51
I'd really like to give you the credits because you brought me on track and now I have a working solution, but I can't manage to get a working solution from your code bits. `while whatever` and `do_whatever()` are substantially different from my code (I launch consumers in a *limited* for-loop). So I can't call `limiter.cancel()` after that for-loop, because it would be executed before the consumers finish, which in turn would result in a fail of the limiter. So, please provide more bits or in doubt a full solution. – Manu CJ Jul 31 '18 at 16:23
The easierst way to do that is to use an inner nursery. Example amended. – Matthias Urlichs Aug 02 '18 at 08:39

score 1 · Answer 3 · answered Feb 01 '19 at 20:14

Motivation and origin of this solution

Some months have passed since I asked this question. Python has improved since then, so has trio (and my knowledge of them). So I thought it was time for a little update using Python 3.6 with type annotations and trio-0.10 memory channels.

I developed my own improvement of the original version, but after reading @Roman Novatorov's great solution, adapted it again and this is the result. Kudos to him for the main structure of the function (and the idea to use httpbin.org for illustration purposes). I chose to use memory channels instead of a mutex to be able to take out any token re-release logic out of the worker.

Explanation of solution

I can rephrase the original problem like this:

I want to have a number of workers that start the request independently of each other (thus, they will be realized as asynchronous functions).
There is zero or one token released at any point; any worker starting a request to the server consumes a token, and the next token will not be issued until a minimum time has passed. In my solution, I use trio's memory channels to coordinate between the token issuer and the token consumers (workers)

In case your not familiar with memory channels and their syntax, you can read about them in the trio doc. I think the logic of async with memory_channel and memory_channel.clone() can be confusing in the first moment.

from typing import List, Iterator

import asks
import trio

asks.init('trio')

links: List[str] = [
    'https://httpbin.org/delay/7',
    'https://httpbin.org/delay/6',
    'https://httpbin.org/delay/4'
] * 3


async def fetch_urls(urls: List[str], number_workers: int, throttle_rate: float):

    async def token_issuer(token_sender: trio.abc.SendChannel, number_tokens: int):
        async with token_sender:
            for _ in range(number_tokens):
                await token_sender.send(None)
                await trio.sleep(1 / throttle_rate)

    async def worker(url_iterator: Iterator, token_receiver: trio.abc.ReceiveChannel):
        async with token_receiver:
            for url in url_iterator:
                await token_receiver.receive()

                print(f'[{round(trio.current_time(), 2)}] Start loading link: {url}')
                response = await asks.get(url)
                # print(f'[{round(trio.current_time(), 2)}] Loaded link: {url}')
                responses.append(response)

    responses = []
    url_iterator = iter(urls)
    token_send_channel, token_receive_channel = trio.open_memory_channel(0)

    async with trio.open_nursery() as nursery:
        async with token_receive_channel:
            nursery.start_soon(token_issuer, token_send_channel.clone(), len(urls))
            for _ in range(number_workers):
                nursery.start_soon(worker, url_iterator, token_receive_channel.clone())

    return responses

responses = trio.run(fetch_urls, links, 5, 1.)

Example of logging output:

As you see, the minimum time between all page requests is one second:

[177878.99] Start loading link: https://httpbin.org/delay/7
[177879.99] Start loading link: https://httpbin.org/delay/6
[177880.99] Start loading link: https://httpbin.org/delay/4
[177881.99] Start loading link: https://httpbin.org/delay/7
[177882.99] Start loading link: https://httpbin.org/delay/6
[177886.20] Start loading link: https://httpbin.org/delay/4
[177887.20] Start loading link: https://httpbin.org/delay/7
[177888.20] Start loading link: https://httpbin.org/delay/6
[177889.44] Start loading link: https://httpbin.org/delay/4

Comments on the solution

As not untypical for asynchronous code, this solution does not maintain the original order of the requested urls. One way to solve this is to associate an id to the original url, e. g. with a tuple structure, put the responses into a response dictionary and later grab the responses one after the other to put them into a response list (saves sorting and has linear complexity).

score 0 · Answer 4 · answered Jul 09 '18 at 18:40

0

You need to increment next_request_at by 1 every time you come into async_load_page. Try using next_request_at = max(trio.current_time() + 1, next_request_at + 1). Also I think you only need to set it once. You may get into trouble if you're setting it around awaits, where you're giving the opportunity for other tasks to change it before examining it again.

answered Jul 09 '18 at 18:40

parity3

575
6
15

1

I'm not sure if this might work, but @Matthias Urlichs solution is way cleaner. I was a bit off-track in my original question. – Manu CJ Jul 31 '18 at 16:25
Yes, absolutely go with @matthias-urlichs 's answer over modifying your proposed. I only posted this answer in case you wanted to make the least amount of changes possible, but I fully agree that the OP code does not need all that structure at least in the scenario provided. I think you can get away with not using a queue or multiple tasks at all, but will leave that as an exercise. As to using trio.current_time() being too complicated I would have to disagree; it's like saying this pseudo code is too complex: if it's not 5pm, wait til 5pm before leaving work...although, come to think of it.. – parity3 Jul 31 '18 at 20:10

Combining semaphore and time limiting in python-trio with asks http request

4 Answers4

Motivation and origin of this solution

Explanation of solution

Example of logging output:

Comments on the solution

Linked