I have the following line in my luigi.cfg
file (on all nodes, scheduler and workers):
[core]
parallel-scheduling: true
However, when I monitor CPU utilization on my luigi scheduler (with a graph of around ~4000 tasks, handling requests from ~100 workers), it is only utilizing a single core on the scheduler, with the single luigid
thread often hitting 100% CPU utilization. My understanding is that this configuration variable should parallelize scheduling of tasks.
The source suggests that this flag should indeed use multiple cores on the scheduler. In https://github.com/spotify/luigi/blob/master/luigi/interface.py#L194, a call is made to https://github.com/spotify/luigi/blob/master/luigi/worker.py#L498 to check the .complete()
state of the task in parallel.
What am I missing to get my Luigi scheduler to utilize all of its cores?