My pipeline makes a lot of HTTP requests. It’s not a CPU-heavy operation, I’d like to spin more processes than the number of CPU cores. How can I change this?
Asked
Active
Viewed 56 times
1 Answers
2
ParallelRunner supports the max_workers
parameter, but currently there’s no way to pass it from kedro run cli command. It’s done to reduce the complexity of the CLI.
You can add a parameter manually, or just hard-code the value when instantiating the ParallelRunner in kedro_cli.py
. The runner part might look like:
runner_class = load_obj(runner, "kedro.runner") if runner else SequentialRunner
runner_params = {'num_workers': 100} if runner is ParallelRunner else {}
context = load_context(Path.cwd(), env=env)
context.run(
tags=tag,
runner=runner_class(**runner_params),
node_names=node_names,
from_nodes=from_nodes,
to_nodes=to_nodes,
from_inputs=from_inputs,
load_versions=load_version,
pipeline_name=pipeline,
)
![](../../users/profiles/3858528.webp)
921Kiyo
- 512
- 3
- 9