Override nested parameters using kedro run CLI command

Question

I am using nested parameters in my parameters.yml and would like to override these using runtime parameters for the kedro run CLI command:

train:
    batch_size: 32
    train_ratio: 0.9
    epochs: 5

The following doesn't seem to work:

kedro run --params  train.batch_size:64,train.epochs:50

the values for epoch and batch_size are those from the parameters.yml. How can I override these parameters with the cli command?

napoleon_borntoparty · Accepted Answer · 2020-08-04T09:19:17.497

The additional parameters get passed into the KedroContext object via load_context(Path.cwd(), env=env, extra_params=params) in kedro_cli.py. Here you can see that there's a callback (protected) function called _split_params which splits the key-value pairs on :.

This _split_params first splits string on commas (to get multiple params) and then on colons. Actually adding a print/logging statement of what gets passed into extra_params will show you something like:

{'train.batch_size': 64, 'train.epochs': 50}

I think you have a couple options:

Un-nesting the params. That way you will override them correctly.
Adding custom logic to _split_params in kedro_cli.py to create a nested dictionary on . characters which gets passed into the func mentioned above. I think you can reuse a lot of the existing logic.

NB: This was tested on kedro==0.16.2.

NB2: The way kedro splits out nested params is using the _get_feed_dict and _add_param_to_feed_dict functions in context.py. Specifically, _add_param_to_feed_dict is a recursive function that unpacks a dictionary and formats as "{}.{}".format(key, value). IMO you can use the logic from here.

I was hoping that there is a cli syntax which would allow this but I already feared that I'll have to go with your suggested second option when dealing with nested parameters. Thank you @napoleon_borntoparty confirming this. — evolved, Aug 04 '20 at 13:42

score 0 · Answer 2 · answered Nov 09 '20 at 12:50

I would suggest another way. I add to my Kedro project file run.py and override KedroContext and ConfigLoader.

Now I can use in CLI something like that kedro run .. --params "train_kwargs_max_epochs:1" and it will be converted as train_kwargs.max_epochs = 1.

So I can use in my pipeline params:train_kwargs and in the node use it to init Trainer: Trainer(**train_kwargs).

I would be happy to provide full source code if anybody is interested in it. Thing is the current code are deeply integrated with my customer's sources and I need time to separate it and publish.

Override nested parameters using kedro run CLI command

2 Answers2