3

I am using optuna for parameter optimisation of my custom models.

Is there any way to sample parameters until current params set was not tested before? I mean, do try sample another params if there were some trial in the past with the same set of parameters.

In some cases it is impossible, for example, when there is categorial distribution and n_trials is greater than number os possible unique sampled values.

What I want: have some config param like num_attempts in order to sample parameters up to num_attempts in for-loop until there is a set that was not tested before, else - to run trial on the last sampled set.

Why I need this: just because it costs too much to run heavy models several times on the same parameters.

What I do now: just make this "for-loop" thing but it's messy.

If there is another smart way to do it - will be very grateful for information.

Thanks!

roseaysina
  • 55
  • 7

2 Answers2

6

To the best of my knowledge, there is no direct way to handle your case for now. As a workaround, you can check for parameter duplication and skip the evaluation as follows:

import optuna

def objective(trial: optuna.Trial):
    # Sample parameters.
    x = trial.suggest_int('x', 0, 10)
    y = trial.suggest_categorical('y', [-10, -5, 0, 5, 10])

    # Check duplication and skip if it's detected.
    for t in trial.study.trials:
        if t.state != optuna.structs.TrialState.COMPLETE:
            continue

        if t.params == trial.params:
            return t.value  # Return the previous value without re-evaluating it.

            # # Note that if duplicate parameter sets are suggested too frequently,
            # # you can use the pruning mechanism of Optuna to mitigate the problem.
            # # By raising `TrialPruned` instead of just returning the previous value,
            # # the sampler is more likely to avoid sampling the parameters in the succeeding trials.
            #
            # raise optuna.structs.TrialPruned('Duplicate parameter set')

    # Evaluate parameters.
    return x + y

# Start study.
study = optuna.create_study()

unique_trials = 20
while unique_trials > len(set(str(t.params) for t in study.trials)):
    study.optimize(objective, n_trials=1)
sile
  • 141
  • 3
  • Thanks! That looks much more accurate than my custom loop. The only problem I see is that if duplicated values were sampled then current trial will be "used", there is no attempt to sample non-duplicated values (I mean, number of unique trials will be lower). That's why I don't want use pruning, because actual number of `n_trials` becomes lower than desired. But any way, if there is no such option, I will use your approach as it looks more laconic :) – roseaysina Nov 13 '19 at 10:17
  • 1
    I'm glad that my answer helps you. BTW, to address the problem you mentioned (i.e., "actual number of n_trials becomes lower than desired"), I updated my example code a bit. Please see the bottom three lines of the code. – sile Nov 13 '19 at 11:14
  • Great! Thank you very much :) That's exactly what I wanted! Definitely looks much better than messy for-loop. – roseaysina Nov 13 '19 at 15:19
1

To second @sile's code comment, you may write a pruner such as:

class RepeatPruner(BasePruner):
    def prune(self, study, trial):
        # type: (Study, FrozenTrial) -> bool

        trials = study.get_trials(deepcopy=False)
        completed_trials = [t.params for t in trials if t.state == TrialState.COMPLETE]
        n_trials = len(completed_trials)

        if n_trials == 0:
            return False

        if trial.params in completed_trials:
            return True

        return False

then call the pruner as:

study = optuna.create_study(study_name=study_name, storage=storage, load_if_exists=True, pruner=RepeatPruner())
Rock
  • 2,531
  • 8
  • 33
  • 45
  • Thanks for the answer! One question - is it right that if I do `n_trials = 5`, among them there are 2 param sets that are the same (so therefore 4 unique sets were sampled), then using pruner I will have only 4 tried attempts/trials, right? I mean pruner drops duplicated sample, doesn't try to resample another params. – roseaysina Apr 23 '20 at 08:51
  • 1
    You may still use @sile’s `while` loop to force the optimizer try the exact number of trials as you want. Hope this answers your question. – Rock Apr 30 '20 at 16:06