2

I am trying to use optuna lib in Python to optimise parameters for recommender systems' models. Those models are custom and look like standard fit-predict sklearn models (with methods get/set params).

What I do: simple objective function that selects two parameters from uniform int distribution, set these params to model, predicts the model (there no fit stage as it simple model that uses params only in predict stage) and calculates some metric.

What I get: the first trial runs normal, it samples params and prints results to log. But on the second and next trial I have some strange errors (look code below) that I can't solve or google. When I run study on just 1 trial everything is okay.

What I tried: to rearrange parts of objective function, put fit stage inside, try to calculate more simpler metrics - nothing helps.

Here is my objective function:

# getting train, test
# fitting model
self.model = SomeRecommender()
self.model.fit(train, some_other_params)

def objective(trial: optuna.Trial):
    # save study
    if path is not None:
        joblib.dump(study, some_path)

    # sampling params
    alpha = trial.suggest_uniform('alpha', 0, 100)
    beta = trial.suggest_uniform('beta', 0, 100)

    # setting params to model
    params = {'alpha': alpha,
              'beta': beta}
    self.model.set_params(**params)

    # getting predict
    recs = self.model.predict(some_other_params)

    # metric computing
    metric_result = Metrics.hit_rate_at_k(recs, test, k=k)

    return metric_result

# starting study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=3, n_jobs=1)

That's what I get on three trials:

[I 2019-10-01 12:53:59,019] Finished trial#0 resulted in value: 0.1. Current best value is 0.1 with parameters: {'alpha': 59.6135986324444, 'beta': 40.714559720597585}.
[W 2019-10-01 13:39:58,140] Setting status of trial#1 as TrialState.FAIL because of the following error: AttributeError("'_BaseUniformDistribution' object has no attribute 'to_internal_repr'")
Traceback (most recent call last):
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/study.py", line 448, in _run_trial
    result = func(trial)
  File "/Users/roseaysina/code/project/model.py", line 100, in objective
    'alpha', 0, 100)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/trial.py", line 180, in suggest_uniform
    return self._suggest(name, distribution)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/trial.py", line 453, in _suggest
    self.study, trial, name, distribution)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/samplers/tpe/sampler.py", line 127, in sample_independent
    values, scores = _get_observation_pairs(study, param_name)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/samplers/tpe/sampler.py", line 558, in _get_observation_pairs
    param_value = distribution.to_internal_repr(trial.params[param_name])
AttributeError: '_BaseUniformDistribution' object has no attribute 'to_internal_repr'
[W 2019-10-01 13:39:58,206] Setting status of trial#2 as TrialState.FAIL because of the following error: AttributeError("'_BaseUniformDistribution' object has no attribute 'to_internal_repr'")
Traceback (most recent call last):
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/study.py", line 448, in _run_trial
    result = func(trial)
  File "/Users/roseaysina/code/project/model.py", line 100, in objective
    'alpha', 0, 100)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/trial.py", line 180, in suggest_uniform
    return self._suggest(name, distribution)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/trial.py", line 453, in _suggest
    self.study, trial, name, distribution)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/samplers/tpe/sampler.py", line 127, in sample_independent
    values, scores = _get_observation_pairs(study, param_name)
  File "/Users/roseaysina/anaconda3/envs/sauvage/lib/python3.7/site-packages/optuna/samplers/tpe/sampler.py", line 558, in _get_observation_pairs
    param_value = distribution.to_internal_repr(trial.params[param_name])
AttributeError: '_BaseUniformDistribution' object has no attribute 'to_internal_repr'

I can't understand where is the problem and why the first trial is working. Please, help.

Thank you!

roseaysina
  • 55
  • 7

1 Answers1

4

Your code seems to have no problems.

I ran a simplified version of your code (see below), and it worked well in my environment:

import optuna

def objective(trial: optuna.Trial):
    # sampling params
    alpha = trial.suggest_uniform('alpha', 0, 100)
    beta = trial.suggest_uniform('beta', 0, 100)

    # evaluating params
    return alpha + beta

# starting study
study = optuna.create_study(direction='maximize')
study.optimize(objective, n_trials=3, n_jobs=1)

Could you tell me about your environment in order to investigate the problem? (e.g., OS, Python version, Python interpreter (CPython, PyPy, IronPython or Jython), Optuna version)

why the first trial is working.

This error is raised by optuna/samplers/tpe/sampler.py#558, and this line is only executed when the number of completed trials in the study is greater than zero.

BTW, you might be able to avoid this problem by using RandomSampler as follows:

sampler = optuna.samplers.RandomSampler()
study = optuna.create_study(direction='maximize', sampler=sampler)

Notice that the optimization performance of RandomSampler tends to be worse than TPESampler that is the default sampler of Optuna.

sile
  • 141
  • 3
  • Thank you a lot! `RandomSampler` fix helps. Trying your simplified code fully isolated from my project - I found weird bug: if there is `import pyspark` before `import optuna` in your code (just one line) then it fails. If reverse (optuna before spark) everything works fine (without fix). I guess PySpark rewrites something that optuna uses afterwards. I also created [issue](https://github.com/pfnet/optuna/issues/571) on GitHub. – roseaysina Oct 02 '19 at 12:48
  • Thank you for the additional information. It's really useful. I'll investigate this problem and reply to the GitHub issue. – sile Oct 03 '19 at 00:50