1

I'm having an issue with the loading of a file from dagster code (setup, not pipelines). Say I have the following project structure:

pipelines
-app/
--environments
----schedules.yaml
--repository.py
--repository.yaml

When I run dagit while inside the project folder($cd project && dagit -y app/repository.yaml), this folder becomes the working dir and inside the repository.py I could load a file knowing the root is project

# repository.py

with open('app/evironments/schedules.yaml', 'r'):
   # do something with the file

However, if I set up a schedule the pipelines in the project do not run. Checking the cron logs it seems the open line throws a file not found exception. I was wondering if this happens because the working directory is different when executing the cron.

For context, I'm loading a config file with parameters of cron_schedules for each pipeline. Also, here's the tail of the stacktrace in my case:

  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/core/definitions/handle.py", line 190, in from_yaml
    return LoaderEntrypoint.from_file_target(
  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/core/definitions/handle.py", line 161, in from_file_target
    module = import_module_from_path(module_name, os.path.abspath(python_file))
  File "/home/user/.local/share/virtualenvs/pipelines-mfP13m0c/lib/python3.8/site-packages/dagster/seven/__init__.py", line 75, in import_module_from_path
    spec.loader.exec_module(module)
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/user/pipelines/app/repository.py", line 28, in <module>
    schedule_builder = ScheduleBuilder(settings.CRON_PRESET, settings.ENV_DICT)
  File "/home/user/pipelines/app/schedules.py", line 12, in __init__
    self.cron_schedules = self._load_schedules_yaml()
  File "/home/user/pipelines/app/schedules.py", line 16, in _load_schedules_yaml
    with open(path) as f:
FileNotFoundError: [Errno 2] No such file or directory: 'app/environments/schedules.yaml'
ElBrocas
  • 159
  • 1
  • 9

1 Answers1

1

You could open the file using the absolute path of the file so that it opens correctly.

from dagster.utils import file_relative_path

with open(file_relative_path(__file__, './environments/schedules.yaml'), 'r'):
   # do something with the file

All file_relative_path is simply doing the following, so you can call the os.path methods directly if you prefer:

def file_relative_path(dunderfile, relative_path):
    os.path.join(os.path.dirname(dunderfile), relative_path)
carte
  • 983
  • 2
  • 9
  • 27
  • Thanks, I fixed it that way. I still have the doubt of where the workdir is when you call the scheduled processes. – ElBrocas May 29 '20 at 14:04