3

Overview

We're building a pipeline inside a Docker container using Luigi. This is my first time using Luigi and I'm trying to get it running but I'm stuck on a Python threading/signal error.

What we're building
We have a container that runs a setup.py script as an entrypoint. This script imports my Luigi tasks but it's main function is to open up a PubSub channel to Google Cloud Services. When it receives a message on that channel, it kicks off a run of tasks.

The Error .
I am calling Luigi directly from Python, experimenting with variations of this command:

luigi.build([GetImageSet()], workers=2, local_scheduler=True, no_lock=True)

And recieving this error:

ValueError: signal only works in main thread


Background on Signal & Luigi

From the Python Signal module docs:
signal.signal: this function can only be called from the main thread; attempting to call it from other threads will cause a ValueError exception to be raised.

From the Luigi worker.py script here
Luigi provides the no_install_shutdown_handler flag (defaults to false). 'If true, the SIGUSR1 shutdown handler will NOT be installed on the worker'. This is also where the error is occuring (on line 538). The script checks that a configuration flag for the no_install_shutdown_handler is (the default) false before running signal.signal(). I have failed so far to get Luigi to read my client.cfg file with that flag set to true, and Docker may be to blame.

From the Luigi interface.py script here
If you don't want to run luigi from the command line. You may use the methods defined in this module to programmatically run luigi. In this script, I could provide a custom worker schedule factory but I haven't been able to wrap my head around that yet.

Local vs. global Luigi scheduler
Luigi provides two scheduler options for running tasks. The local

Dockerfile woes: In my Dockerfile for this container, I am installing Luigi through pip but not doing much else. After reviewing this and this docker/luigi implementations on github, I'm starting to worry that I'm not doing enough in my Dockerfile.


Possible reasons I think the error is happening

  1. The pub-sub channel subscriber is non-blocking, so I'm doing something that might be terrible to keep the main thread from exiting while we wait for messages in the background. This seems a likely source of my thread woes.
  2. The no_install_shutdown_handler flag is not being successfully set to True, which would hopefully circumvent the Error but not necessarily be what I want
  3. The local task scheduler. I should be using the global scheduler instead of the local scheduler. I'm eventually going to have to get this working for production anyways...
  4. Running the script from Python instead of the command line
  5. using luigi.build. Instead I should be using luigi.run, but based on the documentation page for Running from Python build is "useful if you want to get some dynamic parameters from another source, such as database, or provide additional logic before you start tasks." And that sounds like it fits my use case (triggering tasks after recieving a message from a pub-sub channel that passes a variable needed to run the first task)

Am I doing this all wrong anyway? If you have any suggestions for implementing the system I've described, please let me know. I will also post my Dockerfile and setup.py attempts upon request.


Some code examples

Here's the Dockerfile

# ./Dockerfile
# sfm-base:test is the container with tensorflow & our python sfm-library. It installs Ubuntu, Python, pip etc.
FROM sfm-base:test
LABEL maintainer "---@---.io"

# Install luigi, google-cloud w/ pip in module mode
RUN python -m pip install luigi && \
python -m pip install --upgrade google-cloud

# for now at least, start.sh just calls setup.py and sets google credentials. Ignore that chmod bit if it's bad I don't know.
COPY start.sh /usr/local/bin
RUN chmod -R 755 "/usr/local/bin/start.sh"
ENTRYPOINT [ "start.sh" ]

WORKDIR /src
COPY . .

# this was my attempt at setting the Luigi client.cfg in the container
# all I'm having the config do for now is set [core] no_install_shutdown_handler: true
ENV LUIGI_CONFIG_PATH /src/client.cfg

And here's the setup.py (edited down for SO)

# setup.py
from google.cloud import pubsub_v1
from time import sleep
import luigitasks
import luigi
import logging
import json

subscriber = pubsub_v1.SubscriberClient()
subscription_path = subscriber.subscription_path(
'servicename', 'pubsubcommand')

# Example task. These are actually in luigitasks.py
class GetImageSet(luigi.Task):
     uri = luigi.Parameter(default='')

     def requires(self):
          return []

     def output(self):
          # write zip to local
          return

     def run(self):
          # use the URI to retrieve the ImageSet.zip from the bucket
          logging.info('Running getImageSet')

# Pubsub message came in
def onMessageReceived(message):
     print('Received message: {}'.format(message))

     if message.attributes:
          for key in message.attributes:
               if key == 'ImageSetUri':
                    value = message.attributes.get(key)
                    # Kick off the pipeline starting with the GetImageSet Task
                    # I've tried setting no_lock, take_lock, local_scheduler...
                    # General flags to try and prevent the thread issues
                    luigi.build([GetImageSet()], workers=3, local_scheduler=True, no_lock=False)
                    message.ack()

subscriber.subscribe(subscription_path, callback=onMessageReceived)

# The subscriber is non-blocking, so I keep the main thread from
# exiting to allow it to process messages in the background. Is this
# the cause of my woes?
print('Listening for messages on {}'.format(subscription_path))
while True:
    sleep(60)
  • 1
    hey Justin, did you ever figure this out? I'm trying to run a flask server and I'm running into a similar problem - Flask dispatches threads to handle requests and I'm trying to `build` my pipeline within the response generating method – Nikhil Shinday Nov 24 '18 at 00:14

1 Answers1

0

This happens because subscriber.subscribe starts a background thread. When that thread invokes luigi.build the exception is thrown.

The best solution here is to read pub-sub messages from the main thread using subscriber.pull. See example in the docs.

Oren
  • 589
  • 4
  • 9