20

I am new to Airflow. I am following a tutorial and written following code.

from airflow import DAG
from airflow.operators.python_operator import PythonOperator
from datetime import datetime, timedelta
from models.correctness_prediction import CorrectnessPrediction

default_args = {
    'owner': 'abc',
    'depends_on_past': False,
    'start_date': datetime.now(),
    'email': ['abc@xyz.com'],
    'email_on_failure': False,
    'email_on_retry': False,
    'retries': 1,
    'retry_delay': timedelta(minutes=5)
}

def correctness_prediction(arg):
    CorrectnessPrediction.train()

dag = DAG('daily_processing', default_args=default_args)

task_1 = PythonOperator(
    task_id='print_the_context',
    provide_context=True,
    python_callable=correctness_prediction,
    dag=dag)

On running the script, it doesn't show any errors but when I check for dags in Web-UI it doesn't show under Menu->DAGs

enter image description here

But I can see the scheduled job under Menu->Browse->Jobs

enter image description here

I also cannot see anything in $AIRFLOW_HOME/dags. Is it supposed to be like this only? Can someone explain why?

snakecharmerb
  • 28,223
  • 10
  • 51
  • 86
Rusty
  • 696
  • 2
  • 8
  • 26

7 Answers7

17

Run airflow list_dags to check, whether the dag file is located correctly.

For some reason, I didn't see my dag in the browser UI before I executed this. Must be issue with browser cache or something.

If that doesn't work, you should just restart the webserver with airflow webserver -p 8080 -D

Stephen
  • 6,056
  • 8
  • 40
  • 74
samutamm
  • 1,657
  • 3
  • 19
  • 25
  • Do you know how to fix the browser UI problem? – Eric Bellet Aug 27 '19 at 12:48
  • @EricBellet for me `airflow list_dags` helped as quick fix, I don't know the root cause for this – samutamm Aug 27 '19 at 13:29
  • 2
    Yes. Restarting the UI with airflow webserver -p 8080 -D it is other quick fix – Eric Bellet Aug 27 '19 at 13:57
  • 2
    Sometimes even this takes a while to work. I had an experience just now where I followed all of the instructions in this answer, but it still took about 3 minutes for the new DAG to show up in the UI. At some point maybe I'll dig into the configuration settings to see if this is a refresh frequency that can be tweaked. – Stephen Jan 07 '20 at 19:50
  • I had a DAG that was throwing an error, but rather than the error propagating to the UI, the DAG just wouldn't show up. Running `airflow list_dags` allowed me to see the error and debug that way. I am using an older version of Airflow. – ChristopherTull Oct 26 '20 at 18:10
12

I have the same issue. To resolve I need to run scheduler

airflow scheduler

Without this command, I don't see my new DAGs BTW: the UI show me warning related to that problem:

The scheduler does not appear to be running. Last heartbeat was received 9 seconds ago. The DAGs list may not update, and new tasks will not be scheduled.

DenisOgr
  • 391
  • 1
  • 4
  • 11
8

The ScheduleJob that you see on the jobs page is an entry for the Scheduler. Thats not the dag being scheduled.

Its weird that your $AIRFLOW_HOME/dags is empty. All dags must live within the $AIRFLOW_HOME/dags directory (specifically in the dags directory configured in your airflow.cfg file). Looks like you are not storing the actual dag in the right directory (the dags directory).

Alternatively, sometimes you also need to restart the webserver for the dag to show up (though that doesn't seem to be the issue here).

Vineet Goel
  • 1,878
  • 1
  • 15
  • 23
  • 1
    Do I need to run the script _mentioned in the question_ in $AIRFLOW_HOME/dags folder ? – Rusty Aug 17 '16 at 20:08
  • Yes, thats right. All your dag definitions (python files initialize dags - the line `dag = DAG(...)` in your example above) should be in the global scope within the DAGs dir configured in your airflow.cfg file. – Vineet Goel Aug 17 '16 at 22:40
8

We need to clarify several things:

  1. By no means you need to run the DAG file yourself (unless you're testing it for syntax errors). This is the job of Scheduler/Executor.
  2. For DAG file to be visible by Scheduler (and consequently, Webserver), you need to add it to dags_folder (specified in airflow.cfg. By default it's $AIRFLOW_HOME/dags subfolder).

Airflow Scheduler checks dags_folder for new DAG files every 5 minutes by default (governed by dag_dir_list_interval in airflow.cfg). So if you just added a new file, you have two options:

  1. Restart Scheduler
  2. Wait until current Scheduler process picks up new DAGs.
ptyshevs
  • 1,150
  • 7
  • 22
4

Check the dags_folder variable in airflow.cfg. If you have a virtual environment then run the command export AIRFLOW_HOME=$(pwd) from the main project directory. Note that running export AIRFLOW_HOME=$(pwd) expects your dags to be in a dags subdirectory in the project directory.

deerishi
  • 189
  • 2
  • 4
1

I had the same issue. I had put the downloaded Airflow twice, once without sudo and once with sudo. I was using with the sudo version, where the directories where under my user path. I simply ran the airflow command: export AIRFLOW_HOME=~/airflow

Jonathan
  • 21
  • 1
0

Check the Paused dags. Your DAG might have ended there. If you are sure that you have added .py file correctly then manually type the url of the dag using dag_id. For e.g. http://AIRFLOW_URL/graph?dag_id=dag_id. Then you can see if Airflow has accepted your dag or not.

Nikhil Redij
  • 671
  • 8
  • 14